[System Design] 系統設計 (1) -- SNAKE & Twitter

時間 2019-12-08

標籤 system design 系統設計 snake 欄目遊戲简体版

原文原文鏈接

評分標準

可行解 Work Solution	15%
特定問題 Special Case	20%
分析能力 Analysis	25%
權衡 Trade-off	15%
知識儲備 Knowledge	25%

SNAKE分析法

Scenario

哪些功能？Feature/Interface?算法

Needs

多強的系統？Constrains/Hypothesis/QPS/DAU數據庫

Application

主要組成模塊？Service/Module數據結構

Kilobyte

如何存儲數據和訪問？Data/Storage/SQL vs. NoSQL/File System/Schema併發

Evolve

如何進化，解決缺陷，處理問題？Optimize/Special Caseless

Design a Twitter -- 例子

Scenario

Step 1 -- Enumerate 羅列功能

Register/Login異步
User Profile Display/Editide
Upload Image/Videomemcached
Searchpost
Post/Share a tweet優化
Timeline/News Feed
Follow/Unfollow a user

Step 2 -- Sort 選出核心功能

Post a tweet
Timeline
News Feed
Follow/Unfollow
Register/Login

Needs

Step 1 -- Ask

DAU -- Daily Active Users -- 日活躍用戶數量 -- 評價系統牛逼的標準
Twitter: MAU 320M, DAU ~150M+
Read More: http://bit.ly/1Knl0M7

Step 2 -- Predict

Concurrent Users -- 併發用戶
- Avg Concurrent Users = 日活躍用戶數量 * 每一個用戶平均請求次數 / 一天多少秒 = 150M * 60 / 86400 ~= 100k

峯值：Peak Users = Avg Concurrent Users * 3 ~ 300k
快速增加的產品：Fast Growing = Peak Users * 2 ~ 600k

Read QPS(Queries Per Second) 讀頻率：300k
Write QPS(Queries Per Second) 寫頻率：5k

Application -- Service/Module

Receptionist

User Service: Register/Login
Tweet Service: Post a tweet/News Feed/Timeline
Media Service: Upload Picture/Video
Friendship Service: Follow/Unfollow

Replay -- 重放需求

Merge -- 歸併需求

Kilobyte -- Data/Storage

基本知識

關係型數據庫 SQL Database：User Table
非關係型數據庫 NoSQL Database：Tweets, Social Graph (Followers)
文件系統 File System: Images, Videos, other media files

程序 = 算法 + 數據結構
系統 = 服務 + 數據存儲

User Service: SQL
Tweet Service: NoSQL
Media Service: File System
Friendship Service: SQL/NoSQL

Select

爲每一個App/Service選擇合適的存儲結構

Schema

細化Database結構

Please Design Schema

User Table
userId	integer
username	varchar
email	varchar
password	varchar

Friendship Table
relationshipId	integer
from_userId	foreign key
to_userId	foreign key

Tweet Table
tweetId	integer
userId	foreign key
time	timestamp
content	text

News Feed 如何存取？

Pull vs. Push （明星問題、殭屍粉問題）

Pull Model

獲取每一個好友的前k條tweets，合併出k條news feed
- K路歸併算法：Merge K sorted arrays
假設有N個好友，則時間爲 ==>
- N次DB Read的時間 + K路歸併時間（可忽略）
Post a tweet ==>
- 1次DB Write的時間

Pull Work Flow 原理圖

Client ---->send get News Feed request to----> Server
Server <----get Following from----> Friendship Table
Server <----get Tweets of Followings from----> Tweet Table
Server ---->Merge Tweets and return to----> Client

Pull模型的缺陷

讀取慢（N次DB Reads，很是慢）
發生在用戶得到News Feed的請求過程當中，有延遲

Push Model

算法

爲每一個用戶建一個List存儲他的News Feed；
當他post一個tweet的時候，將該推文逐個推送（Fanout）到每一個Follower的List中；
當他查看News Feed時，從List中讀取最新的100條便可

複雜度

每次News Feed，只用一次DB Read；
每次Post Tweet，會Fanout到N個Follower，須要N次DB Writes；
不過對於Post Tweet，能夠用異步任務後臺執行，用戶無須等待

postTweet(POST, tweet_info) {
    tweet = DB.insertTweet(userId, tweet_info); //userId對應這個用戶的News Feed List
    AsyncService.fanoutTweet(userId, tweet);
    return success;
}
AsyncService::fanoutTweet(userId, tweet) {
    followers = DB.getFollowers(userId);
    for (follower: followers) {
        DB.insertNewsFeed(follower.userId, tweet);
    }
}

Push Model的缺陷

postTweet()的異步執行；而fanoutTweet()可能遇到followers數目太大的問題。

Push和Pull的比較

Facebook	Pull
Twitter	Pull
Instagram	Pull + Push

Evolve 優化：Optimize/Maintenance

Step 1: Optimize

Solve Problems: Push vs. Pull; Normalize vs. De-normalize
More Features: Edit; Delete; Media; Ads
Special Cases: 大V，熱推，不活躍用戶

Step 2: Maintenance

Robust 魯棒性：若是有一臺server/DB掛了怎麼處理
Scalability 擴展性：若是有流量暴增，如何擴展

解決Pull的缺陷 DB Reads

在訪問DB以前加入Cache；
Cache每一個用戶的Timeline
- N次DB Reads，因此Cache最近的100條
Cache每一個用戶的News Feed
- 最近沒有Cache過News Feed的用戶：歸併N個好友每人最近的100條Tweets，取出前100條；
- 最近Cache過的用戶：歸併某個時間戳以後的tweets

解決Push的缺陷

浪費更多Disk存儲空間
- 與Pull模型存在Memory中相比，雖然Disk很便宜
其實對於實時性要求而言，Push的效果不如Pull
因此對於不活躍用戶，能夠採用粉絲排序
follower數目遠大於following數目時，加幾臺push任務的機器
若是加server沒法解決：針對長期的fast growing，進行評估，轉換push模型爲pull模型
Tradeoff：對於明星用戶，採用pull；對於普通用戶，採用push(朋友圈)；

如何實現Follow和Unfollow

Follow以後：異步將他的Timeline合併到你的News Feed中
Unfollow以後：異步將他的Tweets從你的News Feed中移除

異步的好處：用戶迅速獲得反饋，覺得succeess了，無須等待異步操做的真正完成
異步的壞處：若是unfollow以後刷新，發現他的Tweets還在

如何存儲Likes

標準化操做：Normalize：兩個tables，使用Join操做，時間更多
因此，使用去標準化操做：De-normalize

大V發一條tweet以後的問題

對於同一條數據短期出現大量請求：

load balancer, sharing, consistent hashing都不是頗有效；
加入cache能夠完美解決；

Follow Up 1:

Like, Retweet, Comment都會改變該tweet的基本信息，如何更新？
- Write through; Write back; Look aside

Follow Up 2:

Cache失效怎麼辦，例如內存不夠或者Cache決策失誤，致使tweet
Answer: http://www.cs.utah.edu/~stuts...

While building, maintaining, and evolving our system we have learned the following lessons. (1) Separating cache and persistent storage sys- tems allows us to independently scale them. (2) Features that improve monitoring, debugging and operational ef- ficiency are as important as performance. (3) Managing stateful components is operationally more complex than stateless ones. As a result keeping logic in a stateless client helps iterate on features and minimize disruption. (4) The system must support gradual rollout and roll- back of new features even if it leads to temporary het- erogeneity of feature sets. (5) Simplicity is vital.