可行解 Work Solution | 15% |
特定問題 Special Case | 20% |
分析能力 Analysis | 25% |
權衡 Trade-off | 15% |
知識儲備 Knowledge | 25% |
哪些功能?Feature/Interface?算法
多強的系統?Constrains/Hypothesis/QPS/DAU數據庫
主要組成模塊?Service/Module數據結構
如何存儲數據和訪問?Data/Storage/SQL vs. NoSQL/File System/Schema併發
如何進化,解決缺陷,處理問題?Optimize/Special Caseless
Register/Login異步
User Profile Display/Editide
Upload Image/Videomemcached
Searchpost
Post/Share a tweet優化
Timeline/News Feed
Follow/Unfollow a user
Post a tweet
Timeline
News Feed
Follow/Unfollow
Register/Login
DAU -- Daily Active Users -- 日活躍用戶數量 -- 評價系統牛逼的標準
Twitter: MAU 320M, DAU ~150M+
Read More: http://bit.ly/1Knl0M7
Concurrent Users -- 併發用戶
Avg Concurrent Users = 日活躍用戶數量 * 每一個用戶平均請求次數 / 一天多少秒
= 150M * 60 / 86400 ~= 100k
峯值:Peak Users = Avg Concurrent Users * 3
~ 300k
快速增加的產品:Fast Growing = Peak Users * 2
~ 600k
Read QPS(Queries Per Second) 讀頻率:300k
Write QPS(Queries Per Second) 寫頻率:5k
Receptionist
User Service: Register/Login
Tweet Service: Post a tweet/News Feed/Timeline
Media Service: Upload Picture/Video
Friendship Service: Follow/Unfollow
關係型數據庫 SQL Database:User Table
非關係型數據庫 NoSQL Database:Tweets, Social Graph (Followers)
文件系統 File System: Images, Videos, other media files
程序 = 算法 + 數據結構
系統 = 服務 + 數據存儲
User Service: SQL
Tweet Service: NoSQL
Media Service: File System
Friendship Service: SQL/NoSQL
爲每一個App/Service選擇合適的存儲結構
細化Database結構
User Table | |
---|---|
userId | integer |
username | varchar |
varchar | |
password | varchar |
Friendship Table | |
---|---|
relationshipId | integer |
from_userId | foreign key |
to_userId | foreign key |
Tweet Table | |
---|---|
tweetId | integer |
userId | foreign key |
time | timestamp |
content | text |
Pull Model
獲取每一個好友的前k條tweets,合併出k條news feed
K路歸併算法:Merge K sorted arrays
假設有N個好友,則時間爲 ==>
N次DB Read的時間 + K路歸併時間(可忽略)
Post a tweet ==>
1次DB Write的時間
Client ---->send get News Feed
request to----> Server
Server <----get Following
from----> Friendship Table
Server <----get Tweets
of Followings
from----> Tweet Table
Server ---->Merge Tweets
and return to----> Client
讀取慢(N次DB Reads,很是慢)
發生在用戶得到News Feed的請求過程當中,有延遲
爲每一個用戶建一個List存儲他的News Feed;
當他post一個tweet的時候,將該推文逐個推送(Fanout)到每一個Follower的List中;
當他查看News Feed時,從List中讀取最新的100條便可
每次News Feed,只用一次DB Read;
每次Post Tweet,會Fanout到N個Follower,須要N次DB Writes;
不過對於Post Tweet,能夠用異步任務後臺執行,用戶無須等待
postTweet(POST, tweet_info) { tweet = DB.insertTweet(userId, tweet_info); //userId對應這個用戶的News Feed List AsyncService.fanoutTweet(userId, tweet); return success; } AsyncService::fanoutTweet(userId, tweet) { followers = DB.getFollowers(userId); for (follower: followers) { DB.insertNewsFeed(follower.userId, tweet); } }
postTweet()的異步執行;而fanoutTweet()可能遇到followers數目太大的問題。
Pull | |
Pull | |
Pull + Push |
Solve Problems: Push vs. Pull; Normalize vs. De-normalize
More Features: Edit; Delete; Media; Ads
Special Cases: 大V,熱推,不活躍用戶
Robust 魯棒性:若是有一臺server/DB掛了怎麼處理
Scalability 擴展性:若是有流量暴增,如何擴展
在訪問DB以前加入Cache;
Cache每一個用戶的Timeline
N次DB Reads,因此Cache最近的100條
Cache每一個用戶的News Feed
最近沒有Cache過News Feed的用戶:歸併N個好友每人最近的100條Tweets,取出前100條;
最近Cache過的用戶:歸併某個時間戳以後的tweets
浪費更多Disk存儲空間
與Pull模型存在Memory中相比,雖然Disk很便宜
其實對於實時性要求而言,Push的效果不如Pull
因此對於不活躍用戶,能夠採用粉絲排序
follower數目遠大於following數目時,加幾臺push任務的機器
若是加server沒法解決:針對長期的fast growing,進行評估,轉換push模型爲pull模型
Tradeoff:對於明星用戶,採用pull;對於普通用戶,採用push(朋友圈);
Follow以後:異步將他的Timeline合併到你的News Feed中
Unfollow以後:異步將他的Tweets從你的News Feed中移除
異步的好處:用戶迅速獲得反饋,覺得succeess了,無須等待異步操做的真正完成
異步的壞處:若是unfollow以後刷新,發現他的Tweets還在
標準化操做:Normalize:兩個tables,使用Join操做,時間更多
因此,使用去標準化操做:De-normalize
對於同一條數據短期出現大量請求:
load balancer, sharing, consistent hashing都不是頗有效;
加入cache能夠完美解決;
Follow Up 1:
Like, Retweet, Comment都會改變該tweet的基本信息,如何更新?
Write through; Write back; Look aside
Follow Up 2:
Cache失效怎麼辦,例如內存不夠或者Cache決策失誤,致使tweet
Answer: http://www.cs.utah.edu/~stuts...
While building, maintaining, and evolving our system we have learned the following lessons. (1) Separating cache and persistent storage sys- tems allows us to independently scale them. (2) Features that improve monitoring, debugging and operational ef- ficiency are as important as performance. (3) Managing stateful components is operationally more complex than stateless ones. As a result keeping logic in a stateless client helps iterate on features and minimize disruption. (4) The system must support gradual rollout and roll- back of new features even if it leads to temporary het- erogeneity of feature sets. (5) Simplicity is vital.
Please refer to next article.