大規模分佈式系統設計是業界的技術難題,本文經過 GOOGLE 的分佈式系統關鍵技術論文,從數據平面和控制平面兩個緯度進行解讀,從而幫助構建分佈式系統設計基礎。web
計算框架論文 | 簡介 | 發表時間 | 主要做者 |
---|---|---|---|
搜索引擎 Search Engine | The Anatomy of a Large-Scale Hypertextual Web Search Engine | 1998 | Sergey Brin, Lawrence Page |
數據挖掘 [Mining Causal Structures](Mining Causal Structures) | Scalable Techniques for Mining Causal Structures | 1998 | Craig Silverstein, Sergey Brin, Rajeev Motwani, etc. |
搜索引擎 Extracting Patterns | Extracting Patterns and Relations from the World Wide Web | 1998 | Sergey Brin |
搜索引擎 WEBSEARCH FOR A PLANET | THE GOOGLE CLUSTER ARCHITECTURE | 2003 | Luiz André Barroso, Jeffrey Dean |
分佈式鎖服務 Chubby | The Chubby lock service for loosely-coupled distributed systems | 2006 | Mike Burrows |
數據中心架構 The Datacenter as a Computer | An Introduction to the Design of Warehouse-Scale Machines | 2009 | Luiz André Barroso, Urs Hölzle |
數據中心統計畫像 GOOGLE-WIDE PROFILING | A CONTINUOUS PROFILING INFRASTRUCTURE FOR DATA CENTERS | 2010 | Gang Ren, Eric Tune, Tipp Moseley, etc. |
系統追蹤 Dapper | A Large-Scale Distributed Systems Tracing Infrastructure | 2010 | Benjamin H. Sigelman, Luiz Andre Barroso, Mike Burrows, etc. |
多租戶彈性資源伸縮 CloudScale | Elastic Resource Scaling for Multi-Tenant Cloud Systems | 2011 | Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu |
網絡設計 B4 | Experience with a Globally-Deployed Software Defined WAN | 2013 | Sushant Jain, Alok Kumar, Subhasree Mandal, etc. |
低時延設計 The Tail at Scale | Software techniques that tolerate latency variability are vital to building responsive large-scale Web services | 2013 | JEFFREY DEAN, LUIZ ANDRÉ BARROSO |
集羣調度 Omega | Flexible, scalable schedulers for large compute clusters | 2013 | Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, etc. |
性能隔離 CPI2 | CPU performance isolation for shared compute clusters | 2013 | Xiao Zhang, Eric Tune, Robert Hagmann |
大規模集羣管控 Borg | Large-scale cluster management at Google with Borg | 2015 | Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, etc. |
自動分區 Slicer | Auto-Sharding for Datacenter Applications | 2016 | Atul Adya, Daniel Myers, Jon Howell, etc. |
容器調度 K8S | Borg, Omega, and Kubernetes | 2016 | BRENDAN BURNS, BRIAN GRANT, DAVID OPPENHEIMER, etc. |
圖分區管理 Graph partitioning | Distributed Balanced Partitioning via Linear Embedding | 2016 | Kevin Aydin, MohammadHossein Bateni, Vahab Mirrokni |
數據排布的高效集羣調度 Firmament | Fast, Centralized Cluster Scheduling at Scale | 2016 | Ionel Gog, Malte Schwarzkopf, Adam Gleave, etc. |
GOOGLE 從搭建搜索引擎開始,分別從數據平面和管理平面構建大規模分佈式系統,其中數據平面以 GFS、MR、BigTable 三篇經典 做爲基礎不斷髮展,同時管控平面也不斷完善。api
構建大規模分佈式系統,其實和構建傳統 ICT 相似,也須要從架構上設計好數據平面和控制平面,從而除了集中數據路徑的設計優化,一樣也須要設計好控制平面的集羣控制、鎖管理、日誌跟蹤、統計畫像、資源隔離、熱點均衡等技術,只是在大規模系統構建的需求下,須要進行架構的從新設計。網絡