【互動問答分享】第15期決勝雲計算大數據時代Spark亞太研究院公益大講堂

時間 2019-11-08

標籤互動問答分享決勝計算數據時代 spark 亞太研究院公益講堂欄目 Spark 简体版

原文原文鏈接

「決勝雲計算大數據時代」app

Spark亞太研究院100期公益大講堂【第15期互動問答分享】ide

Q1：AppClient和worker、master之間的關係是什麼？oop

:AppClient是在StandAlone模式下SparkContext.runJob的時候在Client機器上應用程序的表明，要完成程序的registerApplication等功能；大數據
當程序完成註冊後Master會經過Akka發送消息給客戶端來啓動Driver；this
在Driver中管理Task和控制Worker上的Executor來協同工做；雲計算

Q2：Spark的shuffle 和hadoop的shuffle的區別大麼？spa

Spark的Shuffle是一種比較嚴格意義上的shuffle，在Spark中Shuffle是有RDD操做的依賴關係中的Lineage上父RDD中的每一個partition元素的內容交給多個子RDD； ip
在Hadoop中的Shuffle是一個相對模糊的概念，Mapper階段介紹後把數據交給Reducer就會產生Shuffle，Reducer三階段的第一個階段便是Shuffle；hadoop

Q3：Spark 的HA怎麼處理的？ rem

對於Master的HA，在Standalone模式下，Worker節點自動是HA的，對於Master的HA，通常採用Zookeeper；
Utilizing ZooKeeper to provide leader election and some state storage, you can launch multiple Masters in your cluster connected to the same ZooKeeper instance. One will be elected 「leader」 and the others will remain in standby mode. If the current leader dies, another Master will be elected, recover the old Master’s state, and then resume scheduling. The entire recovery process (from the time the the first leader goes down) should take between 1 and 2 minutes. Note that this delay only affects scheduling new applications – applications that were already running during Master failover are unaffected；
對於Yarn和Mesos模式，ResourceManager通常也會採用ZooKeeper進行HA;

相關文章

相關標籤/搜索

大數據----Spark

大數據時代

互動問答分享

雲計算與大數據

決戰大數據

Spark亞太研究院系列叢書

Docker命令大全

PHP 7 新特性

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。

最新文章

本站公眾號

歡迎關注本站公眾號,獲取更多信息

相關文章

>>更多相關文章<<