spark history Server產生背景python
以standalone運行模式爲例,在運行Spark Application的時候,Spark會提供一個WEBUI列出應用程序的運行時信息;但該WEBUI隨着Application的完成(成功/失敗)而關閉,也就是說,Spark Application運行完(成功/失敗)後,將沒法查看Application的歷史記錄;web
Spark history Server就是爲了應對這種狀況而產生的,經過配置能夠在Application執行的過程當中記錄下了日誌事件信息,那麼在Application執行結束後,WEBUI就能從新渲染生成UI界面展示出該Application在執行過程當中的運行時信息;docker
Spark運行在yarn或者mesos之上,經過spark的history server仍然能夠重構出一個已經完成的Application的運行時參數信息(假如Application運行的事件日誌信息已經記錄下來);apache
spark history Server的配置vim
1. 在Spark的conf目錄下/usr/local/spark-1.6.0-bin-hadoop2.6/conf,將spark-defaults.conf.template更名爲spark-defaults.conf
mv spark-defaults.conf.template spark-defaults.conf 瀏覽器
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6$ ls
bin data examples licenses NOTICE README.md work
CHANGES.txt derby.log lib logs python RELEASE
conf ec2 LICENSE metastore_db R sbin
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6$ cd conf/
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf$ ls
docker.properties.template metrics.properties.template spark-env.sh
fairscheduler.xml.template slaves
log4j.properties.template spark-defaults.conf.template
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf$ mv spark-defaults.conf.template spark-defaults.conf
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf$ ls
docker.properties.template metrics.properties.template spark-env.sh
fairscheduler.xml.template slaves
log4j.properties.template spark-defaults.conf
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf$ oop
2. 對spark-defaults.conf 配置ui
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf$ vim spark-defaults.confspa
spark.eventLog.enabled true
spark.eventLog.dir hdfs://SparkSingleNode:9000/historyserverforSpark
spark.history.ui.port 18080
spark.history.fs.logDirectory hdfs://SparkSingleNode:9000/historyserverforSpark3d
3.啓動history-server
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf$ cd ..
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6$ sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-SparkSingleNode.out
failed to launch org.apache.spark.deploy.history.HistoryServer:
full log in /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-SparkSingleNode.out
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6$ jps
6816 Jps
3876 Worker
6772 HistoryServer
3174 NameNode
5990 CoarseGrainedExecutorBackend
3703 Master
3453 SecondaryNameNode
3293 DataNode
5887 SparkSubmit
spark@SparkSingleNode:/usr/local/spark/spark-1.5.2-bin-hadoop2.6$
四、spark-env.sh
5.在web瀏覽器中查看http://SparkSingleNode:18080/ 顯示頁面
1.6.0 History Server
Event log directory: hdfs://Master:9000/historyserverforSpark
成功!