構建 docker 鏡像
docker build –rm -t sequenceiq/spark:1.4.0 .
-t 選項是你要構建的sequenceiq/spark image的tag,就比如ubuntu:13.10同樣 –rm 選項是告訴Docker在構建完成後刪除臨時的Container,Dockerfile的每一行指令都會建立一個臨時的Container,通常你是不須要這些臨時生成的Container的node
運行鏡像docker
docker run -it -p 8088:8088 -p 8042:8042 -h sandbox sequenceiq/spark:1.4.0 bash
or
docker run -d -h sandbox sequenceiq/spark:1.4.0 -d
apache
若是使用-p或者-P,那麼容器會開放部分端口到主機,只要對方能夠鏈接到主機,就能夠鏈接到容器內部。當使用-P時,Docker會在主機中隨機從49153 和65535之間查找一個未被佔用的端口綁定到容器。你可使用docker port來查找這個隨機綁定端口。ubuntu
若是在docker run後面追加-d=true或者-d,那麼容器將會運行在後臺模式。此時全部I/O數據只能經過網絡資源或者共享卷組來進行交互。由於容器再也不監聽你執行docker run的這個終端命令行窗口。但你能夠經過執行docker attach來從新附着到該容器的回話中。須要注意的是,容器運行在後臺模式下,是不能使用–rm選項的。ruby
-p 8088:8088 這個端口是resourcemanager 或者 集羣 ,-p 8042:8042 這個端口是 nodemanager端口 bash
版本
Hadoop 2.6.0 and Apache Spark v1.4.0 on Centos markdown
測試
There are two deploy modes that can be used to launch Spark applications on YARN.網絡
In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application.
Estimating Pi (yarn-cluster mode):
# execute the the following command which should write the "Pi is roughly 3.1418" into the logs
# note you must specify --files argument in cluster mode to enable metrics
spark-submit \
--class org.apache.spark.examples.SparkPi \
--files $SPARK_HOME/conf/metrics.properties \
--master yarn-cluster \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
$SPARK_HOME/lib/spark-examples-1.4.0-hadoop2.6.0.jar
# execute the the following command which should print the "Pi is roughly 3.1418" to the screen
spark-submit \
--class org.apache.spark.examples.SparkPi \ --master yarn-client \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 1 \ $SPARK_HOME/lib/spark-examples-1.4.0-hadoop2.6.0.jar
版權聲明:本文爲博主原創文章,未經博主容許不得轉載。app