docker on spark

時間 2019-11-11

標籤 docker spark 欄目 Docker 简体版

原文原文鏈接

從docker 倉庫 pull 鏡像
docker pull sequenceiq/spark:1.4.0
構建 docker 鏡像
docker build –rm -t sequenceiq/spark:1.4.0 .
-t 選項是你要構建的sequenceiq/spark image的tag，就比如ubuntu:13.10同樣 –rm 選項是告訴Docker在構建完成後刪除臨時的Container，Dockerfile的每一行指令都會建立一個臨時的Container，通常你是不須要這些臨時生成的Container的node
運行鏡像docker
- if using boot2docker make sure your VM has more than 2GB memory
- in your /etc/hosts file add $(boot2docker ip) as host ‘sandbox’ to make it easier to access your sandbox UI
- open yarn UI ports when running container

docker run -it -p 8088:8088 -p 8042:8042 -h sandbox sequenceiq/spark:1.4.0 bash
or
docker run -d -h sandbox sequenceiq/spark:1.4.0 -d
apache

若是要進行交互式操做（例如Shell腳本），那咱們必須使用-i -t參數同容器進行數據交互。可是當經過管道同容器進行交互時，就不須要使用-t參數
-h來設定hostname
若是使用-p或者-P，那麼容器會開放部分端口到主機，只要對方能夠鏈接到主機，就能夠鏈接到容器內部。當使用-P時，Docker會在主機中隨機從49153 和65535之間查找一個未被佔用的端口綁定到容器。你可使用docker port來查找這個隨機綁定端口。ubuntu
若是在docker run後面追加-d=true或者-d，那麼容器將會運行在後臺模式。此時全部I/O數據只能經過網絡資源或者共享卷組來進行交互。由於容器再也不監聽你執行docker run的這個終端命令行窗口。但你能夠經過執行docker attach來從新附着到該容器的回話中。須要注意的是，容器運行在後臺模式下，是不能使用–rm選項的。ruby
-p 8088:8088 這個端口是resourcemanager 或者集羣，-p 8042:8042 這個端口是 nodemanager端口 bash
1. 版本
  Hadoop 2.6.0 and Apache Spark v1.4.0 on Centos markdown
2. 測試
  There are two deploy modes that can be used to launch Spark applications on YARN.網絡
  - YARN-client mode

In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application.

Estimating Pi (yarn-cluster mode):

# execute the the following command which should write the "Pi is roughly 3.1418" into the logs
# note you must specify --files argument in cluster mode to enable metrics
spark-submit \
--class org.apache.spark.examples.SparkPi \
--files $SPARK_HOME/conf/metrics.properties \
--master yarn-cluster \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
$SPARK_HOME/lib/spark-examples-1.4.0-hadoop2.6.0.jar

YARN-cluster mode

# execute the the following command which should print the "Pi is roughly 3.1418" to the screen
spark-submit \
--class org.apache.spark.examples.SparkPi \ --master yarn-client \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 1 \ $SPARK_HOME/lib/spark-examples-1.4.0-hadoop2.6.0.jar

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。