很是好的spark分析博客,咱們team的,哈哈:http://jerryshao.me/php
spark programming guide:java
https://github.com/mesos/spark/wiki/Spark-Programming-Guidegit
-------------------------------------------------------------github
scala安裝:web
$ wget http://www.scala-lang.org/files/archive/scala-2.9.3.tgz
$ tar xvfz scala-2.9.3.tgzapache
~/.bashrc中添加:bash
export SCALA_HOME=/usr/scala/scala-2.9.3
export PATH=$PATH:$SCALA_HOME/binjvm
-------------------------------------------------ide
編譯:
SPARK_HADOOP_VERSION=1.2.1 sbt/sbt assembly
須要安裝hadoop
主機:
192.168.56.103
從機:
192.168.56.102
192.168.56.103
conf/spark-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/ export SCALA_HOME=/usr/local/src/scala-2.9.3/ export SPARK_MASTER_IP=192.168.56.103 export SPARK_MASTER_WEBUI_PORT=8080 export SPARK_WORKER_WEBUI_PORT=8081 export SPARK_WORKER_CORES=1 export SPARK_WORKER_MEMORY=512m
conf/slaves
# A Spark Worker will be started on each of the machines listed below. 192.168.56.102 192.168.56.103
主機和從機的這兩個文件是同樣的,以後再主機上執行:oop
bin/start-all.sh
而後測試是否開啓成功:
主機jps:
8787 Worker 3017 NameNode 9366 Jps 3728 TaskTracker 8454 Master 2830 DataNode 2827 SecondaryNameNode 3484 JobTracker
從機jps:
6649 Worker 2592 DataNode 2997 TaskTracker 7105 Jps
webUI:
(主機master,能夠查看各個worker的工做狀態) http://localhost:8080/
運行例子:
在主機上:
./run-example org.apache.spark.examples.SparkPi spark://192.168.56.103:7077
./run-example org.apache.spark.examples.SparkLR spark://192.168.56.103:7077
Mesos部署Spark
。。。
----------------------------------------------
去中心化調度器(sparrow):
http://www.binospace.com/index.php/sparrow-sosp13-an-accelerated-short-job-scheduling-method/