安裝Spark前須要安裝Hadoop,已在VM上安裝了三臺虛擬機shell
1、前置條件vim
一、Hadoop集羣,做者已經在VM上安裝了三臺虛擬機,裏面已安裝好Hadoop集羣bash
2、所需軟件oop
一、 Scala:2.10spa
二、Spark:2.20scala
3、安裝Scala(集羣上三臺機器都須要安裝,下面已一臺爲例)code
一、下載scala-2.10.0,並解壓到/usr/local/目錄,做者下載的scala目錄在/home/hadoop/toolsserver
hadoop@Worker1:~$ sudo mv /home/hadoop/tools/scala-2.11.0.tgz /usr/local/
hadoop@Worker1:/usr/local$ sudo tar -zxvf scala-2.11.0.tgz
二、修改~/.bashrc文件,增長SCALA_HOMEthree
hadoop@Worker1:/usr/local$ vim ~/.bashrc
export SCALA_HOME=/usr/local/scala-2.11.0 export JAVA_HOME=/usr/local/bin/jdk1.8.0_131 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH export PATH=${SCALA_HOME}/bin:$PATH export HADOOP_HOME=/usr/local/hadoop-2.7.3 export PATH=$PATH:${HADOOP_HOME}/bin export SCALA_HOME=/usr/local/scala-2.11.0
三、使修改生效hadoop
hadoop@Worker1:/usr/local$ source ~/.bashrc
四、驗證Scala是否安裝成功
hadoop@Worker1:/usr/local$ scala -version
若是出現:Scala code runner version 2.11.0 -- Copyright 2002-2013, LAMP/EPFL,則表示安裝成功
五、在其它機器上重複上面的步驟安裝成功scala
4、安裝Spark集羣:Spark集羣支持三種模式,分別是Mesos、YARN、Standalone模式,如今做者配置Standalone模式
一、下載spark-2.2.0-bin-hadoop2.7.tgz,並解壓
hadoop@Master:/usr/local$ sudo mv /home/hadoop/tools/spark-2.2.0-bin-hadoop2.7.tgz /usr/local/
hadoop@Master:/usr/local$ sudo tar -zxvf spark-2.2.0-bin-hadoop2.7.tgz
二、受權
hadoop@Master:/usr/local$ sudo chown -R hadoop:root ./spark-2.2.0-bin-hadoop2.7
三、bashrc配置spark環境
hadoop@Master:/usr/local$ vim ~/.bashrc
export SCALA_HOME=/usr/local/scala-2.11.0 export JAVA_HOME=/usr/local/bin/jdk1.8.0_131 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH export PATH=${SCALA_HOME}/bin:$PATH export HADOOP_HOME=/usr/local/hadoop-2.7.3 export PATH=$PATH:${HADOOP_HOME}/bin export SCALA_HOME=/usr/local/scala-2.11.0 export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7 export PATH=${SPARK_HOME}/bin:${SPARK_HOME}/sbin/sbin:$PATH
四、使配置生效
hadoop@Master:/usr/local$ source ~/.bashrc
五、進入Spark的conf目錄,配置spark-env.sh
hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ cp spark-env.sh.template spark-env.sh
hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ vim spark-env.sh
export JAVA_HOME=/usr/local/bin/jdk1.8.0_131 export SCALA_HOME=/usr/local/scala-2.11.0 export HADOOP_HOME=/usr/local/hadoop-2.7.3 export HADOOP_CONF_DIR=/usr/local/hadoop-2.7.3/etc/hadoop export SPARK_MASTER_IP=Master export SPARK_WORKER_CORES=2 export SPARK_DRIVER_MEMORY=1G export SPARK_WORKER_MEMORY=1g export SPARK_EXECUTOR_MEMORY=1g
六、配置slaves
hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ cp slaves.template slaves
hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ vim slaves
Worker1 Worker2
七、配置spark-defaults.conf
spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three" spark.eventLog.enabled true spark.eventLog.dir hdfs://Master:9000/historyserverforSpark spark.yarn.historyServer.address Master:18080 spark.history.fs.logDirectory hdfs:Master:9000/historyserverforSpark
其中historyserverforSpark目錄須要手動建立,不然在啓動Spark-shell的時候會報錯。
八、啓動
hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/sbin$ ./start-all.sh