一:Scala的安裝.html
Scala 官網提供各個版本的 Scala,用戶須要根據 Spark 官方規定的 Scala 版本進行下載
和安裝。sql
我下載的版本是 scala-2.11.8.tgz,下載地址:http://www.scala-lang.org/download/shell
1.在mysoftware下解壓,即:express
hadoop@master:/mysoftware$ tar -xzvf ~/Desktop/scala-2.11.8.tgz
2.配置環境變量,在/etc/profile中添加以下內容:apache
export SCALA_HOME=/mysoftware/scala-2.11.8瀏覽器
export PATH=$SCALA_HOME/bin:$PATH緩存
3.生效:./etc/profileapp
hadoop@master:~$ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80). Type in expressions for evaluation. Or try :help. scala> 9*9 res0: Int = 81 scala>
二:Spark的安裝。oop
進入官網下載對應 Hadoop 版本的 Spark 程序包,ui
對於Spark,我下載的版本是:spark-1.6.1.tgz。
官網下載地址:https://spark.apache.org/downloads.html
1.解壓到mysoftware目錄,即:
hadoop@master:/mysoftware$ tar -xzvf ~/Desktop/spark-1.6.1.tgz
2.在 /etc/profile 下添加以下內容:
export SPARK_HOME=/mysoftware/spark-1.6.1
export PATH=$SPARK_HOME/bin:$PATH
3.配置conf/spark-env.sh,即:
hadoop@master:/mysoftware/spark-1.6.1/conf$ cp spark-env.sh.template spark-env.sh hadoop@master:/mysoftware/spark-1.6.1/conf$ sudo gedit spark-env.sh
在spark-env.sh文件末尾添加以下內容:
export SCALA_HOME=/mysoftware/scala-2.11.8
export JAVA_HOME=/mysoftware/jdk1.7.0_80
export SPARK_MASTER_IP=192.168.226.129
export SPARK_WORKER_MEMORY=512m
export master=spark://192.168.226.129:7077
參數 SPARK_WORKER_MEMORY 決定在每個Worker節點上可用的最大內存,增長這個數值能夠在內存中緩存更多的數據,可是必定要給Slave的操做系統和其餘服務預留足夠的內存。
須要配置 SPARK_MASTER_IP 和 MASTER,不然會形成Slave沒法註冊主機錯誤。3.啓動Spark.
在Spark根目錄下啓動spark,即:
hadoop@master:/mysoftware/spark-1.6.1$ sbin/start-all.sh
可是啓動spark後,出現了以下問題:
hadoop@master :/mysoftware/spark-1.6.1$ sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out
failed to launch org.apache.spark.deploy.master.Master:
Failed to find Spark assembly in /mysoftware/spark-1.6.1/assembly/target/scala-2.10.
You need to build Spark before running this program.
full log in /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out
master: starting org.apache.spark.deploy.worker.Worker, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
master: failed to launch org.apache.spark.deploy.worker.Worker:
master: Failed to find Spark assembly in /mysoftware/spark-1.6.1/assembly/target/scala-2.10.
master: You need to build Spark before running this program.
master: full log in /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
很好奇啊,我明明安裝的scala版本是2.11.8,而錯誤提示大概告訴我說沒有找到scala-2.10.
問題思路:
原文關鍵:
You have to download one of pre-built version in "Choose a package type" section from the Spark download page.
哦,一開始覺得像以前下載包同樣,看到包就下載下來了,卻忽視了這句話:
進入官網下載對應 Hadoop 版本的 Spark 程序包,(原來是沒有下載對應的包。。。。)
個人hadoop版本爲hadoop-2.6.4,則從新下載spark包爲:spark-1.6.1-bin-hadoop2.6.tgz
從新安裝Spark:
hadoop@master:/mysoftware$ tar -xzvf ~/Desktop/spark-1.6.1-bin-hadoop2.6.tgz
照以前安裝完畢後,在從新啓動spark:
hadoop@master:/mysoftware/hadoop-2.6.4$ cd ../spark-1.6.1/ hadoop@master:/mysoftware/spark-1.6.1$ sbin/start-all.sh starting org.apache.spark.deploy.master.Master, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out master: starting org.apache.spark.deploy.worker.Worker, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out hadoop@master:/mysoftware/spark-1.6.1$ jps 2975 NameNode 4055 Worker 3964 Master 3611 NodeManager 3282 SecondaryNameNode 3482 ResourceManager 2769 MainGenericRunner 4104 Jps 3108 DataNode hadoop@master:/mysoftware/spark-1.6.1$
注意經過查看進程,會發現多了 Master 和 Worker 。
而後在啓動spark-shell,即:
hadoop@master:/mysoftware/spark-1.6.1$ spark-shell log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties To adjust logging level use sc.setLogLevel("INFO") Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.6.1 /_/ Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80) Type in expressions to have them evaluated. Type :help for more information. Spark context available as sc. 16/06/02 04:09:09 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/06/02 04:09:10 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/06/02 04:09:21 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 16/06/02 04:09:21 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 16/06/02 04:09:28 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/06/02 04:09:28 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 16/06/02 04:09:34 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 16/06/02 04:09:35 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException SQL context available as sqlContext. scala>
最後可在瀏覽器端輸入如下地址,查看:
http://192.168.226.129:4040/
http://192.168.226.129:8080/