sparksql+hive :http://lxw1234.com/archives/2015/06/294.htmhtml
1,安裝scalajava
http://scala-lang.org/download/2.11.8.htmlnode
scala-2.11.8.tgzmysql
放在/usr/bigdata 目錄下sql
tar -zxvf scala-2.11.8.tgzapache
vi /etc/profileoop
export SCALA_HOME=/usr/bigdata/scala-2.11.8
export PATH=$PATH:$SCALA_HOME/binspa
source /etc/profile.net
2,安裝sparkscala
/usr/bigdata/spark-1.6.2-bin-hadoop2.6
版本:spark-1.6.2-bin-hadoop2.6.tgz
放在/usr/bigdata下面
tar -zxvf spark-1.6.2-bin-hadoop2.6.tgz
vi /etc/profile
export SPARK_HOME=/usr/bigdata/spark-1.6.2-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
source /etc/profile
3,配置spark
vi /usr/bigdata/spark-1.6.2-bin-hadoop2.6/conf/spark-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_80
export SCALA_HOME=/usr/bigdata/scala-2.11.8
export HADOOP_HOME=/usr/bigdata/hadoop-2.6.2
#HADOOP_OPTS=-Djava.library.path=/usr/bigdata/hadoop-2.6.2/lib/native
export HADOOP_CONF_DIR=/usr/bigdata/hadoop-2.6.2/etc/hadoop
export SPARK_CLASSPATH=$SPARK_CLASSPATH:$SPARK_HOME/lib/mysql-connector-java-5.1.38.jar
vi slaves
vm-10-112-29-172
vm-10-112-29-174
.sbin/start-all.sh
每一個節點上都一樣配置:
scp -r /usr/bigdata/scala-2.11.8 root@vm-10-112-29-172:/usr/bigdata/
scp -r /usr/bigdata/spark-1.6.2-bin-hadoop2.6 root@vm-10-112-29-172:/usr/bigdata/
scp /usr/bigdata/spark-1.6.2-bin-hadoop2.6/conf/slaves root@vm-10-112-29-172:/usr/bigdata/spark-1.6.2-bin-hadoop2.6/conf
4,檢查配置是否成功
jps命令:
在master節點上出現「Master」,在slave節點上出現「Worker」;
5,運行檢測
cd bin/
run-example SparkPi
返回結果:
Pi is roughly 3.14506
6,運行spark自帶的實例:
./bin/run-example org.apache.spark.examples.sql.JavaSparkSQL
7,spark實例體驗:
http://my.oschina.net/scipio/blog/284957
啓動jdbc服務:
./start-thriftserver.sh --master yarn --hiveconf hive.server2.thrift.port=10009
spark-sql 客戶端啓動:
./bin/spark-sql --master yarn-client --jars /usr/bigdata/spark-1.6.2-bin-hadoop2.6/lib/mysql-connector-java-5.1.17.jar
-------------------------命令-----------------------------------
1: ./spark-sql --master yarn-client
2: ./spark-sql --master yarn-client --total-executor-cores 20 --driver-memory 1g --executor-memory 6g --executor-cores 6 --num-executors 100 --conf spark.default.parallelism=1000 --conf spark.storage.memoryFraction=0.5 --conf spark.shuffle.memoryFraction=0.3
3: ./spark-sql --master yarn-client --total-executor-cores 20 --driver-memory 1g --executor-memory 6g --executor-cores 6 --num-executors 200 --conf spark.default.parallelism=1200 --conf spark.storage.memoryFraction=0.4 --conf spark.shuffle.memoryFraction=0.4
./spark-sql --master yarn-client --total-executor-cores 20 --driver-memory 1g --executor-memory 6g --executor-cores 6 --num-executors 200 --conf spark.default.parallelism=1200 --conf spark.storage.memoryFraction=0.4 --conf spark.shuffle.memoryFraction=0.4 --conf spark.sql.shuffle.partitions=300
./start-thriftserver.sh --hiveconf hive.server2.thrift.port=10009
!connect jdbc:hive2://node6:10009