[Spark]Spark-streaming經過Receiver方式實時消費Kafka流程(Yarn-cluster)

1.啓動zookeeper
2.啓動kafka服務(broker)
[root@master kafka_2.11-0.10.2.1]# ./bin/kafka-server-start.sh config/server.properties
3.啓動kafka的producer(前提:已經建立好topic
[root@master kafka_2.11-0.10.2.1]# ./bin/kafka-console-producer.sh --broker-list master:9092 --topic test
4.啓動kafka的consumer
[root@master kafka_2.11-0.10.2.1]#./bin/kafka-console-consumer.sh --zookeeper master:2181 --topic test --from-beginning
5.打jar包,將帶有依賴的jar包上傳到集羣上
mvn clean assembly:assembly
6.編寫啓動腳本,啓動任務 sh run_receiver.sh
/usr/local/src/spark-2.0.2-bin-hadoop2.6/bin/spark-submit\
        --class com.skyell.streaming.ReceiverFromKafka\
        --master yarn-cluster \
        --executor-memory 1G \
        --total-executor-cores 2 \
        --files $HIVE_HOME/conf/hive-site.xml \
        ./Spark8Pro-2.0-SNAPSHOT-jar-with-dependencies.jar
監控任務及查看日誌

http://master:8088/clusterapp

關閉spark streaming任務
yarn application -kill application_1539421032843_0093

數據驅動變革-雲將 我的博客地址oop

相關文章
相關標籤/搜索