Flink集羣部署與啓動之Flink On Yarn

時間 2020-09-22

原文原文鏈接

Flink集羣的部署

Flink的部署有三種模式，分別是Local，Standalone Cluster和Yarn Cluster，這裏咱們主要講如何配置Yarn Cluster。java

在配置Flink On Yarn以前，必須保證hdfs和yarn都已經開啓：Hadoop集羣部署與啓動，Yarn模式要考慮Container內存資源分配segmentfault

安裝版本： flink-1.7.1-bin-hadoop28-scala_2.11.tgz瀏覽器

mkdir /usr/local/flink
tar zxvf flink-1.7.1-bin-hadoop28-scala_2.11.tgz -C /usr/local/flink

修改域名與IP的對應關係(hadoop2和hadoop3一樣也須要修改hosts文件）oop

vi /etc/hosts
10.2.15.176 hadoop1
10.2.15.177 hadoop2
10.2.15.170 hadoop3

配置環境變量(hadoop2和hadoop3一樣也須要修改hosts文件）spa

vi /etc/profile
export FLINK_HOME=/usr/local/flink/flink-1.7.1
export PATH=$PATH:$FLINK_HOME/bin
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.8.3
export PATH=$HADOOP_HOME/bin:$PATH
source /etc/profile

修改flink-conf.yaml文件.net

vi /usr/local/flink/flink-1.7.1/conf/flink-conf.yaml
#==============================================================================
# Common
#==============================================================================
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 1
parallelism.default: 1
env.java.home: /usr/local/jdk/jdk1.8.0_251
jobmanager.heap.mb: 6192m
taskmanager.heap.mb: 8192m
#==============================================================================
# High Availability
#==============================================================================
high-availability: zookeeper
high-availability.storageDir: hdfs:///flink/ha/
high-availability.zookeeper.quorum: 10.2.15.181:2181,10.2.15.174:2181,10.2.15.172:2181
high-availability.zookeeper.path.root: /flink_on_yarn
high-availability.zookeeper.path.namespace: /cluster_yarn
#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================
state.backend: filesystem
state.backend.fs.checkpointdir: hdfs:///flink/checkpoints
#==============================================================================
# Web Frontend
#==============================================================================
rest.port: 8081
#==============================================================================
# Advanced
#==============================================================================
taskmanager.memory.preallocate: false
taskmanager.network.numberOfBuffers: 64000
fs.hdfs.hadoopconf: /usr/local/hadoop/hadoop-2.8.5/etc/hadoop

修改masters和slaves文件scala

vi conf/masters
hadoop1:8081
hadoop2:8081
vi conf/slaves
hadoop2
hadoop3

提交Job

首先先啓動ZooKeeperjps仲裁rest

./start-zookeeper-quorum.sh

而後啓動Per-Job-Cluster任務，可經過 ./bin/flink run -m yarn-cluster -d -c mainClass /path/to/user/jar 命令使用分離模式啓動一個集羣，即單任務單集羣code

./bin/flink run -m yarn-cluster  ./examples/batch/WordCount.jar -input hdfs://hadoop1:9000/input/input_hadoop_demo_test.txt  -output hdfs://hadoop1:9000/output/wordcount-result1.txt
或 
./bin/flink run -m yarn-cluster -yn 2 -yjm 800 -ytm 800  ./examples/batch/WordCount.jar -input hdfs://hadoop1:9000/input/input_hadoop_demo_test.txt  -output hdfs://hadoop1:9000/output/wordcount-result2.txt

在瀏覽器中輸入 http://hadoop1:8088，可查看相關信息內存