主要完成hadoop集羣搭建和yarn上運行flinkhtml
主要是搭建hadoop MapReduce(yarn)和HDFSjava
這裏下載的hadoop二進制包爲 2.7.7,下載後解壓到本地,假設是/usr/hadoop/hadoop-2.7.7node
#HADOOP VARIABLES START export HADOOP_INSTALL=/usr/hadoop/hadoop-2.7.7 export HADOOP_HOME=$HADOOP_INSTALL export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" #HADOOP VARIABLES END
運行命令web
ssh localhostshell
若是出現 「Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.」 相似的錯誤則須要作以下配置apache
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa # 若是已經生成了公私鑰對則跳過改步驟vim
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keysssh
$ chmod 0600 ~/.ssh/authorized_keysoop
cd hadoop-2.7.7.net
vim etc/hadoop/core-site.xml
修改core-site.xml文件內容爲
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/hadoop/hadoop-2.7.2/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
vim etc/hadoop/hdfs-site.xml
修改hdfs-site.xml內容爲
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/hadoop/hadoop-2.7.2/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/hadoop/hadoop-2.7.2/tmp/dfs/data</value> </property> </configuration>
注意: 官網的配置只配置了fs.defaultFS和dfs.replication,這樣即可以啓動起來,可是若沒有配置hadoop.tmp.dir參數,則默認使用的臨時目錄爲 /tmp/hadoo-hadoop,而這個目錄在重啓時有可能被系統清理掉,致使必須從新執行format才行。
vim etc/hadoop/hadoop-env.sh
須要顯示的聲明JAVA_HOME, 即便環境變量裏已經有了JAVA_HOME。不然會報錯:JAVA_HOME is not set and could not be found
## 修改此處爲jdk的home目錄 export JAVA_HOME=/opt/jdk/jdk1.8
$ bin/hdfs namenode -format
$ sbin/start-dfs.sh
成功啓動後能夠經過 http://localhost:50070/ 訪問hdfs web頁面。使用jps查看進程能夠看到DataNode、NameNode、SecondaryNameNode 三個進程,若是沒有看到NameNode,能夠排除下是否是端口有衝突,而後修復core-site.xml中fs.defaultFS配置的端口號重試下。
vim etc/hadoop/mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
vim etc/hadoop/yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
啓動yarn
$ sbin/start-yarn.sh
啓動後能夠經過 http://localhost:8088/ 訪問ResourceManager
到此hadoop僞集羣已經搭建完畢
flink要下載和hadoop版本對應的flink版本,不然會出現錯誤, 這裏咱們下載 Apache Flink 1.7.2 with Hadoop® 2.7 for Scala 2.11。下載後解壓爲flink-1.7.2。直接運行以下命令便可:
$ flink-1.7.2/bin/flink run -m yarn-cluster -yn 2 ../my-flink-project-0.1.jar
其中yarn-cluster表示在yarn上運行flink集羣, my-flink-project-0.1.jar是本身寫的flink程序。
提交後能夠經過ResourceManager http://localhost:8088/ 查看yarn任務運行.