一、環境介紹node
涉及到軟件下載地址:https://pan.baidu.com/s/1hpcXUSJe85EsU9ara48MsQlinux
服務器:CentOS 6.8 其中:2 臺 namenode、3 臺 datanodeshell
zookeeper集羣地址:192.168.67.11:2181,192.168.67.12:2181apache
JDK:jdk-8u191-linux-x64.tar.gzbootstrap
hadoop:hadoop-3.1.1.tar.gzvim
節點信息:服務器
節點 | IP | namenode | datanode | resourcemanager | journalnode |
namenode1 | 192.168.67.101 | √ | √ | √ | |
namenode2 | 192.168.67.102 | √ | √ | √ | |
datanode1 | 192.168.67.103 | √ | √ | ||
datanode2 | 192.168.67.104 | √ | √ | ||
datanode3 | 192.168.67.105 | √ | √ |
二、配置ssh免密登錄框架
2.1 在每臺機器上執行 ssh-keygen -t rsassh
2.2 vim ~/.ssh/id_rsa.pub 將全部機器上的公鑰內容彙總到 authorized_keys 文件並分發到每臺機器上。ide
2.3 受權 chmod 600 ~/.ssh/authorized_keys
三、配置hosts:
vim /etc/hosts #增長以下配置 192.168.67.101 namenode1 192.168.67.102 namenode2 192.168.67.103 datanode1 192.168.67.104 datanode2 192.168.67.105 datanode3
#將hosts文件分發至其餘機器 scp -r /etc/hosts namenode2:/etc/hosts scp -r /etc/hosts datanode1:/etc/hosts scp -r /etc/hosts datanode2:/etc/hosts scp -r /etc/hosts datanode3:/etc/hosts
四、關閉防火牆
service iptables stop chkconfig iptables off
五、安裝JDK
tar -zxvf /usr/local/soft/jdk-8u191-linux-x64.tar.gz -C /usr/local/ vim /etc/profile #增長JDK環境變量內容 export JAVA_HOME=/usr/local/jdk1.8.0_191 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH
使環境變量生效:source /etc/profile
六、安裝hadoop
tar -zxvf /usr/local/soft/hadoop-3.1.1.tar.gz -C /usr/local/ vim /etc/profile #增長hadoop環境變量內容 export HADOOP_HOME=/usr/local/hadoop-3.1.1 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib
使環境變量生效:source /etc/profile
#修改 start-dfs.sh 和 stop-dfs.sh 兩個文件,增長配置 vim /usr/local/hadoop-3.1.1/sbin/start-dfs.sh vim /usr/local/hadoop-3.1.1/sbin/stop-dfs.sh #增長啓動用戶 HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=root HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root HDFS_JOURNALNODE_USER=root HDFS_ZKFC_USER=root
#修改 start-yarn.sh 和 stop-yarn.sh 兩個文件,增長配置 vim /usr/local/hadoop-3.1.1/sbin/start-yarn.sh vim /usr/local/hadoop-3.1.1/sbin/stop-yarn.sh #增長啓動用戶 YARN_RESOURCEMANAGER_USER=root HDFS_DATANODE_SECURE_USER=root YARN_NODEMANAGER_USER=root
vim /usr/local/hadoop-3.1.1/etc/hadoop/hadoop-env.sh #增長內容 export JAVA_HOME=/usr/local/jdk1.8.0_191 export HADOOP_HOME=/usr/local/hadoop-3.1.1
#修改 workers 文件內容 vim /usr/local/hadoop-3.1.1/etc/hadoop/workers
#替換內容爲 datanode1 datanode2 datanode3
vim /usr/local/hadoop-3.1.1/etc/hadoop/core-site.xml #修改成以下配置 <configuration> <!-- 指定hdfs的nameservice爲nameservice --> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster/</value> </property> <!-- 指定hadoop臨時目錄 --> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop-3.1.1/hdfs/temp</value> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>192.168.67.1:2181</value> </property> </configuration>
vim /usr/local/hadoop-3.1.1/etc/hadoop/hdfs-site.xml #修改成以下配置 <configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop-3.1.1/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop-3.1.1/hdfs/data</value> </property> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>namenode1:9000</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>namenode2:9000</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>namenode1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>namenode2:50070</value> </property> <!--HA故障切換 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- journalnode 配置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://namenode1:8485;namenode2:8485;datanode1:8485;datanode2:8485;datanode3:8485/mycluster</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!--發生failover時,Standby的節點要執行一系列方法把原來那個Active節點中不健康的NameNode服務給殺掉, 這個叫作fence過程。sshfence會經過ssh遠程調用fuser命令去找到Active節點的NameNode服務並殺死它--> <property> <name>dfs.ha.fencing.methods</name> <value>shell(/bin/true)</value> </property> <!--SSH私鑰 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!--SSH超時時間 --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> <!--Journal Node文件存儲地址 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/local/hadoop-3.1.1/hdfs/journaldata</value> </property> <property> <name>dfs.qjournal.write-txns.timeout.ms</name> <value>60000</value> </property> </configuration>
vim /usr/local/hadoop-3.1.1/etc/hadoop/mapred-site.xml #修改成以下配置 <configuration> <!-- 指定mr框架爲yarn方式 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
vim /usr/local/hadoop-3.1.1/etc/hadoop/yarn-site.xml #修改成以下配置 <configuration> <!-- Site specific YARN configuration properties --> <!-- 開啓RM高可用 --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!-- 指定RM的cluster id --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yrc</value> </property> <!-- 指定RM的名字 --> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!-- 分別指定RM的地址 --> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>namenode1</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>namenode2</value> </property> <!-- 指定zk集羣地址 --> <property> <name>yarn.resourcemanager.zk-address</name> <value>192.168.67.11:2181,192.168.67.12:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
#將這些修改的文件分發至其餘4臺服務器中 /usr/local/hadoop-3.1.1/sbin/start-dfs.sh /usr/local/hadoop-3.1.1/sbin/stop-dfs.sh /usr/local/hadoop-3.1.1/sbin/start-yarn.sh /usr/local/hadoop-3.1.1/sbin/stop-yarn.sh /usr/local/hadoop-3.1.1/etc/hadoop/hadoop-env.sh /usr/local/hadoop-3.1.1/etc/hadoop/workers /usr/local/hadoop-3.1.1/etc/hadoop/core-site.xml /usr/local/hadoop-3.1.1/etc/hadoop/hdfs-site.xml /usr/local/hadoop-3.1.1/etc/hadoop/mapred-site.xml /usr/local/hadoop-3.1.1/etc/hadoop/yarn-site.xml
首次啓動順序 1、確保配置的zookeeper服務器已經運行 2、在全部journalnode機器上啓動:hadoop-daemon.sh start journalnode 3、namenode1中執行格式化zkfc:hdfs zkfc -formatZK 4、namenode1中格式化主節點:hdfs namenode -format 5、啓動namenode1中的主節點:hadoop-daemon.sh start namenode 6、namenode2副節點同步主節點格式化:hdfs namenode -bootstrapStandby 7、啓動集羣:start-all.sh
七、驗證
7.1 訪問地址:
http://192.168.67.101/50070/
http://192.168.67.102/50070/
http://192.168.67.101/8088/
http://192.168.67.102/8088/
7.2 關閉 namenode 爲 active 對應的服務器,觀察另外一臺 namenode 狀態是否由 standby 變動爲 active