1. 基礎環境搭建html
新建3個CentOS6.5操做系統的虛擬機,命名(可自定)爲masternode、slavenode1和slavenode2。該過程參考上一篇博文CentOS6.5安裝配置詳解
java
2.Hadoop集羣搭建(如下操做中三個節點相同的地方就只給出主節點的截圖,不一樣的纔給出全部節點的截圖)
node
2.1 系統時間同步jquery
使用date命令查看當前系統時間linux
系統時間同步web
[root@masternode ~]# cd /usr/share/zoneinfo/
[root@masternode zoneinfo]# ls //找到Asia
[root@masternode zoneinfo]# cd Asia/ //進入Asia目錄
[root@masternode Asia]# ls //找到Shanghai
[root@masternode Asia]# cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime //當前時區替換爲上海
咱們能夠同步當前系統時間和日期與NTP(網絡時間協議)一致。算法
[root@masternode Asia]# yum install ntp //若是ntp命令不存在,在線安裝ntp
[root@masternode Asia]# ntpdate pool.ntp.org //執行此命令同步日期時間
分別在masternode、slavenode1和slavenode2節點內新建hadoop用戶組和用戶,專用於Hadoop集羣的操做和管理。命令以下:
shell
[root@masternode ~]# groupadd hadoop
[root@masternode ~]# useradd -g hadoop hadoop
建立結果以下:apache
而後執行命令建立密碼。注意:此過程你所輸入的內容是不可見的,但其實已經輸入了。還有,不能夠刪除。bootstrap
[root@masternode hadoop]# passwd hadoop
2.2 目錄規劃
下面首先爲這三臺機器分配IP地址及相應的角色
192.168.86.135-----master,namenode,jobtracker
192.168.86.136-----slave1,datanode,tasktracker
192.168.86.137-----slave2,datanode,tasktracker
在全部節點的hosts文件中添加靜態IP與hostname的映射配置信息。
[root@masternode ~]# vi /etc/hosts
而後依次對master、slave一、slave2進行目錄規劃。
名稱 路徑
全部集羣安裝的軟件目錄 /home/hadoop/app/
全部臨時目錄 /tmp
系統默認的臨時目錄是在/tmp下,而這個目錄在每次重啓後都會被刪掉,必須從新執行format才行,不然會出錯。
2.3 禁用防火牆
全部節點的防火牆都要關閉。查看防火牆狀態:
[root@masternode ~]# service iptables status
iptables: Firewall is not running.
若是不是上面的關閉狀態,則須要關閉防火牆。
[root@masternode ~]# chkconfig iptables off //永久關閉防火牆
[root@masternode ~]# service iptables stop //臨時關閉防火牆
2.4 SSH免密通訊配置
[hadoop@masternode ~]$ su root //切換到hadoop用戶下
Password:
[root@masternode hadoop]# su hadoop //切換到hadoop用戶目錄
[hadoop@masternode ~]$ mkdir .ssh
mkdir: cannot create directory `.ssh': File exists //個人已經存在,不影響,繼續下面的操做
[hadoop@masternode ~]$ ssh-keygen -t rsa //執行命令一路回車,生成祕鑰
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
4b:a8:30:35:0e:cc:82:3f:1b:78:81:9c:e2:ee:ca:7d hadoop@masternode
The key's randomart image is:
+--[ RSA 2048]----+ //生成的密鑰圖像
| |
|o+. |
|=o= o |
|o+ = . . |
|..B . . S |
|.. * . . . |
| .. . . |
|o . E |
|oo .. |
+-----------------+
[hadoop@masternode ~]$ cd .ssh
[hadoop@masternode .ssh]$ ls
id_rsa id_rsa.pub
[hadoop@masternode .ssh]$ cat id_rsa.pub >> authorized_keys //將公鑰保存到authorized_keys認證文件中
[hadoop@masternode .ssh]$ ls
authorized_keys id_rsa id_rsa.pub
[hadoop@masternode .ssh]$ cd ..
[hadoop@masternode ~]$ chmod 700 .ssh
[hadoop@masternode ~]$ chmod 600 .ssh/*
[hadoop@masternode ~]$ ssh masternode
The authenticity of host 'masternode (192.168.86.135)' can't be established.
RSA key fingerprint is 45:13:ab:81:3a:53:44:2b:59:8f:06:fb:56:2f:b6:d8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'masternode,192.168.86.135' (RSA) to the list of known hosts.
Last login: Tue Apr 17 14:16:46 2018 from 192.168.86.1
[hadoop@masternode ~]$ ssh masternode
Last login: Tue Apr 17 15:45:44 2018 from masternode
集羣全部節點都要行上面的操做,而後將全部節點中的共鑰id_ras.pub拷貝到masternode中的authorized_keys文件中。
[hadoop@masternode ~]$ cat ~/.ssh/id_rsa.pub | ssh hadoop@masternode 'cat >> ~/.ssh/authorized_keys'
//全部節點都須要執行這條命令
再將masternode中的authorized_keys文件分發到全部節點上面。
[hadoop@masternode ~]$ cd .ssh
[hadoop@masternode .ssh]$ ls
authorized_keys id_rsa id_rsa.pub known_hosts
[hadoop@masternode .ssh]$ scp -r authorized_keys hadoop@slavenode1:~/.ssh/
hadoop@slavenode1's password:
authorized_keys 100% 1596 1.6KB/s 00:00
[hadoop@masternode .ssh]$ scp -r authorized_keys hadoop@slavenode2:~/.ssh/
hadoop@slavenode2's password:
authorized_keys 100% 1596 1.6KB/s 00:00
到此,集羣的SSH免密通訊就配置完成了。
2.5 腳本工具
在masternode節點上建立/home/hadoop/tools目錄。
[hadoop@masternode ~]$ mkdir /home/hadoop/tools
[hadoop@masternode ~]$ cd /home/hadoop/tools
將本地腳本文件上傳至/home/hadoop/tools目錄下。這些腳本你們若是能看懂也能夠本身寫, 若是看不懂直接使用就能夠,後面慢慢補補Linux相關的知識。
先建立腳本文件,而後分別填入下面內容:
[hadoop@masternode tools]$ touch deploy.conf
[hadoop@masternode ~]$ vi deploy.conf
masternode,all,namenode,zookeeper,resourcemanager,
slavenode1,all,slave,namenode,zookeeper,resourcemanager,
slavenode2,all,slave,datanode,zookeeper,
[hadoop@masternode tools]$ touch deploy.sh
[hadoop@masternode ~]$ vi deploy.sh
#!/bin/bash
#set -x
if [ $# -lt 3 ]
then
echo "Usage: ./deply.sh srcFile(or Dir) descFile(or Dir) MachineTag"
echo "Usage: ./deply.sh srcFile(or Dir) descFile(or Dir) MachineTag confFile"
exit
fi
src=$1
dest=$2
tag=$3
if [ 'a'$4'a' == 'aa' ]
then
confFile=/home/hadoop/tools/deploy.conf
else
confFile=$4
fi
if [ -f $confFile ]
then
if [ -f $src ]
then
for server in `cat $confFile|grep -v '^#'|grep ','$tag','|awk -F',' '{print $1}'`
do
scp $src $server":"${dest}
done
elif [ -d $src ]
then
for server in `cat $confFile|grep -v '^#'|grep ','$tag','|awk -F',' '{print $1}'`
do
scp -r $src $server":"${dest}
done
else
echo "Error: No source file exist"
fi
else
echo "Error: Please assign config file or run deploy.sh command with deploy.conf in same directory"
fi
[hadoop@masternode tools]$ touch runRemoteCmd.sh
[hadoop@masternode ~]$ vi runRemoteCmd.sh#!/bin/bash
#set -x
if [ $# -lt 2 ]
then
echo "Usage: ./runRemoteCmd.sh Command MachineTag"
echo "Usage: ./runRemoteCmd.sh Command MachineTag confFile"
exit
fi
cmd=$1
tag=$2
if [ 'a'$3'a' == 'aa' ]
then
confFile=/home/hadoop/tools/deploy.conf
else
confFile=$3
fi
if [ -f $confFile ]
then
for server in `cat $confFile|grep -v '^#'|grep ','$tag','|awk -F',' '{print $1}'`
do
echo "*******************$server***************************"
ssh $server "source /etc/profile; $cmd"
done
else
echo "Error: Please assign config file or run deploy.sh command with deploy.conf in same directory"
fi
若是咱們想直接使用腳本,還須要給腳本添加執行權限。
[hadoop@masterndoe tools]$ chmod u+x deploy.sh
[hadoop@masterndoe tools]$ chmod u+x runRemoteCmd.sh
此時,咱們須要將/home/hadoop/tools目錄配置到PATH路徑中,並使配置文件生效。
[hadoop@masterndoe tools]$ su root
Password:
[root@masterndoe tools]# vi /etc/profile
PATH=/home/hadoop/tools:$PATH
export PATH
[root@masternode app]# source /etc/profile
咱們在masternode節點上,經過runRemoteCmd.sh腳本,一鍵建立全部節點的軟件安裝目錄/home/hadoop/app。
[hadoop@masterndoe tools]$ runRemoteCmd.sh "mkdir /home/hadoop/app" all
咱們能夠在全部節點查看到/home/hadoop/app目錄已經建立成功。
2.6 JDK安裝與配置
將本地下載好的jdk1.7,上傳至hadoop11節點下的/home/hadoop/app目錄並解壓。
[hadoop@masternode ~]$ cd /home/hadoop/app/
[hadoop@masternode app]$ rz
[hadoop@masternode app]$ ls
jdk-8u60-linux-x64.tar.gz
[hadoop@masternode app]$ tar zxvf jdk-8u60-linux-x64.tar.gz //解壓
[hadoop@masternode app]$ ls
jdk1.8.0_60 jdk-8u60-linux-x64.tar.gz
[hadoop@masternode app]$ rm -f jdk-8u60-linux-x64.tar.gz //刪除安裝包
而後,添加JDK環境變量。
[hadoop@masternode app]$ su root
Password:
[root@masternode app]# vi /etc/profile JAVA_HOME=/home/hadoop/app/jdk1.8.0_60 CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar PATH=$JAVA_HOME/bin:/home/hadoop/tools:$PATH //黑色字體爲2.5中腳本工具的配置信息 export JAVA_HOME CLASSPATH PATH
[root@masternode app]# source /etc/profile
[root@masternode app]# java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
slavenode1和slavenode2節點重複masternode節點上的jdk配置便可。
2.7 Zookeeper安裝與配置
將本地下載好的zookeeper-3.4.6.tar.gz安裝包,上傳至masternode節點下的/home/hadoop/app目錄下。
[root@masternode app]# su hadoop
[hadoop@masternode app]$ rz //選擇本地下載好的zookeeper-3.4.6.tar.gz
[hadoop@masternode app]$ ls
jdk1.8.0_60 zookeeper-3.4.5-cdh5.10.0.tar.gz
[hadoop@masternode app]$ //重命名 [hadoop@masternode app]$ ls jdk1.8.0_60 zookeeper
修改Zookeeper中的配置文件,必定注意將下面配置信息中的全部中文註釋去掉,不然編碼會出錯致使沒法啓動zookeeper,之後也是,配置中儘可能不要出現中午和字符(空格,tab等)!
[hadoop@masternode app]$ cd /home/hadoop/app/zookeeper/conf/
[hadoop@masternode conf]$ ls
configuration.xsl log4j.properties zoo_sample.cfg
[hadoop@masternode conf]$ cp zoo_sample.cfg zoo.cfg //複製生成zoo.cfg文件
[hadoop@masternode conf]$ vi zoo.cfg
dataDir=/home/hadoop/data/zookeeper/zkdata //數據文件目錄
dataLogDir=/home/hadoop/data/zookeeper/zkdatalog //日誌目錄
# the port at which the clients will connect
clientPort=2181 //默認端口號
#server.服務編號=主機名稱:Zookeeper不一樣節點之間同步和通訊的端口:選舉端口(選舉leader)
server.1=masternode:2888:3888
server.2=slavenode1:2888:3888
server.3=slavenode2:2888:3888
經過遠程命令遠程拷貝命令scp將Zookeeper安裝目錄拷貝到其餘節點上面。
[hadoop@masternode zookeeper]# scp -r zookeeper slavenode1:/home/hadoop/app
[hadoop@masternode zookeeper]# scp -r zookeeper slavenode2:/home/hadoop/app
經過遠程命令runRemoteCmd.sh在全部的節點上面建立目錄:
[hadoop@masternode app]$ runRemoteCmd.sh "mkdir -p /home/hadoop/data/zookeeper/zkdata" all
*******************masternode***************************
*******************slavenode1***************************
mkdir: cannot create directory `/home/hadoop/data/zookeeper': Permission denied
*******************slavenode2***************************
mkdir: cannot create directory `/home/hadoop/data/zookeeper': Permission denied
結果出現訪問拒絕命令,這是因爲用戶組權限問題,應該是咱們以前建立的data/目錄是屬於root用戶組的,咱們須要賦予權限給hadoop用戶組。
[hadoop@masternode hadoop]$ chown -R hadoop:hadoop data
接下來就能夠成功建立目錄了:
[hadoop@masternode tools]$ runRemoteCmd.sh "mkdir -p /home/hadoop/data/zookeeper/zkdata" all
*******************masternode***************************
*******************slavenode1***************************
*******************slavenode2***************************
[hadoop@masternode tools]$ runRemoteCmd.sh "mkdir -p /home/hadoop/data/zookeeper/zkdatalog" all
*******************masternode***************************
*******************slavenode1***************************
*******************slavenode2***************************
而後分別在masternode、slavenode1和slavenode2上面,進入zkdata目錄下,建立文件myid,裏面的內容分別填充爲:一、二、3, 這裏咱們以masternode爲例。
[hadoop@masternode tools]$ cd /home/hadoop/data/zookeeper/zkdata
[hadoop@masternode zkdata]$ vi myid
1
配置Zookeeper環境變量。
[hadoop@masternode zkdata]$ su root
Password:
[root@masternode zookeeper]# vi /etc/profile
TOOL_HOME=/home/hadoop/tools
JAVA_HOME=/home/hadoop/app/jdk1.8.0_60
ZOOKEEPER_HOME=/home/hadoop/app/zookeeper
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$TOOLO_HOME:$PATH
export JAVA_HOME CLASSPATH PATH ZOOKEEPER_HOME
[root@masternode zookeeper]# source /etc/profile //使配置生效
在masternode節點上面啓動全部節點的Zookeeper並查看狀態。
[hadoop@masternode ~]$ cd /home/hadoop/tools/
[hadoop@masternode tools]$ runRemoteCmd.sh "/home/hadoop/app/zookeeper/bin/zkServer.sh start" zookeeper
*******************masternode***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
*******************slavenode1***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
*******************slavenode2***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@masternode tools]$ runRemoteCmd.sh "/home/hadoop/app/zookeeper/bin/zkServer.sh status" zookeeper
*******************masternode***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: follower
*******************slavenode1***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: leader //leader節點是經過zookeeper的leader選舉算法決定的,和啓動順序有關,
//正常啓動時第一個啓動的就是leade;若是該節點掛掉則根據算法再選舉另外一個節點做爲leader節點。
*******************slavenode2***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: follower
2.8 Hadoop環境配置
將下載好的apache hadoop-2.6.0.tar.gz安裝包,上傳至masternode節點下的/home/hadoop/app目錄下,而後解壓。
[hadoop@masternode tools]$ cd /home/hadoop/app //將本地的hadoop-2.6.0.tar.gz安裝包上傳至當前目錄
[hadoop@masternode app]$ rz
[hadoop@masternode app]$ tar zvxf hadoop-2.6.0.tar.gz //解壓
[hadoop@masternode app]$ ls
hadoop-2.6.0 hadoop-2.6.0.tar.gz jdk1.8.0_60 zookeeper
[hadoop@masternode app]$ rm -f hadoop-2.6.0.tar.gz //刪除安裝包
[hadoop@masternode app]$ mv hadoop-2.6.0/ hadoop //重命名
[hadoop@masternode app]$ ls
hadoop jdk1.8.0_60 zookeeper
配置HDFS
切換到/home/hadoop/app/hadoop/etc/hadoop/目錄下,修改配置文件。
[hadoop@masternode app]$ cd /home/hadoop/app/hadoop/etc/hadoop/
[hadoop@masternode hadoop]$ ls
capacity-scheduler.xml httpfs-env.sh mapred-env.sh
configuration.xsl httpfs-log4j.properties mapred-queues.xml.template
container-executor.cfg httpfs-signature.secret mapred-site.xml.template
core-site.xml httpfs-site.xml slaves
hadoop-env.cmd kms-acls.xml ssl-client.xml.example
hadoop-env.sh kms-env.sh ssl-server.xml.example
hadoop-metrics2.properties kms-log4j.properties yarn-env.cmd
hadoop-metrics.properties kms-site.xml yarn-env.sh
hadoop-policy.xml log4j.properties yarn-site.xml
hdfs-site.xml mapred-env.cmd
配置hadoop-env.sh文件
[hadoop@masternode hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/home/hadoop/app/jdk1.8.0_60
配置core-site.xml文件
[hadoop@masternode hadoop]$ vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
</property>
< 這裏的值指的是默認的HDFS路徑 ,取名爲cluster1>
<property> <name>hadoop.tmp.dir</name>
<value>/home/hadoop/data/tmp</value>
</property>
< hadoop的臨時目錄,若是須要配置多個目錄,須要逗號隔開,data目錄須要咱們本身建立>
<property>
<name>ha.zookeeper.quorum</name>
<value>masternode:2181,slavenode1:2181,slavenode2:2181</value>
</property>
< 配置Zookeeper 管理HDFS>
</configuration>
配置hdfs-site.xml文件
[hadoop@masternode hadoop]$ vi hdfs-site.xm
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> < 數據塊副本數爲3> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> < 權限默認配置爲false> <property> <name>dfs.nameservices</name> <value>cluster1</value> </property> < 命名空間,它的值與fs.defaultFS的值要對應,namenode高可用以後有兩個namenode,cluster1是對外提供的統一入口> <property> <name>dfs.ha.namenodes.cluster1</name> <value>masternode,slavenode1</value> </property> < 指定 nameService 是 cluster1 時的nameNode有哪些,這裏的值也是邏輯名稱,名字隨便起,相互不重複便可> <property> <name>dfs.namenode.rpc-address.cluster1.masternode</name> <value>masternode:9000</value> </property> < masternode rpc地址> <property> <name>dfs.namenode.http-address.cluster1.masternode</name> <value>masternode:50070</value> </property> < masternode http地址> <property> <name>dfs.namenode.rpc-address.cluster1.slavenode1</name> <value>slavenode1:9000</value> </property> < slavenode1 rpc地址> <property> <name>dfs.namenode.http-address.cluster1.slavenode1</name> <value>slavenode1:50070</value> </property> < slavenode2 http地址> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> < 啓動故障自動恢復> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://masternode:8485;slavenode1:8485;slavenode2:8485/cluster1</value> </property> < 指定journal> <property> <name>dfs.client.failover.proxy.provider.cluster1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> < 指定 cluster1 出故障時,哪一個實現類負責執行故障切換> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hadoop/data/journaldata/jn</value> </property> < 指定JournalNode集羣在對nameNode的目錄進行共享時,本身存儲數據的磁盤路徑 > <property> <name>dfs.ha.fencing.methods</name> <value>shell(/bin/true)</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>10000</value> </property> < 腦裂默認配置> <property> <name>dfs.namenode.handler.count</name> <value>100</value> </property> </configuration>
配置slave文件
[hadoop@masternode hadoop]$ vi slaves
slavenode2
向全部節點分發hadoop安裝包。
[hadoop@masternode app]# scp -r zookeeper slavenode1:/home/hadoop/app
[hadoop@masternode app]# scp -r zookeeper slavenode2:/home/hadoop/app
hdfs配置完畢後的啓動順序
1)啓動全部節點上面的Zookeeper進程
[hadoop@masternode app]$ cd /home/hadoop/tools/
[hadoop@masternode tools]$ runRemoteCmd.sh "/home/hadoop/app/zookeeper/bin/zkServer.sh start" zookeeper
*******************masternode***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
*******************slavenode1***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
*******************slavenode2***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@masternode tools]$ runRemoteCmd.sh "/home/hadoop/app/zookeeper/bin/zkServer.sh status" zookeeper
*******************masternode***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: follower
*******************slavenode1***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: follower
*******************slavenode2***************************
JMX enabled by default
Using config: /home/hadoop/app/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[hadoop@masternode hadoop]$ jps
6560 Jps
6459 QuorumPeerMain
其中,QuorumPeerMain對應zookeeper的進程。
2)啓動全部節點上面的journalnode進程
[hadoop@masternode tools]$ runRemoteCmd.sh "/home/hadoop/app/hadoop/sbin/hadoop-daemon.sh start journalnode" all
*******************masternode***************************
starting journalnode, logging to /home/hadoop/app/hadoop/logs/hadoop-hadoop-journalnode-masternode.out
*******************slavenode1***************************
starting journalnode, logging to /home/hadoop/app/hadoop/logs/hadoop-hadoop-journalnode-slavenode1.out
*******************slavenode2***************************
starting journalnode, logging to /home/hadoop/app/hadoop/logs/hadoop-hadoop-journalnode-slavanode2.out
[hadoop@masternode tools]$ jps
6672 Jps
6624 JournalNode
6459 QuorumPeerMain
或者在每一個節點上使用如下命令分別啓動
[hadoop@masterndoe hadoop]$ sbin/hadoop-daemon.sh start journalnode
3)首先在主節點上(好比,masterndoe)執行格式化並啓動Namenode
[hadoop@masterndoe hadoop]$ bin/hdfs namenode -format //namenode 格式化
[hadoop@masterndoe hadoop]$ bin/hdfs zkfc -formatZK //格式化高可用
[hadoop@masterndoe hadoop]$ bin/hdfs namenode //啓動namenode
4)與此同時,須要在備節點(好比,slavenode1)上執行數據同步
[hadoop@slavenode1 hadoop]$ bin/hdfs namenode -bootstrapStandby //同步主節點和備節點之間的元數據,
5)slavenode1同步完數據後,緊接着在masterndoe節點上,按下ctrl+c來結束namenode進程。 而後關閉全部節點上面的journalnode進程
[hadoop@masternode hadoop]$ runRemoteCmd.sh "/home/hadoop/app/hadoop/sbin/hadoop-daemon.sh stop journalnode" all
//而後停掉各節點的journalnode
[hadoop@masternode hadoop]$ jps
6842 Jps
6459 QuorumPeerMain
6)若是上面操做沒有問題,咱們能夠一鍵啓動hdfs全部相關進程
[hadoop@masternode hadoop]$ sbin/start-dfs.sh
[hadoop@masternode hadoop]$ jps
8640 DFSZKFailoverController
8709 Jps
6459 QuorumPeerMain
8283 NameNode
8476 JournalNode
[hadoop@slavenode1 hadoop]$ jps
5667 DFSZKFailoverController
5721 Jps
5562 JournalNode
4507 QuorumPeerMain
5487 NameNode
[hadoop@slavanode2 hadoop]$ jps
5119 Jps
5040 JournalNode
5355 DataNode
4485 QuorumPeerMain
以上masternode和slavenode1是做爲NameNode的,而slavenode2則做爲DataNode。
驗證是否啓動成功,經過web界面查看namenode啓動狀況。
http://masternode:50070
如圖,masternode節點狀態爲active,而slavenode1節點狀態爲standby。
7)測試集羣是否能夠正常使用
使用如下命令在HDFS中新建一個文件夾,而後再網頁上經過文件系統查看。
[hadoop@masternode hadoop]$ hdfs dfs -mkdir /test
還能夠上傳文件到文件夾內,這個能夠本身下去測試一下。
在這裏,我想說的是,哪一個是active,哪一個是standby是隨機的 ,這是由選舉決定的。
下面咱們來試一下將slavenode1節點變爲active。
首先kill掉masternode節點的Namenode,而後刷新網頁看看有聲明變化。
[hadoop@masternode hadoop]$ jps
8640 DFSZKFailoverController
8901 Jps
6459 QuorumPeerMain
8283 NameNode
8476 JournalNode
[hadoop@masternode hadoop]$ kill -9 8283
[hadoop@masternode hadoop]$ jps
8640 DFSZKFailoverController
8916 Jps
6459 QuorumPeerMain
8476 JournalNode
[hadoop@slavenode1 hadoop]$ jps
5986 Jps
5667 DFSZKFailoverController
5562 JournalNode
4507 QuorumPeerMain
5487 NameNode
如圖,slavenode1節點變爲了active狀態!剛纔將masternode的Namenode kill掉了,因此根據選舉算法,slavenode1節點被選舉爲Namenode節點,因此狀態爲active。
2.9 YARN安裝配置
配置mapred-site.xml
[hadoop@masternode hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@masternode hadoop]$ vi mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <指定運行mapreduce的環境是Yarn,與hadoop1不一樣的地方> </configuration>
配置yarn-site.xml
[hadoop@masternode hadoop]$ vi yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.connect.retry-interval.ms</name>
<value>2000</value>
</property>
< 超時的週期>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
< 打開高可用>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<啓動故障自動恢復>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
<value>true</value>
</property>
<failover使用內部的選舉算法>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-rm-cluster</value>
</property>
<給yarn cluster 取個名字yarn-rm-cluster>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<給ResourceManager 取個名字 rm1,rm2>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>masternode</value>
</property>
<配置ResourceManagerrm1hostname>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slavenode1</value>
</property>
<配置ResourceManagerrm2hostname>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<啓用resourcemanager 自動恢復>
<property>
<name>yarn.resourcemanager.zk.state-store.address</name>
<value>masternode:2181,slavenode1:2181,slavenode2:2181</value>
</property>
<配置Zookeeper地址>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>masternode:2181,slavenode1:2181,slavenode2:2181</value>
</property>
<配置Zookeeper地址>
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>masternode:8032</value>
</property>
< rm1端口號>
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>masternode:8034</value>
</property>
< rm1調度器的端口號>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>masternode:8088</value>
</property>
< rm1webapp端口號>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>slavenode1:8032</value>
</property>
< rm2端口號>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>slavenode1:8034</value>
</property>
< rm2調度器的端口號>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>slavenode1:8088</value>
</property>
< rm2webapp端口號>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<執行MapReduce須要配置的shuffle過程>
</configuration>
啓動YARN
1) 將yarn-site.xml文件發送到slavenode1和slavenode2節點上。
[hadoop@masternode hadoop]$ scp yarn-site.xml slavenode1:/home/hadoop/app/hadoop/etc/hadoop/
yarn-site.xml 100% 2782 2.7KB/s 00:00
[hadoop@masternode hadoop]$ scp yarn-site.xml slavenode2:/home/hadoop/app/hadoop/etc/hadoop/
yarn-site.xml 100% 2782 2.7KB/s 00:00
2)在masternode節點上執行。
[hadoop@masternode hadoop]$ sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop/logs/yarn-hadoop-resourcemanager-masternode.out
slavenode2: starting nodemanager, logging to /home/hadoop/app/hadoop/logs/yarn-hadoop-nodemanager-slavanode2.out
[hadoop@masternode hadoop]$ jps
8640 DFSZKFailoverController
8969 ResourceManager
6459 QuorumPeerMain
8476 JournalNode
9054 Jps
YARN對應的進程爲ResourceManager。
3)在slavenode1節點上執行。
[hadoop@slavenode1 hadoop]$ sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop/logs/yarn-hadoop-resourcemanager-slavenode1.out
[hadoop@slavenode1 hadoop]$ jps
5667 DFSZKFailoverController
5562 JournalNode
4507 QuorumPeerMain
6059 ResourceManager
6127 Jps
5487 NameNode
同時打開如下web界面。
http://hadoop11:8088
http://hadoop12:8088
檢查一下ResourceManager狀態
[hadoop@slavenode1 hadoop]$ bin/yarn rmadmin -getServiceState rm1
18/04/20 16:58:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@slavenode1 hadoop]$ bin/yarn rmadmin -getServiceState rm2
18/04/20 16:58:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
即slavenode1的ResourceManager,即rm1,是active;masternode的ResourceManager,即rm2,是standby;結論也與web頁面吻合。關閉其中一個resourcemanager,而後再啓動,同Namenode也能使兩節點狀態交換。
那麼,到此hadoop的3節點集羣搭建完畢,咱們使用zookeeper來管理hadoop集羣,同時,實現了namenode熱備和ResourceManager熱備。
以上就是博主爲你們介紹的這一板塊的主要內容,這都是博主本身的學習過程,但願能給你們帶來必定的指導做用,有用的還望你們點個支持,若是對你沒用也望包涵,有錯誤煩請指出。若有期待可關注博主以第一時間獲取更新哦,謝謝!