[大數據學習研究] 4. Zookeeper-分佈式服務的協同管理神器

 

原本這一節想寫Hadoop的分佈式高可用環境的搭建,寫到一半,發現仍是有必要先介紹一下ZooKeeper這個東西。html

ZooKeeper理念介紹

ZooKeeper是爲分佈式應用來提供協同服務的,並且ZooKeeper自己也是分佈式的,由分佈在至少三臺機器上,這幾臺機器造成一個Quorum,就像一個劇團同樣。這個團裏有個團長,就是leader的角色,其餘的是follower。這個劇團裏的每一個人腦子裏都記住一樣的東西(ZooKeeper是基於內存的),而且及時和leader保持同步,全部client可鏈接任何一個server便可。劇團裏的每一個人都有一個編號myid。若是劇團裏的leader掛斷後,剩下的幾個要從新選舉出新的leader來確保服務正常運行。java

1. ZooKeepe的安裝

ZooKeeper的安裝挺簡單,就是解壓,設置環境變量就能夠了數據庫

[root@hadoop100 bin]# tar -zxvf /opt/software/zookeeper-3.4.10.tar.gz -C /opt/modules/

 

打開/ect/profile 編輯環境變量,加上下面的內容:apache

#JAVA_HOME
export JAVA_HOME=/opt/modules/jdk1.8.0_121
export PATH=$PATH:$JAVA_HOME/bin

#HADOOP_HOME
export HADOOP_HOME=/opt/modules/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

#ZOOKEEPER export ZOOKEEPER_HOME=/opt/modules/zookeeper-3.4.10 export PATH=$PATH:ZOOKEEPER_HOME/bin

 

 

而後 source /ect/profile 讓更改生效。記得用xsync 和xcall超級腳本,把更改同步到整個集羣。服務器

[root@hadoop100 bin]# xsync /etc/profile
[root@hadoop100 bin]# xcall source /etc/profile

2. ZooKeeper的配置

1. Zookeeper 須要一個data目錄,用於存儲zookeeper內存數據庫的鏡像和日誌。而後更改zoo.cfg文件。ZooKeeper解壓後提供了一個/opt/modules/zookeeper-3.4.10/conf/zoo_sample.cfg文件,把這個複製一下或者改個名字叫zoo.cfg, 修改一下里面的dataDir的指向。ssh

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/modules/zookeeper-3.4.10/zkData # the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
~

 

要搭建ZooKeeper的機器環境,zookeeper服務器的數量應該是奇數臺。最少要3臺。分佈式

# 鏈接到leader 服務器的tick數,超過這個tick數 這臺服務器尚未鏈接上leader,那這臺機
器就被認爲是死掉了
initLimit = 5
# 在和leader同步過程當中所容許落後的最大tick數,若是超過這個,那就是掉隊了
syncLimit = 2
server.100=hadoop100:2888:3888
server.101=hadoop101:2888:3888
server.102=hadoop102:2888:3888
server.103=hadoop103:2888:3888
server.104=hadoop104:2888:3888

 機器的參數配置的格式是這樣的:ide

Server.A=B:C:D。
A是一個數字,表示這個是第幾號服務器;
B是這個服務器的ip地址;
C是這個服務器與集羣中的Leader服務器交換信息的端口;
D是萬一集羣中的Leader服務器掛了,須要一個端口來從新進行選舉,選出一個新的Leader,而這個端口就是用來執行選舉時服務器相互通訊的端口。oop

 

注意更改完畢後別忘了分發到集羣中。zookeeper自己是也分佈式的。先把相關文件分發到集羣中的其餘機器上。學習

[root@hadoop100 modules]# xsync zookeeper-3.4.10/

 

而後爲每臺機器作上獨特的標記,在data目錄裏建立myId文件,內容就是上面配置文件中的數字

[root@hadoop100 zookeeper-3.4.10]# cd zkData/
[root@hadoop100 zkData]# echo 100 > myid

在集羣的其餘幾臺機器上修改myid文件的內容,讓myid的內容和配置文件中的編號一致。這時候只能麻煩點,依次登陸到每臺機器上建立 data目錄下的myid文件了。

[root@hadoop100 zkData]# ssh hadoop101

Last login: Thu Sep 19 14:10:35 2019 from gateway
[root@hadoop101 ~]# echo 101 > /opt/modules/zookeeper-3.4.10/zkData/myid
[root@hadoop101 ~]#exit

[root@hadoop100 zkData]# ssh hadoop101
Last login: Thu Sep 19 14:10:35 2019 from gateway
[root@hadoop101 ~]# echo 101 > /opt/modules/zookeeper-3.4.10/zkData/myid
[root@hadoop101 ~]# exit
logout
Connection to hadoop101 closed.
[root@hadoop100 zkData]# ssh hadoop102
Last login: Tue Sep 17 13:26:48 2019 from hadoop100
[root@hadoop102 ~]# echo 102 > /opt/modules/zookeeper-3.4.10/zkData/myid
[root@hadoop102 ~]# exit
logout
Connection to hadoop102 closed.
[root@hadoop100 zkData]# ssh hadoop103
Last login: Tue Sep 17 13:17:00 2019 from hadoop100
[root@hadoop103 ~]# echo 103 > /opt/modules/zookeeper-3.4.10/zkData/myid
[root@hadoop103 ~]# exit
logout
Connection to hadoop103 closed.
[root@hadoop100 zkData]# ssh hadoop104
Last login: Tue Sep 17 11:04:38 2019 from hadoop100
[root@hadoop104 ~]# echo 104 > /opt/modules/zookeeper-3.4.10/zkData/myid
[root@hadoop104 ~]# exit
logout
Connection to hadoop104 closed.

檢查一下確保沒問題

[root@hadoop100 bin]# xcall cat /opt/modules/zookeeper-3.4.10/zkData/myid
---------running at localhost--------
100
---------running at hadoop101-------
101
---------running at hadoop102-------
102
---------running at hadoop103-------
103
---------running at hadoop104-------
104
[root@hadoop100 bin]#

 

好了,基本配置好了,準備啓動了,ZooKeeper集羣都要啓動ZooKeeper服務。我用以前介紹過的超級腳本xcall. (後來發現用這種方式靠不住,說是啓動了,其實沒啓動 ;;;) 

[root@hadoop100 zkData]# xcall /opt/modules/zookeeper-3.4.10/bin/zkServer.sh start
---------running at localhost--------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
---------running at hadoop101-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
---------running at hadoop102-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
---------running at hadoop103-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
---------running at hadoop104-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@hadoop100 zkData]#

 

錯誤排查:Error contacting service. It is probably not running.

查看一下運行狀態, 啊哦,怎麼沒啓動呢? 

[root@hadoop100 bin]# xcall /opt/modules/zookeeper-3.4.10/bin/zkServer.sh status
---------running at localhost--------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
---------running at hadoop101-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
---------running at hadoop102-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
---------running at hadoop103-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
---------running at hadoop104-------
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
[root@hadoop100 bin]#

 

後來發現須要單獨ssh到每臺機器上單獨啓動就能夠了,多是xcall神器有的時候不可靠。不過提示一點,zkServer.sh start-foreground 命令,能夠在查看詳細啓動過程,方便排查錯誤。

[root@hadoop101 ~]# /opt/modules/zookeeper-3.4.10/bin/zkServer.sh start-foreground
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
2019-09-19 14:52:29,093 [myid:] - INFO  [main:QuorumPeerConfig@134] - Reading configuration from: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
2019-09-19 14:52:29,122 [myid:] - INFO  [main:QuorumPeer$QuorumServer@167] - Resolved hostname: hadoop104 to address: hadoop104/192.168.56.104
2019-09-19 14:52:29,123 [myid:] - INFO  [main:QuorumPeer$QuorumServer@167] - Resolved hostname: hadoop103 to address: hadoop103/192.168.56.103
2019-09-19 14:52:29,123 [myid:] - INFO  [main:QuorumPeer$QuorumServer@167] - Resolved hostname: hadoop102 to address: hadoop102/192.168.56.102
2019-09-19 14:52:29,124 [myid:] - INFO  [main:QuorumPeer$QuorumServer@167] - Resolved hostname: hadoop101 to address: hadoop101/192.168.56.101
2019-09-19 14:52:29,124 [myid:] - INFO  [main:QuorumPeer$QuorumServer@167] - Resolved hostname: hadoop100 to address: hadoop100/192.168.56.100
2019-09-19 14:52:29,124 [myid:] - INFO  [main:QuorumPeerConfig@396] - Defaulting to majority quorums
2019-09-19 14:52:29,134 [myid:101] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2019-09-19 14:52:29,135 [myid:101] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2019-09-19 14:52:29,135 [myid:101] - INFO  [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2019-09-19 14:52:29,150 [myid:101] - INFO  [main:QuorumPeerMain@127] - Starting quorum peer
2019-09-19 14:52:29,171 [myid:101] - INFO  [main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:2181
2019-09-19 14:52:29,172 [myid:101] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting abnormally
java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
    at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:90)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:130)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
[root@hadoop101 ~]#

 

若是jps命令能看到QuorumPeerMain就是已經啓動成功了。

[root@hadoop100 bin]# jps
1885 QuorumPeerMain
2029 Jps

 

SSH單獨登陸到各個服務器上依次啓動,並查看狀態,能夠發現我如今的集羣環境中hadoop102是leader,其餘幾臺是follower:

[root@hadoop100 bin]# /opt/modules/zookeeper-3.4.10/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[root@hadoop100 bin]# ssh hadoop101
Last login: Thu Sep 19 15:04:12 2019 from hadoop100
[root@hadoop101 ~]# /opt/modules/zookeeper-3.4.10/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[root@hadoop101 ~]# exit
logout
Connection to hadoop101 closed.
[root@hadoop100 bin]# ssh hadoop102
Last login: Thu Sep 19 15:04:48 2019 from hadoop100
[root@hadoop102 ~]# /opt/modules/zookeeper-3.4.10/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: leader
[root@hadoop102 ~]# exit
logout
Connection to hadoop102 closed.
[root@hadoop100 bin]# ssh hadoop103
Last login: Thu Sep 19 15:05:07 2019 from hadoop100
[root@hadoop103 ~]# /opt/modules/zookeeper-3.4.10/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[root@hadoop103 ~]# exit
logout
Connection to hadoop103 closed.
[root@hadoop100 bin]# ssh hadoop104
Last login: Thu Sep 19 15:05:51 2019 from hadoop100
[root@hadoop104 ~]# /opt/modules/zookeeper-3.4.10/bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/modules/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower
[root@hadoop104 ~]# exit
logout
Connection to hadoop104 closed.
[root@hadoop100 bin]#

 

好了,到如今爲止,個人ZooKeeper集羣環境已經搭建成功了。 

 

 

題外話

學習研究的話能夠用虛擬機,真要認真作點事仍是要上雲,好比阿里雲。若是你須要,能夠用個人下面這個連接,有折扣返現。

https://promotion.aliyun.com/ntms/yunparter/invite.html?userCode=vltv9frd

相關文章
相關標籤/搜索