阿里雲ECS服務器部署HADOOP集羣（二）：HBase徹底分佈式集羣搭建（使用外置ZooKeeper）

時間 2020-04-23

標籤阿里 ecs 服務器部署 hadoop 集羣 hbase 徹底分佈式搭建使用 zookeeper 欄目阿里巴巴简体版

原文原文鏈接

本篇將在阿里雲ECS服務器部署HADOOP集羣（一）：Hadoop徹底分佈式集羣環境搭建的基礎上搭建，多添加了一個 datanode 節點。html

1 節點環境介紹：

1.1 環境介紹：

服務器：三臺阿里雲ECS服務器：master, slave1, slave2
操做系統：CentOS 7.3
Hadoop：hadoop-2.7.3.tar.gz
Java: jdk-8u77-linux-x64.tar.gz
HBase: hbase-1.2.6-bin.tar.gz
ZooKeeper: zookeeper-3.4.14.tar.gz

1.2 各節點角色分配

master: NameNode、SecondaryNameNode、HMaster、QuorumPeerMain
slave1: DataNode、HMaster（候補節點）、HRegionServer、QuorumPeerMain
slave2: DataNode、HRegionServer、QuorumPeerMain

2 HBase 下載

下載 hbase-1.2.6-bin.tar.gz 並在合適的位置解壓縮，筆者這裏解壓縮的路徑爲:java

/usr/local

將解壓獲得的目錄更名爲 hbase node

1 cd /usr/local 2 mv hbase-1.2.6/ hbase/

3 添加 HBase 環境變量

在"/etc/profile"中添加內容：linux

1 export HBASE_HOME=/usr/local/hbase 2 export PATH=$PATH:$HBASE_HOME/bin

從新加載環境：ios

source /etc/profile

4 修改 HBase 配置信息

4.1 修改 hbase 環境變量（hbase-env.sh)

編輯文件：web

vim $HBASE_HOME/conf/hbase-env.sh

添加內容：shell

1 export JAVA_HOME=/usr/local/jdk1.8
2 export HBASE_CLASSPATH=/usr/local/hadoop/etc/hadoop3 export HBASE_MANAGES_ZK=false

關於 HBASE_CLASSPATH , 官方文檔解釋以下：Of note, if you have made HDFS client configuration changes on your Hadoop cluster, such as configuration directives for HDFS clients, as opposed to server-side configurations, you must use one of the following methods to enable HBase to see and use these configuration changes:apache

Add a pointer to your HADOOP_CONF_DIR to the HBASE_CLASSPATH environment variable in hbase-env.sh.
Add a copy of hdfs-site.xml (or hadoop-site.xml) or, better, symlinks, under ${HBASE_HOME}/conf, or
if only a small set of HDFS client configurations, add them to hbase-site.xml.

An example of such an HDFS client configuration is dfs.replication. If for example, you want to run with a replication factor of 5, HBase will create files with the default of 3 unless you do the above to make the configuration available to HBase.vim

HBASE_MANAGES_ZK 設置是否使用內置 ZooKeeper ，默認爲 true 也就是使用內置 ZooKeeper 筆者這裏使用外置 ZooKeeper 。（生產環境建議使用外置ZooKeeper，維護起來比較方便，可參考到底要不要用hbase自帶的zookeeper）
瀏覽器

4.2 修改 hbase 默認配置（hbase-site.xml)

編輯文件：

vim $HBASE_HOME/conf/hbase-site.xml

配置可參考以下代碼：

 1 <configuration>
 2   <!--HBase 的數據保存在 HDFS 對應的目錄下-->
 3   <property>
 4     <name>hbase.rootdir</name>
 5     <value>hdfs://master:9000/hbase</value>
 6   </property>
 7   <!--是否分佈式環境-->  
 8   <property>
 9     <name>hbase.cluster.distributed</name>
10     <value>true</value>
11   </property>
12   <!--配置 ZK 的地址, 三個節點都啓用 ZooKeeper-->
13   <property>
14     <name>hbase.zookeeper.quorum</name>
15     <value>master,slave1,slave2</value>
16   </property>
17   <!--內置 ZooKeeper 的數據目錄-->  
18   <property>
19     <name>hbase.zookeeper.property.dataDir</name>
20     <value>/usr/local/hbase/zookeeper</value>
21   </property>
22 </configuration>

4.3 指定 regionservers （regionservers)

編輯文件：

vim $HBASE_HOME/conf/regionservers

添加內容：

1 slave1 2 slave2

4.4 指定候補節點（backup-masters）

這個文件須要本身建立。

編輯文件：

vim $HBASE_HOME/conf/backup-masters

添加內容：

slave1

爲了保證HBase集羣的高可靠性，HBase支持多Backup Master 設置。當Active Master掛掉後，Backup Master能夠自動接管整個HBase的集羣。

5 分發 hbase 和 profile 給 slave1，slave2(建議將 hbase 壓縮後分發）

1 scp -r /usr/local/hbase slave1:/usr/local 2 scp -r /usr/local/hbase slave2:/usr/local

1 scp /etc/profile slave1:/etc/
2 scp /etc/profile slave2:/etc/

分發後分別在各節點從新加載環境並測試，可以使用 hbase version 測試。

6 安裝 ZooKeeper

參考阿里雲ECS服務器部署HADOOP集羣（三）：ZooKeeper 徹底分佈式集羣搭建

7 開放相關端口（坑！！！！！！）

注：服務器端口所有開放的能夠直接跳過這一步，不想看筆者BB的也能夠直接跳到該小結的最後。

多是因爲計算機網絡沒學好，從搭建Hadoop開始大半的時間都被浪費到這個端口問題上，各類 Error 全都是由於這個問題。😭😭😭

至此發誓必定要認真從新學習一遍計算機網絡！！！

回到正題：阿里雲服務器默認只開放三個端口，以下

因此Hadoop集羣搭建的各類所需端口都須要本身手動開放。

看到一些例如一些廢棄的端口如 yarn 的 web ui 舊端口 8088 會被黑客用來挖礦的關於端口開放安全性的問題，筆者嘗試一個個的參照配置文件一個個的添加端口，結果遇到各類問題。例如

web UI 打不開
web UI 顯示內容有問題
各節點的通信問題
hbase 報錯

起初在 Hadoop 配置的時候還好，問題還比較容易發現，可是隨着從底層的向上延伸到了 Hadoop 的組件問題就變的很神祕了，例如在 hbase 中遇到的

ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing;
ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet;
master.ServerManager: Waiting for region servers count to settle;
The Load Balancer is not enabled which will eventually cause performance degradation in HBase as Regions will not be distributed across all RegionServers. The balancer is only expected to be disabled during rolling upgrade scenarios.
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.SocketException: Network is unreachable；
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 5643550422109494702 number_of_rows: 100 close_scanner: false next_call_seq: 0
master.SplitLogManager: finished splitting (more than or equal to) 0 bytes
hbase:meta,,1.1588230740 state=PENDING_OPEN, ts=Tue Nov 24 08:26:45 UTC 2015 (1098s ago), server=amb2.service.consul,16020,1448353564099
17/11/12 22:44:10 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink as 192.168.0.101:50010
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1456)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1357)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:587)

注：此次沒有記錄Error因此以上都是從瀏覽器歷史記錄中找到的，有些不完整的用網上的類似的Error替代了。

這些問題在網上找到的全部解決方法大體歸結爲如下幾點：

關閉防火牆
禁用 selinux
退出 hdfs 安全模式
配置時間同步
hosts 文件的各節點 IP 內外網配置
從新格式化 hdfs
修復 hbase
存在節點未啓動或非正常關閉

可是：

阿里雲服務器防火牆默認是關閉的
阿里雲服務器 selinux 默認是禁用的
hdfs 並無處在安全模式
阿里雲服務器時間自動同步
IP 內外網配置在阿里雲ECS服務器部署HADOOP集羣（一）：Hadoop徹底分佈式集羣環境搭建已經配置好
hdfs 刪掉了 tmp 從新格式化試了沒用
修復了 hbase 沒用
節點均正常啓動而且沒有異常關閉

因此試了這麼多方法個人這麼多Error一個都沒解決。。。

認真看了官方文檔嘗試了各類配置甚至嘗試了各個版本的hbase，在老師的開導下先換了HMaster的放置的節點失敗了，嘗試僞分佈式也失敗，但這時的 Error 已經不像剛開始的又少又難懂，網上匹配的結果也不多，範圍愈來愈小，Error愈來愈明顯了。而後就發現了原來是 hdfs 之間的鏈接問題，可是由於菜還不太明確問題，因此便嘗試將 hdfs 設置成僞分佈式，終於成功了！而後又嘗試配置兩個節點的 hdfs 集羣，這時終於從 log 中肯定了問題所在，原來是節點的 50010 50020 端口沒開放，致使節點間沒法通信。因而一氣之下打開了服務器的全部端口，從新配置了一遍便成功了。

總結：log 必定要認真仔細查看並去理解，若是遇到報錯不多而且各類方法都無效的狀況時，應該嘗試更換思路，好比簡單化當前的配置，縮小範圍，得到一些新的、更多的Error，絕對不能放棄，只要是個錯誤，就必定能夠獲得解決，除非它自己就是個Error。。。學會變換不一樣的思路去解決問題，方法確定嘗試不完～

因此對於端口問題有兩種解決辦法：

根據官方文檔將默認配置須要的端口一個個的開放。
直接 1/65535，將全部端口開放，可能會出現安全問題。

8 啓動使用外置 ZooKeeper 的 HBase

8.1 啓動各組件

啓動順序：hdfs-->zookeeper-->hbase

1 # master 2 start-dfs.sh
3 zkServer.sh start 4 # slave1, slave2 5 zkServer.sh start 6 # master 7 start-base.sh

8.2 檢查各節點的全部進程是否已啓動

[root@master ~]# jps 17136 HMaster 14532 QuorumPeerMain 16885 SecondaryNameNode 16695 NameNode 17324 Jps [root@slave1 ~]# jps 11138 HRegionServer 11475 Jps 9479 QuorumPeerMain 11015 DataNode 11225 HMaster [root@slave2 ~]# jps 5923 DataNode 6216 Jps 5288 QuorumPeerMain 6040 HRegionServer

9 訪問HBase

9.1 經過 HBase Shell 訪問

進入 hbase shell

hbase shell

在 hbase shell 中輸入 list 結果以下

hbase(main):001:0> list TABLE 0 row(s) in 0.3050 seconds => []

9.2 經過 Web 訪問

打開網頁 http://master:16010（IP根據實際狀況修改）,能夠看到以下頁面：

10 搭建完成

至此，基於阿里雲三臺服務器的HBASE徹底分佈式集羣搭建就完成了！

阿里雲ECS服務器部署HADOOP集羣系列：

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。

阿里雲ECS服務器部署HADOOP集羣（二）：HBase徹底分佈式集羣搭建（使用外置ZooKeeper）

1 節點環境介紹：

1.1 環境介紹：

1.2 各節點角色分配

2 HBase 下載

3 添加 HBase 環境變量

4 修改 HBase 配置信息

4.1 修改 hbase 環境變量 （hbase-env.sh)

4.2 修改 hbase 默認配置（hbase-site.xml)

4.3 指定 regionservers （regionservers)

4.4 指定候補節點（backup-masters）

5 分發 hbase 和 profile 給 slave1，slave2(建議將 hbase 壓縮後分發）

6 安裝 ZooKeeper

7 開放相關端口（坑！！！！！！）

因此對於端口問題有兩種解決辦法：

8 啓動使用外置 ZooKeeper 的 HBase

8.1 啓動各組件

8.2 檢查各節點的全部進程是否已啓動

9 訪問HBase

9.1 經過 HBase Shell 訪問

9.2 經過 Web 訪問

10 搭建完成

4.1 修改 hbase 環境變量（hbase-env.sh)