Hadoop HA

Hadoop HA

什麼是 HA

HA是High Available縮寫,是雙機集羣系統簡稱,指高可用性集羣,是保證業務連續性的有效解決方案,通常有兩個或兩個以上的節點,且分爲活動節點及備用節點。一般把正在執行業務的稱爲活動節點,而做爲活動節點的一個備份的則稱爲備用節點。當活動節點出現問題,致使正在運行的業務(任務)不能正常運行時,備用節點此時就會偵測到,並當即接續活動節點來執行業務。從而實現業務的不中斷或短暫中斷。java

hadoop HA機制介紹

hadoop2.0的HA 機制有兩個namenode,一個是active namenode,狀態是active;另一個是standby namenode,狀態是standby。二者的狀態是能夠切換的,但不能同時兩個都是active狀態,最多隻有1個是active狀態。只有active namenode提供對外的服務,standby namenode是不對外服務的。active namenode和standby namenode之間經過NFS或者JN(journalnode,QJM方式)來同步數據。
active namenode會把最近的操做記錄寫到本地的一個edits文件中(edits file),並傳輸到NFS或者JN中。standby namenode按期的檢查,從NFS或者JN把最近的edit文件讀過來,而後把edits文件和fsimage文件合併成一個新的fsimage,合併完成以後會通知active namenode獲取這個新fsimage。active namenode得到這個新的fsimage文件以後,替換原來舊的fsimage文件。
這樣,保持了active namenode和standby namenode的數據的實時同步,standby namenode能夠隨時切換成active namenode(譬如active namenode掛了)。並且還有一個原來hadoop1.0的secondarynamenode,checkpointnode,buckcupnode的功能:合併edits文件和fsimage文件,使fsimage文件一直保持更新。因此啓動了hadoop2.0的HA機制以後,secondarynamenode,checkpointnode,buckcupnode這些都不須要了。node

搭建 hadoop HA 集羣

環境

linux: CentOS-7.5_x64
hadoop: hadoop-3.2.0
zookeeper: zookeeper-3.4.10linux

機器規劃

主機名 IP 安裝軟件 運行進程
node-1 192.168.91.11 hadoop NameNode,ResourceManager,DFSZKFailoverController
node-2 192.168.91.12 hadoop,zookeeper NameNode,ResourceManager,QuorumPeerMain,DFSZKFailoverController
node-3 192.168.91.13 hadoop,zookeeper QuorumPeerMain,DataNode,NodeManager,JournalNode
node-4 192.168.91.14 hadoop,zookeeper QuorumPeerMain,DataNode,NodeManager,JournalNode

前置準備

四臺機器須要ssh免密登陸,node-2,node-3,node-4須要安裝zookeeper、java環境shell

集羣搭建

# 下載
$ wget http://mirrors.shu.edu.cn/apache/hadoop/common/hadoop-3.1.2/hadoop-3.2.0.tar.gz

# 解壓
$ tar -zxvf hadoop-3.2.0.tar.gz

# 配置系統的環境變量
$ vim /etc/profile

export JAVA_HOME=/usr/local/jdk1.8.0_191
export PATH=$PATH:$JAVA_HOME/bin

export HADOOP_HOME=/export/servers/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin


# 進入解壓目錄配置環境變量
$ cd $HADOOP_HOME

# 配置hadoop-env.sh 添加下面配置(不配置啓動會報錯)
$ vim etc/hadoop/core-site.xml

export JAVA_HOME=/usr/local/jdk1.8.0_191

export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root

export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root



# 配置core-site.xml
$ vim etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <!-- HA 配置指定 hdfs 的 nameService 爲ns -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://ns</value>
    </property>

    <!-- HA 配置,指定zookeeper地址 -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>node-2:2181,node-3:2181,node-4:2181</value>
    </property>

    <property>   
        <name>hadoop.tmp.dir</name>   
        <value>/export/data/hadoop/temp</value>
    </property>

    <property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
    </property>

    <property>
        <name>hadoop.proxyuser.root.groups</name>
        <value>*</value>
    </property>

</configuration>



# 配置hdfs-site.xml
$ vim etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <!--指定hdfs的nameservice爲ns,須要和core-site.xml中的保持一致 -->
    <property>
        <name>dfs.nameservices</name>
        <value>ns</value>
    </property>

    <!-- bi下面有兩個NameNode,分別是nn1,nn2 -->
    <property>
        <name>dfs.ha.namenodes.ns</name>
        <value>nn1,nn2</value>
    </property>

    <!-- nn1的RPC通訊地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ns.nn1</name>
        <value>node-1:9000</value>
    </property>

    <!-- nn1的http通訊地址 -->
    <property>
        <name>dfs.namenode.http-address.ns.nn1</name>
        <value>node-1:50070</value>
    </property>

    <!-- nn2的RPC通訊地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ns.nn2</name>
        <value>node-2:9000</value>
    </property>

    <!-- nn2的http通訊地址 -->
    <property>
        <name>dfs.namenode.http-address.ns.nn2</name>
        <value>node-2:50070</value>
    </property>

    <!-- 指定NameNode的edits元數據在JournalNode上的存放位置 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://node-3:8485;node-4:8485/ns</value>
    </property>

    <!-- 指定JournalNode在本地磁盤存放數據的位置 -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/export/data/hadoop/journaldata</value>
    </property>

    <!-- 開啓NameNode失敗自動切換 -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <!-- 配置失敗自動切換實現方式 -->
    <property>
        <name>dfs.client.failover.proxy.provider.bi</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <!-- 配置隔離機制方法,多個機制用換行分割,即每一個機制暫用一行-->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
            sshfence
            shell(/bin/true)
        </value>
    </property>

    <!-- 使用sshfence隔離機制時須要ssh免登錄 -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>

    <!-- 配置sshfence隔離機制超時時間 -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>

    <property>
        <name>dfs.ha.namenodes.jn</name>
        <value>node-3,node-4</value>
    </property>

</configuration>

# 配置mapred-site.xml
$ vim etc/hadoop/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

    <property>
        <name>yarn.app.mapreduce.am.env</name>
        <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
    </property>

    <property>
        <name>mapreduce.map.env</name>
        <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
    </property>

    <property>
        <name>mapreduce.reduce.env</name>
        <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
    </property>

</configuration>


# 配置yarn-site.xml
$ vim etc/hadoop/yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->

    <!-- 開啓RM高可用 -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>

    <!-- 指定RM的cluster id -->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yarn-ha</value>
    </property>

    <!-- 指定RM的名字 -->
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <!-- 分別指定RM的地址 -->
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>node-1</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>node-2</value>
    </property>

    <!-- 指定zk集羣地址 -->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>node-2:2181,node-3:2181,node-4:2181</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

</configuration>


# 配置workers節點
$ vim $HADOOP_HOME/etc/hadoop/workers

node-3
node-4

# 拷貝hadoop到其餘節點(node-2,node-3,node-4)
$ scp -r hadoop-3.2.0 root@node-2:/xxx/xxx/

hdfs HA 配置

# 啓動zookeeper集羣
$ $ZOOKEEPER_HOME/bin/zkServer.sh start

# 查看zookeeper狀態
$ $ZOOKEEPER_HOME/bin/zkServer.sh status

# 啓動 JournalNode 集羣 分別在 node-三、node-4 上執行如下命令
$ hdfs --daemon start journalnode

# 格式化 ZooKeeper 集羣
$ hdfs zkfc -formatZK

# 格式化集羣的 NameNode (在 node-1 上執行)
$ hdfs namenode -format

# 啓動剛格式化的 NameNode (在 node-1 上執行)
$ hdfs --daemon start namenode

# 同步 NameNode1 元數據到 NameNode2 上 (在 node-2 上執行)  
$ hdfs namenode -bootstrapStandby

# 啓動 NameNode2 (在 node-2 上執行)
$ hdfs --daemon start namenode

# 啓動集羣中全部的DataNode (在 node-1 上執行)  
$ sbin/start-dfs.sh

# 啓動 ZKFC 進程 (在 node-1 和 node-2 的主機上分別執行以下命令)
$ hdfs --daemon start zkfc

# 驗證ha(在node-1節點停掉namenode進程)
$ hafs --daemon stop namenode

resourceManager HA 配置

# 在 RM1 啓動 YARN (在 node-1 上執行)
$ yarn --daemon start resourcemanager

# 在 RM2 啓動 YARN (在 node-2 上執行)
$ yarn --daemon start resourcemanager

# 在任意節點執行獲取resourceManager狀態(active)
$ yarn rmadmin -getServiceState rm1

# 在任意節點執行獲取resourceManager狀態(standby)
$ yarn rmadmin -getServiceState rm2

# 驗證 yarn 的 ha(在node-1節點執行)standby 的 resourcemanager 則會轉換爲 active
$ yarn --daemon stop resourcemanager

# 在任意節點執行獲取resourceManager狀態(active)
$ yarn rmadmin -getServiceState rm2

總結

搭建hadoop HA 過程當中遇到了不少各類各樣的問題上述步驟都是通過驗證的如在安裝過程當中遇到問題能夠留言,謝謝!express

相關文章
相關標籤/搜索