基於Cloudera CDH5 Beta2的QJM HA配置

時間 2019-12-23

標籤基於 cloudera cdh5 cdh beta2 beta qjm 配置欄目雲服務简体版

原文原文鏈接

準備：
配置 最低版本jdk 1.7.0.45 以上，
軟件下載：http://archive.cloudera.com/cdh5/cdh/5/
系統目前支持64位 我用的是Centos 6.5
配置環境
/etc/profile root用戶下
export JAVA_HOME=/usr/java/jdk1.7.0_45
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export HADOOP_HOME=/usr/hadoop/chd5
export HADOOP_PID_DIR=/usr/hadoop/hadoop_pid_dir
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=/usr/hadoop/chd5
export ZOOKEEPER_HOME=/usr/hadoop/zookeeper
export PATH=${JAVA_HOME}/bin:${ZOOKEEPER_HOME}/bin:$PATH
source /etc/profile重啓變量設置生效
角色分配：
master：192.168.1.10 角色：namenode JournalNode
master2:192.168.1.9  角色：namenode JournalNode
slave1:192.168.1.11  角色 ：datanode JournalNode
slave2:192.168.1.12  角色 datanode
slave3:192.168.1.13  角色 datanode 
core-site.xml配置》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》
<configuration>
        <property>  
                <name>fs.defaultFS</name>  
                <value>hdfs://cluster</value> 
／＊＊fs.defaultFS的值表示hdfs路徑的邏輯名稱。由於咱們會啓動2個NameNode，每一個NameNode的位置不同，那麼切換後，
用戶也要修改代碼，很麻煩，所以使用一個邏輯路徑，用戶就能夠沒必要擔憂NameNode切換帶來的路徑不一致問題了。＊＊／

        </property>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/usr/hadoop/tmp</value>//在usr/hadoop目錄下建立文件夾 tmp
                <description>A base for other temporary directories.</description>
        </property>
        <property>
                <name>dfs.name.dir</name>
                <value>/usr/hadoop/hdfs/name</value>//在Hadoop目錄下建立文件夾hdfs及子目錄 data和name
        </property>
        <property>  
                <name>fs.trash.interval</name>  
                <value>10080</value>  
         </property>
         <property>  
                <name>fs.trash.checkpoint.interval</name>  
                <value>10080</value>  
         </property> 
         <property>  
                <name>topology.script.file.name</name>  
                <value>/usr/hadoop/chd5/etc/hadoop/rack.py</value>  //rack.py是機架感知程序
         </property>
         <property>  
                <name>topology.script.number.args</name>  
                <value>6</value>  
         </property>
         <property>   
                <name>hadoop.native.lib</name>   
                <value>false</value>   
                <description>Should native hadoop libraries, if present, be used.</description>   
         </property> 
         <property>  
                <name>hadoop.security.group.mapping</name>  
                <value>org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback</value> 
         </property>      

         <property>  
                <name>hadoop.proxyuser.hadoop.hosts</name>  
                <value>*</value>  
         </property>  

         <property>  
                <name>hadoop.proxyuser.hadoop.groups</name>  
                <value>*</value>  
         </property>  
</configuration>
hdfs-site.xml配置》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》
<configuration>
        <property>  
               <name>dfs.replication</name>  
               <value>3</value>  
        </property>
        <property>  
                <name>dfs.blocksize</name>  
                <value>16m</value> 
         </property>

         <property>
                <name>dfs.data.dir</name>
                <value>/usr/hadoop/hdfs/data</value>
        </property>
        <property>
                <name>dfs.nameservices</name>
                <value>cluster</value>
        </property>
        <property>
                <name>dfs.ha.namenodes.cluster</name>
                <value>master,master2</value>
        </property>
        <property>
                <name>dfs.namenode.rpc-address.cluster.master</name>
                <value>master:9000</value>
         </property>
         <property>
                <name>dfs.namenode.rpc-address.cluster.master2</name>
                <value>master2:9000</value>
         </property>
         <property>  
                <name>dfs.namenode.http-address.cluster.master</name>  
                <value>master:50070</value>  
          </property>
         <property>
                <name>dfs.namenode.http-address.cluster.master2</name>
                <value>master2:50070</value>
          </property>

          <property>  
                <name>dfs.namenode.secondary.http-address.cluster.master</name>  
                <value>master:50090</value>  
          </property>
          <property>
                <name>dfs.namenode.secondary.http-address.cluster.master2</name>
                <value>master2:50090</value>
           </property>
           <property>
                <name>dfs.namenode.shared.edits.dir</name>
                <value>qjournal://master:8485;master2:8485;slave1:8485/cluster</value>
        </property>
        <property>
                <name>dfs.client.failover.proxy.provider.cluster</name>
                <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
         <property>
                 <name>ha.zookeeper.quorum</name>
                 <value>master:2181,slave1:2181,slave2:2181,slave3:2181,master2:2181</value>
         </property>
        <property>
                <name>dfs.ha.fencing.methods</name>
                <value>sshfence</value>
        </property>
        <property>
                <name>dfs.ha.fencing.ssh.private-key-files</name>
                <value>/home/hadoop/.ssh/id_rsa</value>
        </property>
        <property>
                <name>dfs.journalnode.edits.dir</name>
                <value>/usr/hadoop/tmp/journal</value>
        </property>
        <property>
           <name>dfs.ha.automatic-failover.enabled</name>
              <value>true</value>
        </property>
        <property>
                <name>dfs.permissions</name>
                <value>false</value>
        </property>
        <property>  
                <name>dfs.webhdfs.enabled</name>  
                <value>true</value>  
        </property>

        <property>  
                <name>dfs.datanode.max.xcievers</name>  
                <value>1000000</value>  
        </property> 

        <property>  
                <name>dfs.balance.bandwidthPerSec</name>  
                <value>104857600</value>  
                <description>  
                      Specifies the maximum amount of bandwidth that each datanode  can utilize for the balancing purpose in   the number of bytes per second.  
                </description>  
       </property> 
       <property>  
                <name>dfs.hosts.exclude</name>  
                <value>/usr/hadoop/chd5/etc/hadoop/excludes</value>  
                <description>
                        Names a file that contains a list of hosts that are  
                        not permitted to connect to the namenode.  The full pathname of the 
                       file must be specified.  If the value is empty, no hosts are excluded                                                                                    </description>
        </property>
</configuration>
mapred-site.xml 配置》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》
<configuration>
        <property>
                <name>mapreduce.framework.name</name>  
                <value>yarn</value>  
        </property> 
        <property>  
                <name>mapreduce.jobhistory.address</name>  
                <value>master:10020</value>  
        </property>
        <property>  
                <name>mapreduce.jobhistory.webapp.address</name>  
                <value>master:19888</value>  
        </property>
        <property>  
                <name>mapreduce.output.fileoutputformat.compress</name>  
                <value>true</value>  
        </property>  
        <property>  
                <name>mapreduce.output.fileoutputformat.compress.type</name>  
                <value>BLOCK</value>  
        </property>  
        <property>  
                <name>mapreduce.output.fileoutputformat.compress.codec</name>  
                <value>org.apache.hadoop.io.compress.SnappyCodec</value>  
        </property>  
        <property>  
                <name>mapreduce.map.output.compress</name>  
                <value>true</value>  
        </property>  
        <property>  
                 <name>mapreduce.map.output.compress.codec</name>  
                 <value>org.apache.hadoop.io.compress.SnappyCodec</value>  
        </property>
</configuration>
yarn-site.xml配置》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》
<configuration>
        <property>
                <name>mapreduce.framework.name</name>  
                <value>yarn</value>  
        </property> 
        <property>  
                <name>mapreduce.jobhistory.address</name>  
                <value>master:10020</value>  
        </property>
        <property>  
                <name>mapreduce.jobhistory.webapp.address</name>  
                <value>master:19888</value>  
        </property>
        <property>  
                <name>mapreduce.output.fileoutputformat.compress</name>  
                <value>true</value>  
        </property>  
        <property>  
                <name>mapreduce.output.fileoutputformat.compress.type</name>  
                <value>BLOCK</value>  
        </property>  
        <property>  
                <name>mapreduce.output.fileoutputformat.compress.codec</name>  
                <value>org.apache.hadoop.io.compress.SnappyCodec</value>  
        </property>  
        <property>  
                <name>mapreduce.map.output.compress</name>  
                <value>true</value>  
        </property>  
        <property>  
                 <name>mapreduce.map.output.compress.codec</name>  
                 <value>org.apache.hadoop.io.compress.SnappyCodec</value>  
        </property>
</configuration>
rack.py》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》
#!/bin/env python  
  
import sys,os,time  
  
pwd = os.path.realpath( __file__ )  
rack_file = os.path.dirname(pwd) + "/rack.data"  
  
rack_list = [ l.strip().split() for l in open(rack_file).readlines() if len(l.strip().split()) > 1 ]  
rack_map = {}  
for item in rack_list:  
        for host in item[:-1]:  
                rack_map[host] = item[-1]  
rack_map['default'] = 'default' in rack_map and rack_map['default'] or '/default/rack'  
rack_result = [av in rack_map and rack_map[av] or rack_map['default'] for av in sys.argv[1:]]  
#print rack_map, rack_result  
print ' '.join( rack_result )  
  
f = open('/tmp/rack.log','a+')  
f.writelines( "[%s] %sn" % (time.strftime("%F %T"),str(sys.argv)))  
f.close()  
zookeeper 中zoo.cfg配置》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》》
在zookeeper文件夾下建立data目錄用於存放數據,並在data 目錄下建立文件myid 根據server id 的不一樣寫入1或2或3或4或5和
和配置裏的server.1，server.2……一致
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/usr/hadoop/zookeeper/data
# the port at which the clients will connect
clientPort=2181
server.1=master:2888:3888
server.2=master2:2888:3888   
server.3=slave1:2888:3888 
server.4=slave2:2888:3888
server.5=slave3:2888:3888      
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

如下命令嚴格注意執行順序，不能顛倒！html

一、在master、master2,slave1上，執行命令 hadoop-daemon.sh start journalnodejava

二、在5個節點啓動zookeeper 執行命令：zkServer.sh start命令node

三、在master執行命令 hdfs namenode –formatpython

在master執行命令 hadoop-daemon.sh start namenodeweb

四、在master2執行命令 hdfs namenode -bootstrapStandbyapache

在master2執行命令 hadoop-daemon.sh start namenodebootstrap

此時訪問192.168.1.9:50070 和192.168.1.10:50070都是standy狀態
app

五、格式化zookeeper HA配置文件執行命令ssh