CDH5支持不少新特性,因此打算把當前的CDH4.5升級到CDH5,軟件部署仍是以以前的CDH4.5集羣爲基礎 node
192.168.1.10 U-1 (Active) hadoop-yarn-resourcemanager hadoop-hdfs-namenode hadoop-mapreduce-historyserver hadoop-yarn-proxyserver hadoop-hdfs-zkfc 192.168.1.20 U-2 hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce journalnode zookeeper zookeeper-server 192.168.1.30 U-3 hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce journalnode zookeeper zookeeper-server 192.168.1.40 U-4 hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce journalnode zookeeper zookeeper-server 192.168.1.50 U-5 hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce 192.168.1.70 U-7 (Standby) hadoop-yarn-resourcemanager hadoop-hdfs-namenode hadoop-hdfs-zkfc注意:由於咱們是升級CDH4.5到CDH5,因此上表並無列出來全部要安裝的軟件,由於在CDH4.5的時候已經安裝了一些,因此上面列出的軟件只是你升級的時候須要從新安裝的。
操做過程以下: web
1 Back Up Configuration Data and Stop Services shell
1 namenode進入safe mode,保存fsimage apache
su - hdfs hdfs dfsadmin -safemode enter hdfs dfsadmin -saveNamespace
2 中止集羣中的各類hadoop服務 bootstrap
for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done
2 Back up the HDFS Metadata ubuntu
1 找到dfs.namenode.name.dir tomcat
grep -C1 name.dir /etc/hadoop/conf/hdfs-site.xml
2 備份dfs.namenode.name.dir指定的目錄 app
tar czvf dfs.namenode.name.dir.tgz /data
3 Uninstall the CDH 4 Version of Hadoop ssh
1 卸載hadoop組件 curl
apt-get remove bigtop-utils bigtop-jsvc bigtop-tomcat sqoop2-client hue-common
2 刪除CDH4的repository files
mv /etc/apt/sources.list.d/cloudera-cdh4.list /root/
4 Download the Latest Version of CDH 5
1 下載CDH5的repository
wget 'http://archive.cloudera.com/cdh5/one-click-install/precise/amd64/cdh5-repository_1.0_all.deb'
2 安裝CDH5的repository
dpkg -i cdh5-repository_1.0_all.deb curl -s http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh/archive.key | apt-key add -
5 Install CDH 5 with YARN
1 安裝zookeeper
2 在各個主機上安裝相關組件
1 Resource Manager host
apt-get install hadoop-yarn-resourcemanager
2 NameNode host(s)
apt-get install hadoop-hdfs-namenode
3 All cluster hosts except the Resource Manager
apt-get install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
4 One host in the cluster(Active NameNode)
apt-get install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
5 All client hosts
apt-get install hadoop-client
6 Install CDH 5 with MRv1
由於CDH5已經主推YARN了,因此咱們再也不使用MRv1,就不安裝了。
7 In an HA Deployment, Upgrade and Start the Journal Nodes
1 安裝journal nodes
apt-get install hadoop-hdfs-journalnode
2 啓動journal node
service hadoop-hdfs-journalnode start
8 Upgrade the HDFS Metadata
HA模式和NON-HA模式的升級方式不同,由於咱們以前的CDH4.5是HA模式的,因此咱們就按照HA模式的來升級
1 在active namenode上執行
service hadoop-hdfs-namenode upgrade
2 重啓standby namenode
su - hdfs hdfs namenode -bootstrapStandby service hadoop-hdfs-namenode start
3 啓動datanode
service hadoop-hdfs-datanode start
4 查看版本
9 Start YARN
1 建立相關目錄
su - hdfs hadoop fs -mkdir /user/history hdfs fs -chmod -R 1777 /user/history hdfs fs -chown yarn /user/history hdfs fs -mkdir /var/log/hadoop-yarn hdfs fs -chown yarn:mapred /var/log/hadoop-yarn hadoop fs -ls -R /
2 在各個hadoop集羣集羣上啓動相關服務
service hadoop-yarn-resourcemanager start service hadoop-yarn-nodemanager start service hadoop-mapreduce-historyserver start
10 配置NameNode的HA配置
1 NameNode HA和CDH4.5的部署同樣,只是要把yarn-site.xml中的mapreduce.shuffle修改成mapreduce_shuffle便可。
2 驗證
11 配置YARN的HA配置
1 Stop all YARN daemons
service hadoop-yarn-nodemanager stop service hadoop-yarn-resourcemanager stop service hadoop-mapreduce-historyserver stop
2 Update the configuration used by the ResourceManagers, NodeManagers and clients
如下是U-1上的配置,core-site.xml、hdfs-site.xml、mapred-site.xml三個文件都不須要作修改,惟一要修改的是yarn-site.xml
core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster/</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>U-2:2181,U-3:2181,U-4:2181</value> </property> </configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/data</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data01,/data02</value> </property> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <!-- HA Config --> <property> <name>dfs.ha.namenodes.mycluster</name> <value>U-1,U-7</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.U-1</name> <value>U-1:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.U-7</name> <value>U-7:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.U-1</name> <value>U-1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.U-7</name> <value>U-7:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://U-2:8485;U-3:8485;U-4:8485/mycluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/jdata</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/var/lib/hadoop-hdfs/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>U-1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>U-1:19888</value> </property> </configuration>
yarn-site.xml
<configuration> <!-- Resource Manager Configs --> <property> <name>yarn.resourcemanager.connect.retry-interval.ms</name> <value>2000</value> </property> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.ha.automatic-failover.embedded</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yarn-rm-cluster</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>U-1,U-7</value> </property> <property> <name>yarn.resourcemanager.ha.id</name> <value>U-1</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>U-2:2181,U-3:2181,U-4:2181</value> </property> <property> <name>yarn.resourcemanager.zk.state-store.address</name> <value>U-1:2181</value> </property> <property> <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name> <value>5000</value> </property> <!-- RM1 configs --> <property> <name>yarn.resourcemanager.address.U-1</name> <value>U-1:23140</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.U-1</name> <value>U-1:23130</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address.U-1</name> <value>U-1:23189</value> </property> <property> <name>yarn.resourcemanager.webapp.address.U-1</name> <value>U-1:23188</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.U-1</name> <value>U-1:23125</value> </property> <property> <name>yarn.resourcemanager.admin.address.U-1</name> <value>U-1:23141</value> </property> <!-- RM2 configs --> <property> <name>yarn.resourcemanager.address.U-7</name> <value>U-7:23140</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.U-7</name> <value>U-7:23130</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address.U-7</name> <value>U-7:23189</value> </property> <property> <name>yarn.resourcemanager.webapp.address.U-7</name> <value>U-7:23188</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.U-7</name> <value>U-7:23125</value> </property> <property> <name>yarn.resourcemanager.admin.address.U-7</name> <value>U-7:23141</value> </property> <!-- Node Manager Configs --> <property> <description>Address where the localizer IPC is.</description> <name>yarn.nodemanager.localizer.address</name> <value>0.0.0.0:23344</value> </property> <property> <description>NM Webapp address.</description> <name>yarn.nodemanager.webapp.address</name> <value>0.0.0.0:23999</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/yarn/local</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/yarn/log</value> </property> <property> <name>mapreduce.shuffle.port</name> <value>23080</value> </property> </configuration>
注意:在把yarn-site.xml拷貝到U-7後,須要把U-7上的yarn-site.xml的yarn.resourcemanager.ha.id的值修改成U-7,不然ResourceManager啓動不了。
3 Start all YARN daemons
service hadoop-yarn-resourcemanager start service hadoop-yarn-nodemanager start
4 驗證
我勒個去的,這是啥問題,沒有找到相應的ZKFC地址?
今天再次實驗YARN的HA機制,發現官方的郵件列表有以下解釋:
Right now, RM HA does not use ZKFC. So, we can not use this command 「yarn rmadmin -failover rm1 rm2」 now. If you use the default HA configuration, you set up a Automatic RM HA. In order to failover manually, you have two options: set up manual RM HA by set the configuration 「yarn.resourcemanager.ha.automatic-failover.enable」 as false. Then you can use command 「yarn rmadmin –transitionToActive rm1」, 「yarn rmadmin –transitionToStandby rm2」 to control which rm goes to active by yourself. If you really want to experiment the manual failover when automatic failover enabled, you can use command 「yarn rmadmin –transitionToActive --forcemanual rm2" Thanks原來是個人姿式不對....
參考:https://issues.apache.org/jira/browse/YARN-3006
https://issues.apache.org/jira/browse/YARN-1177