Hadoop2.7.1配置NameNode+ResourceManager高可用原理分析
關於NameNode高可靠須要配置的文件有core-site.xml和hdfs-site.xml
關於ResourceManager高可靠須要配置的文件有yarn-site.xml
邏輯結構:
NameNode-HA工做原理:
在一個典型的HA集羣中,最好有2臺獨立的機器的來配置NameNode角色,不管在任什麼時候候,集羣中只能有一個NameNode做爲Active狀態,而另外一個是Standby狀態,Active狀態的NameNode負責集羣中全部的客戶端操做,這麼設置的目的,其實HDFS底層的機制是有關係的,同一時刻一個文件,只容許一個寫入方佔用,若是出現多個,那麼文件偏移量便會混亂,從而致使數據格式不可用,固然狀態爲Standby的NameNode這時候僅僅扮演一個Slave的角色,以便於在任什麼時候候Active的NameNode掛掉時,可以第一時間,接替它的任務,成爲主NameNode,達到一個熱備份的效果,在HA架構裏面SecondaryNameNode這個冷備角色已經不存在了,爲了保持從NameNode時時的與主NameNode的元數據保持一致,他們之間交互經過一系列守護的輕量級進程JournalNode,當任何修改操做在主NameNode上執行時,它同時也會記錄修改log到至少半數以上的JornalNode中,這時狀態爲Standby的NameNode監測到JournalNode裏面的同步log發生變化了會讀取JornalNode裏面的修改log,而後同步到本身的的目錄鏡像樹裏面,當發生故障時,Active的NameNode掛掉後,Standby的NameNode會在它成爲Active NameNode前,讀取全部的JournalNode裏面的修改日誌,這樣就能高可靠的保證與掛掉的NameNode的目錄鏡像樹一致,而後無縫的接替它的職責,維護來自客戶端請求,從而達到一個高可用的目的。
爲了達到快速容錯的掌握全局的目的,Standby角色也會接受來自DataNode角色彙報的塊信息,前面只是介紹了NameNode容錯的工做原理,下面介紹下,當引入Zookeeper以後,爲啥能夠NameNode-HA能夠達到無人值守,自動切換的容錯。
在主備切換上Zookeeper能夠乾的事:
(1)失敗探測 在每一個NameNode啓動時,會在Zookeeper上註冊一個持久化的節點,當這個NameNode宕機時,它的會話就會終止,Zookeeper發現以後,就會通知備用的NameNode,Hi,老兄,你該上崗了。
(2)選舉機制, Zookeeper提供了一個簡單的獨佔鎖,獲取Master的功能,若是那個NameNode發現本身獲得這個鎖,那就預示着,這個NameNode將被激活爲Active狀態
固然,實際工做中Hadoop提供了ZKFailoverController角色,在每一個NameNode的節點上,簡稱zkfc,它的主要職責以下:
(1)健康監測,zkfc會週期性的向它監控的NameNode發送健康探測命令,從而來肯定某個NameNode是否處於健康狀態,若是機器宕機,心跳失敗,那麼zkfc就會標記它處於一個不健康的狀態
(2)會話管理, 若是NameNode是健康的,zkfc就會在zookeeper中保持一個打開的會話,若是NameNode同時仍是Active狀態的,那麼zkfc還會在Zookeeper中佔有一個類型爲短暫類型的znode,當這個NameNode掛掉時,
這個znode將會被刪除,而後備用的NameNode,將會獲得這把鎖,升級爲主NameNode,同時標記狀態爲Active,當宕機的NameNode,從新啓動時,它會再次註冊zookeper,發現已經有znode鎖了,便會自動變爲Standby狀態,如此往復循環,保證高可靠,須要注意,目前僅僅支持最多配置2個NameNode。
(3)master選舉,如上所述,經過在zookeeper中維持一個短暫類型的znode,來實現搶佔式的鎖機制,從而判斷那個NameNode爲Active狀態。
core-site.xml裏面
- <configuration>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://ns1</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/ROOT/server/data-hadoop/hadooptmp</value>
- </property>
- <property>
- <name>io.compression.codecs</name>
- <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.co
- mpress.SnappyCodec</value>
- </property>
- <property>
- <name>fs.trash.interval</name>
- <value>0</value>
- <description>Number of minutes between trash checkpoints.
- If zero, the trash feature is disabled.
- </description>
- </property>
-
- <!-- ha的zk的配置 -->
- <property>
- <name>ha.zookeeper.quorum</name>
- <value>h1:2181,h2:2181,h3:2181</value>
- </property>
- </configuration>
hdfs-site.xml裏面
yarn-site.xml裏面:
- <?xml version="1.0"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <configuration>
-
-
-
-
-
- <!--啓用RM高可用-->
- <property>
- <name>yarn.resourcemanager.ha.enabled</name>
- <value>true</value>
- </property>
-
-
-
- <!--RM集羣標識符-->
- <property>
- <name>yarn.resourcemanager.cluster-id</name>
- <value>ns1</value>
- </property>
-
-
- <property>
- <!--指定兩臺RM主機名標識符-->
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>h1,h2</value>
- </property>
-
-
- <!--RM故障自動切換-->
- <property>
- <name>yarn.resourcemanager.ha.automatic-failover.recover.enabled</name>
- <value>true</value>
- </property>
-
-
- <!--RM故障自動恢復-->
-
- <property>
- <name>yarn.resourcemanager.recovery.enabled</name>
- <value>true</value>
- </property>
-
-
- <!--RM主機1-->
- <property>
- <name>yarn.resourcemanager.hostname.h1</name>
- <value>h1</value>
- </property>
-
- <!--RM主機2-->
- <property>
- <name>yarn.resourcemanager.hostname.h2</name>
- <value>h2</value>
- </property>
-
-
- <!--RM狀態信息存儲方式,一種基於內存(MemStore),另外一種基於ZK(ZKStore)-->
- <property>
- <name>yarn.resourcemanager.store.class</name>
- <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
- </property>
-
-
- <!--使用ZK集羣保存狀態信息-->
- <property>
- <name>yarn.resourcemanager.zk-address</name>
- <value>h1:2181,h2:2181,h3:2181</value>
- </property>
-
-
- <!--向RM調度資源地址-->
- <property>
- <name>yarn.resourcemanager.scheduler.address.h1</name>
- <value>h1:8030</value>
- </property>
-
-
- <property>
- <name>yarn.resourcemanager.scheduler.address.h2</name>
- <value>h2:8030</value>
- </property>
-
-
- <!--NodeManager經過該地址交換信息-->
- <property>
- <name>yarn.resourcemanager.resource-tracker.address.h1</name>
- <value>h1:8031</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.resource-tracker.address.h2</name>
- <value>h2:8031</value>
- </property>
-
-
- <!--客戶端經過該地址向RM提交對應用程序操做-->
- <property>
- <name>yarn.resourcemanager.address.h1</name>
- <value>h1:8032</value>
- </property>
- <property>
- <name>yarn.resourcemanager.address.h2</name>
- <value>h2:8032</value>
- </property>
-
-
- <!--管理員經過該地址向RM發送管理命令-->
- <property>
- <name>yarn.resourcemanager.admin.address.h1</name>
- <value>h1:8033</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.admin.address.h2</name>
- <value>h2:8033</value>
- </property>
-
-
- <!--RM HTTP訪問地址,查看集羣信息-->
- <property>
- <name>yarn.resourcemanager.webapp.address.h1</name>
- <value>h1:8088</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.webapp.address.h2</name>
- <value>h2:8088</value>
- </property>
-
-
- <property>
- <name>yarn.resourcemanager.scheduler.class</name>
- <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
- </property>
-
-
-
-
-
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
-
- <property>
- <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
-
- <property>
- <description>Classpath for typical applications.</description>
- <name>yarn.application.classpath</name>
- <value>$HADOOP_CONF_DIR
- ,$HADOOP_COMMON_HOME/share/hadoop/common/*
- ,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*
- ,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*
- ,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*
- ,$YARN_HOME/share/hadoop/yarn/*</value>
- </property>
-
- <!-- Configurations for NodeManager -->
- <property>
- <name>yarn.nodemanager.resource.memory-mb</name>
- <value>5632</value>
- </property>
-
- <property>
- <name>yarn.scheduler.minimum-allocation-mb</name>
- <value>1408</value>
- </property>
-
-
- <property>
- <name>yarn.scheduler.maximum-allocation-mb</name>
- <value>5632</value>
- </property>
-
-
-
-
-
- </configuration>
mapred-site.xml裏面內容
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
- <!-- Put site-specific property overrides in this file. -->
-
- <configuration>
-
-
-
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.jobtracker.address</name>
- <value>h1:8021</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.address</name>
- <value>h1:10020</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>h1:19888</value>
- </property>
- <property>
- <name>mapred.max.maps.per.node</name>
- <value>2</value>
- </property>
- <property>
- <name>mapred.max.reduces.per.node</name>
- <value>1</value>
- </property>
- <property>
- <name>mapreduce.map.memory.mb</name>
- <value>1408</value>
- </property>
- <property>
- <name>mapreduce.map.java.opts</name>
- <value>-Xmx1126M</value>
- </property>
-
- <property>
- <name>mapreduce.reduce.memory.mb</name>
- <value>2816</value>
- </property>
- <property>
- <name>mapreduce.reduce.java.opts</name>
- <value>-Xmx2252M</value>
- </property>
- <property>
- <name>mapreduce.task.io.sort.mb</name>
- <value>512</value>
- </property>
- <property>
- <name>mapreduce.task.io.sort.factor</name>
- <value>100</value>
- </property>
-
-
-
-
-
-
-
- </configuration>
啓動方式:假設你是新的集羣,若是不是,請參考文末的官網url連接
1,先在集羣中啓動N/2+1個JornalNode進程,寫ssh腳本執行命令:hadoop-daemon.sh start journalnode
2 ,而後在第一臺NameNode上應執行hdfs namenode -format格式化集羣
3,而後在第二臺NameNode上執行hdfs namenode -bootstrapStandby同步第一臺NameNode元數據
4,在第一臺NameNode上執行命令hdfs zkfc -formatZK格式化zookeeper
5,第一臺NameNode上啓動zkfc執行命令:hadoop-daemon.sh start zkfc
6,在第二臺NameNode上啓動zkfc執行命令:hadoop-daemon.sh start zkfc
7,執行start-dfs.sh啓動全部的NameNode,DataNode,JournalNode(注意若是已經啓動就會跳過)
8,執分別訪問兩臺機器的50070端口,查看NameNode狀態,其中一個爲Active,一個爲Standby即爲正常
9,測試容錯,找到狀態爲Active的NameNode的pid進程,並kill掉,查看standby是否會自動晉級爲active,若是
一切安裝完畢,則會自動切換,若是沒切換,注意查看zkfc和namenode的log
感謝並參考的文章:
http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
http://lizhenliang.blog.51cto.com/7876557/1661354
http://www.cnblogs.com/781811964-Fighter/p/4930067.html
歡迎關注本站公眾號,獲取更多信息