具體的就不介紹了,請參考我另一篇文章http://my.oschina.net/lanzp/blog/309078 node
在講解以前,先聲明一點,這篇文章的全部環境都是基於上一篇僞分佈的配置文章的基礎上再作配置的,若是你沒有看過我以前的那篇文章,建議你先去閱讀一下。 web
如今網上比較少新版本的配置教程,那麼下面我就來分享一下我本身的實戰經驗,若有不正確的地歡迎指正 :) shell
首先,咱們的節點信息以下: apache
master 192.168.8.184 slave1 192.168.8.183 slave2 192.168.8.178 slave3 192.168.8.190
我是用虛擬機作的,主節點(Master)配置好後做爲模板,其餘3個從節點(slave一、slave二、slave3)直接作虛擬機拷貝,其中slave1還兼任了第二namenode節點。由於主節點的配置跟從節點的配置是徹底同樣的,拷貝完後只須要作兩件事:修改hostname、從新生成SSH無口令登錄的KEY。拷貝完成後啓動若是網絡有問題,直接百度谷歌就好了。關於修改hostname對應上面的IP修改爲相應的名稱就能夠了,如192.168.8.184修改爲master。而用戶名、密碼徹底能夠不須要修改,固然你要改也是沒有問題的,節點多了很差記哦。首先,咱們先來說一下配置,一共要配置6個文件,分別是core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml、masters、slaves,其中masters文件可能會在配置目錄下不存在,本身新建一個就能夠了,詳細配置以下: 網絡
core-site.xml: app
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://192.168.8.184:9000</value> <description>same as fs.default.name</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/mywind/tmp</value> <description>A base for other temporary directories.</description> </property> </configuration>
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>/usr/mywind/name</value> <description>same as dfs.name.dir</description> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/mywind/data</value> <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices.</description> </property> <property> <name>dfs.blocksize</name> <value>268435456</value> <description>HDFS blocksize of 256MB for large file-systems.</description> </property> <property> <name>dfs.namenode.http-address</name> <value>master:50070</value> <description>The address and the base port where the dfs namenode web ui will listen on.</description> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>slave1:50090</value> <description>The secondary namenode http server address and port.</description> </property> <property> <name>dfs.replication</name> <value>3</value> <description>same as old frame,recommend set the value as the cluster DataNode host numbers!</description> </property> </configuration>
mapred-site.xml: ssh
<configuration> <!--Configurations for MapReduce Applications --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>Execution framework set to Hadoop YARN.</description> </property> <!--Configurations for MapReduce JobHistory Server --> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> <description>MapReduce JobHistory Server host:port.Default port is 10020.</description> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> <description>MapReduce JobHistory Server Web UI host:port.Default port is 19888.</description> </property> </configuration>
yarn-site.xml: webapp
<configuration> <!-- Configurations for ResourceManager --> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <!-- Configurations for NodeManager --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <description>Configuration to enable or disable log aggregation.Shuffle service that needs to be set for Map Reduce applications.</description> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
masters: 分佈式
master
slave1 slave2 slave3
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
從節點(slave1、slave2、slave3)執行上面的命令從新生成完KEY以後,主節點須要與從節點創建無密碼登錄,那麼,就須要把主節點的KEY添加到從節點的受權列表中,這樣主節點才能不須要輸入密碼的狀況下鏈接訪問從節點。步驟以下: oop
1. 把主節點的公鑰複製到從節點中。
2. 從節點把主節點的公鑰寫入受權列表。
首先,在主節點中切換到/home/a01513目錄下(注意:a01513是個人操做系統用戶名,實際名稱根據你係統的用戶名改爲相應的名稱就能夠了):
cd /home/a01513
而後複製,這裏我以slave1節點爲例:
scp ~/.ssh/id_rsa.pub a01513@192.168.8.183:/home/a01513/
這裏可能要你輸入slave1(192.168.8.183)的主機密碼,輸入就能夠了,輸入完成後到slave1節點終端輸入如下命令把主節點的公鑰複製到受權列表:
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
回到主節點終端中輸入:
ssh slave1
若是不須要輸入密碼就鏈接成功的話,證實你配置對了,恭喜!
若是上面的步驟完成了,那麼下面進行格式化文件系統:
hdfs namenode -format
而後啓動HDFS、YARN、JobHistory進程:
start-dfs.sh start-yarn.sh
注意舊版本的start-all.sh已經被上面兩個腳本替代了,不過好像這命令還可能用,建議仍是用上面兩個吧。
mr-jobhistory-daemon.sh start historyserver
stop-dfs.sh stop-yarn.sh mr-jobhistory-daemon.sh stop historyserver
至此,咱們的徹底分佈式配置已經大功告成!