hadoop 2.7.2 安裝

文中安裝hadoop環境爲Linux centos6.5 64位,一共4個節點,資源分佈以下:
NameNode(NN):                                         node1,node2
DataNode(DN):                                           node2,node3,node4
ZooKeeper(ZK):                                           node1,node2,node3
ZooKeeperFailoverController(ZKFC):     node1,node2
JournalNode(JN):                                         node2,node3,node4
ResourceManager(RM):                            node1
DataManager(DM):                                    node2,node3,node4java

  NN DN ZK ZKFC JN RM DM
node01      
node02  
node03      
node04        

 

一、系統環境準備:4臺Linux 系統電腦(虛擬機)。node

配置java環境變量,可參考:http://my.oschina.net/u/574036/blog/719977linux

4臺設備設置ssh免密碼登陸(可選,若不設置則不能統一啓動hadoop環境),可參考:http://my.oschina.net/u/574036/blog/730485apache

二、下載hadoop-2.7.二、zookeeper-3.4.8。bootstrap

hadoop:http://apache.fayea.com/hadoop/common/centos

zookeeper:http://apache.fayea.com/zookeeper/服務器

三、解壓安裝zookeeper,可參考: http://www.javashuo.com/article/p-draivztk-bp.htmlssh

四、解壓安裝hadoop(注意:hadoop-2.x 和hadoop-1.x文件目錄有所不一樣):ide

(1)在第一個節點(node01)上解壓hadoop到安裝目錄oop

# tar -zxvf hadoop-2.7.2.tar.gz -C ../hadoop-home/

(2)修改hadoop 中java環境變量配置(hadoop-env.sh)

其中默認java環境變量爲:export JAVA_HOME=${JAVA_HOME},

將其註釋掉,改爲:export JAVA_HOME=/home/java/jdk1.7.0_80

hadoop環境變量設置(可選)

#set hadoop environment
export HADOOP_HOME=/home/hadoop/hadoop-home/hadoop-2.8.0/
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH

# vi hadoop-env.sh

(3)配置HA ,修改hdfs-site.xml配置文件,

a.配置命名服務,名字爲 myhadoop

<property>
    <name>dfs.nameservices</name>
    <value>myhadoop</value>    
</property>

b.配置全部namenode的名字,注意服務名和前面配置的命名服務一致,前面是myhadoop,

nn一、nn2爲namenode名稱:

<property>
    <name>dfs.ha.namenodes.myhadoop</name>
    <value>nn1,nn2</value>    
</property>

c.配置namenode 的rpc協議端口,其中node0一、node02 爲namenode所在服務器名稱,也能夠配置爲ip:

<property>
    <name>dfs.namenode.rpc-address.myhadoop.nn1</name>
    <value>node01:8020</value>    
</property>

<property>
    <name>dfs.namenode.rpc-address.myhadoop.nn2</name>
    <value>node02:8020</value>    
</property>

d.配置namenode 的http協議端口,其中node0一、node02 爲namenode所在服務器名稱,也能夠配置爲ip:

<property>
    <name>dfs.namenode.http-address.myhadoop.nn1</name>
    <value>node01:50070</value>    
</property>

<property>
    <name>dfs.namenode.http-address.myhadoop.nn2</name>
    <value>node02:50070</value>    
</property>

e.配置JournalNode url地址

<property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://node02:8485;node03:8485;node04:8485/myhadoop</value>    
</property>

f.配置提供客戶端使用類(使用該類找到active namenode)

<property>
    <name>dfs.client.failover.proxy.provider.myhadoop</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>    
</property>

g.配置ssh fencing (/home/hadoop/.ssh/id_rsa 爲前面配置ssh免密碼登陸是生成的私鑰文件 )

<property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
</property>

<property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hadoop/.ssh/id_rsa</value>    
</property>

h.配置JournalNode 工做目錄

<property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/home/hadoop/hadoop-home/journalnode/data</value>    
</property>

i.配置自動切換(可選,最好配置上)

<property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>    
</property>

 

<property>
    <name>dfs.permissions</name>
    <value>false</value>    
  </property>

j.配置zookeeper集羣,在core-site.xml文件中配置。

(4)配置core-site.xml

a.配置namenode入口,myhadoop爲上一步配置的命名服務

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://myhadoop</value>
</property>

b.配置zookeeper集羣,定義zookeeper服務在哪些節點機器存在

<property>
    <name>ha.zookeeper.quorum</name>
    <value>node01:2181,node02:2181,node03:2181</value>
</property>

c.配置hadoop臨時目錄

<property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/hadoop-home/temp</value>
</property>

5.按以上步驟將hadoop安裝部署到其餘節點上,我這裏直接拷貝

scp -r ./* hadoop@node02:/home/hadoop/hadoop-home/

6.mapreduce配置

a.修改mapred-site.xml

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>

<!--

DataNode須要訪問 MapReduce JobHistory Server時配置,默認值:0.0.0.0:10020  

  1. 啓動historyserver:  
  2. sbin/mr-jobhistory-daemon.sh start historyserver

-->

<property>
    <name>mapreduce.jobhistory.adress</name>
    <value>127.0.0.1:10020</value>
  </property> 

b.修改yarn-site.xml,配置resourcemanager 所在機器

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop-master</value>
</property>

<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

<property>
    <name>yarn.namemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

7.slaves

node02 

node03 

node04

8.測試運行(注意啓動順序)

(1)啓動zookeeper(在node0一、node0二、03 )

(2)啓動journalnode(sbin目錄下)

./hadoop-daemon.sh start journalnode

(3)格式化namenode(bin目錄下)

a.在其中一個namenode節點上執行格式化命令

./hdfs namenode -format

b.啓動已經格式化namenode的節點的 namenode(sbin目錄下)

./hadoop-daemon.sh start namenode

c.複製格式化namenode節點生成的元數據,在未格式化的節點執行(bin目錄下)

./hdfs namenode -bootstrapStandby

(執行該命令有可能不成功,能夠直接拷貝

sudo scp -r temp/* hadoop@hadoop02:/home/hadoop/hadoop-home/temp/)

(中止hdfs  stop-dfs.sh)

(啓動hdfs start-dfs.sh)

d.初始化zkfc

在一個namenode節點上執行(bin目錄下):

bin/hdfs zkfc -formatZK

e.所有啓動

sbin/start-dfs.sh

啓動成功後,訪問namenode :http://node01:50070   http://node02:50070

f.上傳文件:

建立目錄bin/hdfs dfs -mkdir -p /home/hadoop/hadoop-files
上傳文件bin/hdfs dfs -put /home/hadoop/Downloads/jdk-7u80-linux-x64.tar.gz /home/hadoop/hadoop-files
刪除文件:bin/hdfs dfs -rm -r /home/hadoop/test-files/jdk-8u131-linux-x64.tar.gz

g.運行/中止所有

sbin/start-all.sh

sbin/stop-all.sh

查看resourcemanager:

http://node01:8088

到此,hadoop安裝配置基本完成。

相關文章
相關標籤/搜索