Hadoop集羣搭建-full徹底分佈式(三)

環境:Hadoop-2.8.5 、centos七、jdk1.8java

1、步驟

1).4臺centos虛擬機node

2). 將hadoop配置修改成徹底分佈式web

3). 啓動徹底分佈式集羣apache

4). 在徹底分佈式集羣上測試wordcount程序centos

2、4臺centos虛擬機配置

4臺虛擬機:node-00一、node-00二、node-00三、node-004app

克隆4臺虛擬機——》生成新的mac地址——》修改主機名——》修改node-001的IP地址——》刪除70-persistent-net.rules文件——》重啓虛擬機生效webapp

3、修改Hadoop配置爲徹底分佈式

須要修改 $HADOOP_HOME/etc/hadoop目錄下配置文件 hadoop-env.sh、 core-site.xmlhdfs-site.xml、 yarn-site.xmlmapred-site.xmlsalves分佈式

配置Hadoop 環境變量

export HADOOP_PREFIX=/home/lims/bd/hadoop-2.8.5 PATH=$PATH:$JAVA_HOME/bin:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin

1.進入$HADOOP_HOME/etc/hadoop目錄

vi hadoop-env.sh export JAVA_HOME=/usr/java/jdk1.8.0

2.修改core-site.xml

vi core-site.xml
<configuration>
<!--配置hdfs文件系統的命名空間-->
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://node-001:9000</value>
  </property>

<!-- 配置操做hdfs的存衝大小 -->
  <property>
    <name>io.file.buffer.size</name>
    <value>4096</value>
  </property>
<!-- 配置臨時數據存儲目錄 -->
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/lims/bd/tmp</value>
  </property>

</configuration>

3.修改hdfs-site.xml

[lims@node-001 hadoop]# vi hdfs-site.xml <configuration>
<!-- 將備份數修改成3,小於等於當前datanode數目便可-->
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
<!-- 將secondary namenode改成hadoop2-->
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node-002:50090</value>
</property>
<property>
       <name>dfs.namenode.name.dir</name>
      <value>file://${hadoop.tmp.dir}/dfs/name</value>
</property>
<property>
    <name>dfs.namenode.data.dir</name>
    <value>file://${hadoop.tmp.dir}/dfs/data</value>
 </property>
 <property>
    <name>dfs.permissions.enabled</name>
    <value>false</value>
 </property>
</configuration>

 

4.修改yarn-site.xml

 
 
[lims@node-001 hadoop]# vi yarn-site.xml

<
configuration> <!-- Site specific YARN configuration properties --> <!-- 添加了yarn.resourcemanager.hostname 屬性--> <property> <name>yarn.resourcemanager.hostname</name> <value>node-001</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 添加了yarn.nodemanager.auxservices.mapreduce.shuffle.class屬性--> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>

 

5.配置mapred-site.xml文件

<configuration>

<!-- MR YARN Application properties -->

<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
  <description>The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn. </description>
</property>

<!-- jobhistory properties -->
<property>
  <name>mapreduce.jobhistory.address</name>
  <value>node-002:10020</value>
  <description>MapReduce JobHistory Server IPC host:port</description>
</property>

<property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>node-003:19888</value>
  <description>MapReduce JobHistory Server Web UI host:port</description>
</property>

</configuration>

 

6.配置salves文件

node-002 node-003 node-004

7.將hadoop/下配置分發到各個節點,hosts配置分發到各個節點

scp hadoop/* lims@node-002:/home/lims/bd/hadoop-2.8.5/etc/hadoop scp hadoop/* lims@node-003:/home/lims/bd/hadoop-2.8.5/etc/hadoop scp hadoop/* lims@node-004:/home/lims/bd/hadoop-2.8.5/etc/hadoop

4、啓動徹底分佈式集羣

1)node-001上格式化namenode oop

hdfs namenode -format

2)node-001上啓動Hadoop集羣測試

start-dfs.sh

3)node-001上啓動yarn

start-yarn.sh

4)各個節點上查看進程

[lims@node-001 hadoop]$ jps 11602 ResourceManager 14499 Jps 11325 NameNode [lims@node-002 ~]$ jps 2449 NodeManager 2377 SecondaryNameNode 2316 DataNode 5564 Jps [lims@node-003 ~]$ jps 4112 Jps 2425 NodeManager 2316 DataNode [lims@node-004 ~]$ jps 2433 NodeManager 2324 DataNode 4009 Jps

5、徹底分佈式集羣上運行wordcount

1)從node-001進入$HADOOP_HOME/share/hadoop/mapreduce/目錄

2)上傳test.txt文件到指定目錄

hadoop fs -put test.txt /user/lims/

3)運行wordcount測試程序,輸出到/output

hadoop jar hadoop-mapreduce-examples-2.8.5.jar wordcount /user/lims/test.txt /output

4)查看mapreduce運行結果

hadoop dfs -text /output/part-*
hadoop dfs -cat /output/part-*
[lims@node-001 hadoop]$ hadoop fs -cat /output/part-* a 2 aa 2 bb 2
cc    1
dd    1
file    2 is 2 test 2 this 2 tmp 1
相關文章
相關標籤/搜索