Hadoop安裝及配置

1、系統及軟件環境

1、操做系統html

CentOS release 6.5 (Final)java

內核版本:2.6.32-431.el6.x86_64node

master.fansik.com192.168.83.118linux

node1.fansik.com192.168.83.119web

node2.fansik.com192.168.83.120apache

2jdk版本:1.7.0_75vim

3Hadoop版本:2.7.2安全

2、安裝前準備

1、關閉防火牆和selinux服務器

# setenforce 0app

# service iptables stop

2、配置host文件

192.168.83.118 master.fansik.com

192.168.83.119 node1.fansik.com

192.168.83.120 node2.fansik.com

3、生成祕鑰

master.fansik.com上執行# ssh-keygen一直回車

# scp ~/.ssh/id_rsa.pub node1.fansik.com:/root/.ssh/authorized_keys

# scp ~/.ssh/id_rsa.pub node2.fansik.com:/root/.ssh/authorized_keys

# chmod 600 /root/.ssh/authorized_keys

4、安裝jdk

# tar xf jdk-7u75-linux-x64.tar.gz

# mv jdk1.7.0_75 /usr/local/jdk1.7

# vim /etc/profile.d/java.sh加入以下內容:

export JAVA_HOME=/usr/local/jdk1.7

export JRE_HOME=/usr/local/jdk1.7/jre

export CLASSPATH=.:$JAVA_HOME/lib:/dt.jar:$JAVA_HOME/lib/tools.jar

export PATH=$PATH:$JAVA_HOME/bin

# source /etc/profile

5、同步時間(不然後邊分析文件的時候可能會有問題)

# ntpdate 202.120.2.101(上海交通大學的服務器)

3、安裝Hadoop

Hadoop的官方下載站點,能夠選擇相應的版本下載:http://hadoop.apache.org/releases.html

分別在三臺機器上執行下面的操做:

# tar xf hadoop-2.7.2.tar.gz

# mv hadoop-2.7.2 /usr/local/hadoop

# cd /usr/local/hadoop/

# mkdir tmp dfs dfs/data dfs/name

4、配置Hadoop

master.fansik.com上的配置

# vim /usr/local/hadoop/etc/hadoop/core-site.xml

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://192.168.83.118:9000</value>

  </property>

  <property>

    <name>hadoop.tmp.dir</name>

    <value>file:/usr/local/hadoop/tmp</value>

  </property>

  <property>

    <name>io.file.buffer.size</name>

    <value>121702</value>

  </property>

</configuration>

# vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml

<configuration>

  <property>

    <name>dfs.namenode.name.dir</name>

    <value>file:/usr/local/hadoop/dfs/name</value>

  </property>

  <property>

    <name>dfs.datanode.data.dir</name>

    <value>file:/usr/local/hadoop/dfs/data</value>

  </property>

  <property>

    <name>dfs.replication</name>

    <value>2</value>

  </property>

  <property>

    <name>dfs.namenode.secondary.http-address</name>

    <value>192.168.83.118.9001</value>

  </property>

  <property>

    <name>dfs.webhdfs.enabled</name>

    <value>true</value>

  </property>

</configuration>

# cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml

# vim (!$|/usr/local/hadoop/etc/hadoop/mapred-site.xml)

<configuration>

  <property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>

  <property>

    <name>mapreduce.jobhistory.address</name>

    <value>192.168.83.118:10020</value>

  </property>

  <property>

    <name>mapreduce.jobhistory.webapp.address</name>

    <value>192.168.83.118:19888</value>

  </property>

</configuration>

# vim /usr/local/hadoop/etc/hadoop/yarn-site.xml

<configuration>

  <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

  </property>

  <property>

    <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>

    <value>org.apache.hadoop.mapred.ShuffleHandler</value>

  </property>

  <property>

    <name>yarn.resourcemanager.address</name>

    <value>192.168.83.118:8032</value>

  </property>

  <property>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>192.168.83.118:8030</value>

  </property>

  <property>

    <name>yarn.resourcemanager.resource-tracker.address</name>

    <value>192.168.83.118:8031</value>

  </property>

  <property>

    <name>yarn.resourcemanager.admin.address</name>

    <value>192.168.83.118:8033</value>

  </property>

  <property>

    <name>yarn.resourcemanager.webapp.address</name>

    <value>192.168.83.118:8088</value>

  </property>

  <property>

    <name>yarn.resourcemanager.resource.memory.mb</name>

    <value>2048</value>

  </property>

</configuration>

# vim /usr/local/hadoop/etc/hadoop/slaves

192.168.83.119

192.168.83.120

master上的etc目錄同步至node1node2

# rsync -av /usr/local/hadoop/etc/ node1.fansik.com:/usr/local/hadoop/etc/

# rsync -av /usr/local/hadoop/etc/ node2.fansik.com:/usr/local/hadoop/etc/

master.fansik.com上操做便可,兩個node會自動啓動

配置Hadoop的環境變量

# vim /etc/profile.d/hadoop.sh

export PATH=/usr/local/hadoop/bin:/usr/local/hadoop/bin:$PATH

# source /etc/profile

初始化

# hdfs namenode -format

查看是否報錯

# echo $?

啓動服務

# start-all.sh

中止服務

# stop-all.sh

啓動服務後便可經過下列地址訪問:

http://192.168.83.118:8088

http://192.168.83.118:50070

5、測試Hadoop

master.fansik.com上操做

# hdfs dfs -mkdir /fansik

若是在建立目錄的時候提示下列的警告能夠忽略

16/07/29 17:38:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your pform... using builtin-java classes where applicable

解決辦法:

到下列站點去下載相應的版本便可:

http://dl.bintray.com/sequenceiq/sequenceiq-bin/

# tar -xvf hadoop-native-64-2.7.0.tar -C /usr/local/hadoop/lib/native/

若是提示:copyFromLocalCannot create directory /123/. Name node is in safe mode

說明Hadoop開啓了安全模式,解決辦法

hdfs dfsadmin -safemode leave

myservicce.sh複製到fansik目錄下

# hdfs dfs -copyFromLocal ./myservicce.sh /fansik

查看/fansik目錄下是否有了myservicce.sh文件

# hdfs dfs -ls /fansik

使用workcount分析文件

# hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /fansik/myservicce.sh /zhangshan/

查看分析後的文件:

# hdfs dfs -ls /zhangshan/

Found 2 items

-rw-r--r--   2 root supergroup          0 2016-08-02 15:19 /zhangshan/_SUCCESS

-rw-r--r--   2 root supergroup        415 2016-08-02 15:19 /zhangshan/part-r-00000

查看分析結果:

# hdfs dfs -cat /zhangshan/part-r-00000

相關文章
相關標籤/搜索