Hadoop分佈式集羣部署

Hadoop 2.x 部署
* Local Mode
* Distributed Mode
* 僞分佈式
一臺機器,運行全部的守護進程,
從節點DataNode、NodeManager
* 徹底分佈式
有多個從節點
DataNodes
NodeManagers
配置文件
$HADOOP_HOME/etc/hadoop/slavesnode

================================================================
三臺機器
192.168.217.131  192.168.217.132  192.168.217.133
hadoop-senior    hadoop-senior02     hadoop-senior03
1.5G           1G          1G
1CPU          1CPU           1CPUweb

配置映射
/etc/hosts
192.168.217.131 hadoop-senior.ibeifeng.com hadoop-senior
192.168.217.132 hadoop-senior02.ibeifeng.com hadoop-senior02
192.168.217.133 hadoop-senior03.ibeifeng.com hadoop-senior03服務器

=====================================================================
      hadoop-senior    hadoop-senior02     hadoop-senior03
HDFS
      NameNode
      DataNode       DataNode           DataNode
                                 SecondaryNameNode
YARN
                 ResourceManager
      NodeManager      NodeManager         NodeManagerapp

MapReduce
      JobHistoryServerwebapp

配置
* hdfs
* hadoop-env.sh
* core-site.xml
* hdfs-site.xml
* slaves
* yarn
* yarn-env.sh
* yarn-site.xml
* slaves
* mapredue
* mapred-env.sh
* mapred-site.xml分佈式

core-site.xmloop

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop-senior1.jason.com:8020</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/app/hadoop-2.5.0/data/tmp</value>
    </property>
    <property>
        <name>fs.trash.interval</name>
        <value>420</value>
    </property>
</configuration>

hdfs-site.xml性能

<configuration>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop-senior3.jason.com:50090</value>
    </property>
</configuration>

mapred-site.xml測試

<configuration>
     <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop-senior1.jason.com:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop-senior1.jason.com:19888</value>
    </property>
</configuration>

yarn-site.xmlspa

<configuration>

<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop-senior2.jason.com</value>
    </property>
</configuration>

 

======================================================================
集羣搭建完成之後
* 基本測試
服務啓動,是否可用,簡單的應用
* hdfs
讀寫操做
bin/hdfs dfs -mkdir -p /user/beifeng/tmp/conf
bin/hdfs dfs -put etc/hadoop/*-site.xml /user/beifeng/tmp/conf
bin/hdfs dfs -text /user/beifeng/tmp/conf/core-site.xml
* yarn
run jar
* mapreduce
bin/yarn jar share/hadoop/mapreduce/hadoop*example*.jar wordcount /user/beifeng/mapreuce/wordcount/input /user/beieng/mapreduce/wordcount/output
* 基準測試
測試集羣的性能
* hdfs
寫數據
讀數據
* 監控集羣
Cloudera
Cloudera Manager
* 部署安裝集羣
* 監控集羣
* 配置同步集羣
* 預警。。。。。

=============================================================
集羣的時間要同步
* 找一臺機器
時間服務器
* 全部的機器與這臺機器時間進行定時的同步
好比,每日十分鐘,同步一次時間

# rpm -qa|grep ntp

# vi /etc/ntp.conf

# vi /etc/sysconfig/ntpd # Drop root to id 'ntp:ntp' by default. SYNC_HWCLOCK=yes OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid -g" [root@hadoop-senior hadoop-2.5.0]# service ntpd statusntpd is stopped[root@hadoop-senior hadoop-2.5.0]# service ntpd startStarting ntpd: [ OK ][root@hadoop-senior hadoop-2.5.0]# chkconfig ntpd on

相關文章
相關標籤/搜索