Hadoop集羣搭建

1. 集羣環境規劃
IP 主機名 NN1 NN2 DN Resource Manager NodeManager ZK
172.*.*.6 master Y Y N Y N Y
172.*.*.7 slave1 N N Y N Y Y
172.*.*.8 slave2 N N Y N Y Y
172.*.*.9 slave2 N N Y N Y Y
2. 新建用戶及用戶組
adduser hadoop
passwd hadoop
#添加到hadoop組
usermod -a -G hadoop hadoop
#賦予root權限
vi /etc/sudoers
hadoop  ALL=(ALL)       ALL
複製代碼
3. 修改master(172...6)主機名
vi /etc/sysconfig/network
HOSTNAME=master
#重啓生效或者臨時使用命令生效
hostname master
#同理在slave1和slave二、slave3上分別執行
hostname slave1
hostname slave2
hostname slave3
複製代碼
4. 配置ip與主機名映射
vi /etc/hosts
172.*.*.6 master
172.*.*.7 slave1
172.*.*.8 slave2
172.*.*.9 slave3
複製代碼
4. 配置免密登陸
#三臺集羣中分別生成密鑰
ssh-keygen -t rsa
#將公鑰拷貝到master的authorized_keys中
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
#賦予authorized_keys 600權限
chmod 600 authorized_keys
#最終authorized_keys文件內容以下
[root@localhost .ssh]# cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtEvxRj/3xPCtnO38Gy4Y/Y4gj6XX5s+G2hwG5xx19PiDQEKeW3BYUDE616OVdecStBo3X+0Plr2ioirI/3WGlUkm0todr/irpksy0MTpvsjCNUnCWGUHGFMUmrcw1LSiNLhoOSS02AcIq+hw3QJO0w0Wo0EN8xcOhrYwuAByoVv3CvqWd/2Vce2rNOXxLNSmc9tR0Dl3ZqOAq+2a55GM7cETj+eiexDeF5zEVJ2vykQdH3+sZ2XLrQu4WXOMn70xFosk7E1lwJ14QLy6lpfRcWnB1JVKJx9mglze6v3U35g59Vu/LP7t3ebW+dJIOD3/Attb5HcvN8MNfQVOX3JD4w== root@master

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAuU9KJmlmYCx7D+vfMCl2Fj/kz1mfWBrChco0jmZtbygpYY8MUSjmfnsC/wefWKMnFtEruJb+RrgBLxVY6lNzvVKXh+iVPhrjubzj54FoZjepR+1EEznIvwkKa+Y4fkcSJjmcSq/Wvjvz34j3/wVoa1qZtbQing+GzC8Xt0y5rQ6fD1gzD4Oniu43fHAeQDxpo2cVNnTdO2HEe56ZfhIctVRP63rc2CoEuD7d0Ea2WhV0Uruqri/ZKFHVAQQqQ7z/jdCgzTdTXJ5t5hpyeaK8+mYhUKEyOF3xrACW1Is6grUjhbjUxTLt2y2Ytw1d5voFxCUJ6MQcy91KFE/9Lfefyw== root@slave1

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEArucVUJdZBYXJD0r9WiX6VnR5S3F7BhoR7hB8UTkXs+WRJGEX9E44yjH+BjIJAPn2v/XwOCdqzSZrGPzLL/BG+XRhGN5NGmdplv8xI3C93hC5kZewRHrHlcAG5Kv4mcHlU+ugcWiyQbIaQvLaFXaq48ZVQHYrzXrz3ZT6QDpsaZtSeW4Z4KWeFmL+AwNyAqxK0nxYXR1zNQJ1r0IdApKmP1WNvbcblB2UKx5G7VMxOs62WY0R9LGdJK6Mmmr5QPlWlpn/g5vXlBvgD80pM6iixFAyz8q19aMQjErTWuULNvX8tdcm+StJV52N8EsiuNMOs+xLVO7L00yxZRtwrXKGgQ== root@slave2
#將master的authorized_keys遠程傳輸到slave1/slave2
scp ~/.ssh/authorized_keys root@slave1:~/.ssh/
scp ~/.ssh/authorized_keys root@slave2:~/.ssh/

#檢查遠程免密登陸
ssh slave1
ssh slave2
ssh slave3
複製代碼
3. 解壓並配置環境變量
tar -zxvf hadoop-2.7.7.tar.gz
vi /etc/profile
export HADOOP_HOME=/opt/middleware/hadoop-2.7.7
export PATH=$PATH:${HADOOP_HOME}/bin
source /etc/profile
複製代碼
4. hadoop配置
#master節點建立文件夾
mkdir -p /opt/middleware/hadoop-2.7.7/dfs/{name,data}
mkdir -p /opt/middleware/hadoop-2.7.7/temp

#其餘配置
vi slaves
slave1
slave2
slave3
複製代碼
#修改hadoop-env.sh
export JAVA_HOME=#{JAVA_HOME}
#修改成(根據jdk實際安裝目錄)
export JAVA_HOME=/usr/local/jdk
複製代碼
#配置core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/opt/middleware/hadoop-2.7.7/temp</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hduser.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hduser.groups</name>
        <value>*</value>
    </property>
</configuration>
複製代碼
#配置hdfs-site.xml
<configuration>
    <property>
        <name>dfs.nameservices</name>
        <value>rsmshadoop</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>master:9001</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/opt/middleware/hadoop-2.7.7/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/opt/middleware/hadoop-2.7.7/dfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>
</configuration>
複製代碼
#配置mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>master:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>master:19888</value>
    </property>
</configuration>
複製代碼
#配置yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>
</configuration>
複製代碼
5. 運行hadoop
#格式化
/opt/middleware/hadoop-2.7.7/bin/hdfs namenode -format
複製代碼
啓動集羣
/opt/middleware/hadoop-2.7.7/sbin/start-all.sh
複製代碼
[root@localhost sbin]# sh start-dfs.sh
which: no start-dfs.sh in (/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/jdk/bin:/opt/middleware/mongodb/bin:/opt/middleware/hadoop-2.7.7/bin:/root/bin:/usr/local/jdk/bin:/opt/middleware/mongodb/bin:/opt/middleware/hadoop-2.7.7/bin)
19/01/17 18:38:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
The authenticity of host 'master (172.*.*.6)' can't be established. RSA key fingerprint is a0:47:1b:35:a9:f1:e7:0d:81:6d:8b:f4:47:95:f9:96. Are you sure you want to continue connecting (yes/no)? yes master: Warning: Permanently added 'master,172.*.*.6' (RSA) to the list of known hosts. master: starting namenode, logging to /opt/middleware/hadoop-2.7.7/logs/hadoop-root-namenode-master.out slave2: starting datanode, logging to /opt/middleware/hadoop-2.7.7/logs/hadoop-root-datanode-slave2.out slave1: starting datanode, logging to /opt/middleware/hadoop-2.7.7/logs/hadoop-root-datanode-slave1.out Starting secondary namenodes [master] master: starting secondarynamenode, logging to /opt/middleware/hadoop-2.7.7/logs/hadoop-root-secondarynamenode-master.out 19/01/17 18:38:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 複製代碼
驗證集羣
hadoop jar /opt/middleware/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar pi 10 10
複製代碼
監控頁面

http://172.*.*.6:50070java

碰到的問題

1.部分NameNode沒法啓動
緣由是屢次運行了格式化命令,致使集羣ID不一致
/opt/middleware/hadoop-2.7.7/bin/hdfs namenode -format
解決方案:
將NameNode(master)的clusterID拷貝到DataNode中
#打開NameNode Version
vi /opt/middleware/hadoop-2.7.7/dfs/name/current/VERSION
...
clusterID=CID-45f7aaaf-424a-472c-9cb5-827a9d18906e
#打開DataNode Version
vi /opt/middleware/hadoop-2.7.7/dfs/data/current/VERSION
...
複製代碼
2.NameNode沒法啓動
There appears to be a gap in the edit log.  We expected txid 1, but got txid 37309
緣由:元數據文件被損壞
解決方案:hadoop namenode -recover
複製代碼
相關文章
相關標籤/搜索