hadoop2.7.7安裝和集羣(適用hadoop3.1.2和docker容器)

準備hadoop2(master), Hadoop3,hadoop4,三臺機器node

  1. vi /etc/profile.d/hadoop.shmysql

    export JAVA_HOME=/usr/local/src/jdk1.8.0_92
    export JRE_HOME=${JAVA_HOME}/jre
    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib:$CLASSPATH
    export JAVA_PATH=${JAVA_HOME}/bin:${JRE_HOME}/bin
    export PATH=$PATH:${JAVA_PATH}
    
    
    export HADOOP_HOME=/usr/local/src/hadoop-2.7.7
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    export HDFS_DATANODE_USER=root
    export HDFS_DATANODE_SECURE_USER=root
    export HDFS_SECONDARYNAMENODE_USER=root
    export HDFS_NAMENODE_USER=root
    export YARN_RESOURCEMANAGER_USER=root
    export YARN_NODEMANAGER_USER=root

    mapred-env.sh hadoop-env.xml yarn-env.sh 至少有一個設置JAVA_HOMEweb

  2. core-site.xml,配置hdfs端口和地址,臨時文件存放地址sql

    更多參考core-site.xmldocker

    <configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://hadoop2:9091</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
    <value>/data/docker/hadoop/tmp</value>
    </property>
    </configuration>
  3. hdfs-site.xml, 配置HDFS組件屬性,副本個數以及數據存放的路徑shell

    更多參考hdfs-site.xmlapache

    dfs.namenode.name.dir和dfs.datanode.data.dir再也不單獨配置,官網給出的配置是針對規模較大的集羣的較高配置。小程序

    <font color=red>注意:這裏目錄是每臺機器上的,不要去使用volumes-from data_docker資源共享卷</font>centos

    三臺機器同時作bash

    mkdir -p /opt/hadoop/tmp && mkdir -p /opt/hadoop/dfs/data && mkdir -p /opt/hadoop/dfs/name

    <configuration>
        <property>
            <name>dfs.namenode.http-address</name>
            <value>hadoop2:9092</value>
        </property>
        <property>
            <name>dfs.replication</name>
            <value>2</value>
        </property>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>file:/opt/hadoop/dfs/name</value>
        </property>
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>file:/opt/hadoop/dfs/data</value>
        </property>
        <property>
            <name>dfs.namenode.handler.count</name>
            <value>100</value>
        </property>
    
    </configuration>
  4. mapred-site.xml,配置使用yarn框架執行mapreduce處理程序

    更多參考mapred-site.xml

    <configuration>
          <property>
              <name>mapreduce.framework.name</name>
              <value>yarn</value>
          </property>
          <property>
              <name>mapreduce.jobhistory.address</name>
              <value>hadoop2:9094</value>
          </property>
         <property>
             <name>mapreduce.jobhistory.webapp.address</name>
             <value>hadoop2:9095</value>
         </property>
       <property>
            <name>mapreduce.application.classpath</name>
            <value>
                /usr/local/src/hadoop-3.1.2/etc/hadoop,
                /usr/local/src/hadoop-3.1.2/share/hadoop/common/*,
                /usr/local/src/hadoop-3.1.2/share/hadoop/common/lib/*,
                /usr/local/src/hadoop-3.1.2/share/hadoop/hdfs/*,
                /usr/local/src/hadoop-3.1.2/share/hadoop/hdfs/lib/*,
                /usr/local/src/hadoop-3.1.2/share/hadoop/mapreduce/*,
                /usr/local/src/hadoop-3.1.2/share/hadoop/mapreduce/lib/*,
                /usr/local/src/hadoop-3.1.2/share/hadoop/yarn/*,
                /usr/local/src/hadoop-3.1.2/share/share/hadoop/yarn/lib/*
            </value>
        </property>
    </configuration>
  5. yarn-site.xml
    更多配置信息,請參考yarn-site.xml

    <configuration>
      <property>
          <name>yarn.resourcemanager.hostname</name>
          <value>bdfb9324ff7d</value>
      </property>
      <property>
          <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
      </property>
      <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>hadoop2:9093</value>
      </property>
       <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
      </property>
    
    </configuration>
  6. 配置ssh免密登陸

    yum -y install openssh-server openssh-clients
    
    ssh-keygen -q -t rsa -b 2048 -f /etc/ssh/ssh_host_rsa_key -N ''  
    ssh-keygen -q -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key -N ''
    ssh-keygen -t dsa -f /etc/ssh/ssh_host_ed25519_key -N ''
    ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa     #這樣能夠沒有交互
    
    #進入~/.ssh
    cp id_rsa.pub authorized_keys
    cp authorized_keys /data/docker/hadoop/    #拷貝到共享磁盤
    
    #在其餘docker
    #1. 依次完成上述操做(1-4)
    #2. hadoop3 ,hadoop4操做以下
    cp /data/docker/hadoop/authorized_keys  ~/.ssh
    cat id_rsa.pub >> authorized_keys
    cp authorized_keys /data/docker/hadoop/authorized_keys  #覆蓋
    
    #再回到hadoop2容器
    cp  /data/docker/hadoop/authorized_keys  authorized_keys #覆蓋,這樣
    
    #測試
    #啓動hadoop3,hadoop4的ssh
     /usr/sbin/sshd
    
    ssh root@hadoop3
    ssh root@hadoop4
  7. 配置hosts

    172.17.0.9    hadoop2
    172.17.0.10    hadoop3
    172.17.0.11    hadoop4
  8. 配置works定義工做節點

    vi /usr/local/src/hadoop-3.1.2/etc/hadoop/workers ,2.7版本中應該是slave

    hadoop2                     #這臺以既能夠是namenode,也能夠是datanode,不要浪費機器
    hadoop3                     #只作datanode
    hadoop4                     #只作datanode
  9. 中止docker容器並建立鏡像

    172.17.0.0/24 可用ip: 1-255 ip總數256, 子網掩碼:255.255.255.0

    172.17.0.0/16 可用ip: 可用地址就是172.16.0.1-172.16.255.254. ip總數:65536 子網掩碼:255.255.0.0

docker commit hadoop2 image_c

docker run --privileged -tdi --volumes-from data_docker --name hadoop2 --hostname hadoop2 --add-host hadoop2:172.17.0.8 --add-host hadoop3:172.17.0.9 --add-host hadoop4:172.17.0.10 --link mysqlcontainer:mysqlcontainer  -p 5002:22 -p 8088:8088 -p 9090:9090 -p 9091:9091  -p 9092:9092  -p 9093:9093  -p 9094:9094  -p 9095:9095  -p 9096:9096  -p 9097:9097  -p 9098:9098  -p 9099:9099 centos:hadoop /bin/bash 

docker run --privileged -tdi --volumes-from data_docker --name hadoop3 --hostname hadoop3 --add-host hadoop2:172.17.0.8 --add-host hadoop3:172.17.0.9 --add-host hadoop4:172.17.0.10 --link mysqlcontainer:mysqlcontainer  -p 5003:22  centos:hadoop /bin/bash 

docker run --privileged -tdi --volumes-from data_docker --name hadoop4 --hostname hadoop4 --add-host hadoop2:172.17.0.8 --add-host hadoop3:172.17.0.9 --add-host hadoop4:172.17.0.10 --link mysqlcontainer:mysqlcontainer  -p 5004:22  centos:hadoop /bin/bash
  1. 啓動

    首次hdfs namenode -format

    你會看到最後倒數: util.ExitUtil: Exiting with status 0

    start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh

#start-dfs.sh----------------------
# jps  能夠在Master上看到以下進程:
5252 DataNode
5126 NameNode
5547 Jps
5423 SecondaryNameNode

# jps slave能夠看到
1131 Jps
1052 DataNode
# start-yarn.sh------------------
# jps  能夠在Master上看到以下進程:
5890 NodeManager
5252 DataNode
5126 NameNode
6009 Jps
5423 SecondaryNameNode
5615 ResourceManager

# jps slave能夠看到
1177 NodeManager
1052 DataNode
1309 Jps

訪問

http://hadoop2:9093

http://hadoop2:9092

試用hadoop

準備test

cat test.txt 

hadoop mapreduce hive
hbase spark storm
sqoop hadoop hive
spark hadoop


#hdfs dfs 看一下幫助
#建立hadoop下的目錄
hadoop fs -mkdir /input
hadoop fs -ls /
#上傳
hadoop fs -put test.txt /input
hadoop fs -ls /input
#運行hadoop自帶workcount程序
#/hadoop-mapreduce-examples-2.7.7.jar裏面有不少小程序
yarn jar /usr/local/src/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount /input/test.txt /output 
hadoop fs -ls /output
-rw-r--r--   2 root supergroup          0 2019-06-03 01:28 /output/_SUCCESS
-rw-r--r--   2 root supergroup         60 2019-06-03 01:28 /output/part-r-00000
#查看結果
hadoop fs -cat /output/part-r-00000

#查看其餘內置程序
hadoop jar /usr/local/src/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar 
#能夠看到grep的用法
hadoop jar /usr/local/src/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar grep

http://hadoop2:9093 看到任務信息

其餘hadoop命令

#查看容量
hadoop fs -df -h
Filesystem              Size   Used  Available  Use%
hdfs://hadoop2:9091  150.1 G  412 K    129.9 G    0%
#查看各個機器狀態
hdfs dfsadmin -report

文章內容由海暢智慧http://www.hichannel.net原創出品,轉載請註明!更多技術文檔和海暢產品,請關注海暢智慧官方網站。

相關文章
相關標籤/搜索