docker hadoop3.2.0搭建集羣

一、docker安裝centos鏡像
從 Docker 鏡像倉庫獲取鏡像的命令是 docker pull。其命令格式爲:
docker pull [選項] [Docker Registry 地址[:端口號]/]倉庫名[:標籤]
能夠直接使用docker pull centos:7命令安裝鏡像
下載好以後,使用docker image ls查看擁有的鏡像:
[hadoop @localhost ~]$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/centos latest 2d194b392dd1 3 weeks ago 195 MB
docker.io/hello-world latest f2a91732366c 4 months ago 1.85 kB
一個是centos鏡像,另外一個是咱們以前使用docker run hello-world命令下載的鏡像。
鏡像(Image)和容器(Container)的關係,就像是面向對象程序設計中的 類 和 實例 同樣,鏡像是靜態的定義,容器是鏡像運行時的實體。容器能夠被建立、啓動、中止、刪除、暫停等。
二、運行容器
有了鏡像後,咱們就可以以這個鏡像爲基礎啓動並運行一個容器。
[hadoop @localhost ~]$ docker run -it --rm centos bash
[root@58f67e873eb9 /]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
 
二、安裝java版本
安裝以前先檢查一下系統有沒有自帶open-jdk
命令:
rpm -qa |grep java
rpm -qa |grep jdk
rpm -qa |grep gcj
若是沒有輸入信息表示沒有安裝。
若是安裝可使用rpm -qa | grep java | xargs rpm -e --nodeps 批量卸載全部帶有Java的文件 這句命令的關鍵字是java
首先檢索包含java的列表
yum list java*
檢索1.8的列表
yum list java-1.8*   
安裝1.8.0的全部文件
yum install java-1.8.0-openjdk* -y
使用命令檢查是否安裝成功
java -version
  wget 下載jdk
   wget --no-cookies --no-check-certificate --header "Cookie:     gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie"     "https://download.oracle.com/otn-pub/java/jdk/8u201-b09/42970487e3af4f5aa5bca3f542482c60/jdk-8u201-linux-x64.tar.gz"
三、下載hadoop最新版本
mkdir /usr/hadoop/
wget  mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz
tar -zxvf  hadoop-3.2.0.tar.gz
四、配置環境變量
vim /etc/profile
添加以下內容:
#JAVA VARIABLES START
export JAVA_HOME=/usr/java/jdk1.8.0_201
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
#JAVA VARIABLES END
 
#HADOOP VARIABLES START
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.8.3
#export HADOOP_INSTALL=$HADOOP_HOME
#export HADOOP_MAPRED_HOME=$HADOOP_HOME
#export HADOOP_COMMON_HOME=$HADOOP_HOME
#export HADOOP_HDFS_HOME=$HADOOP_HOME
#export YARN_HOME=$HADOOP_HOME
#export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH
#export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH
#HADOOP VARIABLES END 
執行命令:source /etc/profile
 
保存鏡像更新信息 docker commit 6ebd4423e2de hadoop-master
五、配置hadoop
hadoop配置文件修改
1).core-site.xml配置
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/hadoop/hadoop-3.2.0/tmp</value>
  </property>
  <property>
    <name>fs.default.name</name>
 <value>hdfs://master:9000</value>
 <final>true</final>
  </property>
 
2).hdfs-site.xml配置
  <property>
    <name>dfs.replication</name>
    <value>2</value>
 <final>true</final>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
 <value>/usr/hadoop/namenode</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
 <value>/usr/local/hadoop/datanode</value>
  </property>
 
3).mapred-site.xml配置
  <property>
    <name>mapred.job.tracker</name>
    <value>master:9001</value>
  </property>
 
4)指定JAVA_HOME環境變量
vim /usr/hadoop/hadoop-3.2.0/etc/hadoop/hadoop-env.sh
修改JAVA_HOME=/usr/java/jdk1.8.0_201
 
5).格式化 namenode
cd /usr/hadoop/hadoop-3.2.0/bin
hadoop namenode -format
 
安裝SSH
查看是否安裝 rpm -qa | grep ssh
 安裝SSH yum install openssh*
centos7設置SSH免密碼登陸
一、ssh-keygen -t rsa 生成公鑰
 
二、把公鑰文件放入受權文件中
cat id_rsa.pub >> authorized_keys
 
將鏡像保存到新的容器
docker commit b243b3926f0a hadoop-basic
將hadoop-basic 建立master,slave1,slave2
運行以下命令:
docker run -p 50070:50070 -p 19888:19888 -p 8088:8088 --name master -ti -h master hadoop-master
docker run -it -h slave1 --name slave1 hadoop-slave1 /bin/bash
docker run -it -h slave2 --name slave2 hadoop-slave2 /bin/bash
 
hdfs dfsadmin -report 查看DataNode是否正常啓動
 
 錯誤處理
 問題1:
   Starting namenodes on [localhost]
    ERROR: Attempting to operate on hdfs namenode as root
    ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
    Starting datanodes
    ERROR: Attempting to operate on hdfs datanode as root
    ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
    Starting secondary namenodes [bogon]
    ERROR: Attempting to operate on hdfs secondarynamenode as root
    ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
    處理1
        $ vim sbin/start-dfs.sh
        $ vim sbin/stop-dfs.sh
    兩處增長如下內容
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
    處理2
        $ vim sbin/start-yarn.sh
        $ vim sbin/stop-yarn.sh
    兩處增長如下內容
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
 
問題2:
localhost: ssh: connect to host localhost port 22: Cannot assign requested address
cd /etc/ssh
vim sshd_config
添加 Port 22
問題3:
Failed to get D-Bus connection: Operation not permitted
解決方法:docker run --privileged -ti -e "container=docker" -v /sys/fs/cgroup:/sys/fs/cgroup hadoop-master /usr/sbin/init
 
問題4:
sshd re-exec requires execution with an absolute path
在開啓SSHD服務時報錯.
sshd re-exec requires execution with an absolute path
用絕對路徑啓動,也報錯以下:
Could not load host key: /etc/ssh/ssh_host_key
Could not load host key: /etc/ssh/ssh_host_rsa_key
Could not load host key: /etc/ssh/ssh_host_dsa_key
Disabling protocol version 1. Could not load host key
Disabling protocol version 2. Could not load host key
sshd: no hostkeys available — exiting
解決過程:
#ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key
#ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key
#/usr/sbin/sshd
執行後報錯:
Could not load host key: /etc/ssh/ssh_host_ecdsa_key
Could not load host key: /etc/ssh/ssh_host_ed25519_key
解決過程:
#ssh-keygen -t dsa -f /etc/ssh/ssh_host_ecdsa_key
#ssh-keygen -t rsa -f /etc/ssh/ssh_host_ed25519_key
#/usr/sbin/sshd
 
hadoop集羣搭建
 
 
問題五、
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [master]
master: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
Starting datanodes
Last login: Mon Jan 28 08:32:32 UTC 2019 on pts/0
localhost: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
Starting secondary namenodes [b982e2adc393]
Last login: Mon Jan 28 08:32:33 UTC 2019 on pts/0
b982e2adc393: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
Starting resourcemanager
Last login: Mon Jan 28 08:32:35 UTC 2019 on pts/0
Starting nodemanagers
Last login: Mon Jan 28 08:32:42 UTC 2019 on pts/0
localhost: /usr/hadoop/hadoop-3.2.0/libexec/hadoop-functions.sh: line 982: ssh: command not found
 
解決:
 $ vim sbin/start-dfs.sh
 $ vim sbin/stop-dfs.sh
將HADOOP_SECURE_DN_USER=hdfs替換爲HADOOP_DATANODE_SECURE_DN_USER=hdfs
centos默認安裝有ssh服務,沒有客戶端。
查看ssh安裝
# rpm -qa | grep openssh
openssh-5.3p1-123.el6_9.x86_64
openssh-server-5.3p1-123.el6_9.x86_64
沒有安裝openssh-clients
yum安裝ssh客戶端
yum -y install openssh-clients
 
 
問題六、Failed to get D-Bus connection: Operation not permitted
問題七、docker: Error response from daemon: cgroups: cannot find cgroup mount destination: unknown.
沒有找到具體的解決方法,重啓後能夠訪問
 
問題8:Datanode denied communication with namenode because hostname cannot be resolved
相關文章
相關標籤/搜索