1.服務器概覽html
hostname | ip | 說明 |
---|---|---|
nn01 | 192.168.56.101 | name node |
nn02 | 192.168.56.102 | name node |
dn01 | 192.168.56.103 | data node |
dn02 | 192.168.56.104 | data node |
dn03 | 192.168.56.105 | data node |
nn01 | nn02 | dn01 | dn02 | dn03 | |
---|---|---|---|---|---|
NameNode | √ | √ | |||
DataNode | √ | √ | √ | ||
ResourceManager | √ | √ | |||
NodeManager | √ | √ | √ | √ | √ |
Zookeeper | √ | √ | √ | √ | √ |
journalnode | √ | √ | √ | √ | √ |
zkfc | √ | √ |
分別在三臺服務器上執行如下命令java
#添加host [root@nn01 ~] vim /etc/hosts 192.168.56.101 nn01 192.168.56.102 nn02 192.168.56.103 dn01 192.168.56.104 dn02 192.168.56.105 dn03 #執行如下命令關閉防火牆 [root@nn01 ~]systemctl stop firewalld && systemctl disable firewalld [root@nn01 ~]setenforce 0 #將SELINUX的值改爲disabled [root@nn01 ~]vim /etc/selinux/config SELINUX=disabled #重啓服務器 [root@nn01 ~]reboot
2.JDK安裝node
#配置環境變量 [root@nn01 ~]# vim /etc/profile # 在最後下添加 # Java Environment Path export JAVA_HOME=/opt/java/jdk1.8.0_172 export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar # 刷新配置文件 source /etc/profile
3.配置免密碼登陸linux
#nn01執行如下命令 #生成密鑰Pair,輸入以後一直選擇enter便可。生成的祕鑰位於 ~/.ssh文件夾下 [root@nn01 ~]# ssh-keygen -t rsa [root@nn01 .ssh]# scp /root/.ssh/id_rsa.pub root@nn01:~ [root@nn01 .ssh]#cat ~/id_rsa.pub >> /root/.ssh/authorized_keys ##nn02 執行如下命令 [root@nn02 .ssh]#cat ~/id_rsa.pub >> /root/.ssh/authorized_keys ##nn02,dn01,dn02,dn03 執行如下命令 [root@nn02 ~]# mkdir -p ~/.ssh [root@nn02 ~]# cd .ssh/ [root@nn02 .ssh]# cat ~/id_rsa.pub >> /root/.ssh/authorized_keys [root@nn02 .ssh]# vim /etc/ssh/sshd_config #禁用root帳戶登陸,若是是用root用戶登陸請開啓 PermitRootLogin yes PubkeyAuthentication yes
要求能經過免登陸包括使用IP和主機名都能免密碼登陸: 1) NameNode能免密碼登陸全部的DataNode 2) 各NameNode能免密碼登陸本身 3) 各NameNode間能免密碼互登陸 4) DataNode能免密碼登陸本身 5) DataNode不須要配置免密碼登陸NameNode和其它DataNode。web
同理,配置nn02免密碼登陸nn01,dn01,dn02,dn03shell
mkdir -p /opt/zookeeper/ cd /opt/zookeeper/ tar -zxvf zookeeper-3.4.13.tar.gz cd zookeeper-3.4.13/conf/ cp zoo_sample.cfg zoo.cfg vim zoo.cfg
zoo.cfg數據庫
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/opt/data/zookeeper # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=nn01:2888:3888 server.2=nn02:2888:3888 server.3=dn01:2888:3888 server.4=dn02:2888:3888 server.5=dn03:2888:3888
基本配置:
tickTime
心跳基本時間單位,毫秒級,ZK基本上全部的時間都是這個時間的整數倍。
initLimit
tickTime的個數,表示在leader選舉結束後,followers與leader同步須要的時間,若是followers比較多或者說leader的數據灰常多時,同步時間相應可能會增長,那麼這個值也須要相應增長。固然,這個值也是follower和observer在開始同步leader的數據時的最大等待時間(setSoTimeout)
syncLimit
tickTime的個數,這時間容易和上面的時間混淆,它也表示follower和observer與leader交互時的最大等待時間,只不過是在與leader同步完畢以後,進入正常請求轉發或ping等消息交互時的超時時間。
dataDir
內存數據庫快照存放地址,若是沒有指定事務日誌存放地址(dataLogDir),默認也是存放在這個路徑下,建議兩個地址分開存放到不一樣的設備上。express
clientPort
配置ZK監聽客戶端鏈接的端口apache
server.serverid=host:tickpot:electionportvim
server:固定寫法
serverid:每一個服務器的指定ID(必須處於1-255之間,必須每一臺機器不能重複)
host:主機名
tickpot:心跳通訊端口
electionport:選舉端口
#新建文件夾 mkdir -p /opt/data/zookeeper mkdir -p /opt/data/logs/zookeeper touch /opt/data/zookeeper/myid #複製到其餘主機 scp -r /opt/zookeeper root@nn02:/opt/ scp -r /opt/data/zookeeper root@nn02:/opt/data/ scp -r /opt/data/logs/zookeeper root@nn02:/opt/data/logs/ #在nn01上執行 echo 1 > /opt/data/zookeeper/myid #在nn02上執行 echo 2 > /opt/data/zookeeper/myid #在dn01上執行 echo 3 > /opt/data/zookeeper/myid #在dn02上執行 echo 4 > /opt/data/zookeeper/myid #在dn03上執行 echo 5 > /opt/data/zookeeper/myid
#添加環境變量 export ZOOKEEPER_HOME=/opt/zookeeper/zookeeper-3.4.13 export PATH=$ZOOKEEPER_HOME/bin:$PATH source /etc/profile
1 下載hadoop
mkdir -p /opt/hadoop/ cd /opt/hadoop tar -xf hadoop-3.1.1.tar.gz ##設置環境變量 export HADOOP_HOME=/opt/hadoop/hadoop-3.1.1 export HADOOP_PREFIX=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin export HADOOP_INSTALL=$HADOOP_HOME # 新建文件夾 mkdir -p /opt/data/logs/hadoop mkdir -p /opt/data/hadoop/hdfs/nn mkdir -p /opt/data/hadoop/hdfs/dn mkdir -p /opt/data/hadoop/hdfs/jn
修改配置文件:/opt/hadoop/hadoop-3.1.1/etc/hadoop/hadoop-env.sh
## 在文件開頭加上,根據本身服務器配置設置jvm內存大小 export JAVA_HOME=/opt/java/jdk1.8.0_172 export HADOOP_NAMENODE_OPTS=" -Xms1024m -Xmx1024m -XX:+UseParallelGC" export HADOOP_DATANODE_OPTS=" -Xms512m -Xmx512m" export HADOOP_LOG_DIR=/opt/data/logs/hadoop
/opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- 指定hdfs的nameservice爲mycluster --> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/data/hadoop/tmp</value> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>nn01:2181,nn02:2181,dn01:2181,dn02:2181,dn03:2181</value> </property> <!-- hadoop連接zookeeper的超時時長設置 --> <property> <name>ha.zookeeper.session-timeout.ms</name> <value>30000</value> <description>ms</description> </property> <property> <name>fs.trash.interval</name> <value>1440</value> </property> </configuration>
/opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- journalnode集羣之間通訊的超時時間 --> <property> <name>dfs.qjournal.start-segment.timeout.ms</name> <value>60000</value> </property> <!--指定hdfs的nameservice爲mycluster,須要和core-site.xml中的保持一致 dfs.ha.namenodes.[nameservice id]爲在nameservice中的每個NameNode設置惟一標示符。 配置一個逗號分隔的NameNode ID列表。這將是被DataNode識別爲全部的NameNode。 例如,若是使用"mycluster"做爲nameservice ID,而且使用"nn01"和"nn02"做爲NameNodes標示符 --> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <!-- mycluster下面有兩個NameNode,分別是nn01,nn02 --> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn01,nn02</value> </property> <!-- nn01的RPC通訊地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn01</name> <value>nn01:8020</value> </property> <!-- nn02的RPC通訊地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn02</name> <value>nn02:8020</value> </property> <!-- nn01的http通訊地址 --> <property> <name>dfs.namenode.http-address.mycluster.nn01</name> <value>nn01:50070</value> </property> <!-- nn02的http通訊地址 --> <property> <name>dfs.namenode.http-address.mycluster.nn02</name> <value>nn02:50070</value> </property> <!-- 指定NameNode的edits元數據的共享存儲位置。也就是JournalNode列表 該url的配置格式:qjournal://host1:port1;host2:port2;host3:port3/journalId journalId推薦使用nameservice,默認端口號是:8485 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://nn01:8485;nn02:8485;dn01:8485;dn02:8485;dn03:8485/mycluster</value> </property> <!-- 配置失敗自動切換實現方式 --> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔離機制方法,多個機制用換行分割,即每一個機制暫用一行 --> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> <property> <name>dfs.support.append</name> <value>true</value> </property> <!-- 使用sshfence隔離機制時須要ssh免登錄 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!-- 指定副本數 --> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/opt/data/hadoop/hdfs/nn</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/opt/data/hadoop/hdfs/dn</value> </property> <!-- 指定JournalNode在本地磁盤存放數據的位置 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/data/hadoop/hdfs/jn</value> </property> <!-- 開啓NameNode失敗自動切換 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 啓用webhdfs --> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <!-- 配置sshfence隔離機制超時時間 --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> <property> <name>ha.failover-controller.cli-check.rpc-timeout.ms</name> <value>60000</value> </property> </configuration>
/opt/hadoop/hadoop-3.1.1/etc/hadoop/mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- 指定mr框架爲yarn方式 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- 指定mapreduce jobhistory地址 --> <property> <name>mapreduce.jobhistory.address</name> <value>nn01:10020</value> </property> <!-- 任務歷史服務器的web地址 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>nn01:19888</value> </property> <property> <name>mapreduce.application.classpath</name> <value> /opt/hadoop/hadoop-3.1.1/etc/hadoop, /opt/hadoop/hadoop-3.1.1/share/hadoop/common/*, /opt/hadoop/hadoop-3.1.1/share/hadoop/common/lib/*, /opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/*, /opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/lib/*, /opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/*, /opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/lib/*, /opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/*, /opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/lib/* </value> </property> </configuration>
/opt/hadoop/hadoop-3.1.1/etc/hadoop/yarn-site.xml
<?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- Site specific YARN configuration properties --> <!-- 開啓RM高可用 --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!-- 指定RM的cluster id --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yrc</value> </property> <!-- 指定RM的名字 --> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!-- 分別指定RM的地址 --> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>nn01</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>nn02</value> </property> <!-- 指定zk集羣地址 --> <property> <name>yarn.resourcemanager.zk-address</name> <value>nn01:2181,nn02:2181,dn01:2181,dn02:2181,dn03:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>86400</value> </property> <!-- 啓用自動恢復 --> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <!-- 制定resourcemanager的狀態信息存儲在zookeeper集羣上 --> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> </configuration>
/opt/hadoop/hadoop-3.1.1/etc/hadoop/workers
dn01 dn02 dn03
/opt/hadoop/hadoop-3.1.1/sbin/start-dfs.sh sbin/stop-dfs.sh
HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_ZKFC_USER=root HDFS_JOURNALNODE_USER=root HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root
/opt/hadoop/hadoop-3.1.1/sbin/start-yarn.sh sbin/stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
複製到其餘機器
scp -r /opt/data root@nn02:/opt/ scp -r /opt/data root@dn01:/opt/ scp -r /opt/data root@dn02:/opt/ scp -r /opt/data root@dn03:/opt/ scp -r /opt/hadoop/hadoop-3.1.1 root@nn02:/opt/hadoop/ scp -r /opt/hadoop/hadoop-3.1.1 root@dn01:/opt/hadoop/ scp -r /opt/hadoop/hadoop-3.1.1 root@dn02:/opt/hadoop/ scp -r /opt/hadoop/hadoop-3.1.1 root@dn03:/opt/hadoop/
Zookeeper -> JournalNode -> 格式化NameNode ->建立命名空間(zkfc) -> NameNode -> DataNode -> ResourceManager -> NodeManager。
1. 啓動zookeeper
nn01,nn02,dn01,dn02,dn03
zkServer.sh start
2. 啓動journalnode
nn01,nn02,dn01,dn02,dn03
hadoop-daemon.sh start journalnode
3. 格式化namenode
nn01
hadoop namenode -format
把在nn01節點上生成的元數據給複製到其餘節點上
scp -r /opt/data/hadoop/hdfs/nn/* root@nn02:/opt/data/hadoop/hdfs/nn/ scp -r /opt/data/hadoop/hdfs/nn/* root@dn01:/opt/data/hadoop/hdfs/nn/ scp -r /opt/data/hadoop/hdfs/nn/* root@dn02:/opt/data/hadoop/hdfs/nn/ scp -r /opt/data/hadoop/hdfs/nn/* root@dn03:/opt/data/hadoop/hdfs/nn/
4. 格式化zkfc
重點強調:只能在nameonde節點進行 nn01
hdfs zkfc -formatZK
5. 啓動HDFS
重點強調:只能在nameonde節點進行 nn01
start-dfs.sh
6. 啓動YARN
在主備 resourcemanager 中隨便選擇一臺進行啓動
nn02
start-yarn.sh
若備用節點的 resourcemanager 沒有啓動起來,則手動啓動起來: yarn-daemon.sh start resourcemanager
7. 啓動 mapreduce 任務歷史服務器
mr-jobhistory-daemon.sh start historyserver
8. 狀態查看
查看各主節點的狀態
hdfs haadmin -getServiceState nn01 hdfs haadmin -getServiceState nn02 [root@nn01 hadoop]# hdfs haadmin -getServiceState nn01 WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX. 2018-09-27 11:06:58,892 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable active [root@nn01 hadoop]# [root@nn01 hadoop]# [root@nn01 hadoop]# [root@nn01 hadoop]# hdfs haadmin -getServiceState nn02 WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX. 2018-09-27 11:07:02,217 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable standby [root@nn01 hadoop]# [root@nn01 hadoop]# yarn rmadmin -getServiceState rm1 WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX. 2018-09-27 11:07:45,112 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable standby [root@nn01 hadoop]# [root@nn01 hadoop]# [root@nn01 hadoop]# [root@nn01 hadoop]# yarn rmadmin -getServiceState rm2 WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX. 2018-09-27 11:07:48,350 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable active [root@nn01 hadoop]#
WEB界面進行查看
##HDFS http://192.168.56.101:50070/ http://192.168.56.102:50070/ #YARN http://192.168.56.102:8088/cluster