1.1 系統內核參數優化配置html
修改文件/etc/sysctl.conf,使用sysctl -p命令即時生效。java
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2
|
1.2 修改Linux最大限制node
追加到文件/etc/security/limits.conf便可。linux
1
2
3
4
|
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
|
1.3 磁盤I/O優化調整git
Linux磁盤I/O調度器對磁盤的訪問支持不一樣的策略,默認的爲CFQ,GP建議設置爲deadline。web
我這裏是sda磁盤,因此直接對sda磁盤更改IO調度策略(你須要根據你的磁盤進行設置),以下設置:shell
1
|
$ echo deadline > /sys/block/sdb/queue/scheduler
|
若是想永久生效,加入到/etc/rc.local便可。apache
PS:都配置完畢後,重啓生效便可。vim
2.1 測試環境清單centos
角色 | 主機名 | 地址 | 系統 | 硬件 |
namenode,resourcemanager,datanode,nodemanager,secondarynamenode | hadoop-nn | 10.10.0.186 | CentOS 7.2 | 8核8G |
datanode,nodemanager | hadoop-snn | 10.10.0.187 | CentOS 7.2 | 8核8G |
datanode,nodemanager | hadoop-dn-01 | 10.10.0.188 | CentOS 7.2 | 8核8G |
2.2 設置主機名
1
2
3
4
5
6
7
8
9
10
11
|
# 10.10.0.186;
$ hostname hadoop-nn
$ echo "hostname hadoop-nn" >> /etc/rc.local
# 10.10.0.187;
$ hostname hadoop-snn
$ echo "hostname hadoop-snn" >> /etc/rc.local
# 10.10.0.188;
$ hostname hadoop-dn-01
$ echo "hostname hadoop-dn-01" >> /etc/rc.local
|
2.3 關閉防火牆
若是想開啓防火牆,就須要瞭解Greenplum全部的端口信息便可。
1
2
|
$ systemctl stop firewalld
$ systemctl disable firewalld
|
2.4 關閉SELinux
1
2
|
$ setenforce 0
$ sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
|
2.5 添加全部節點到/etc/hosts
1
2
3
|
10.10.0.186 hadoop-nn master
10.10.0.187 hadoop-snn slave01
10.10.0.188 hadoop-dn-01 slave02
|
2.6 NTP時間同步
在Hadoop namenode節點安裝ntp服務器,而後其餘各個節點都同步namenode節點的時間。
1
2
3
|
$ yum install ntp
$ systemctl start ntpd
$ systemctl enable ntpd
|
而後在其餘節點同步ntp時間。
1
|
$ ntpdate hadoop-nn
|
添加一個計劃任務,Hadoop須要各個節點時間的時間都是一致的,切記。
3.1 在全部主機安裝JAVA
1
|
$ yum install java java-devel -y
|
查看java版本,確保此命令沒有問題
1
2
3
4
|
$ java -version
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)
|
另外openjdk安裝後,不會默許設置JAVA_HOME環境變量,要查看安裝後的目錄,能夠用命令。
1
2
3
4
5
6
7
|
$ update-alternatives --config jre_openjdk
There is 1 program that provides 'jre_openjdk'.
Selection Command
-----------------------------------------------
*+ 1 java-1.8.0-openjdk.x86_64 (/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre)
|
默認jre目錄爲:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre
設置環境變量,可用編輯/etc/profile.d/java.sh
1
2
3
4
5
|
#!/bin/bash
#
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre
export CLASSPATH=$JAVA_HOME/lib/rt.jar:$JAVA_HOME/../lib/dt.jar:$JAVA_HOME/../lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
|
完成這項操做以後,須要從新登陸,或source一下profile文件,以便環境變量生效,固然也能夠手工運行一下,以即時生效。
3.2 在全部主機建立專門hadoop用戶
1
2
|
$ useradd hadoop
$ passwd hadoop
|
設置密碼,爲簡單起見,3臺機器上的hadoop密碼最好設置成同樣,好比123456。爲了方便,建議將hadoop加入root用戶組,操做方法:
1
|
$ usermod -g root hadoop
|
執行完後hadoop即歸屬於root組了,能夠再輸入id hadoop查看輸出驗證一下,若是看到相似下面的輸出:
1
2
|
$ id hadoop
uid=1002(hadoop) gid=0(root) groups=0(root)
|
3.3 在NameNode節點建立祕鑰
建立RSA祕鑰對
1
2
|
$ su - hadoop
$ ssh-keygen
|
在NameNode節點複製公鑰到全部節點Hadoop用戶目錄下,包括本身:
1
2
3
|
$ ssh-copy-id hadoop@10.10.0.186
$ ssh-copy-id hadoop@10.10.0.187
$ ssh-copy-id hadoop@10.10.0.188
|
3.4 在全部主機解壓Hadoop二進制包並設置環境變量
Hadoop二進制包下載自行去國內源或者官網搞定。
1
2
|
$ tar xvf hadoop-2.8.0.tar.gz -C /usr/local/
$ ln -sv /usr/local/hadoop-2.8.0/ /usr/local/hadoop
|
編輯環境配置文件/etc/profile.d/hadoop.sh,定義相似以下環境變量,設定Hadoop的運行環境。
1
2
3
4
5
6
7
8
|
#!/bin/bash
#
export HADOOP_PREFIX="/usr/local/hadoop"
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_YARN_HOME=${HADOOP_PREFIX}
|
建立數據和日誌目錄
1
2
3
4
|
$ mkdir -pv /data/hadoop/hdfs/{nn,snn,dn}
$ chown -R hadoop:hadoop /data/hadoop/hdfs
$ mkdir -pv /var/log/hadoop/yarn
$ chown -R hadoop:hadoop /var/log/hadoop/yarn
|
然後,在Hadoop的安裝目錄中建立logs目錄,並修改Hadoop全部文件的屬主和屬組。
1
2
3
4
|
$ cd /usr/local/hadoop
$ mkdir logs
$ chmod g+w logs
$ chown -R hadoop:hadoop ./*
|
4.1 hadoop-nn節點
須要配置如下幾個文件。
core-site.xml
core-size.xml文件包含了NameNode主機地址以及其監聽RPC端口等信息(NameNode默認使用的RPC端口爲8020),對於分佈式環境,每一個節點都須要設置NameNode主機地址,其簡要的配置內容以下所示:
1
2
|
$ su - hadoop
$ vim /usr/local/hadoop/etc/hadoop/core-site.xml
|
1
2
3
4
5
6
7
|
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
<final>true</final>
</property>
</configuration>
|
hdfs-site.xml
hdfs-site.xml主要用於配置HDFS相關的屬性,列如複製因子(即數據塊的副本數)、NN和DN用於存儲數據的目錄等。數據塊的副本數對於分佈式的Hadoop應該爲3,這裏我設置爲2,爲了減小磁盤使用。而NN和DN用於村粗的數據的目錄爲前面的步驟中專門爲其建立的路徑。另外,前面的步驟中也爲SNN建立了相關的目錄,這裏也一併配置爲啓用狀態。
1
2
|
$ su - hadoop
$ vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/hadoop/hdfs/nn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/hadoop/hdfs/dn</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>file:///data/hadoop/hdfs/snn</value>
</property>
<property>
<name>fs.checkpoint.edits.dir</name>
<value>file:///data/hadoop/hdfs/snn</value>
</property>
</configuration>
|
注意,若是須要其它用戶對hdfs有寫入權限,還須要在hdfs-site.xml添加一項屬性定義:
1
2
3
4
|
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
|
mapred-site.xml
mapred-site.xml文件用於配置集羣的MapReduce framework,此處應該指定使用yarn,另外的可用值還有local和classic。mapred-site.xml默認不存在,但有模塊文件mapred-site.xml.template,只須要將其複製爲mapred-site.xml便可。
1
2
3
|
$ su - hadoop
$ cp -fr /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
$ vim /usr/local/hadoop/etc/hadoop/mapred-site.xml
|
1
2
3
4
5
6
|
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
|
yarn-site.xml
yarn-site.yml用於配置YARN進程及YARN的相關屬性,首先須要指定ResourceManager守護進程的主機和監聽的端口(這裏ResourceManager準備安裝在NameNode節點);其次須要指定ResourceMnager使用的scheduler,以及NodeManager的輔助服務。一個簡要的配置示例以下所示:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>
</configuration>
|
hadoop-env.sh和yarn-env.sh
Hadoop的個各守護進程依賴於JAVA_HOME環境變量,若是有相似於前面步驟中經過/etc/profile.d/java.sh全局配置定義的JAVA_HOME變量便可正常使用。不過,若是想爲Hadoop定義依賴到特定JAVA環境,也能夠編輯這兩個腳本文件,爲其JAVA_HOME取消註釋並配置合適的值便可。此外,Hadoop大多數守護進程默認使用的堆大小爲1GB,但現實應用中,可能須要對其各種進程的堆內存大小作出調整,這隻須要編輯此二者文件中相關環境變量值便可,列如HADOOP_HEAPSIZE、HADOOP_JOB_HISTORY_HEADPSIZE、JAVA_HEAP_SIZE和YARN_HEAP_SIZE等。
slaves文件
slaves文件存儲於了當前集羣的全部slave節點的列表,默認值爲localhost。這裏我打算在三個節點都安裝DataNode,因此都添加進去便可。
1
2
3
4
5
|
$ su - hadoop
$ cat /usr/local/hadoop/etc/hadoop/slaves
hadoop-nn
hadoop-snn
hadoop-dn-01
|
到目前爲止,第一個節點(Master)已經配置好了。在hadoop集羣中,全部節點的配置都應該是同樣的,前面咱們也爲slaves節點建立了Hadoop用戶、數據目錄以及日誌目錄等配置。
接下來就是把Master節點的配置文件都同步到全部Slaves便可。
1
2
3
|
$ su - hadoop
$ scp /usr/local/hadoop/etc/hadoop/* hadoop@hadoop-snn:/usr/local/hadoop/etc/hadoop/
$ scp /usr/local/hadoop/etc/hadoop/* hadoop@hadoop-dn-01:/usr/local/hadoop/etc/hadoop/
|
在HDFS的NameNode啓動以前須要先初始化其用於存儲數據的目錄,若是hdfs-site.xml中dfs.namenode.name.dir屬性指定的目錄不存在,格式化命令會自動建立之;若是事先存在,請確保其權限設置正確,此時格式操做會清除其內部的全部數據並從新創建一個新的文件系統。須要以hdfs用戶的身份執行以下命令。
1
|
[hadoop@hadoop-nn ~]$ hdfs namenode -format
|
其輸出會有大量的信息輸出,若是顯示出相似」17/06/13 05:56:18 INFO common.Storage: Storage directory /data/hadoop/hdfs/nn has been successfully formatted.「的結果表示格式化操做已然完成。
啓動Hadood集羣的方法有兩種:一是在各節點分別啓動須要啓動的服務,二是在NameNode節點啓動整個集羣(推薦方法)。
第一種:分別啓動方式
Master節點須要啓動HDFS的NameNode、SecondaryName、nodeDataNode服務,以及YARN的ResourceManager服務。
1
2
3
4
|
$ sudo -u hadoop hadoop-daemon.sh start namenode
$ sudo -u hadoop hadoop-daemon.sh start secondarynamenode
$ sudo -u hadoop hadoop-daemon.sh start datanode
$ sudo -u hadoop yarn-daemon.sh start resourcemanager
|
各Slave節點須要啓動HDFS的DataNode服務,以及YARN的NodeManager服務。
1
2
|
$ sudo -u hadoop hadoop-daemon.sh start datanode
$ sudo -u hadoop yarn-daemon.sh start nodemanager
|
第二種:集羣啓動方式
集羣規模較大時,分別啓動各節點的各服務過於繁瑣和低效,爲此,Hadoop專門提供了start-dfs.sh和stop-dfs.sh來啓動及中止整個hdfs集羣,以及start-yarn.sh和stop-yarn.sh來啓動及中止整個yarn集羣。
1
2
|
$ sudo -u hadoop start-dfs.sh
$ sudo -u hadoop start-yarn.sh
|
較早版本的Hadoop會提供start-all.sh和stop-all.sh腳原本統一控制hdfs和mapreduce,但Hadoop 2.0及以後的版本不建議再使用此種方式。
我這裏都使用集羣啓動方式。
6.1 啓動HDFS集羣
1
2
3
4
5
6
7
8
|
[hadoop@hadoop-nn ~]$ start-dfs.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-namenode-hadoop-nn.out
hadoop-nn: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-hadoop-nn.out
hadoop-snn: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-hadoop-snn.out
hadoop-dn-01: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-hadoop-dn-01.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-hadoop-nn.out
|
HDFS集羣啓動完成後,可在各節點以jps命令等驗證各進程是否正常運行,也能夠經過Web UI來檢查集羣的運行狀態。
查看NameNode節點啓動的進程:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
# hadoop-nn;
[hadoop@hadoop-nn ~]$ jps
14576 NameNode
14887 SecondaryNameNode
14714 DataNode
15018 Jps
[hadoop@hadoop-nn ~]$ netstat -anplt | grep java
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 16468/java
tcp 0 0 127.0.0.1:58545 0.0.0.0:* LISTEN 16290/java
tcp 0 0 10.10.0.186:8020 0.0.0.0:* LISTEN 16146/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 16146/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 16290/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 16290/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 16290/java
tcp 0 0 10.10.0.186:32565 10.10.0.186:8020 ESTABLISHED 16290/java
tcp 0 0 10.10.0.186:8020 10.10.0.186:32565 ESTABLISHED 16146/java
tcp 0 0 10.10.0.186:8020 10.10.0.188:11681 ESTABLISHED 16146/java
tcp 0 0 10.10.0.186:8020 10.10.0.187:57112 ESTABLISHED 16146/java
|
查看DataNode節點啓動進程:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# hadoop-snn;
[hadoop@hadoop-snn ~]$ jps
741 DataNode
862 Jps
[hadoop@hadoop-snn ~]$ netstat -anplt | grep java
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 1042/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 1042/java
tcp 0 0 127.0.0.1:18975 0.0.0.0:* LISTEN 1042/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 1042/java
tcp 0 0 10.10.0.187:57112 10.10.0.186:8020 ESTABLISHED 1042/java
|
1
2
3
4
|
# hadoop-dn-01;
[hadoop@hadoop-dn-01 ~]$ jps
410 DataNode
539 Jps
|
經過JPS命令和開啓的端口基本能夠看出,NameNode、SecondaryNameNode、DataNode進程各自開啓的對應端口。另外,能夠看到DataNode都正常鏈接到了NameNode的8020端口。若是相關節點起不來,多是權限不對,或者相關目錄沒有建立,具體能夠看相關節點的日誌:/usr/local/hadoop/logs/*.log。
經過NameNode節點的http://hadoop-nn:50070訪問Web UI界面:
此時其實HDFS集羣已經好了,就能夠往裏面存儲數據了,下面簡單使用HDFS命令演示一下:
1
2
3
4
5
6
7
8
9
10
11
12
|
# 在HDFS集羣建立目錄;
[hadoop@hadoop-nn ~]$ hdfs dfs -mkdir /test
# 上傳文件到HDFS集羣;
[hadoop@hadoop-nn ~]$ hdfs dfs -put /etc/fstab /test/fstab
[hadoop@hadoop-nn ~]$ hdfs dfs -put /etc/init.d/functions /test/functions
# 查看HDFS集羣的文件;
[hadoop@hadoop-nn ~]$ hdfs dfs -ls /test/
Found 2 items
-rw-r--r-- 2 hadoop supergroup 524 2017-06-14 01:49 /test/fstab
-rw-r--r-- 2 hadoop supergroup 13948 2017-06-14 01:50 /test/functions
|
而後咱們再看一下Hadoop Web UI界面:
能夠看到Blocks字段,在Hadoop-dn和hadoop-nn節點各自佔用一個塊,HDFS默認未64M一個塊大小。因爲咱們上傳的文件過小,因此也沒有作切割,咱們再啓動集羣時設置的是2個副本,因此這裏就至關於存儲了兩份。
HDFS集羣管理命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
[hadoop@hadoop-nn ~]$ hdfs
Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND
where COMMAND is one of:
dfs run a filesystem command on the file systems supported in Hadoop.
classpath prints the classpath
namenode -format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
journalnode run the DFS journalnode
zkfc run the ZK Failover Controller daemon
datanode run a DFS datanode
debug run a Debug Admin to execute HDFS debug commands
dfsadmin run a DFS admin client
haadmin run a DFS HA admin client
fsck run a DFS filesystem checking utility
balancer run a cluster balancing utility
jmxget get JMX exported values from NameNode or DataNode.
mover run a utility to move block replicas across
storage types
oiv apply the offline fsimage viewer to an fsimage
oiv_legacy apply the offline fsimage viewer to an legacy fsimage
oev apply the offline edits viewer to an edits file
fetchdt fetch a delegation token from the NameNode
getconf get config values from configuration
groups get the groups which users belong to
snapshotDiff diff two snapshots of a directory or diff the
current directory contents with a snapshot
lsSnapshottableDir list all snapshottable dirs owned by the current user
Use -help to see options
portmap run a portmap service
nfs3 run an NFS version 3 gateway
cacheadmin configure the HDFS cache
crypto configure HDFS encryption zones
storagepolicies list/get/set block storage policies
version print the version
|
6.2 啓動YARN集羣
1
2
3
4
5
6
|
[hadoop@hadoop-nn ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-resourcemanager-hadoop-nn.out
hadoop-nn: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-hadoop-nn.out
hadoop-dn-01: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-hadoop-dn-01.out
hadoop-snn: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-hadoop-snn.out
|
YARN集羣啓動完成後,可在各節點以jps命令等驗證各進程是否正常運行。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
# hadoop-nn;
[hadoop@hadoop-nn ~]$ jps
10674 SecondaryNameNode
10342 NameNode
10487 DataNode
15323 ResourceManager
15453 NodeManager
15775 Jps
# hadoop-snn;
[hadoop@hadoop-snn ~]$ jps
10415 NodeManager
11251 Jps
9984 DataNode
# hadoop-dn-01;
10626 NodeManager
10020 DataNode
11423 Jps
|
經過JPS命令和開啓的端口基本能夠看出ResourceManager、NodeManager進程都各自啓動。另外,NodeManager會在對應的DataNode節點都啓動。
經過ResourceManager節點的http://hadoop-nn:8088訪問Web UI界面:
YARN集羣管理命令
YARN命令有許多子命令,大致可分爲用戶命令和管理命令兩類。直接運行yarn命令,可顯示其簡單使用語法及各子命令的簡單介紹:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
[hadoop@hadoop-nn ~]$ yarn
Usage: yarn [--config confdir] [COMMAND | CLASSNAME]
CLASSNAME run the class named CLASSNAME
or
where COMMAND is one of:
resourcemanager run the ResourceManager
Use -format-state-store for deleting the RMStateStore.
Use -remove-application-from-state-store <appId> for
removing application from RMStateStore.
nodemanager run a nodemanager on each slave
timelineserver run the timeline server
rmadmin admin tools
sharedcachemanager run the SharedCacheManager daemon
scmadmin SharedCacheManager admin tools
version print the version
jar <jar> run a jar file
application prints application(s)
report/kill application
applicationattempt prints applicationattempt(s)
report
container prints container(s) report
node prints node report(s)
queue prints queue information
logs dump container logs
classpath prints the class path needed to
get the Hadoop jar and the
required libraries
cluster prints cluster information
daemonlog get/set the log level for each
daemon
top run cluster usage tool
|
這些命令中,jar、application、node、logs、classpath和version是經常使用的用戶命令,而resourcemanager、nodemanager、proxyserver、rmadmin和daemonlog是較爲經常使用的管理類命令。
YARN應用程序(Application)能夠是一個簡單的shell腳本、MapReduce做業或其它任意類型的做業。須要運行應用程序時,客戶端須要事先生成一個ApplicationMaster,然後客戶端把application context提交給ResourceManager,隨後RM向AM分配內存及運行應用程序的容器。大致來講,此過程分爲六個階段。
下面咱們來利用搭建好的Hadoop平臺處理一個任務,看一下這個流程是怎樣的。Hadoop安裝包默認提供了一下運行示例,以下操做:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
[hadoop@hadoop-nn ~]$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar
An example program must be given as the first argument.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
|
咱們找一個比較好理解的wordcount進行測試,還記得咱們剛開始提供一個funcations文件到了HDFS集羣中,下面咱們就把funcations這個文件進行單詞統計處理,示例以下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
|
[hadoop@hadoop-nn ~]$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount /test/fstab /test/functions /test/wc
17/06/14 04:26:02 INFO client.RMProxy: Connecting to ResourceManager at master/10.10.0.186:8032
17/06/14 04:26:03 INFO input.FileInputFormat: Total input files to process : 2
17/06/14 04:26:03 INFO mapreduce.JobSubmitter: number of splits:2
17/06/14 04:26:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1497424827481_0002
17/06/14 04:26:03 INFO impl.YarnClientImpl: Submitted application application_1497424827481_0002
17/06/14 04:26:03 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1497424827481_0002/
17/06/14 04:26:03 INFO mapreduce.Job: Running job: job_1497424827481_0002
17/06/14 04:26:09 INFO mapreduce.Job: Job job_1497424827481_0002 running in uber mode : false
17/06/14 04:26:09 INFO mapreduce.Job: map 0% reduce 0%
17/06/14 04:26:14 INFO mapreduce.Job: map 100% reduce 0%
17/06/14 04:26:19 INFO mapreduce.Job: map 100% reduce 100%
17/06/14 04:26:19 INFO mapreduce.Job: Job job_1497424827481_0002 completed successfully
17/06/14 04:26:19 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=1272
FILE: Number of bytes written=411346
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1183
HDFS: Number of bytes written=470
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=4582
Total time spent by all reduces in occupied slots (ms)=2651
Total time spent by all map tasks (ms)=4582
Total time spent by all reduce tasks (ms)=2651
Total vcore-milliseconds taken by all map tasks=4582
Total vcore-milliseconds taken by all reduce tasks=2651
Total megabyte-milliseconds taken by all map tasks=4691968
Total megabyte-milliseconds taken by all reduce tasks=2714624
Map-Reduce Framework
Map input records=53
Map output records=142
Map output bytes=1452
Map output materialized bytes=1278
Input split bytes=206
Combine input records=142
Combine output records=86
Reduce input groups=45
Reduce shuffle bytes=1278
Reduce input records=86
Reduce output records=45
Spilled Records=172
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=169
CPU time spent (ms)=1040
Physical memory (bytes) snapshot=701403136
Virtual memory (bytes) snapshot=6417162240
Total committed heap usage (bytes)=529530880
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=977
File Output Format Counters
Bytes Written=470
|
咱們把統計結果放到HDFS集羣的/test/wc目錄下。另外,注意當輸出目錄存在時執行任務會報錯。
任務運行時,你能夠去Hadoop管理平臺(8088端口)看一下會有以下相似的輸出信息,包括這次應用名稱,運行用戶、任務名稱、應用類型、執行時間、執行狀態、以及處理進度。
而後咱們能夠看一下/test/wc目錄下有什麼:
1
2
3
4
|
[hadoop@hadoop-nn ~]$ hdfs dfs -ls /test/wc
Found 2 items
-rw-r--r-- 2 hadoop supergroup 0 2017-06-14 04:26 /test/wc/_SUCCESS
-rw-r--r-- 2 hadoop supergroup 470 2017-06-14 04:26 /test/wc/part-r-00000
|
看一下單詞統計結果:
1
2
3
4
5
6
7
8
9
10
11
|
[hadoop@hadoop-nn ~]$ hdfs dfs -cat /test/wc/part-r-00000
# 8
'/dev/disk' 2
/ 2
/boot 2
/data 2
/dev/mapper/centos-root 2
/dev/mapper/centos-swap 2
/dev/sdb 2
/etc/fstab 2
......
|
當運行過Yarn任務以後,在Web UI界面能夠查看其狀態信息。可是當ResourceManager重啓以後,這些任務就不可見了。因此能夠經過開啓Hadoop歷史服務來查看歷史任務信息。
Hadoop開啓歷史服務能夠在web頁面上查看Yarn上執行job狀況的詳細信息。能夠經過歷史服務器查看已經運行完的Mapreduce做業記錄,好比用了多少個Map、用了多少個Reduce、做業提交時間、做業啓動時間、做業完成時間等信息。
1
2
3
4
5
6
7
8
|
[root@hadoop-nn ~]# jps
23347 DataNode
23188 NameNode
23892 NodeManager
20597 QuorumPeerMain
24631 Jps
24264 JobHistoryServer
23534 SecondaryNameNode
|
JobHistoryServer開啓後,能夠經過Web頁面查看歷史服務器:
歷史服務器的Web端口默認是19888,能夠查看Web界面。你能夠多執行幾回Yarn任務,能夠經過History點擊跳到歷史頁面,查看其任務詳情。
可是在上面所顯示的某一個Job任務頁面的最下面,Map和Reduce個數的連接上,點擊進入Map的詳細信息頁面,再查看某一個Map或者Reduce的詳細日誌是看不到的,是由於沒有開啓日誌彙集服務。
MapReduce是在各個機器上運行的,在運行過程當中產生的日誌存在於各個機器上,爲了可以統一查看各個機器的運行日誌,將日誌集中存放在HDFS上,這個過程就是日誌彙集。
配置日誌彙集功能,Hadoop默認是不啓用日誌彙集的,在yarn-site.xml文件裏配置啓用日誌彙集。
1
2
3
4
5
6
7
8
|
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>106800</value>
</property>
|
yarn.log-aggregation-enable:是否啓用日誌彙集功能。
yarn.log-aggregation.retain-seconds:設置日誌保留時間,單位是秒。
將配置文件分發到其餘節點:
1
2
3
|
[hadoop@hadoop-nn ~]$ su - hadoop
[hadoop@hadoop-nn ~]$ scp /usr/local/hadoop/etc/hadoop/* hadoop@hadoop-snn:/usr/local/hadoop/etc/hadoop/
[hadoop@hadoop-nn ~]$ scp /usr/local/hadoop/etc/hadoop/* hadoop@hadoop-dn-01:/usr/local/hadoop/etc/hadoop/
|
重啓Yarn進程:
1
2
|
[hadoop@hadoop-nn ~]$ stop-yarn.sh
[hadoop@hadoop-nn ~]$ start-yarn.sh
|
重啓HistoryServer進程:
1
2
|
[hadoop@hadoop-nn ~]$ mr-jobhistory-daemon.sh stop historyserver
[hadoop@hadoop-nn ~]$ mr-jobhistory-daemon.sh start historyserver
|
測試日誌彙集,運行一個demo MapReduce,使之產生日誌:
1
|
[hadoop@hadoop-nn ~]$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount /test/fstab /test/wc1
|
運行Job後,就能夠在歷史服務器Web頁面查看各個Map和Reduce的日誌了。
1
2
3
4
5
6
7
8
|
# [hadoop] 512 - 4096
$ /usr/local/hadoop/etc/hadoop/hadoop-env.sh
export HADOOP_PORTMAP_OPTS="-Xmx4096m $HADOOP_PORTMAP_OPTS"
export HADOOP_CLIENT_OPTS="-Xmx4096m $HADOOP_CLIENT_OPTS"
# [yarn] 2048 - 4096
$ /usr/local/hadoop/etc/hadoop/yarn-env.sh
JAVA_HEAP_MAX=-Xmx4096m
|
Ambari——大數據平臺的搭建利器:https://www.ibm.com/developerworks/cn/opensource/os-cn-bigdata-ambari/index.html