採用3臺虛擬機html
節點名稱 | IP地址 | 內存 | 硬盤 | 節點角色 |
---|---|---|---|---|
node1 | 192.168.1.6 | 2GB | 10GB | NameNode、ResoucerManager |
node2 | 192.168.1.7 | 2GB | 10GB | DataNode、NodeManager、SecondaryNameNode |
node3 | 192.168.1.8 | 2GB | 10GB | DataNode、NodeManager |
軟件 | 版本 |
---|---|
JDK | jdk-8u271 |
HADOOP | hadoop-3.2.1 |
[root@node1 ~]# vi /etc/hostname [root@node1 ~]# reboot [root@node1 ~]# cat /etc/hostname node1 [root@node1 ~]# hostname node1 [root@node2 ~]# hostname # 其餘節點同理 node2 [root@node3 ~]# hostname node3
在 node1 上執行以下步驟:java
[root@node1 ~]# vi /etc/hosts # 添加以下三行內容 IP地址 節點名稱 [root@node1 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.1.6 node1 192.168.1.7 node2 192.168.1.8 node3 # 將hosts文件複製到node2和node3節點 [root@node1 ~]# scp /etc/hosts node2:/etc/ The authenticity of host 'node2 (192.168.1.7)' can't be established. ECDSA key fingerprint is SHA256:8MU51OTPEjoMAEsg3eOMgAJBy3L4nuSMX1RGWN8ew/w. ECDSA key fingerprint is MD5:00:2a:ce:9a:66:9b:42:af:a6:8e:74:07:a9:01:52:dc. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'node2,192.168.1.7' (ECDSA) to the list of known hosts. hosts
[root@node1 ~]# ping node2 PING node2 (192.168.1.7) 56(84) bytes of data. 64 bytes from node2 (192.168.1.7): icmp_seq=1 ttl=64 time=0.404 ms 64 bytes from node2 (192.168.1.7): icmp_seq=2 ttl=64 time=0.617 ms 64 bytes from node2 (192.168.1.7): icmp_seq=3 ttl=64 time=0.828 ms ^C --- node2 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2016ms rtt min/avg/max/mdev = 0.404/0.616/0.828/0.174 ms [root@node1 ~]# ping node3 PING node3 (192.168.1.8) 56(84) bytes of data. 64 bytes from node3 (192.168.1.8): icmp_seq=1 ttl=64 time=1.59 ms 64 bytes from node3 (192.168.1.8): icmp_seq=2 ttl=64 time=0.496 ms 64 bytes from node3 (192.168.1.8): icmp_seq=3 ttl=64 time=0.443 ms ^C --- node3 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2015ms rtt min/avg/max/mdev = 0.443/0.843/1.592/0.530 ms [root@node2 ~]# ping node1 PING node1 (192.168.1.6) 56(84) bytes of data. 64 bytes from node1 (192.168.1.6): icmp_seq=1 ttl=64 time=0.325 ms 64 bytes from node1 (192.168.1.6): icmp_seq=2 ttl=64 time=0.864 ms ^C --- node1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.325/0.594/0.864/0.270 ms [root@node2 ~]# ping node3 PING node3 (192.168.1.8) 56(84) bytes of data. 64 bytes from node3 (192.168.1.8): icmp_seq=1 ttl=64 time=1.58 ms 64 bytes from node3 (192.168.1.8): icmp_seq=2 ttl=64 time=0.728 ms ^C --- node3 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1012ms rtt min/avg/max/mdev = 0.728/1.158/1.589/0.431 ms [root@node3 ~]# ping node1 PING node1 (192.168.1.6) 56(84) bytes of data. 64 bytes from node1 (192.168.1.6): icmp_seq=1 ttl=64 time=0.372 ms 64 bytes from node1 (192.168.1.6): icmp_seq=2 ttl=64 time=0.395 ms ^C --- node1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1003ms rtt min/avg/max/mdev = 0.372/0.383/0.395/0.022 ms [root@node3 ~]# ping node2 PING node2 (192.168.1.7) 56(84) bytes of data. 64 bytes from node2 (192.168.1.7): icmp_seq=1 ttl=64 time=0.874 ms 64 bytes from node2 (192.168.1.7): icmp_seq=2 ttl=64 time=1.03 ms ^C --- node2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1006ms rtt min/avg/max/mdev = 0.874/0.955/1.036/0.081 ms
[root@node1 ~]# systemctl stop firewalld.service [root@node1 ~]# firewall-cmd --state not running [root@node1 ~]# systemctl disable firewalld.service # 禁止firewall開機啓動
使宿主機和虛擬機系統能夠相互ping通node
C:\Windows\System32\drivers\etc\hosts
目錄下,添加以下內容:linux
192.168.1.6 node1 192.168.1.7 node2 192.168.1.8 node3
C:\Users\zgg>ping node1 正在 Ping node1 [192.168.1.6] 具備 32 字節的數據: 來自 192.168.1.6 的回覆: 字節=32 時間<1ms TTL=64 來自 192.168.1.6 的回覆: 字節=32 時間<1ms TTL=64 192.168.1.6 的 Ping 統計信息: 數據包: 已發送 = 2,已接收 = 2,丟失 = 0 (0% 丟失), 往返行程的估計時間(以毫秒爲單位): 最短 = 0ms,最長 = 0ms,平均 = 0ms Control-C ^C C:\Users\zgg>ping node2 正在 Ping node2 [192.168.1.7] 具備 32 字節的數據: 來自 192.168.1.7 的回覆: 字節=32 時間<1ms TTL=64 來自 192.168.1.7 的回覆: 字節=32 時間<1ms TTL=64 192.168.1.7 的 Ping 統計信息: 數據包: 已發送 = 2,已接收 = 2,丟失 = 0 (0% 丟失), 往返行程的估計時間(以毫秒爲單位): 最短 = 0ms,最長 = 0ms,平均 = 0ms Control-C ^C C:\Users\zgg>ping node3 正在 Ping node3 [192.168.1.8] 具備 32 字節的數據: 來自 192.168.1.8 的回覆: 字節=32 時間<1ms TTL=64 來自 192.168.1.8 的回覆: 字節=32 時間<1ms TTL=64 192.168.1.8 的 Ping 統計信息: 數據包: 已發送 = 2,已接收 = 2,丟失 = 0 (0% 丟失), 往返行程的估計時間(以毫秒爲單位): 最短 = 0ms,最長 = 0ms,平均 = 0ms Control-C ^C
無密碼登錄:在 node1 上,經過 ssh node2
或 ssh node3
就能夠登錄到對方計算機上,而不用輸入密碼。web
分別在三臺虛擬機的 /root
目錄下執行:sql
ssh-keygen -t rsa
設置 ssh 的密鑰和密鑰的存放路徑。 路徑爲~/.ssh
shell
進入到 .ssh
目錄,執行以下命令,將公鑰放到 authorized_keys 裏:apache
cp id_rsa.pub authorized_keys
將 node1 上的 authorized_keys 放入其餘虛擬機的 ~/.ssh
目錄下:網絡
scp authorized_keys test2:~/.ssh/ scp authorized_keys test3:~/.ssh/
[root@node1 ~]# ssh node2 Last login: Thu Nov 12 15:38:28 2020 from node1 [root@node2 ~]# exit 登出 Connection to node2 closed.
在 node1 上,下載,解壓,並配置環境變量:app
[root@node1 opt]# tar -zxvf jdk-8u271-linux-x64.tar.gz ... [root@node1 opt]# vi /etc/profile [root@node1 opt]# source /etc/profile [root@node1 opt]# java -version java version "1.8.0_271" Java(TM) SE Runtime Environment (build 1.8.0_271-b09) Java HotSpot(TM) 64-Bit Server VM (build 25.271-b09, mixed mode) [root@node1 opt]# cat /etc/profile # /etc/profile ... export JAVA_HOME=/opt/jdk1.8.0_271 export PATH=.:$JAVA_HOME/bin:$PATH ...
將 jdk1.8.0_271 複製到 node2 和 node3
[root@node1 opt]# scp -r jdk1.8.0_271/ node2:/opt/ [root@node1 opt]# scp -r jdk1.8.0_271/ node3:/opt/
將 /etc/profile 複製到 node2 和 node3
[root@node1 opt]# scp /etc/profile node2:/etc/ profile 100% 1890 1.4MB/s 00:00 [root@node1 opt]# scp /etc/profile node3:/etc/ profile 100% 1890 1.7MB/s 00:00 [root@node2 opt]# source /etc/profile [root@node3 opt]# source /etc/profile
在 node1 上,下載,解壓,並配置環境變量:
[root@node1 opt]# tar -zxvf hadoop-3.2.1.tar.gz ... [root@node1 opt]# vi /etc/profile [root@node1 opt]# source /etc/profile [root@node1 opt]# cat /etc/profile # /etc/profile ... export JAVA_HOME=/opt/jdk1.8.0_271 export HADOOP_HOME=/opt/hadoop-3.2.1 export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH
將 /etc/profile 複製到 node2 和 node3
[root@node1 opt]# scp /etc/profile node2:/etc/ profile 100% 1945 1.6MB/s 00:00 [root@node1 opt]# scp /etc/profile node3:/etc/ profile 100% 1945 1.5MB/s 00:00 [root@node2 opt]# source /etc/profile [root@node3 opt]# source /etc/profile
配置配置文件後,將 hadoop-3.2.1 複製到 node2 和 node3
[root@node1 opt]# scp -r hadoop-3.2.1/ node2:/opt/ [root@node1 opt]# scp -r hadoop-3.2.1/ node3:/opt/
對 node1 :
[root@node1 hadoop-3.2.1]# hdfs namenode -format # 格式化 2020-11-12 21:43:16,999 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = node1/192.168.1.6 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 3.2.1 STARTUP_MSG: classpath = /opt/hadoop-3.2.1/etc/ ... 2020-11-12 21:43:20,696 INFO common.Storage: Storage directory /opt/hadoop-3.2.1/dfs/namenode has been successfully formatted. 2020-11-12 21:43:20,762 INFO namenode.FSImageFormatProtobuf: Saving image file /opt/hadoop-3.2.1/dfs/namenode/current/fsimage.ckpt_0000000000000000000 using no compression 2020-11-12 21:43:20,859 INFO namenode.FSImageFormatProtobuf: Image file /opt/hadoop-3.2.1/dfs/namenode/current/fsimage.ckpt_0000000000000000000 of size 399 bytes saved in 0 seconds . 2020-11-12 21:43:20,866 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 2020-11-12 21:43:20,874 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown. 2020-11-12 21:43:20,874 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at node1/192.168.1.6 ************************************************************/
若是再次格式化,須要先刪除 namenode 和 datanode 上的 dfs/namenode
和 dfs/datanode
目錄。
能夠所有啓動,也能夠分別啓動。
[root@node1 -3.2.1]# sbin/start-all.sh Starting namenodes on [node1] 上一次登陸:六 11月 14 20:28:04 CST 2020pts/0 上 Starting datanodes 上一次登陸:六 11月 14 20:30:51 CST 2020pts/0 上 Starting secondary namenodes [node2] 上一次登陸:六 11月 14 20:30:54 CST 2020pts/0 上 Starting resourcemanager 上一次登陸:六 11月 14 20:30:59 CST 2020pts/0 上 Starting nodemanagers 上一次登陸:六 11月 14 20:31:05 CST 2020pts/0 上 [root@node1 hadoop-3.2.1]# mapred --daemon start historyserver [root@node1 hadoop]# jps 11524 ResourceManager 11927 Jps 11899 JobHistoryServer 11100 NameNode [root@node2 hadoop-3.2.1]# jps 8210 DataNode 8312 SecondaryNameNode 8393 NodeManager 8507 Jps [root@node3 hadoop-3.2.1]# jps 17760 DataNode 17981 Jps 17870 NodeManager
[root@node1 hadoop-3.2.1]# hadoop jar /opt/hadoop-3.2.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /in/wc.txt /out 2020-11-15 00:43:37,940 INFO client.RMProxy: Connecting to ResourceManager at node1/192.168.1.6:8032 2020-11-15 00:43:38,763 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1605372113315_0001 2020-11-15 00:43:38,945 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2020-11-15 00:43:39,647 INFO input.FileInputFormat: Total input files to process : 1 2020-11-15 00:43:39,695 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2020-11-15 00:43:39,731 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2020-11-15 00:43:39,770 INFO mapreduce.JobSubmitter: number of splits:1 2020-11-15 00:43:39,960 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2020-11-15 00:43:39,999 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1605372113315_0001 2020-11-15 00:43:39,999 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2020-11-15 00:43:40,196 INFO conf.Configuration: resource-types.xml not found 2020-11-15 00:43:40,196 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2020-11-15 00:43:40,664 INFO impl.YarnClientImpl: Submitted application application_1605372113315_0001 2020-11-15 00:43:40,808 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1605372113315_0001/ 2020-11-15 00:43:40,809 INFO mapreduce.Job: Running job: job_1605372113315_0001 2020-11-15 00:43:52,004 INFO mapreduce.Job: Job job_1605372113315_0001 running in uber mode : false 2020-11-15 00:43:52,005 INFO mapreduce.Job: map 0% reduce 0% 2020-11-15 00:43:59,092 INFO mapreduce.Job: map 100% reduce 0% 2020-11-15 00:44:05,137 INFO mapreduce.Job: map 100% reduce 100% 2020-11-15 00:44:06,168 INFO mapreduce.Job: Job job_1605372113315_0001 completed successfully 2020-11-15 00:44:06,284 INFO mapreduce.Job: Counters: 54 File System Counters FILE: Number of bytes read=67 FILE: Number of bytes written=452639 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=149 HDFS: Number of bytes written=41 HDFS: Number of read operations=8 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 HDFS: Number of bytes read erasure-coded=0 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=4936 Total time spent by all reduces in occupied slots (ms)=3495 Total time spent by all map tasks (ms)=4936 Total time spent by all reduce tasks (ms)=3495 Total vcore-milliseconds taken by all map tasks=4936 Total vcore-milliseconds taken by all reduce tasks=3495 Total megabyte-milliseconds taken by all map tasks=5054464 Total megabyte-milliseconds taken by all reduce tasks=3578880 Map-Reduce Framework Map input records=3 Map output records=9 Map output bytes=93 Map output materialized bytes=67 Input split bytes=92 Combine input records=9 Combine output records=5 Reduce input groups=5 Reduce shuffle bytes=67 Reduce input records=5 Reduce output records=5 Spilled Records=10 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=162 CPU time spent (ms)=1180 Physical memory (bytes) snapshot=322658304 Virtual memory (bytes) snapshot=5471531008 Total committed heap usage (bytes)=170004480 Peak Map Physical memory (bytes)=210452480 Peak Map Virtual memory (bytes)=2732470272 Peak Reduce Physical memory (bytes)=112205824 Peak Reduce Virtual memory (bytes)=2739060736 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=57 File Output Format Counters Bytes Written=41
出現了以下問題:
(1)經過 yarn 提交任務出現 Failed while trying to construct the redirect url to the log server. Log Server url may not be configured
緣由是未配置 historyserver 服務。配置以下屬性:
<!-- mapred-site.xml--> <property> <!-- MapReduce JobHistory Server IPC host:port --> <name>mapreduce.jobhistory.address</name> <value>node1:10020</value> </property> <property> <!-- MapReduce JobHistory Server Web UI host:port --> <name>mapreduce.jobhistory.webapp.address</name> <value>node1:19888</value> </property> <!-- yarn-site.xml--> <property> <name>yarn.log.server.url</name> <value>http://node1:19888/jobhistory/logs</value> </property>
(2)執行做業時,出現了 錯誤: 找不到或沒法加載主類 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
[root@node1 hadoop-3.2.1]# hadoop classpath /opt/hadoop-3.2.1/etc/hadoop:/opt/hadoop-3.2.1/share/hadoop/common/lib/*:/opt/hadoop-3.2.1/share/hadoop/common/*:/opt/hadoop-3.2.1/share/hadoop/hdfs:/opt/hadoop-3.2.1/share/hadoop/hdfs/lib/*:/opt/hadoop-3.2.1/share/hadoop/hdfs/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/*:/opt/hadoop-3.2.1/share/hadoop/yarn:/opt/hadoop-3.2.1/share/hadoop/yarn/lib/*:/opt/hadoop-3.2.1/share/hadoop/yarn/*
將上述值添加到 yarn-site.xml
文件以下屬性中:
<property> <name>yarn.application.classpath</name> <value>/opt/hadoop-3.2.1/etc/hadoop:/opt/hadoop-3.2.1/share/hadoop/common/lib/*:/opt/hadoop-3.2.1/share/hadoop/common/*:/opt/hadoop-3.2.1/share/hadoop/hdfs:/opt/hadoop-3.2.1/share/hadoop/hdfs/lib/*:/opt/hadoop-3.2.1/share/hadoop/hdfs/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/*:/opt/hadoop-3.2.1/share/hadoop/yarn:/opt/hadoop-3.2.1/share/hadoop/yarn/lib/*:/opt/hadoop-3.2.1/share/hadoop/yarn/*</value> </property>
(3)執行做業時,出現了 The auxService:mapreduce_shuffle does not exist
錯誤。
由於在複製 yarn-site.xml 時漏掉了 yarn.nodemanager.aux-services
屬性。
(4)第一次執行做業的時候,輸出日誌一直卡在 INFO mapreduce.Job: Running job: job_1605371813670_0001
。這個問題首先要考慮配置文件是否正確,其次考慮yarn的資源分配。
(1)若是某個進程啓動失敗了,考慮配置文件是否是配置錯誤了,或者格式化的時候未清理上次集羣的id。
(2)若是啓動,出現了 ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
錯誤,說明在 hadoop-env.sh
中未配置此項。具體配置內容見下面的配置文件。
(3)在 Hadoop3.x 中,NameNode 的 web 端口改爲了 9870
(4)配置文件的配置能夠同時參考 官網集羣搭建、官網core-site.xml、官網hdfs-site.xml、官網yarn-site.xml、官網mapred-site.xml
(5)在跑任務時,注意資源的分配。
[root@node1 hadoop]# pwd /opt/hadoop-3.2.1/etc/hadoop [root@node1 hadoop]# ls capacity-scheduler.xml hadoop-user-functions.sh.example kms-log4j.properties ssl-client.xml.example configuration.xsl hdfs-site.xml kms-site.xml ssl-server.xml.example container-executor.cfg httpfs-env.sh log4j.properties user_ec_policies.xml.template core-site.xml httpfs-log4j.properties mapred-env.cmd workers hadoop-env.cmd httpfs-signature.secret mapred-env.sh yarn-env.cmd hadoop-env.sh httpfs-site.xml mapred-queues.xml.template yarn-env.sh hadoop-metrics2.properties kms-acls.xml mapred-site.xml yarnservice-log4j.properties hadoop-policy.xml kms-env.sh shellprofile.d yarn-site.xml
管理員應該經過設置 etc/hadoop/hadoop-env.sh
,和可選的 etc/hadoop/mapred-env.sh
、etc/hadoop/yarn-env.sh
腳原本對 Hadoop 守護進程環境進行個性化設置,好比,設置 namenode 使用多少堆內存。
至少,你須要在每一個遠程結點上指定 JAVA_HOME 。
# 在 node一、node二、node3 節點: [root@node1 hadoop]# vi hadoop-env.sh ... ### # Generic settings for HADOOP ### # Technically, the only required environment variable is JAVA_HOME. # All others are optional. However, the defaults are probably not # preferred. Many sites configure these options outside of Hadoop, # such as in /etc/profile.d # The java implementation to use. ... # export JAVA_HOME= export JAVA_HOME=/opt/jdk1.8.0_271 export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root
# 在 node一、node二、node3 節點: [root@node1 hadoop]# cat core-site.xml ... <!-- Put site-specific property overrides in this file. --> <configuration> <property> <!-- 指定namenode的hdfs協議的文件系統通訊地址 --> <name>fs.defaultFS</name> <value>hdfs://node1:9000</value> </property> <property> <!-- 文件IO緩衝區的大小,131072KB(64M),是系統默認值 --> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <!-- hadoop臨時目錄 --> <name>hadoop.tmp.dir</name> <value>/opt/hadoop-3.2.1/tmp</value> </property> </configuration>
# 在 node一、node二、node3 節點: [root@node1 hadoop]# cat hdfs-site.xml ... <!-- Put site-specific property overrides in this file. --> <configuration> <property> <!-- NameNode持久存儲命名空間和事務日誌的本地文件系統上的路徑 --> <name>dfs.namenode.name.dir</name> <value>/opt/hadoop-3.2.1/dfs/namenode</value> </property> <property> <!-- List of permitted DataNodes. --> <name>dfs.hosts</name> <value>/opt/hadoop-3.2.1/etc/hadoop/workers</value> </property> <property> <!-- 配置 secondary namenodes在node2上 --> <name>dfs.namenode.secondary.http-address</name> <value>node2:9868</value> </property> <property> <!-- 在本地文件系統存儲數據塊的目錄的逗號分隔的列表,即配置多個存儲目錄 --> <name>dfs.datanode.data.dir</name> <value>/opt/hadoop-3.2.1/dfs/datanode</value> </property> <property> <!-- Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. --> <name>dfs.namenode.checkpoint.dir</name> <value>/opt/hadoop-3.2.1/dfs/namesecondary</value> </property> </configuration>
# 在 node一、node二、node3 節點: [root@node1 hadoop]# cat mapred-site.xml ... <!-- Put site-specific property overrides in this file. --> <configuration> <property> <!-- 指定mapreduce框架爲yarn方式 --> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <!-- Directory where history files are written by MapReduce jobs --> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>/mr-history/tmp</value> </property> <property> <!-- Directory where history files are managed by the MR JobHistory Server --> <name>mapreduce.jobhistory.done-dir</name> <value>/mr-history/done</value> </property> <property> <!-- MapReduce JobHistory Server IPC host:port --> <name>mapreduce.jobhistory.address</name> <value>node1:10020</value> </property> <property> <!-- MapReduce JobHistory Server Web UI host:port --> <name>mapreduce.jobhistory.webapp.address</name> <value>node1:19888</value> </property> </configuration>
# 在 node一、node二、node3 節點: [root@node1 hadoop]# cat yarn-site.xml ... <configuration> <property> <!-- Configuration to enable or disable log aggregation --> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>node1</value> </property> <property> <!-- ResourceManager host:port for clients to submit jobs. --> <name>yarn.resourcemanager.address</name> <value>node1:8032</value> </property> <property> <!-- ResourceManager host:port for ApplicationMasters to talk to Scheduler to obtain resources. --> <name>yarn.resourcemanager.scheduler.address</name> <value>node1:8030</value> </property> <property> <!-- ResourceManager host:port for NodeManagers. --> <name>yarn.resourcemanager.resource-tracker.address</name> <value>node1:8031</value> </property> <property> <!-- ResourceManager host:port for administrative commands. --> <name>yarn.resourcemanager.admin.address</name> <value>node1:8033</value> </property> <property> <!-- ResourceManager web-ui host:port. --> <name>yarn.resourcemanager.webapp.address</name> <value>node1:8088</value> </property> <property> <!-- List of permitted NodeManagers. --> <name>yarn.resourcemanager.nodes.include-path</name> <value>/opt/hadoop-3.2.1/etc/hadoop/workers</value> </property> <property> <!-- Comma-separated list of paths on the local filesystem where intermediate data is written. --> <name>yarn.nodemanager.local-dirs</name> <value>/opt/hadoop-3.2.1/tmp</value> </property> <property> <!-- Comma-separated list of paths on the local filesystem where logs are written. --> <name>yarn.nodemanager.log-dirs</name> <value>/opt/hadoop-3.2.1/logs</value> </property> <property> <!-- NodeManager上運行的附屬服務 --> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <!-- URL for log aggregation server --> <name>yarn.log.server.url</name> <value>http://node1:19888/jobhistory/logs</value> </property> <property> <name>yarn.application.classpath</name> <value>/opt/hadoop-3.2.1/etc/hadoop:/opt/hadoop-3.2.1/share/hadoop/common/lib/*:/opt/hadoop-3.2.1/share/hadoop/common/*:/opt/hadoop-3.2.1/share/hadoop/hdfs:/opt/hadoop-3.2.1/share/hadoop/hdfs/lib/*:/opt/hadoop-3.2.1/share/hadoop/hdfs/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.2.1/share/hadoop/mapreduce/*:/opt/hadoop-3.2.1/share/hadoop/yarn:/opt/hadoop-3.2.1/share/hadoop/yarn/lib/*:/opt/hadoop-3.2.1/share/hadoop/yarn/*</value> </property> </configuration>
# 在 node一、node二、node3 節點: [root@node1 hadoop]# cat workers node2 node3
dfs.datanode.data.dir
默認是 file://${hadoop.tmp.dir}/dfs/data
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories that do not exist will be created if local filesystem permission allows.