Hadoop基準測試（一）

時間 2019-11-18

標籤 hadoop 基準測試欄目 Hadoop 简体版

原文原文鏈接

測試對於驗證系統的正確性、分析系統的性能來講很是重要，但每每容易被咱們所忽視。爲了能對系統有更全面的瞭解、能找到系統的瓶頸所在、能對系統性能作更好的改進，打算先從測試入手，學習Hadoop主要的測試手段。html

TestDFSIOjava

TestDFSIO用於測試HDFS的IO性能，使用一個MapReduce做業來併發地執行讀寫操做，每一個map任務用於讀或寫每一個文件，map的輸出用於收集與處理文件相關的統計信息，reduce用於累積統計信息，併產生summary。node

NameNode的地址爲：10.*.*.131:7180bash

輸入命令 hadoop version，提示hadoop jar包所在路徑併發

進入jar包所在路徑，輸入命令 hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar，返回以下信息：app

An example program must be given as the first argument.
Valid program names are:
DFSCIOTest: Distributed i/o benchmark of libhdfs.
DistributedFSCheck: Distributed checkup of the file system consistency.
MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures
TestDFSIO: Distributed i/o benchmark.
dfsthroughput: measure hdfs throughput
filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed)
loadgen: Generic map/reduce load generator
mapredtest: A map/reduce test check.
minicluster: Single process HDFS and MR cluster.
mrbench: A map/reduce benchmark that can create many small jobs
nnbench: A benchmark that stresses the namenode.
testarrayfile: A test for flat files of binary key/value pairs.
testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
testfilesystem: A test for FileSystem read/write.
testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
testrpc: A test for rpc.
testsequencefile: A test for flat files of binary key value pairs.
testsequencefileinputformat: A test for sequence file input format.
testsetfile: A test for flat files of binary key/value pairs.
testtextinputformat: A test for text input format.
threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill

輸入並執行命令 hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000dom

返回以下信息：ide

19/04/02 16:22:30 INFO fs.TestDFSIO: TestDFSIO.1.7
19/04/02 16:22:30 INFO fs.TestDFSIO: nrFiles = 10
19/04/02 16:22:30 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0
19/04/02 16:22:30 INFO fs.TestDFSIO: bufferSize = 1000000
19/04/02 16:22:30 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
19/04/02 16:22:31 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files
java.io.IOException: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

報錯！　　java.io.IOException: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-xoop

執行命令 su hdfs 切換用戶爲 hdfs性能

輸入並執行命令 hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000

返回以下信息：

bash-4.2$ hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000
19/04/02 16:26:39 INFO fs.TestDFSIO: TestDFSIO.1.7
19/04/02 16:26:39 INFO fs.TestDFSIO: nrFiles = 10
19/04/02 16:26:39 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0
19/04/02 16:26:39 INFO fs.TestDFSIO: bufferSize = 1000000
19/04/02 16:26:39 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
19/04/02 16:26:40 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files
19/04/02 16:26:40 INFO fs.TestDFSIO: created control files for: 10 files
19/04/02 16:26:40 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/02 16:26:40 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/02 16:26:41 INFO mapred.FileInputFormat: Total input paths to process : 10
19/04/02 16:26:41 INFO mapreduce.JobSubmitter: number of splits:10
19/04/02 16:26:41 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
19/04/02 16:26:41 INFO Configuration.deprecation: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
19/04/02 16:26:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_0002
19/04/02 16:26:41 INFO impl.YarnClientImpl: Submitted application application_1552358721447_0002
19/04/02 16:26:41 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0002/
19/04/02 16:26:41 INFO mapreduce.Job: Running job: job_1552358721447_0002
19/04/02 16:26:48 INFO mapreduce.Job: Job job_1552358721447_0002 running in uber mode : false
19/04/02 16:26:48 INFO mapreduce.Job:  map 0% reduce 0%
19/04/02 16:27:02 INFO mapreduce.Job:  map 30% reduce 0%
19/04/02 16:27:03 INFO mapreduce.Job:  map 100% reduce 0%
19/04/02 16:27:08 INFO mapreduce.Job:  map 100% reduce 100%
19/04/02 16:27:08 INFO mapreduce.Job: Job job_1552358721447_0002 completed successfully
19/04/02 16:27:08 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=379
                FILE: Number of bytes written=1653843
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2310
                HDFS: Number of bytes written=10485760082
                HDFS: Number of read operations=43
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=12
        Job Counters
                Launched map tasks=10
                Launched reduce tasks=1
                Data-local map tasks=10
                Total time spent by all maps in occupied slots (ms)=128477
                Total time spent by all reduces in occupied slots (ms)=2621
                Total time spent by all map tasks (ms)=128477
                Total time spent by all reduce tasks (ms)=2621
                Total vcore-milliseconds taken by all map tasks=128477
                Total vcore-milliseconds taken by all reduce tasks=2621
                Total megabyte-milliseconds taken by all map tasks=131560448
                Total megabyte-milliseconds taken by all reduce tasks=2683904
        Map-Reduce Framework
                Map input records=10
                Map output records=50
                Map output bytes=784
                Map output materialized bytes=1033
                Input split bytes=1190
                Combine input records=0
                Combine output records=0
                Reduce input groups=5
                Reduce shuffle bytes=1033
                Reduce input records=50
                Reduce output records=5
                Spilled Records=100
                Shuffled Maps =10
                Failed Shuffles=0
                Merged Map outputs=10
                GC time elapsed (ms)=2657
                CPU time spent (ms)=94700
                Physical memory (bytes) snapshot=7229349888
                Virtual memory (bytes) snapshot=32021716992
                Total committed heap usage (bytes)=6717702144
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1120
        File Output Format Counters
                Bytes Written=82
java.io.FileNotFoundException: TestDFSIO_results.log (Permission denied)

報錯！　　java.io.FileNotFoundException: TestDFSIO_results.log (Permission denied)

這是因爲用戶hdfs對當前所在文件夾沒有足夠的訪問權限，參考：　　https://blog.csdn.net/qq_15547319/article/details/53543587　　中的評論

解決：新建文件夾 ** （命令：mkdir **），並授予用戶hdfs對文件夾**的訪問權限（命令：sudo chmod -R 777 **），進入文件夾**，執行命令 hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000 ，返回以下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000
19/04/03 10:26:32 INFO fs.TestDFSIO: TestDFSIO.1.7
19/04/03 10:26:32 INFO fs.TestDFSIO: nrFiles = 10
19/04/03 10:26:32 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0
19/04/03 10:26:32 INFO fs.TestDFSIO: bufferSize = 1000000
19/04/03 10:26:32 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
19/04/03 10:26:32 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files
19/04/03 10:26:33 INFO fs.TestDFSIO: created control files for: 10 files
19/04/03 10:26:33 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 10:26:33 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 10:26:33 INFO mapred.FileInputFormat: Total input paths to process : 10
19/04/03 10:26:33 INFO mapreduce.JobSubmitter: number of splits:10
19/04/03 10:26:33 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
19/04/03 10:26:33 INFO Configuration.deprecation: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
19/04/03 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_0006
19/04/03 10:26:34 INFO impl.YarnClientImpl: Submitted application application_1552358721447_0006
19/04/03 10:26:34 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0006/
19/04/03 10:26:34 INFO mapreduce.Job: Running job: job_1552358721447_0006
19/04/03 10:26:39 INFO mapreduce.Job: Job job_1552358721447_0006 running in uber mode : false
19/04/03 10:26:39 INFO mapreduce.Job:  map 0% reduce 0%
19/04/03 10:26:53 INFO mapreduce.Job:  map 30% reduce 0%
19/04/03 10:26:54 INFO mapreduce.Job:  map 90% reduce 0%
19/04/03 10:26:55 INFO mapreduce.Job:  map 100% reduce 0%
19/04/03 10:27:00 INFO mapreduce.Job:  map 100% reduce 100%
19/04/03 10:27:00 INFO mapreduce.Job: Job job_1552358721447_0006 completed successfully
19/04/03 10:27:00 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=392
                FILE: Number of bytes written=1653853
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2310
                HDFS: Number of bytes written=10485760082
                HDFS: Number of read operations=43
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=12
        Job Counters
                Launched map tasks=10
                Launched reduce tasks=1
                Data-local map tasks=10
                Total time spent by all maps in occupied slots (ms)=125653
                Total time spent by all reduces in occupied slots (ms)=2636
                Total time spent by all map tasks (ms)=125653
                Total time spent by all reduce tasks (ms)=2636
                Total vcore-milliseconds taken by all map tasks=125653
                Total vcore-milliseconds taken by all reduce tasks=2636
                Total megabyte-milliseconds taken by all map tasks=128668672
                Total megabyte-milliseconds taken by all reduce tasks=2699264
        Map-Reduce Framework
                Map input records=10
                Map output records=50
                Map output bytes=783
                Map output materialized bytes=1030
                Input split bytes=1190
                Combine input records=0
                Combine output records=0
                Reduce input groups=5
                Reduce shuffle bytes=1030
                Reduce input records=50
                Reduce output records=5
                Spilled Records=100
                Shuffled Maps =10
                Failed Shuffles=0
                Merged Map outputs=10
                GC time elapsed (ms)=1881
                CPU time spent (ms)=78110
                Physical memory (bytes) snapshot=6980759552
                Virtual memory (bytes) snapshot=31983017984
                Total committed heap usage (bytes)=6693060608
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1120
        File Output Format Counters
                Bytes Written=82
19/04/03 10:27:00 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
19/04/03 10:27:00 INFO fs.TestDFSIO:            Date & time: Wed Apr 03 10:27:00 CST 2019
19/04/03 10:27:00 INFO fs.TestDFSIO:        Number of files: 10
19/04/03 10:27:00 INFO fs.TestDFSIO: Total MBytes processed: 10000.0
19/04/03 10:27:00 INFO fs.TestDFSIO:      Throughput mb/sec: 114.77630098937172
19/04/03 10:27:00 INFO fs.TestDFSIO: Average IO rate mb/sec: 115.29634094238281
19/04/03 10:27:00 INFO fs.TestDFSIO:  IO rate std deviation: 7.880011777295818
19/04/03 10:27:00 INFO fs.TestDFSIO:     Test exec time sec: 27.05
19/04/03 10:27:00 INFO fs.TestDFSIO:
bash-4.2$

測試命令正確執行之後會在Hadoop File System中建立文件夾存放生成的測試文件，以下所示：

並生成了一系列小文件：

將小文件下載到本地，查看文件大小爲1KB

用Notepad++打開後，查看內容爲：

並非可讀的內容

執行命令：　　 hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -read -nrFiles 10 -fileSize 1000

返回以下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -read -nrFiles 10 -fileSize 1000
19/04/03 10:51:05 INFO fs.TestDFSIO: TestDFSIO.1.7
19/04/03 10:51:05 INFO fs.TestDFSIO: nrFiles = 10
19/04/03 10:51:05 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0
19/04/03 10:51:05 INFO fs.TestDFSIO: bufferSize = 1000000
19/04/03 10:51:05 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
19/04/03 10:51:05 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files
19/04/03 10:51:06 INFO fs.TestDFSIO: created control files for: 10 files
19/04/03 10:51:06 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 10:51:06 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 10:51:06 INFO mapred.FileInputFormat: Total input paths to process : 10
19/04/03 10:51:06 INFO mapreduce.JobSubmitter: number of splits:10
19/04/03 10:51:06 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
19/04/03 10:51:06 INFO Configuration.deprecation: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
19/04/03 10:51:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_0007
19/04/03 10:51:07 INFO impl.YarnClientImpl: Submitted application application_1552358721447_0007
19/04/03 10:51:07 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0007/
19/04/03 10:51:07 INFO mapreduce.Job: Running job: job_1552358721447_0007
19/04/03 10:51:12 INFO mapreduce.Job: Job job_1552358721447_0007 running in uber mode : false
19/04/03 10:51:12 INFO mapreduce.Job:  map 0% reduce 0%
19/04/03 10:51:19 INFO mapreduce.Job:  map 100% reduce 0%
19/04/03 10:51:25 INFO mapreduce.Job:  map 100% reduce 100%
19/04/03 10:51:25 INFO mapreduce.Job: Job job_1552358721447_0007 completed successfully
19/04/03 10:51:25 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=345
                FILE: Number of bytes written=1653774
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=10485762310
                HDFS: Number of bytes written=81
                HDFS: Number of read operations=53
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=10
                Launched reduce tasks=1
                Data-local map tasks=10
                Total time spent by all maps in occupied slots (ms)=50265
                Total time spent by all reduces in occupied slots (ms)=2630
                Total time spent by all map tasks (ms)=50265
                Total time spent by all reduce tasks (ms)=2630
                Total vcore-milliseconds taken by all map tasks=50265
                Total vcore-milliseconds taken by all reduce tasks=2630
                Total megabyte-milliseconds taken by all map tasks=51471360
                Total megabyte-milliseconds taken by all reduce tasks=2693120
        Map-Reduce Framework
                Map input records=10
                Map output records=50
                Map output bytes=774
                Map output materialized bytes=1020
                Input split bytes=1190
                Combine input records=0
                Combine output records=0
                Reduce input groups=5
                Reduce shuffle bytes=1020
                Reduce input records=50
                Reduce output records=5
                Spilled Records=100
                Shuffled Maps =10
                Failed Shuffles=0
                Merged Map outputs=10
                GC time elapsed (ms)=1310
                CPU time spent (ms)=35780
                Physical memory (bytes) snapshot=6365962240
                Virtual memory (bytes) snapshot=31838441472
                Total committed heap usage (bytes)=6873415680
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1120
        File Output Format Counters
                Bytes Written=81
19/04/03 10:51:25 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
19/04/03 10:51:25 INFO fs.TestDFSIO:            Date & time: Wed Apr 03 10:51:25 CST 2019
19/04/03 10:51:25 INFO fs.TestDFSIO:        Number of files: 10
19/04/03 10:51:25 INFO fs.TestDFSIO: Total MBytes processed: 10000.0
19/04/03 10:51:25 INFO fs.TestDFSIO:      Throughput mb/sec: 897.4243919949744
19/04/03 10:51:25 INFO fs.TestDFSIO: Average IO rate mb/sec: 898.6844482421875
19/04/03 10:51:25 INFO fs.TestDFSIO:  IO rate std deviation: 33.68623587810037
19/04/03 10:51:25 INFO fs.TestDFSIO:     Test exec time sec: 19.035
19/04/03 10:51:25 INFO fs.TestDFSIO:
bash-4.2$

執行命令： hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -clean

返回以下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -clean
19/04/03 11:17:25 INFO fs.TestDFSIO: TestDFSIO.1.7
19/04/03 11:17:25 INFO fs.TestDFSIO: nrFiles = 1
19/04/03 11:17:25 INFO fs.TestDFSIO: nrBytes (MB) = 1.0
19/04/03 11:17:25 INFO fs.TestDFSIO: bufferSize = 1000000
19/04/03 11:17:25 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
19/04/03 11:17:26 INFO fs.TestDFSIO: Cleaning up test files
bash-4.2$

同時Hadoop File System中刪除了TestDFSIO文件夾

nnbench

nnbench用於測試NameNode的負載，它會生成不少與HDFS相關的請求，給NameNode施加較大的壓力。這個測試能在HDFS上模擬建立、讀取、重命名和刪除文件等操做。

nnbench命令的參數說明以下：

NameNode Benchmark 0.4
Usage: nnbench <options>
Options:
-operation <Available operations are create_write open_read rename delete. This option is mandatory>
* NOTE: The open_read, rename and delete operations assume that the files they operate on, are already available. The create_write operation must be run before running the other operations.
-maps <number of maps. default is 1. This is not mandatory>
-reduces <number of reduces. default is 1. This is not mandatory>
-startTime <time to start, given in seconds from the epoch. Make sure this is far enough into the future, so all maps (operations) will start at the same time>. default is launch time + 2 mins. This is not mandatory
-blockSize <Block size in bytes. default is 1. This is not mandatory>
-bytesToWrite <Bytes to write. default is 0. This is not mandatory>
-bytesPerChecksum <Bytes per checksum for the files. default is 1. This is not mandatory>
-numberOfFiles <number of files to create. default is 1. This is not mandatory>
-replicationFactorPerFile <Replication factor for the files. default is 1. This is not mandatory>
-baseDir <base DFS path. default is /becnhmarks/NNBench. This is not mandatory>
-readFileAfterOpen <true or false. if true, it reads the file and reports the average time to read. This is valid with the open_read operation. default is false. This is not mandatory>
-help: Display the help statement

爲了使用12個mapper和6個reducer來建立1000個文件，執行命令： hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar nnbench -operation create_write -maps 12 -reduces 6 -blockSize 1 -bytesToWrite 0 -numberOfFiles 1000 -replicationFactorPerFile 3 -readFileAfterOpen true -baseDir /benchmarks/NNBench

返回以下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar nnbench -operation create_write -maps 12 -reduces 6 -blockSize 1 -bytesToWrite 0 -numberOfFiles 1000 -replicationFactorPerFile 3 hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar nnbench -operation create_write -maps 12 -reduces 6 -blockSize 1 -bytesToWrite 0 -numberOfFiles 1000 -replicationFactorPerFile 3 -readFileAfterOpen true -baseDir /benchmarks/NNBench
NameNode Benchmark 0.4
19/04/03 16:11:22 INFO hdfs.NNBench: Test Inputs:
19/04/03 16:11:22 INFO hdfs.NNBench:            Test Operation: create_write
19/04/03 16:11:22 INFO hdfs.NNBench:                Start time: 2019-04-03 16:13:22,755
19/04/03 16:11:22 INFO hdfs.NNBench:            Number of maps: 12
19/04/03 16:11:22 INFO hdfs.NNBench:         Number of reduces: 6
19/04/03 16:11:22 INFO hdfs.NNBench:                Block Size: 1
19/04/03 16:11:22 INFO hdfs.NNBench:            Bytes to write: 0
19/04/03 16:11:22 INFO hdfs.NNBench:        Bytes per checksum: 1
19/04/03 16:11:22 INFO hdfs.NNBench:           Number of files: 1000
19/04/03 16:11:22 INFO hdfs.NNBench:        Replication factor: 3
19/04/03 16:11:22 INFO hdfs.NNBench:                  Base dir: /benchmarks/NNBench
19/04/03 16:11:22 INFO hdfs.NNBench:      Read file after open: true
19/04/03 16:11:23 INFO hdfs.NNBench: Deleting data directory
19/04/03 16:11:23 INFO hdfs.NNBench: Creating 12 control files
19/04/03 16:11:24 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
19/04/03 16:11:24 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 16:11:24 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 16:11:24 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
19/04/03 16:11:24 INFO mapred.FileInputFormat: Total input paths to process : 12
19/04/03 16:11:24 INFO mapreduce.JobSubmitter: number of splits:12
19/04/03 16:11:24 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
19/04/03 16:11:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_0009
19/04/03 16:11:24 INFO impl.YarnClientImpl: Submitted application application_1552358721447_0009
19/04/03 16:11:24 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0009/
19/04/03 16:11:24 INFO mapreduce.Job: Running job: job_1552358721447_0009
19/04/03 16:11:31 INFO mapreduce.Job: Job job_1552358721447_0009 running in uber mode : false
19/04/03 16:11:31 INFO mapreduce.Job:  map 0% reduce 0%
19/04/03 16:11:48 INFO mapreduce.Job:  map 50% reduce 0%
19/04/03 16:11:49 INFO mapreduce.Job:  map 67% reduce 0%
19/04/03 16:13:26 INFO mapreduce.Job:  map 100% reduce 0%
19/04/03 16:13:31 INFO mapreduce.Job:  map 100% reduce 17%
19/04/03 16:13:32 INFO mapreduce.Job:  map 100% reduce 100%
19/04/03 16:13:32 INFO mapreduce.Job: Job job_1552358721447_0009 completed successfully
19/04/03 16:13:32 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=519
                FILE: Number of bytes written=2736365
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2908
                HDFS: Number of bytes written=170
                HDFS: Number of read operations=66
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=12012
        Job Counters
                Launched map tasks=12
                Launched reduce tasks=6
                Data-local map tasks=12
                Total time spent by all maps in occupied slots (ms)=1363711
                Total time spent by all reduces in occupied slots (ms)=18780
                Total time spent by all map tasks (ms)=1363711
                Total time spent by all reduce tasks (ms)=18780
                Total vcore-milliseconds taken by all map tasks=1363711
                Total vcore-milliseconds taken by all reduce tasks=18780
                Total megabyte-milliseconds taken by all map tasks=1396440064
                Total megabyte-milliseconds taken by all reduce tasks=19230720
        Map-Reduce Framework
                Map input records=12
                Map output records=84
                Map output bytes=2016
                Map output materialized bytes=3276
                Input split bytes=1418
                Combine input records=0
                Combine output records=0
                Reduce input groups=7
                Reduce shuffle bytes=3276
                Reduce input records=84
                Reduce output records=7
                Spilled Records=168
                Shuffled Maps =72
                Failed Shuffles=0
                Merged Map outputs=72
                GC time elapsed (ms)=2335
                CPU time spent (ms)=35880
                Physical memory (bytes) snapshot=9088864256
                Virtual memory (bytes) snapshot=52095377408
                Total committed heap usage (bytes)=11191975936
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1490
        File Output Format Counters
                Bytes Written=170
19/04/03 16:13:32 INFO hdfs.NNBench: -------------- NNBench -------------- :
19/04/03 16:13:32 INFO hdfs.NNBench:                                Version: NameNode Benchmark 0.4
19/04/03 16:13:32 INFO hdfs.NNBench:                            Date & time: 2019-04-03 16:13:32,475
19/04/03 16:13:32 INFO hdfs.NNBench:
19/04/03 16:13:32 INFO hdfs.NNBench:                         Test Operation: create_write
19/04/03 16:13:32 INFO hdfs.NNBench:                             Start time: 2019-04-03 16:13:22,755
19/04/03 16:13:32 INFO hdfs.NNBench:                            Maps to run: 12
19/04/03 16:13:32 INFO hdfs.NNBench:                         Reduces to run: 6
19/04/03 16:13:32 INFO hdfs.NNBench:                     Block Size (bytes): 1
19/04/03 16:13:32 INFO hdfs.NNBench:                         Bytes to write: 0
19/04/03 16:13:32 INFO hdfs.NNBench:                     Bytes per checksum: 1
19/04/03 16:13:32 INFO hdfs.NNBench:                        Number of files: 1000
19/04/03 16:13:32 INFO hdfs.NNBench:                     Replication factor: 3
19/04/03 16:13:32 INFO hdfs.NNBench:             Successful file operations: 0
19/04/03 16:13:32 INFO hdfs.NNBench:
19/04/03 16:13:32 INFO hdfs.NNBench:         # maps that missed the barrier: 0
19/04/03 16:13:32 INFO hdfs.NNBench:                           # exceptions: 0
19/04/03 16:13:32 INFO hdfs.NNBench:
19/04/03 16:13:32 INFO hdfs.NNBench:                TPS: Create/Write/Close: 0
19/04/03 16:13:32 INFO hdfs.NNBench: Avg exec time (ms): Create/Write/Close: 0.0
19/04/03 16:13:32 INFO hdfs.NNBench:             Avg Lat (ms): Create/Write: NaN
19/04/03 16:13:32 INFO hdfs.NNBench:                    Avg Lat (ms): Close: NaN
19/04/03 16:13:32 INFO hdfs.NNBench:
19/04/03 16:13:32 INFO hdfs.NNBench:                  RAW DATA: AL Total #1: 0
19/04/03 16:13:32 INFO hdfs.NNBench:                  RAW DATA: AL Total #2: 0
19/04/03 16:13:32 INFO hdfs.NNBench:               RAW DATA: TPS Total (ms): 0
19/04/03 16:13:32 INFO hdfs.NNBench:        RAW DATA: Longest Map Time (ms): 0.0
19/04/03 16:13:32 INFO hdfs.NNBench:                    RAW DATA: Late maps: 0
19/04/03 16:13:32 INFO hdfs.NNBench:              RAW DATA: # of exceptions: 0
19/04/03 16:13:32 INFO hdfs.NNBench:
bash-4.2$

任務執行完之後能夠到頁面 http://*.*.*.*:19888/jobhistory/job/job_1552358721447_0009 查看任務執行詳情，以下：

而且在Hadoop File System中生成文件夾NNBench存儲任務產生的文件：

進入目錄/benchmarks/NNBench/control，查看某個文件 NNBench_Controlfile_0 的元信息，發現文件存在三個節點上：

下載下來用Notepad++打開，發現內容是亂碼：

mrbench

mrbench會屢次重複執行一個小做業，用於檢查在機羣上小做業的運行是否可重複以及運行是否高效。mrbench的用法以下：

Usage: mrbench [-baseDir <base DFS path for output/input, default is /benchmarks/MRBench>] [-jar <local path to job jar file containing Mapper and Reducer implementations, default is current jar file>] [-numRuns <number of times to run the job, default is 1>] [-maps <number of maps for each run, default is 2>] [-reduces <number of reduces for each run, default is 1>] [-inputLines <number of input lines to generate, default is 1>] [-inputType <type of input to generate, one of ascending (default), descending, random>] [-verbose]

執行命令： hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar mrbench -numRuns 50

返回以下信息：

……
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=3
        File Output Format Counters
                Bytes Written=3
19/04/03 17:10:15 INFO mapred.MRBench: Running job 49: input=hdfs://node1:8020/benchmarks/MRBench/mr_input output=hdfs://node1:8020/benchmarks/MRBench/mr_output/output_299739316
19/04/03 17:10:15 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 17:10:15 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:8032
19/04/03 17:10:15 INFO mapred.FileInputFormat: Total input paths to process : 1
19/04/03 17:10:15 INFO mapreduce.JobSubmitter: number of splits:2
19/04/03 17:10:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_0059
19/04/03 17:10:15 INFO impl.YarnClientImpl: Submitted application application_1552358721447_0059
19/04/03 17:10:15 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0059/
19/04/03 17:10:15 INFO mapreduce.Job: Running job: job_1552358721447_0059
19/04/03 17:10:21 INFO mapreduce.Job: Job job_1552358721447_0059 running in uber mode : false
19/04/03 17:10:21 INFO mapreduce.Job:  map 0% reduce 0%
19/04/03 17:10:25 INFO mapreduce.Job:  map 100% reduce 0%
19/04/03 17:10:30 INFO mapreduce.Job:  map 100% reduce 100%
19/04/03 17:10:30 INFO mapreduce.Job: Job job_1552358721447_0059 completed successfully
19/04/03 17:10:30 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=27
                FILE: Number of bytes written=450422
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=239
                HDFS: Number of bytes written=3
                HDFS: Number of read operations=9
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=2
                Launched reduce tasks=1
                Data-local map tasks=2
                Total time spent by all maps in occupied slots (ms)=5134
                Total time spent by all reduces in occupied slots (ms)=2562
                Total time spent by all map tasks (ms)=5134
                Total time spent by all reduce tasks (ms)=2562
                Total vcore-milliseconds taken by all map tasks=5134
                Total vcore-milliseconds taken by all reduce tasks=2562
                Total megabyte-milliseconds taken by all map tasks=5257216
                Total megabyte-milliseconds taken by all reduce tasks=2623488
        Map-Reduce Framework
                Map input records=1
                Map output records=1
                Map output bytes=5
                Map output materialized bytes=39
                Input split bytes=236
                Combine input records=0
                Combine output records=0
                Reduce input groups=1
                Reduce shuffle bytes=39
                Reduce input records=1
                Reduce output records=1
                Spilled Records=2
                Shuffled Maps =2
                Failed Shuffles=0
                Merged Map outputs=2
                GC time elapsed (ms)=196
                CPU time spent (ms)=2550
                Physical memory (bytes) snapshot=1503531008
                Virtual memory (bytes) snapshot=8690847744
                Total committed heap usage (bytes)=1791492096
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=3
        File Output Format Counters
                Bytes Written=3
DataLines       Maps    Reduces AvgTime (milliseconds)
1               2       1       15357
bash-4.2$