上一篇文章,博主爲你們分享了hadoop的安裝以及集羣的啓動,本篇博客將帶領小夥伴們一塊兒來感覺下hadoop命令和Linux命令的使用有什麼不一樣。node
1、首先,啓動hadoop集羣,執行腳本sh start-dfs.sh;sh start-yarn.shlinux
2、瀏覽器中查看dfs中的文件目錄(此時文件爲空),對應的到hadoop中數據節點中的路徑/home/hadoop/hdpdata/dfs/data/current/BP-302498708-192.168.29.144-1540943832361/current/finalized,其中/home/hadoop/hdpdata爲配置文件中配dfs路徑。web
3、命令行使用體驗centos
#查看hdfs根目錄 hadoop fs -ls hdfs://centos-aaron-h1:9000/ 或 hadoop fs -ls / #將test.avi這個文件上傳到hdfs跟目錄中,默認爲128m才切開成不一樣的塊來存放. hadoop fs -put test.avi / hadoop fs -put test.avi test1.avi / #若是文件被切成了不一樣的塊,咱們知道是哪幾個塊的話,能夠用塊直接還原文件 cat bloc_2034655 >>hadoop.file cat bloc_2034656 >>hadoop.file tar -zxvf hadoop.file -C /opt #瀏覽test.avi文件內容 hadoop fs -cat /test.avi #從hdfs中下載test.avi這個文件 hadoop fs -get /test.avi #刪除文件或文件夾 hadoop fs -rmr /wordcount/output 或 hadoop fs -rm -r /wordcount/output #遞歸新建文件夾 hadoop fs -mkdir -p /wordcount/input #運行jar包,wordcount爲主類,/wordcount/input爲主類運行須要的本身的參數輸入目錄,/wordcount/output/爲輸出目錄;若是"/wordcount/output/"已經存在會報錯 hadoop jar hadoop-mapreduce-examples-2.9.1.jar wordcount /wordcount/input/ /wordcount/output/
效果圖:瀏覽器
上傳的文件在web頁面能夠下載服務器
4、運行第一個mapreduce程序app
首先,hadoop官網jar包自己自動有一些用於測試的mapreduce程序jar包,它們在/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce目錄下: socket
本次測試使用hadoop-mapreduce-examples-2.9.1.jar(官網的文檔中有介紹)這個jar包,該jar包是內有一個單詞統計的mapreduce程序,主類是:wordcount;oop
測試步驟:測試
a.準備用於測試單詞統計使用的輸入目錄和結果輸出目錄
#遞歸新建輸入目錄 hadoop fs -mkdir -p /wordcount/input
b. 準備用於測試單詞統計的保存着單詞的文件
#新建文件test.txt,而且輸入些單詞報錯 vi test.txt ctrl+z+z #複製一份 cp test.txt test1.txt #上傳到上一步用於作單詞統計的輸入目錄 hadoop fs -put test.txt test1.txt /wordcount/input
c.運行hadoop-mapreduce-examples-2.9.1.jar
#運行jar包,wordcount爲主類,/wordcount/input爲主類運行須要的本身的參數輸入目錄,/wordcount/output/爲輸出目錄;若是"/wordcount/output/"已經存在會報錯 hadoop jar hadoop-mapreduce-examples-2.9.1.jar wordcount /wordcount/input/ /wordcount/output/
執行效果 [hadoop@centos-aaron-h3 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.9.1.jar wordcount /wordcount/input/ /wordcount/output/ 18/11/04 04:59:13 INFO client.RMProxy: Connecting to ResourceManager at centos-aaron-h1/192.168.29.144:8032 18/11/04 04:59:14 INFO input.FileInputFormat: Total input files to process : 2 18/11/04 04:59:15 INFO mapreduce.JobSubmitter: number of splits:2 18/11/04 04:59:15 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 18/11/04 04:59:16 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1541270195271_0001 18/11/04 04:59:17 INFO impl.YarnClientImpl: Submitted application application_1541270195271_0001 18/11/04 04:59:17 INFO mapreduce.Job: The url to track the job: http://centos-aaron-h1:8088/proxy/application_1541270195271_0001/ 18/11/04 04:59:17 INFO mapreduce.Job: Running job: job_1541270195271_0001 18/11/04 04:59:29 INFO mapreduce.Job: Job job_1541270195271_0001 running in uber mode : false 18/11/04 04:59:29 INFO mapreduce.Job: map 0% reduce 0% 18/11/04 04:59:46 INFO mapreduce.Job: map 100% reduce 0% 18/11/04 04:59:56 INFO mapreduce.Job: map 100% reduce 100% 18/11/04 04:59:57 INFO mapreduce.Job: Job job_1541270195271_0001 completed successfully 18/11/04 04:59:57 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=40 FILE: Number of bytes written=592624 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=257 HDFS: Number of bytes written=13 HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=29268 Total time spent by all reduces in occupied slots (ms)=7337 Total time spent by all map tasks (ms)=29268 Total time spent by all reduce tasks (ms)=7337 Total vcore-milliseconds taken by all map tasks=29268 Total vcore-milliseconds taken by all reduce tasks=7337 Total megabyte-milliseconds taken by all map tasks=29970432 Total megabyte-milliseconds taken by all reduce tasks=7513088 Map-Reduce Framework Map input records=2 Map output records=2 Map output bytes=30 Map output materialized bytes=46 Input split bytes=235 Combine input records=2 Combine output records=2 Reduce input groups=1 Reduce shuffle bytes=46 Reduce input records=2 Reduce output records=1 Spilled Records=4 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=798 CPU time spent (ms)=4970 Physical memory (bytes) snapshot=547995648 Virtual memory (bytes) snapshot=2537877504 Total committed heap usage (bytes)=259543040 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=22 File Output Format Counters Bytes Written=13
d.查看執行結果,_SUCCESS表明執行成功的文件,part-r-00000是輸出結果文件
hadoop fs -ls /wordcount/output hadoop fs -cat /wordcount/output/part-r-00000
5、hadoop命令詳解
#1.獲取命令提示 hadoop fs #2.查看hadoop fs命令幫助,功能:輸出這個命令參數手冊 hadoop fs -help #3.管道命令分頁查看 hadoop fs -cat /wordcount/output/part-r-00000 | more #4.功能:顯示目錄信息 hadoop fs -ls 示例: hadoop fs -ls hdfs://hadoop-server01:9000/ 備註:這些參數中,hadoop-server01爲namenode的主機名或域名,全部的hdfs路徑均可以簡寫--> hadoop fs -ls / 等同於上一條命令的效果 #5.功能:在hdfs上建立目錄 hadoop fs -mkdir 示例: hadoop fs -mkdir -p /aaa/bbb/cc/dd #6.功能:從本地剪切粘貼到hdfs hadoop fs -moveFromLocal 示例: hadoop fs -moveFromLocal /home/hadoop/a.txt /aaa/bbb/cc/dd #7.功能:從hdfs剪切粘貼到本地 hadoop fs -moveToLocal 示例: hadoop fs -moveToLocal /aaa/bbb/cc/dd /home/hadoop/a.txt #8.功能:追加一個文件到已經存在的文件末尾 hadoop fs -appendToFile 示例: hadoop fs -appendToFile ./hello.txt hdfs://hadoop-server01:9000/hello.txt 能夠簡寫爲: Hadoop fs -appendToFile ./hello.txt /hello.txt #9.功能:顯示文件內容 Hadoop fs -cat 示例: hadoop fs -cat /hello.txt #10.功能:顯示一個文件的末尾 hadoop fs -tail 示例: hadoop fs -tail /weblog/access_log.1 #11.功能:以字符形式打印一個文件的內容 hadoop fs -text 示例: hadoop fs -text /weblog/access_log.1 #13.功能:linux文件系統中的用法同樣,對文件所屬權限 -chgrp -chmod -chown 示例: hadoop fs -chmod 666 /hello.txt hadoop fs -chown someuser:somegrp /hello.txt #14.功能:從本地文件系統中拷貝文件到hdfs路徑去 hadoop fs -copyFromLocal 示例: hadoop fs -copyFromLocal ./jdk.tar.gz /aaa/ #15.功能:從hdfs拷貝到本地 hadoop fs -copyToLocal 示例: hadoop fs -copyToLocal /aaa/jdk.tar.gz #16.功能:從hdfs的一個路徑拷貝hdfs的另外一個路徑 hadoop fs -cp 示例: hadoop fs -cp /aaa/jdk.tar.gz /bbb/jdk.tar.gz.2 #17.功能:在hdfs目錄中移動文件 hadoop fs -mv 示例: hadoop fs -mv /aaa/jdk.tar.gz / #18.功能:等同於copyToLocal,就是從hdfs下載文件到本地 hadoop fs -get 示例: hadoop fs -get /aaa/jdk.tar.gz #19.功能:合併下載多個文件 hadoop fs -getmerge 示例:好比hdfs的目錄 /aaa/下有多個文件:log.1, log.2,log.3,... hadoop fs -getmerge /aaa/log.* ./log.sum #20.功能:等同於copyFromLocal hadoop fs -put 示例: hadoop fs -put /aaa/jdk.tar.gz /bbb/jdk.tar.gz.2 #21.功能:刪除文件或文件夾 hadoop fs -rm 示例: hadoop fs -rm -r /aaa/bbb/ #22.功能:刪除空目錄 hadoop fs -rmdir 示例: hadoop fs -rmdir /aaa/bbb/ccc #23.功能:統計文件系統的可用空間信息 hadoop fs -df 示例: hadoop fs -df -h / #24.功能:統計文件夾的大小信息 hadoop fs -du 示例: hadoop fs -du -s -h /aaa/* 根目錄統計需寫全路徑centos-aaron-h1爲namenode域名 hadoop fs -du -s -h hdfs://centos-aaron-h1:9000/ #25.功能:統計一個指定目錄下的文件節點數量 hadoop fs -count 示例: hadoop fs -count /aaa/ #26.功能:設置hdfs中文件的副本數量 hadoop fs -setrep 示例: hadoop fs -setrep 3 /aaa/jdk.tar.gz <這裏設置的副本數只是記錄在namenode的元數據中,是否真的會有這麼多副本,還得看datanode的數量>
最後補充:
a、hadoop hdfs上傳大於設置的限制大小值的文件時,會進行分塊上傳到hdfs文件系統,對外仍是表現爲一個文件,但在內部可能存在不一樣的節點,且可能包含不少個塊;
b、hadoop hdfs中沒有用戶和組的嚴格限定,你能夠設置任何用戶和組,即便設置的用戶和組不存在也能夠設置成功;
c、分塊保存原理,上傳時用Input循環讀,當讀的值大於設置值(默認128m)時,重新new一個socket和output寫到另一個分塊文件,這樣同一個文件的不一樣塊就可能存在不一樣的集羣節點,且每一個塊都有和設置值(默認3份)相同的副本;
d、mapreduce程序運行時,會在hdfs根目錄產生一個臨時文件夾tmp,是些臨時數據等;
e、設置的副本數的值大於datanode節點時,最多隻會保存副本數爲datanode的節點數量,多一個文件塊不會存多分副本在一臺datanode節點上;
f、在namenode web控制檯看到的文件副本數是元數據的副本值,真實值因爲上面e說明的那個緣由,可能存在誤差;
g、namenode的hdfs默認訪問端口爲9000
以上是博主本次文章的所有內容,若是你們以爲博主的文章還不錯,請點贊;若是您對博主其它服務器技術或者博主本人感興趣,請關注博主博客,而且歡迎隨時跟博主溝通交流。