Hadoop筆記(1)——hdfs命令訪問方式

 

以前的CDH5.8.3安裝部署就算是前菜。今天開始寫正式的系列第一篇,給本身個完整的筆記整理。java

其實目前隨便找一本hadoop的書,都會在HDFS的章節把hadoop dfs的命令從查看、推拉、增刪、複製剪切等逐一羅列。可是,就連今年剛出版的第n版,已經將hadoop介紹到hadoop2.6以上的,也沒有使用hadoop最新官方推薦的hdfs命令交互方式,若是你按照那些命令敲,一大堆的Deprecated。今天這篇博客,我按照類別,將原先的老命令(已經deprecated了)和對應的新命令放到一塊兒作個整理。node

(本文出自:https://my.oschina.net/happyBKs/blog/811739)linux

爲了方便之後查閱和給讀者一個看了就知道如何用的幫助。我將全部的命令用命令示例的方式給出,不在只是給出其餘博客抄來抄去的命令文字說明。不過須要囉嗦的地方也會囉嗦幾句。git

 

爲了方便說明,先新建hdfs目錄,命令咱們先用老命令,後面會把新命令羅列。咱們建兩個目錄。github

[root@localhost hadoop]#  hadoop fs -mkdir /myhome
[root@localhost hadoop]#  hadoop fs -mkdir /myhome/happyBKs

 

1. 文件的推與拉:

推:把本地系統文件推到hdfs上。apache

拉:把hdfs上的文件拉倒本地文件系統目錄。windows

(1)「推」命令:

hadoop dfs -put <本地文件> <HDFS路徑:能夠是一個目錄,文件將被推倒這個目錄下,或者是一個文件名,文件推上去會被從新命名>

 上面這條命令是從hadoop1開始就有的方式,也是各種書介紹的命令。有的也用hadoop fs .....等等。瀏覽器

下面用這種方式先將hadooptext.txt這個文本文件推到hdfs的根目錄/下。bash

[root@localhost log]# ll
total 32
-rw-r--r--. 1 root root 31894 Dec  9 06:45 hadooptext.txt

[root@localhost log]# hadoop dfs -put hadooptext.txt /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
16/12/09 06:48:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost log]# hadoop dfs -ls /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

16/12/09 06:49:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-04 06:46 /myhome
[root@localhost log]# hadoop dfs -lsr /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

lsr: DEPRECATED: Please use 'ls -R' instead.
16/12/09 06:50:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-04 06:46 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-04 06:46 /myhome/happyBKs

值得注意的是,過程當中,CDH5.8 hdfs提示命令已經不被推薦。服務器

那麼推薦的方式是什麼呢?

「推」命令【官方推薦的新方式】:

hdfs dfs <本地文件> <HDFS路徑:能夠是一個目錄,文件將被推倒這個目錄下,或者是一個文件名,文件推上去會被從新命名>

咱們用這種方式再來將另外一個本地文件googledev.txt推送到hdfs的/myhome/happyBKs目錄下。

而後再將剛纔那個hadooptext.txt推送到/myhome/happyBKs下,而且命名爲hadooptext2.txt。

這樣,就不會出現Deprecated的提示了。更重要的是,hdfs dfs命令比hadoop dfs的運行速度快得多,先後對比一下仍是比較明顯的。

[root@localhost log]# nano
[root@localhost log]# ll
total 36
-rw-r--r--. 1 root root  2185 Dec  9 06:57 googledev.txt
-rw-r--r--. 1 root root 31894 Dec  9 06:45 hadooptext.txt
[root@localhost log]# hdfs dfs -put googledev.txt /myhome/happyBKs
16/12/09 06:57:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost log]# hdfs dfs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
16/12/09 06:58:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-04 06:46 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-09 06:57 /myhome/happyBKs
-rw-r--r--   1 root supergroup       2185 2016-12-09 06:57 /myhome/happyBKs/googledev.txt
[root@localhost log]# hdfs dfs -put hadooptext.txt /myhome/happyBKs/hadooptext2.txt
16/12/09 06:59:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost log]# hdfs dfs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
16/12/09 06:59:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-04 06:46 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-09 06:59 /myhome/happyBKs
-rw-r--r--   1 root supergroup       2185 2016-12-09 06:57 /myhome/happyBKs/googledev.txt
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:59 /myhome/happyBKs/hadooptext2.txt
[root@localhost log]#

上傳本地整個目錄到hdfs也是能夠的:

[root@localhost mqbag]# ll
total 8
-rw-r--r--. 1 root root 85 Dec 11 05:32 tangshi1.txt
-rw-r--r--. 1 root root 82 Dec 11 05:33 tangshi2.txt
[root@localhost mqbag]# cd ..
[root@localhost log]# 
[root@localhost log]# 
[root@localhost log]# 
[root@localhost log]# hdfs dfs put mqbag/ /myhome/happyBKs
put: Unknown command
Did you mean -put?  This command begins with a dash.
[root@localhost log]# hdfs dfs -put mqbag/ /myhome/happyBKs
16/12/11 05:50:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[root@localhost log]# hdfs dfs -ls -R /
16/12/11 05:51:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-04 06:46 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup       2185 2016-12-09 06:57 /myhome/happyBKs/googledev.txt
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:59 /myhome/happyBKs/hadooptext2.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs/mqbag
-rw-r--r--   1 root supergroup         85 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi1.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi2.txt
[root@localhost log]#

還有一種相似於put的「推」命令 dfs -copyFromLocal

用法徹底同樣:

[root@localhost mqbag]# hadoop dfs -copyFromLocal *.txt /myhome
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

16/12/11 05:56:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[root@localhost mqbag]# hdfs dfs -ls /myhome
16/12/11 05:57:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup         85 2016-12-11 05:56 /myhome/tangshi1.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:56 /myhome/tangshi2.txt

推薦命令:

[root@localhost mqbag]# hdfs dfs -copyFromLocal tangshi1.txt /myhome/tangshi1_1.txt
16/12/11 06:14:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost mqbag]# hdfs dfs -ls /myhome
16/12/11 06:14:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup         85 2016-12-11 05:56 /myhome/tangshi1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 06:14 /myhome/tangshi1_1.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:56 /myhome/tangshi2.txt
[root@localhost mqbag]#

 

(3)拉命令

老命令:

[root@localhost mqbag]# hadoop dfs -get /myhome/tangshi1_1.txt ./
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

16/12/11 06:42:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost mqbag]# ll
total 12
-rw-r--r--. 1 root root 85 Dec 11 06:42 tangshi1_1.txt
-rw-r--r--. 1 root root 85 Dec 11 05:32 tangshi1.txt
-rw-r--r--. 1 root root 82 Dec 11 05:33 tangshi2.txt

推薦命令:

[root@localhost mqbag]# rm tangshi1_1.txt 
rm: remove regular file ‘tangshi1_1.txt’? y

[root@localhost mqbag]# hdfs dfs -get /myhome/tangshi1_1.txt tangshi1_1_local.txt
16/12/11 07:03:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost mqbag]# ll
total 12
-rw-r--r--. 1 root root 85 Dec 11 07:03 tangshi1_1_local.txt
-rw-r--r--. 1 root root 85 Dec 11 05:32 tangshi1.txt
-rw-r--r--. 1 root root 82 Dec 11 05:33 tangshi2.txt
[root@localhost mqbag]#

拉文件時,還能選擇是否拉取crc校驗失敗的文件,用-ignoreCrc

還能用選項-crc 複製文件以及CRC信息。

hdfs dfs -get [-ignoreCrc] [-crc] <src> <localdst>

 

(4) 把本地文件移動到hdfs,本地不留。

 

[root@localhost mqbag]# hdfs dfs -moveFromLocal tangshi1_1_local.txt /myhome
16/12/11 07:07:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost mqbag]# ll
total 8
-rw-r--r--. 1 root root 85 Dec 11 05:32 tangshi1.txt
-rw-r--r--. 1 root root 82 Dec 11 05:33 tangshi2.txt
[root@localhost mqbag]# hdfs dfs -ls /myhome
16/12/11 07:07:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 5 items
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup         85 2016-12-11 05:56 /myhome/tangshi1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 06:14 /myhome/tangshi1_1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 07:07 /myhome/tangshi1_1_local.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:56 /myhome/tangshi2.txt
[root@localhost mqbag]#

 

 

2. 文件查看類命令:

(1)查看hdfs某目錄:

[root@localhost hadoop]# hdfs dfs -ls /
16/12/13 06:55:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 07:07 /myhome

 

[root@localhost hadoop]# hadoop dfs -lsr /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

lsr: DEPRECATED: Please use 'ls -R' instead.
16/12/13 07:00:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 07:07 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup       2185 2016-12-09 06:57 /myhome/happyBKs/googledev.txt
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:59 /myhome/happyBKs/hadooptext2.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs/mqbag
-rw-r--r--   1 root supergroup         85 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi1.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi2.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 05:56 /myhome/tangshi1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 06:14 /myhome/tangshi1_1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 07:07 /myhome/tangshi1_1_local.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:56 /myhome/tangshi2.txt
[root@localhost hadoop]# 
[root@localhost hadoop]# hdfs dfs -ls -R /
16/12/13 07:01:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 07:07 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup       2185 2016-12-09 06:57 /myhome/happyBKs/googledev.txt
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:59 /myhome/happyBKs/hadooptext2.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs/mqbag
-rw-r--r--   1 root supergroup         85 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi1.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi2.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 05:56 /myhome/tangshi1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 06:14 /myhome/tangshi1_1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 07:07 /myhome/tangshi1_1_local.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:56 /myhome/tangshi2.txt
[root@localhost hadoop]#

 

(2)查看HDFS文件、目錄的大小和數量的命令

一共四個命令須要知道:三個看大小的和一個看數量的

a. 某個目錄的總容量、使用量、剩餘量、使用率百分比。(看資源容量時最推薦)

hdfs dfs -df <path>

查看目錄的使用狀況,如hdfs上該目錄的總容量、該目錄已經使用了多少字節,還有多少字節可用,使用率的百分比。但不會顯示該目錄下每一個文件或子目錄的有哪些及它們的大小信息。

[root@localhost hadoop]# hdfs dfs -df /
16/12/13 07:10:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Filesystem                 Size    Used    Available  Use%
hdfs://master:9000  19001245696  143360  12976656384    0%
[root@localhost hadoop]# 
[root@localhost hadoop]# hdfs dfs -df /myhome/happyBKs
16/12/15 06:44:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Filesystem                 Size    Used    Available  Use%
hdfs://master:9000  19001245696  143360  12976279552    0%
[root@localhost hadoop]#

 

b. 查看某個目錄下的文件和子目錄的大小(看子女)和查看某個目錄自己的大小(只看本人)

hdfs dfs -du /

顯示某目錄下的文件和子目錄,可是不遞歸下去顯示孫級目錄的內容,也不會顯示本級目錄的總大小。顯示這些文件的大小(Byte)

hdfs dfs -du -s /

顯示本級目錄的總大小,可是不會顯示該目錄下的任何文件和子目錄大小。

下面的是上面兩個命令的示例,能夠看出區別:

[root@localhost hadoop]# hdfs dfs -du /
16/12/13 07:12:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
31894  31894  /hadooptext.txt
34583  34583  /myhome
[root@localhost hadoop]# 
[root@localhost hadoop]# hdfs dfs -du /myhome/happyBKs
16/12/15 06:30:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2185   2185   /myhome/happyBKs/googledev.txt
31894  31894  /myhome/happyBKs/hadooptext2.txt
167    167    /myhome/happyBKs/mqbag
[root@localhost hadoop]#

若是我只關心整個目錄的總大小:

[root@localhost hadoop]# hdfs dfs -du -s /
16/12/15 06:27:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
66477  66477  /
[root@localhost hadoop]# hdfs dfs -du -s /myhome
16/12/15 06:28:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
34583  34583  /myhome
[root@localhost hadoop]# hdfs dfs -du -s /myhome/happyBKs
16/12/15 06:28:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
34246  34246  /myhome/happyBKs
[root@localhost hadoop]#

固然-du -s還能夠寫成-dus, 不過已經被DEPRECATED了。

[root@localhost hadoop]# hdfs dfs -dus /
dus: DEPRECATED: Please use 'du -s' instead.
16/12/13 07:13:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
66477  66477  /

 

d. 看數量

看某個目錄下的目錄數和文件數

顯示<path>下的目錄數和文件數,輸出格式:

目錄數 文件數 大小 文件名
hdfs dfs -count [-q] /

若是加上-q,能夠查看文件索引的狀況。

[root@localhost hadoop]# hdfs dfs -count -q /
16/12/13 07:15:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
9223372036854775807 9223372036854775794            none             inf            4            9              66477 /
[root@localhost hadoop]# 
[root@localhost hadoop]# hdfs dfs -count /
16/12/13 07:16:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
           4            9              66477 /
[root@localhost hadoop]#

 

固然,如果要查看hdfs的使用狀況和文件,也能夠用經過瀏覽器訪問。但請確保hadoop 服務器的對應的hdfs端口已經在防火牆設置過,且最好是永久設置,否則你沒法經過50070端口訪問。(若是你設置了其餘號碼爲hdfs端口。請設置其餘數字)

 

 

 

 

3. 查看文件的內容

a. 有兩個命令均可以用來查看文件的內容:

hdfs dfs -cat <src>

瀏覽HDFS路徑爲<src>的文件的內容

-cat [-ignoreCrc] <src> ... :
  Fetch all files that match the file pattern <src> and display their content on
  stdout.

 

hdfs dfs -text <src>

強HDFS路徑爲<src>的文本文件輸出

-text [-ignoreCrc] <src> ... :
  Takes a source file and outputs the file in text format.
  The allowed formats are zip and TextRecordInputStream and Avro.
 

是否是沒看出什麼區別?是的,在輸出查看文本內容時確實如此:

[root@localhost hadoop]# hdfs dfs -cat /myhome/tangshi1.txt
16/12/18 05:53:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
八陣圖

功蓋三分國,名成八陣圖。
江流石不轉,遺恨失吞吳。
[root@localhost hadoop]# hdfs dfs -text /myhome/tangshi1.txt
16/12/18 05:53:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
八陣圖

功蓋三分國,名成八陣圖。
江流石不轉,遺恨失吞吳。
[root@localhost hadoop]#

 

b. 查看文件的最後1KB內容   、動態監控日誌更新

hdfs dfs -tail <src>

查看文件的最後1KB內容。這對查看那種龐大的日誌文件來講是十分有用的。

[root@localhost logs]# hdfs dfs -tail /myhome/hadoop-hadoop-datanode-localhost.localdomain.log
16/12/18 06:34:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
127.0.0.1:38881, dest: /127.0.0.1:50010, bytes: 15181, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_586277825_1, offset: 0, srvID: 54611752-a966-4db1-aabe-8d4525b73d78, blockid: BP-1682033911-127.0.0.1-1480341812291:blk_1073741834_1010, duration: 848268122
2016-12-18 06:07:49,911 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1682033911-127.0.0.1-1480341812291:blk_1073741834_1010, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2016-12-18 06:10:41,876 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1073741834_1010 file /opt/hdfs/data/current/BP-1682033911-127.0.0.1-1480341812291/current/finalized/subdir0/subdir0/blk_1073741834 for deletion
2016-12-18 06:10:41,914 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1682033911-127.0.0.1-1480341812291 blk_1073741834_1010 file /opt/hdfs/data/current/BP-1682033911-127.0.0.1-1480341812291/current/finalized/subdir0/subdir0/blk_1073741834
[root@localhost logs]#

或者說它就是爲日誌文件量身定製的。由於它還能適用於文件不斷變化的情景。

[root@localhost logs]# hdfs dfs -tail -f /myhome/hadoop-hadoop-datanode-localhost.localdomain.log
16/12/18 06:53:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
127.0.0.1:38881, dest: /127.0.0.1:50010, bytes: 15181, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_586277825_1, offset: 0, srvID: 54611752-a966-4db1-aabe-8d4525b73d78, blockid: BP-1682033911-127.0.0.1-1480341812291:blk_1073741834_1010, duration: 848268122
2016-12-18 06:07:49,911 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1682033911-127.0.0.1-1480341812291:blk_1073741834_1010, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2016-12-18 06:10:41,876 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1073741834_1010 file /opt/hdfs/data/current/BP-1682033911-127.0.0.1-1480341812291/current/finalized/subdir0/subdir0/blk_1073741834 for deletion
2016-12-18 06:10:41,914 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1682033911-127.0.0.1-1480341812291 blk_1073741834_1010 file /opt/hdfs/data/current/BP-1682033911-127.0.0.1-1480341812291/current/finalized/subdir0/subdir0/blk_1073741834
^Ctail: Filesystem closed

加入-f參數會進入一種監控方式,控制檯顯示了最後1KB內容以後,命令行不會返回,而知等待新內容的到來。輸出的內容會隨着文件追加內容而更新。很是適用於監控日誌文件。中止監控,只須要Ctrl+C

c. 查HDFS上的文件或目錄的統計信息

輸入下面的命令,查HDFS上的路徑爲<path>的文件或目錄的統計信息。格式爲:

%b 文件大小
%n 文件名
%r 複製因子,或者說副本數
%y,%Y 修改日期

看例子吧。

[root@localhost logs]# hadoop fs -stat '%b %n %r %y %Y %o' /myhome/tangshi1.txt
16/12/18 07:22:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
85 tangshi1.txt 1 2016-12-11 13:56:32 1481464592248 134217728
[root@localhost logs]# hadoop fs -stat '%b %n %r %y %Y %o' /myhome/
16/12/18 07:24:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
0 myhome 0 2016-12-18 14:26:04 1482071164232 0
[root@localhost logs]#

 

 

4. HDFS目錄的新建、刪、移、複製

(1)HDFS目錄的新建:

注意:新建目錄若是新建的的目錄包含多級目錄,必須加上-p,不然會因爲不認識第一級目錄而報錯。這個l與linux命令同樣啦。

[root@localhost logs]# hdfs dfs -mkdir /yourhome/home1
16/12/18 07:27:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
mkdir: '/yourhome/home1': No such file or directory
[root@localhost logs]# 
[root@localhost logs]# hdfs dfs -mkdir -p /yourhome/home1
16/12/18 07:28:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost logs]# hdfs dfs -ls /
16/12/18 07:28:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-18 06:26 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-18 07:28 /yourhome
[root@localhost logs]# hdfs dfs -ls /yourhome
16/12/18 07:28:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - root supergroup          0 2016-12-18 07:28 /yourhome/home1
[root@localhost logs]#

 

(2)HDFS目錄的更名和剪切移位:

若是mv的目標目錄不存在,那麼源目錄會被更名;若是目標目錄存在,那麼源目錄會被剪切移位到目標目錄下。

[root@localhost sbin]# hdfs dfs mkdir -p /yourhome/home2
mkdir: Unknown command
Did you mean -mkdir?  This command begins with a dash.
[root@localhost sbin]# hdfs dfs -mkdir -p /yourhome/home2
16/12/22 07:03:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost sbin]# 
[root@localhost sbin]# 
[root@localhost sbin]# hdfs dfs -ls /yourhome
16/12/22 07:03:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
drwxr-xr-x   - root supergroup          0 2016-12-18 07:28 /yourhome/home1
drwxr-xr-x   - root supergroup          0 2016-12-22 07:03 /yourhome/home2
[root@localhost sbin]# hdfs dfs -mv /yourhome/home1 /yourhome/home3
16/12/22 07:04:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost sbin]# hdfs dfs -ls /yourhome
16/12/22 07:05:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
drwxr-xr-x   - root supergroup          0 2016-12-22 07:03 /yourhome/home2
drwxr-xr-x   - root supergroup          0 2016-12-18 07:28 /yourhome/home3

[root@localhost sbin]# hdfs dfs -ls -R /yourhome
16/12/22 07:05:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x   - root supergroup          0 2016-12-22 07:03 /yourhome/home2
drwxr-xr-x   - root supergroup          0 2016-12-18 07:28 /yourhome/home3
[root@localhost sbin]# 
[root@localhost sbin]# hdfs dfs -mv /yourhome/home2 /yourhome/home3
16/12/22 07:05:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost sbin]# hdfs dfs -ls -R /yourhome
16/12/22 07:05:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x   - root supergroup          0 2016-12-22 07:05 /yourhome/home3
drwxr-xr-x   - root supergroup          0 2016-12-22 07:03 /yourhome/home3/home2
[root@localhost sbin]#

 

(3)HDFS目錄的刪除

刪除目錄的時候,不要忘記-r

[root@localhost sbin]# hdfs dfs -rm -r /yourhome/home3
16/12/22 07:14:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Deleted /yourhome/home3

刪除的時候注意,hdfs上的刪除默認都是標記爲刪除,實際上在hdfs的回收站裏依然佔據着空間。要想完全刪除,必須把回收站清空,命令以下:

[root@localhost sbin]# hdfs dfs -expunge
16/12/22 07:17:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost sbin]#

或者在rm時加入-skipTrash,這樣就能直接刪除乾淨了。有點像windows下你按住shift+DEL。

[root@localhost sbin]# hdfs dfs -rm -r -skipTrash /yourhome/home3
16/12/22 07:19:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Deleted /yourhome/home3
[root@localhost sbin]#

假如你是那種把絕密文件放在本身虛擬機hdfs上的男同胞,冠希哥的錯誤不要在學習hdfs的路程中再犯了。

 

(4)HDFS目錄的複製:

複製目錄的時候,用的不是-r,而是-f。

[root@localhost hadoop]# hdfs dfs -cp -f /yourhome /yourhome2
16/12/24 07:06:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost hadoop]# hdfs dfs -ls /your*
16/12/24 07:06:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /yourhome/hadooparticle.txt
Found 1 items
-rw-r--r--   1 root supergroup      31894 2016-12-24 07:06 /yourhome2/hadooparticle.txt
[root@localhost hadoop]# hdfs dfs -rm -r -skipTrash /yourhome
16/12/24 07:09:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Deleted /yourhome
[root@localhost hadoop]#

 

 

 

5. 文件的增刪移位

(1)在HDFS上新建一個空文件:

[root@localhost hadoop]# hdfs dfs -touchz /blacnk.txt
16/12/24 06:33:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost hadoop]# hdfs dfs -ls /
16/12/24 06:33:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
-rw-r--r--   1 root supergroup          0 2016-12-24 06:33 /blacnk.txt
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-18 06:26 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-22 07:19 /yourhome

[root@localhost hadoop]# hdfs dfs -cat /blacnk.txt
16/12/24 06:33:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost hadoop]#

(2)刪除HDFS上的一個文件

[root@localhost hadoop]# hdfs dfs -rm -skipTrash /blacnk.txt
16/12/24 06:40:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Deleted /blacnk.txt
[root@localhost hadoop]#

(3)將HDFS上的文件剪切移位:

[root@localhost hadoop]# hdfs dfs -ls -R /
16/12/24 06:28:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-18 06:26 /myhome
-rw-r--r--   1 root supergroup     171919 2016-12-18 06:26 /myhome/hadoop-hadoop-datanode-localhost.localdomain.log
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup       2185 2016-12-09 06:57 /myhome/happyBKs/googledev.txt
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:59 /myhome/happyBKs/hadooptext2.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs/mqbag
-rw-r--r--   1 root supergroup         85 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi1.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi2.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 05:56 /myhome/tangshi1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 06:14 /myhome/tangshi1_1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 07:07 /myhome/tangshi1_1_local.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:56 /myhome/tangshi2.txt
drwxr-xr-x   - root supergroup          0 2016-12-22 07:19 /yourhome

[root@localhost hadoop]# hdfs dfs -mv /hadooptext.txt /yourhome
16/12/24 06:44:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost hadoop]# hdfs dfs -ls -R /
16/12/24 06:44:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x   - root supergroup          0 2016-12-18 06:26 /myhome
-rw-r--r--   1 root supergroup     171919 2016-12-18 06:26 /myhome/hadoop-hadoop-datanode-localhost.localdomain.log
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs
-rw-r--r--   1 root supergroup       2185 2016-12-09 06:57 /myhome/happyBKs/googledev.txt
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:59 /myhome/happyBKs/hadooptext2.txt
drwxr-xr-x   - root supergroup          0 2016-12-11 05:50 /myhome/happyBKs/mqbag
-rw-r--r--   1 root supergroup         85 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi1.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:50 /myhome/happyBKs/mqbag/tangshi2.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 05:56 /myhome/tangshi1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 06:14 /myhome/tangshi1_1.txt
-rw-r--r--   1 root supergroup         85 2016-12-11 07:07 /myhome/tangshi1_1_local.txt
-rw-r--r--   1 root supergroup         82 2016-12-11 05:56 /myhome/tangshi2.txt
drwxr-xr-x   - root supergroup          0 2016-12-24 06:44 /yourhome
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /yourhome/hadooptext.txt
[root@localhost hadoop]#

(4)hdfs文件重命名

前面用mv能夠移動hdfs目錄,也能夠爲目錄重命名。mv對hdfs文件也是同樣。

[root@localhost hadoop]# hdfs dfs -mv /yourhome/hadooptext.txt /yourhome/hadooparticle.txt
16/12/24 06:58:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost hadoop]# hdfs dfs -ls /yourhome
16/12/24 06:58:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /yourhome/hadooparticle.txt
[root@localhost hadoop]#

(5)hdfs複製文件

複製的時候能夠只指定文件複製的目標目錄位置;也能夠指定一個完整的目標文件名,即在複製的同時爲文件更名字。

[root@localhost hadoop]# hdfs dfs -cp /yourhome/hadooparticle.txt /
16/12/24 07:00:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost hadoop]# 
[root@localhost hadoop]# hdfs dfs -ls / /yourhome
16/12/24 07:00:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
-rw-r--r--   1 root supergroup      31894 2016-12-24 07:00 /hadooparticle.txt
drwxr-xr-x   - root supergroup          0 2016-12-18 06:26 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-24 06:58 /yourhome
Found 1 items
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /yourhome/hadooparticle.txt
[root@localhost hadoop]# 
[root@localhost hadoop]# hdfs dfs -cp /hadooparticle.txt /hadooptext.txt
16/12/24 07:01:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[root@localhost hadoop]# hdfs dfs -ls / /yourhome
16/12/24 07:01:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
-rw-r--r--   1 root supergroup      31894 2016-12-24 07:00 /hadooparticle.txt
-rw-r--r--   1 root supergroup      31894 2016-12-24 07:01 /hadooptext.txt
drwxr-xr-x   - root supergroup          0 2016-12-18 06:26 /myhome
drwxr-xr-x   - root supergroup          0 2016-12-24 06:58 /yourhome
Found 1 items
-rw-r--r--   1 root supergroup      31894 2016-12-09 06:49 /yourhome/hadooparticle.txt
[root@localhost hadoop]#

這裏其實提到了ls的一個用法,顯示多個目錄下的內容,只要在-ls後面逐個列舉就能夠了,顯示的時候會分開顯示,並顯示每一個列舉的目錄下有多少個文件或者子目錄,Found x items。

至於編輯HDFS上的文件內容,那你就別想了。老老實實把要修改的文件get下來,再put回去吧。

 

6. 一個很是有用的文件條件判斷命令

test
使用方法:hadoop fs -test -[ezd] URI

選項:
-e 檢查文件是否存在。若是存在則返回0。
-z 檢查文件是不是0字節。若是是則返回0。 
-d 若是路徑是個目錄,則返回1,不然返回0。

不過這玩意是不顯示的,因此別傻敲。

 

 

最後,我把hdfs命令在三種命令形式的區別給出:比較權威的區別解釋:不翻譯了。

 

Following are the three commands which appears same but have minute differences

  1. hadoop fs {args}
  2. hadoop dfs {args}
  3. hdfs dfs {args}

    hadoop fs <args>

FS relates to a generic file system which can point to any file systems like local, HDFS etc. So this can be used when you are dealing with different file systems such as Local FS, HFTP FS, S3 FS, and others

hadoop dfs <args>

dfs is very specific to HDFS. would work for operation relates to HDFS. This has been deprecated and we should use hdfs dfs instead.

hdfs   dfs <args>

same as 2nd i.e would work for all the operations related to HDFS and is the recommended command instead of hadoop dfs

below is the list categorized as HDFS commands.

**#hdfs commands**
  namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups

So even if you use Hadoop dfs , it will look locate hdfs and delegate that command to hdfs dfs

 

最後的最後,介紹hdfs中最有用的命令: --help,不知道的時候,查就是了!

[root@localhost mqbag]# hdfs --help
Usage: hdfs [--config confdir] COMMAND
       where COMMAND is one of:
  dfs                  run a filesystem command on the file systems supported in Hadoop.
  namenode -format     format the DFS filesystem
  secondarynamenode    run the DFS secondary namenode
  namenode             run the DFS namenode
  journalnode          run the DFS journalnode
  zkfc                 run the ZK Failover Controller daemon
  datanode             run a DFS datanode
  dfsadmin             run a DFS admin client
  diskbalancer         Distributes data evenly among disks on a given node
  haadmin              run a DFS HA admin client
  fsck                 run a DFS filesystem checking utility
  balancer             run a cluster balancing utility
  jmxget               get JMX exported values from NameNode or DataNode.
  mover                run a utility to move block replicas across
                       storage types
  oiv                  apply the offline fsimage viewer to an fsimage
  oiv_legacy           apply the offline fsimage viewer to an legacy fsimage
  oev                  apply the offline edits viewer to an edits file
  fetchdt              fetch a delegation token from the NameNode
  getconf              get config values from configuration
  groups               get the groups which users belong to
  snapshotDiff         diff two snapshots of a directory or diff the
                       current directory contents with a snapshot
  lsSnapshottableDir   list all snapshottable dirs owned by the current user
						Use -help to see options
  portmap              run a portmap service
  nfs3                 run an NFS version 3 gateway
  cacheadmin           configure the HDFS cache
  crypto               configure HDFS encryption zones
  storagepolicies      list/get/set block storage policies
  version              print the version

Most commands print help when invoked w/o parameters.
[root@localhost mqbag]# hdfs version
Hadoop 2.6.0-cdh5.8.3
Subversion http://github.com/cloudera/hadoop -r 992be3bac6b145248d32c45b16f8fce5a984b158
Compiled by jenkins on 2016-10-13T03:23Z
Compiled with protoc 2.5.0
From source with checksum ef7968b8b98491d54f83cb3bd7a87ea
This command was run using /opt/hadoop-2.6.0-cdh5.8.3/share/hadoop/common/hadoop-common-2.6.0-cdh5.8.3.jar
[root@localhost mqbag]#
相關文章
相關標籤/搜索