hadoop之hdfs命令詳解

時間 2019-11-10

原文原文鏈接

本篇主要對hadoop命令和hdfs命令進行闡述，yarn命令會在以後的文章中體現html

hadoop fs命令能夠用於其餘文件系統，不止是hdfs文件系統內，也就是說該命令的使用範圍更廣能夠用於HDFS、Local FS等不一樣的文件系統。而hdfs dfs命令只用於HDFS文件系統；node

1、hadoop命令linux

使用語法：hadoop [--config confdir] COMMAND #其中config用來覆蓋默認的配置sql

##command #子命令
fs                   run a generic filesystem user client
version              print the version
jar <jar>            run a jar file
checknative [-a|-h]  check native hadoop and compression libraries availability
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath            prints the class path needed to get the
credential           interact with credential providers Hadoop jar and the required libraries
daemonlog            get/set the log level for each daemon
s3guard              manage data on S3
trace                view and modify Hadoop tracing settings

一、archive shell

建立一個hadoop壓縮文件，詳細的能夠參考 http://hadoop.apache.org/docs/r2.7.0/hadoop-archives/HadoopArchives.htmlexpress

使用格式：hadoop archive -archiveName NAME -p <parent path> <src>* <dest> #-p 能夠同時指定多個路徑apache

實例：bootstrap

[hive@mwpl003 ~]$ hadoop fs -touchz /tmp/test/a.txt
[hive@mwpl003 ~]$ hadoop fs -ls /tmp/test/
Found 1 items
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:50 /tmp/test/a.txt
[hive@mwpl003 ~]$ hadoop archive -archiveName test.har -p  /tmp/test/a.txt -r 3 /tmp/test
19/09/18 13:52:58 INFO mapreduce.JobSubmitter: number of splits:1
19/09/18 13:52:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1565571819971_6988
19/09/18 13:52:58 INFO impl.YarnClientImpl: Submitted application application_1565571819971_6988
19/09/18 13:52:58 INFO mapreduce.Job: The url to track the job: http://ip_address:8088/proxy/application_1565571819971_6988/
19/09/18 13:52:58 INFO mapreduce.Job: Running job: job_1565571819971_6988
19/09/18 13:53:04 INFO mapreduce.Job: Job job_1565571819971_6988 running in uber mode : false
19/09/18 13:53:04 INFO mapreduce.Job:  map 0% reduce 0%
19/09/18 13:53:08 INFO mapreduce.Job:  map 100% reduce 0%
19/09/18 13:53:13 INFO mapreduce.Job:  map 100% reduce 100%
19/09/18 13:53:13 INFO mapreduce.Job: Job job_1565571819971_6988 completed successfully
19/09/18 13:53:13 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=80
                FILE: Number of bytes written=313823
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=264
                HDFS: Number of bytes written=69
                HDFS: Number of read operations=14
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=8
        Job Counters 
                Launched map tasks=1
                Launched reduce tasks=1
                Other local map tasks=1
                Total time spent by all maps in occupied slots (ms)=7977
                Total time spent by all reduces in occupied slots (ms)=12015
                Total time spent by all map tasks (ms)=2659
                Total time spent by all reduce tasks (ms)=2403
                Total vcore-milliseconds taken by all map tasks=2659
                Total vcore-milliseconds taken by all reduce tasks=2403
                Total megabyte-milliseconds taken by all map tasks=8168448
                Total megabyte-milliseconds taken by all reduce tasks=12303360
        Map-Reduce Framework
                Map input records=1
                Map output records=1
                Map output bytes=59
                Map output materialized bytes=76
                Input split bytes=97
                Combine input records=0
                Combine output records=0
                Reduce input groups=1
                Reduce shuffle bytes=76
                Reduce input records=1
                Reduce output records=0
                Spilled Records=2
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=91
                CPU time spent (ms)=2320
                Physical memory (bytes) snapshot=1189855232
                Virtual memory (bytes) snapshot=11135381504
                Total committed heap usage (bytes)=3043491840
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=167
        File Output Format Counters 
                Bytes Written=0
[hive@mwpl003 ~]$ hadoop fs -ls /tmp/test/
Found 2 items
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:50 /tmp/test/a.txt
drwxr-xr-x   - hive supergroup          0 2019-09-18 13:53 /tmp/test/test.har

[hive@mwpl003 ~]$ hadoop fs -ls /tmp/test/test.har/
Found 4 items
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:53 /tmp/test/test.har/_SUCCESS
-rw-r--r--   3 hive supergroup         55 2019-09-18 13:53 /tmp/test/test.har/_index
-rw-r--r--   3 hive supergroup         14 2019-09-18 13:53 /tmp/test/test.har/_masterindex
-rw-r--r--   3 hive supergroup          0 2019-09-18 13:53 /tmp/test/test.har/part-0

解壓：
hadoop distcp har:///tmp/test/test.har /tmp/test1
hdfs dfs -cp har:///tmp/test/test.har /tmp/test1

二、checknative安全

檢查hadoop的原生代碼，通常人用不到網絡

使用語法：hadoop checknative [-a] [-h]
-a 檢查全部的庫
-h 顯示幫助

三、classpath

打印hadoop jar或者庫的類路徑

使用語法：hadoop classpath [--glob |--jar <path> |-h |--help]

四、credential

管理憑證供應商的憑證、密碼和secret(有關祕密信息）

使用語法：hadoop credential <subcommand> [options]

五、distcp（比較經常使用）

distributed copy的縮寫（望文生義),主要用於集羣內/集羣之間複製文件。須要使用到mapreduce

使用語法：hadoop distcp [-option] hdfs://source hdfs://dest
詳細見：http://hadoop.apache.org/docs/r2.7.0/hadoop-distcp/DistCp.html

經常使用的幾個選項：
-m <num_maps>  #指定了拷貝數據時map的數目。請注意並非map數越多吞吐量越大
-i               #忽略失敗
-log <logdir>  #記錄日誌到 <logdir>
-update        #當目標集羣上的文件不存在或文件不一致時，纔會從源集羣拷貝
-overwrite     #覆蓋目標集羣上的文件
-filter        #過濾不須要複製的文件
-delete        #刪除目標文件存在，但不存在source中的文件

六、fs

與hdfs dfs同用

查看幫助：hadoop fs -help

詳細查看：http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/FileSystemShell.html

包括以下一些子命令：

appendToFile, cat, checksum, chgrp, chmod, chown, copyFromLocal, copyToLocal, count, cp, createSnapshot, deleteSnapshot, df, du, expunge, find, get, getfacl, getfattr, getmerge, help, ls, mkdir, moveFromLocal, moveToLocal, mv, put, renameSnapshot, rm, rmdir, setfacl, setfattr, setrep, stat, tail, test, text, touchz

在這裏我想各位都應該比較熟悉linux的基本操做命令了，因此這些命令用起來比較簡單

6.一、appendToFile

appendToFile  #追加一下本地文件到分佈式文件系統
Usage: hadoop fs -appendToFile <localsrc> ... <dst>
example：
hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile
hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile  #表示從標準輸入輸入數據到hadoopfile中，ctrl+d 結束輸入

6.二、cat

cat   #查看文件內容
Usage: hadoop fs -cat URI [URI ...]
example：
hadoop fs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
hadoop fs -cat file:///file3 /user/hadoop/file4

6.三、checksum

checksum  #返回被檢查文件的格式
Usage: hadoop fs -checksum URI
example：
[hive@mwpl003 ~]$  hadoop fs -checksum /tmp/test/test.txt
/tmp/test/test.txt      MD5-of-0MD5-of-512CRC32C        000002000000000000000000fde199c1517b7b26b0565ff6b0f46acc

6.四、chgrp

chgrp   #變動文件目錄的所屬組
Usage: hadoop fs -chgrp [-R] GROUP URI [URI ...]

6.五、chmod

chmod  #修改文件或者目錄的權限
Usage: hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]

6.六、chown

chown  #修改目錄或者文件的擁有者和所屬組
Usage: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]

6.七、copyFromLocal

copyFromLocal #從本地複製文件或者文件夾到hdfs，相似put命令
Usage: hadoop fs -copyFromLocal [-f] <localsrc> URI  #其中-f選項會覆蓋與原文件同樣的目標路徑文件
example：
hadoop fs -copyFromLocal start-hadoop.sh  /tmp

6.八、copyToLocal

copyToLocal  #相似get命令，從hdfs獲取文件到本地
Usage: hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>

6.九、count

count  #計算 目錄，文件，字節數
Usage: hadoop fs -count [-q] [-h] [-v] <paths>

6.十、cp

cp     #複製源文件到目標文件
Usage: hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest>
Example:
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir

6.十一、Snapshot相關

createSnapshot #建立快照
deleteSnapshot #刪除快照
詳細見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html
HDFS快照是文件系統的只讀時間點副本。能夠在文件系統的子樹或整個文件系統上拍攝快照。快照的一些常見用例是數據備份，防止用戶錯誤和災難恢復。
在建立快照前，要設置一個目錄爲snapshottable（須要管理員權限），表示能夠在該目錄中建立快照
hdfs dfsadmin -allowSnapshot <path> #在path中啓用快照
hdfs dfsadmin -disallowSnapshot <path> #在path中禁止快照
hdfs dfs -ls /foo/.snapshot #列出快照目錄下的全部快照
hdfs dfs -createSnapshot <path> [<snapshotName>] #建立快照，快照名默認爲時間戳格式
hdfs dfs -deleteSnapshot <path> <snapshotName> #刪除快照
hdfs dfs -renameSnapshot <path> <oldName> <newName> #快照重命名
hdfs lsSnapshottableDir #獲取快照目錄

6.十二、df

df  #展現空間使用狀況
Usage: hadoop fs -df [-h] URI [URI ...]

6.1三、du

du  #展現目錄包含的文件的大小
Usage: hadoop fs -du [-s] [-h] URI [URI ...]
Example:
hadoop fs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1

6.1四、expunge

expunge  #清空回收站（不要瞎用）
Usage: hadoop fs -expunge

6.1五、find

find   #查找
Usage: hadoop fs -find <path> ... <expression> ...
-name pattern
-iname pattern #忽略大小寫
-print
-print0Always
Example:
hadoop fs -find / -name test -print

6.1六、get

get #獲取數據，相似於copyToLocal.但有crc校驗
Usage: hadoop fs -get [-ignorecrc] [-crc] <src> <localdst>
Example:
hadoop fs -get /tmp/input/hadoop/*.xml /home/hadoop/testdir/

6.1七、getfacl

getfacl #展現目錄或者文件的ACL權限
Usage: hadoop fs -getfacl [-R] <path>
[hive@mwpl003 ~]$ hadoop fs -getfacl -R  /tmp/test
# file: /tmp/test
# owner: hive
# group: supergroup
getfacl: The ACL operation has been rejected.  Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false.

6.1八、getfattr

getfattr #顯示文件或目錄的擴展屬性名稱和值
Usage: hadoop fs -getfattr [-R] -n name | -d [-e en] <path>
-n name和 -d是互斥的，
-d表示獲取全部屬性。
-R表示循環獲取； 
-e en 表示對獲取的內容編碼，en的能夠取值是 「text」, 「hex」, and 「base64」.
Examples:
hadoop fs -getfattr -d /file
hadoop fs -getfattr -R -n user.myAttr /dir

6.1九、getmerge

getmerge  #合併文件
Usage: hadoop fs -getmerge <src> <localdst> [addnl]
hadoop fs -getmerge   /src  /opt/output.txt
hadoop fs -getmerge  /src/file1.txt /src/file2.txt  /output.txt

6.20、ls

ls   #羅列文件
Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args>

6.2一、mkdir

mkdir #建立文件夾
Usage: hadoop fs -mkdir [-p] <paths>
Example:
hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir

6.2二、moveFromLocal

moveFromLocal #把本地文件移動到hdfs上
Usage: hadoop fs -moveFromLocal <localsrc> <dst>

6.2三、moveToLocal

moveToLocal   #把hdfs文件移動到本地上
Usage: hadoop fs -moveToLocal [-crc] <src> <dst>

6.2四、mv

mv   #移動文件，可是能夠一次移動多個
Usage: hadoop fs -mv URI [URI ...] <dest>
Example:
hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2
hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1

6.2五、put

put  #把文件複製到hdfs上
Usage: hadoop fs -put <localsrc> ... <dst>
hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile  #Reads the input from stdin.

6.2六、rm

rm  #刪除文件
Usage: hadoop fs -rm [-f] [-r |-R] [-skipTrash] URI [URI ...]

6.2七、rmdir

rmdir  #刪除一個目錄
Usage: hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]

6.2八、setfacl

setfacl  #設置ACL權限
Usage: hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec> <path>]
-b 刪除除基本acl項以外的全部項。保留用戶、組和其餘用戶
-k 刪除全部的默認ACL權限
-R 遞歸操做
-m 修改ACL權限，保留舊的，添加新的
-x 刪除指定ACL權限
--set 徹底替換現有的ACL權限
Examples:
hadoop fs -setfacl -m user:hadoop:rw- /file
hadoop fs -setfacl -x user:hadoop /file
hadoop fs -setfacl -b /file
hadoop fs -setfacl -k /dir
hadoop fs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file
hadoop fs -setfacl -R -m user:hadoop:r-x /dir
hadoop fs -setfacl -m default:user:hadoop:r-x /dir

6.2九、setfattr

setfattr  #設置額外的屬性
Usage: hadoop fs -setfattr -n name [-v value] | -x name <path>
-b 刪除除基本acl項以外的全部項。保留用戶、組和其餘用戶
-n 額外屬性名
-v 額外屬性值
-x name 刪除額外屬性
Examples:
hadoop fs -setfattr -n user.myAttr -v myValue /file
hadoop fs -setfattr -n user.noValue /file
hadoop fs -setfattr -x user.myAttr /file

6.30、setrep

setrep  #改變文件的複製因子（複本）
Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>
Example:
hadoop fs -setrep -w 3 /user/hadoop/dir1

6.3一、stat

stat #獲取文件的時間
Usage: hadoop fs -stat [format] <path> ...
Example:
hadoop fs -stat "%F %u:%g %b %y %n" /file

6.3二、tail

tail #展現文件到標準輸出
Usage: hadoop fs -tail [-f] URI

6.3三、test

test  #測試
Usage: hadoop fs -test -[defsz] URI
-d 判斷是不是目錄
-e 判斷是否存在
-f 判斷是不是文件
-s 判斷目錄是否爲空
-z 判斷文件是否爲空
Example:
hadoop fs -test -e filename

6.3四、text

text #能夠用來看壓縮文件
Usage: hadoop fs -text <src>

6.3五、touchz

touchz  #建立一個空文件
Usage: hadoop fs -touchz URI [URI ...]

七、jar

jar  #運行一個jar文件
Usage: hadoop jar <jar> [mainClass] args...
Example:
hadoop jar ./test/wordcount/wordcount.jar org.codetree.hadoop.v1.WordCount /test/chqz/input /test/chqz/output的各段的含義：
(1) hadoop：${HADOOP_HOME}/bin下的shell腳本名。
(2) jar：hadoop腳本須要的command參數。
(3) ./test/wordcount/wordcount.jar：要執行的jar包在本地文件系統中的完整路徑，參遞給RunJar類。
(4) org.codetree.hadoop.v1.WordCount：main方法所在的類，參遞給RunJar類。
(5) /test/chqz/input：傳遞給WordCount類，做爲DFS文件系統的路徑，指示輸入數據來源。
(6) /test/chqz/output：傳遞給WordCount類，做爲DFS文件系統的路徑，指示輸出數據路徑。
hadoop推薦使用yarn jar替代hadoop jar 詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar

八、key

key #用來管理祕鑰，基本不用

九、trace

trace  #查看和修改跟蹤設置
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/Tracing.html

2、hdfs命令

hdfs命令有以下選項：

User Commands： classpath, dfs, fetchdt, fsck, getconf, groups, lsSnapshottableDir, jmxget, oev, oiv, oiv_legacy, snapshotDiff, version,
Administration Commands： balancer, cacheadmin, crypto, datanode, dfsadmin, haadmin, journalnode, mover, namenode, nfs3, portmap, secondarynamenode, storagepolicies, zkfc
Debug Commands： verifyMeta, computeMeta, recoverLease

這裏不全詳解，詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html

一、classpath

classpath  #獲取jar包或者庫的有關類路徑
Usage: hdfs classpath [--glob |--jar <path> |-h |--help]

二、dfs

dfs #同上節hadoop fs 命令

三、fetchdt

fetchdt  #從namenode節點獲取代理令牌
Usage: hdfs fetchdt <opts> <token_file_path>
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#fetchdt

四、fsck（重要）

hdfs fsck <path>
          [-list-corruptfileblocks |
          [-move | -delete | -openforwrite]
          [-files [-blocks [-locations | -racks | -replicaDetails]]]
          [-includeSnapshots]
          [-storagepolicies] [-blockId <blk_Id>]

-delete    刪除損壞的文件
-files    打印正在檢查的文件.
-files -blocks    打印塊報告
-files -blocks -locations    Print out locations for every block.
-files -blocks -racks    打印每一個塊的位置
-files -blocks -replicaDetails    打印出每一個副本的詳細信息.
-includeSnapshots    若是給定路徑指示SnapshotTable目錄或其下有SnapshotTable目錄，則包括快照數據
-list-corruptfileblocks    打印出所屬丟失塊和文件的列表.
-move    將損壞的文件移動到/lost+found.
-openforwrite    打印爲寫入而打開的文件.
-storagepolicies    打印塊的存儲策略摘要.
-blockId    打印出有關塊的信息.

五、getconf（重要）

hdfs getconf -namenodes #獲取namenode節點
hdfs getconf -secondaryNameNodes #獲取secondaryNameNodes節點
hdfs getconf -backupNodes  #獲取羣集中備份節點的列表
hdfs getconf -includeFile  #獲取定義能夠加入羣集的數據節點的包含文件路徑
hdfs getconf -excludeFile  #獲取定義須要停用的數據節點的排除文件路徑
hdfs getconf -nnRpcAddresses #獲取namenode rpc地址
hdfs getconf -confKey [key] #從配置中獲取特定密鑰 ，能夠用來返回hadoop的配置信息的具體值

六、groups

groups #返回用戶的所屬組
Usage: hdfs groups [username ...]

七、lsSnapshottableDir

lsSnapshottableDir #查看快照目錄
Usage: hdfs lsSnapshottableDir [-help]

八、jmxget

jmxget  #從特定服務獲取jmx信息
Usage: hdfs jmxget [-localVM ConnectorURL | -port port | -server mbeanserver | -service service]

九、oev

oev  #離線編輯查看器
Usage: hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE

十、oiv

oiv  #離線映像編輯查看器
Usage: hdfs oiv [OPTIONS] -i INPUT_FILE

十一、snapshotDiff

snapshotDiff  #對比快照信息的不一樣
Usage: hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html#Get_Snapshots_Difference_Report

十二、balancer（重要）

balancer
 hdfs balancer
          [-threshold <threshold>]
          [-policy <policy>]
          [-exclude [-f <hosts-file> | <comma-separated list of hosts>]]
          [-include [-f <hosts-file> | <comma-separated list of hosts>]]
          [-source [-f <hosts-file> | <comma-separated list of hosts>]]
          [-blockpools <comma-separated list of blockpool ids>]
          [-idleiterations <idleiterations>]
-policy <policy>    datanode (default): 若是每一個數據節點都是平衡的，則羣集是平衡的.
blockpool: 若是每一個數據節點中的每一個塊池都是平衡的，則羣集是平衡的.
-threshold <threshold>    磁盤容量的百分比。這將覆蓋默認閾值
-exclude -f <hosts-file> | <comma-separated list of hosts>    排除平衡器正在平衡的指定數據節點
-include -f <hosts-file> | <comma-separated list of hosts>    僅包含要由平衡器平衡的指定數據節點
-source -f <hosts-file> | <comma-separated list of hosts>    僅選取指定的數據節點做爲源節點。
-blockpools <comma-separated list of blockpool ids>    平衡器將僅在此列表中包含的塊池上運行.
-idleiterations <iterations>    退出前的最大空閒迭代次數。這將覆蓋默認的空閒操做（5次）

1三、cacheadmin

cacheadmin
Usage: hdfs cacheadmin -addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]
hdfs crypto -createZone -keyName <keyName> -path <path>
  hdfs crypto -listZones
  hdfs crypto -provisionTrash -path <path>
  hdfs crypto -help <command-name>
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html

1四、datanode

datanode #運行datanode
Usage: hdfs datanode [-regular | -rollback | -rollingupgrade rollback]
-regular    正常啓動(default).
-rollback    將datanode回滾到之前的版本。這應該在中止datanode並分發舊的hadoop版本以後使用
-rollingupgrade rollback    回滾滾動升級操做

1五、dfsadmim（重要）

hdfs dfsadmin [GENERIC_OPTIONS]
          [-report [-live] [-dead] [-decommissioning]]   #報告基本的文件系統信息和統計信息，包括測量全部dns上的複製、校驗和、快照等使用的原始空間。
          [-safemode enter | leave | get | wait | forceExit] #安全模式維護命令
           #安全模式在namenode啓動時自動進入，當配置的最小塊百分比知足最小複製條件時自動離開安全模式。若是namenode檢測到任何異常，
           #則它將在安全模式下逗留，直到該問題獲得解決。若是異常是故意操做的結果，那麼管理員可使用-safemode forceExit退出安全模式
          [-saveNamespace] #將當前命名空間保存到存儲目錄並重置編輯日誌。須要安全模式
          [-rollEdits] #在活動的namenode上滾動編輯日誌
          [-restoreFailedStorage true |false |check] #此選項將打開或者關閉自動嘗試還原失敗的存儲副本。若是失敗的存儲再次可用，
          #系統將在檢查點期間嘗試還原編輯和fsimage。「check」選項將返回當前設置
          [-refreshNodes] #從新讀取主機並排除文件，以更新容許鏈接到namenode的數據節點集，以及應解除或從新啓用的數據節點集
          [-setQuota <quota> <dirname>...<dirname>]
          [-clrQuota <dirname>...<dirname>]
          [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
          [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
          [-finalizeUpgrade] #完成hdfs的升級。datanodes刪除它們之前版本的工做目錄，而後namenode執行相同的操做。這就完成了升級過程
          [-rollingUpgrade [<query> |<prepare> |<finalize>]]
          [-metasave filename] #將namenode的主數據結構保存到hadoop.log.dir屬性指定的目錄中的filename。若是文件名存在，它將被覆蓋。
          #該文件包含帶namenode的datanodes心跳，等待複製的塊，當前正在複製的塊，等待刪除的塊
          [-refreshServiceAcl] #從新加載服務級別受權策略文件
          [-refreshUserToGroupsMappings] #刷新用戶到組的映射
          [-refreshSuperUserGroupsConfiguration] #刷新超級用戶代理組映射
          [-refreshCallQueue] #從配置從新加載調用隊列
          [-refresh <host:ipc_port> <key> [arg1..argn]] #觸發由<host:ipc port>上的<key>指定的資源的運行時刷新。以後的全部其餘參數都將發送到主機
          [-reconfig <datanode |...> <host:ipc_port> <start |status>] #開始從新配置或獲取正在進行的從新配置的狀態。第二個參數指定節點類型。目前，只支持從新加載datanode的配置
          [-printTopology] #打印由namenode報告的機架及其節點的樹
          [-refreshNamenodes datanodehost:port] #對於給定的數據節點，從新加載配置文件，中止爲已刪除的塊池提供服務，並開始爲新的塊池提供服務
          [-deleteBlockPool datanode-host:port blockpoolId [force]] #若是傳遞了force，則將刪除給定數據節點上給定block pool id的塊池目錄及其內容，不然僅當該目錄爲空時才刪除該目錄。
          #若是datanode仍在爲塊池提供服務，則該命令將失敗
          [-setBalancerBandwidth <bandwidth in bytes per second>] #更改HDFS塊平衡期間每一個數據節點使用的網絡帶寬。<bandwidth>是每一個數據節點每秒將使用的最大字節數。
          #此值重寫dfs.balance.bandwidthpersec參數。注意：新值在datanode上不是持久的
          [-getBalancerBandwidth <datanode_host:ipc_port>] #獲取給定數據節點的網絡帶寬（字節/秒）。這是數據節點在hdfs塊平衡期間使用的最大網絡帶寬
          [-allowSnapshot <snapshotDir>] #設置快照目錄
          [-disallowSnapshot <snapshotDir>] #禁止快照
          [-fetchImage <local directory>] #從namenode下載最新的fsimage並將其保存在指定的本地目錄中
          [-shutdownDatanode <datanode_host:ipc_port> [upgrade]] #提交給定數據節點的關閉請求
          [-getDatanodeInfo <datanode_host:ipc_port>] #獲取有關給定數據節點的信息
          [-evictWriters <datanode_host:ipc_port>]  #使datanode收回正在寫入塊的全部客戶端。若是因爲編寫速度慢而掛起退役，這將很是有用
          [-triggerBlockReport [-incremental] <datanode_host:ipc_port>] #觸發給定數據節點的塊報告。若是指定了「增量」，則爲「增量」，不然爲完整的塊報告
          [-help [cmd]]

1六、haadmin（重要）

hdfs haadmin -checkHealth <serviceId>  #檢查給定namenode的運行情況
hdfs haadmin -failover [--forcefence] [--forceactive] <serviceId> <serviceId> #在兩個namenodes之間啓動故障轉移
hdfs haadmin -getServiceState <serviceId> #肯定給定的namenode是活動的仍是備用的
hdfs haadmin -help <command>
hdfs haadmin -transitionToActive <serviceId> [--forceactive] #將給定namenode的狀態轉換爲active
hdfs haadmin -transitionToStandby <serviceId> #將給定namenode的狀態轉換爲standby
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html

1七、journalnode

journalnode #爲經過QJM實現的高可用hdfs啓動journalnode
Usage: hdfs journalnode

1八、mover　　

Usage: hdfs mover [-p <files/dirs> | -f <local file name>]
-f 指定包含要遷移的hdfs文件/目錄列表的本地文件
-p 指定要遷移的hdfs文件/目錄的空間分隔列表
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

1九、namenode

namenode
hdfs namenode [-backup] |  #開始備份節點
         [-checkpoint] | #檢查點開始節點
         [-format [-clusterid cid ] [-force] [-nonInteractive] ] |  #格式化指定的NameNode。 它啓動NameNode，
         #對其進行格式化而後將其關閉。 若是名稱目錄存在，則爲-force選項格式。 若是名稱目錄存在，則-nonInteractive選項將停止，除非指定了-force選項
         [-upgrade [-clusterid cid] [-renameReserved<k-v pairs>] ] | #在分發新的Hadoop版本後，應該使用升級選項啓動Namenode
         [-upgradeOnly [-clusterid cid] [-renameReserved<k-v pairs>] ] | #升級指定的NameNode而後關閉它
         [-rollback] | #將NameNode回滾到之前的版本。 應在中止羣集並分發舊Hadoop版本後使用此方法
         [-rollingUpgrade <rollback |started> ] |#滾動升級 詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
         [-finalize] |  #再也不支持。使用dfsadmin -finalizeUpgrade替換
         [-importCheckpoint] | #從檢查點目錄加載image並將其保存到當前目錄中。 從屬性dfs.namenode.checkpoint.dir讀取檢查點目錄
         [-initializeSharedEdits] | #格式化新的共享編輯目錄並複製足夠的編輯日誌段，以便備用NameNode能夠啓動
         [-bootstrapStandby [-force] [-nonInteractive] [-skipSharedEditsCheck] ] | #容許經過從活動NameNode複製最新的命名空間快照來引導備用NameNode的存儲目錄
         [-recover [-force] ] | #在損壞的文件系統上恢復丟失的元數據
         [-metadataVersion ] #驗證配置的目錄是否存在，而後打印軟件和映像的元數據版本

20、secondarynamenode

Usage: hdfs secondarynamenode [-checkpoint [force]] | [-format] | [-geteditsize]
-checkpoint [force]    若是EditLog size> = fs.checkpoint.size，則檢查SecondaryNameNode。 若是使用force，則檢查點與EditLog大小無關
-format    啓動期間格式化本地存儲
-geteditsize    打印NameNode上未取消選中的事務的數量

2一、storagepolicies

storagepolicies #列出全部存儲策略
Usage: hdfs storagepolicies
詳情見：http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

2二、zkfc

Usage: hdfs zkfc [-formatZK [-force] [-nonInteractive]]
-formatZK    格式化Zookeeper實例
-force: 若是znode存在，則格式化znode。 
-nonInteractive：若是znode存在，則格式化znode停止，除非指定了-force選項
-h    Display help

2三、verifyMeta

verifyMeta  #驗證HDFS元數據和塊文件。 若是指定了塊文件，咱們將驗證元數據文件中的校驗和是否與塊文件匹配
Usage: hdfs debug verifyMeta -meta <metadata-file> [-block <block-file>]
-block block-file    用於指定數據節點的本地文件系統上的塊文件的絕對路徑
-meta metadata-file    數據節點的本地文件系統上的元數據文件的絕對路徑

2四、computeMeta

computeMeta #從塊文件計算HDFS元數據。 若是指定了塊文件，咱們將從塊文件計算校驗和，並將其保存到指定的輸出元數據文件中
Usage: hdfs debug computeMeta -block <block-file> -out <output-metadata-file>
-block block-file    數據節點的本地文件系統上的塊文件的絕對路徑
-out output-metadata-file    輸出元數據文件的絕對路徑，用於存儲塊文件的校驗和計算結果。

2五、recoverLease

recoverLease #恢復指定路徑上的租約。 該路徑必須駐留在HDFS文件系統上。 默認重試次數爲1
Usage: hdfs debug recoverLease -path <path> [-retries <num-retries>]
[-path path]    要恢復租約的HDFS路徑
[-retries num-retries]    客戶端重試調用recoverLease的次數。 默認重試次數爲1

更多hadoop生態文章請見：hadoop生態系列

相關標籤/搜索

hadoop+hdfs+yarn+spark

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。