使用Hadoop WebHDFS訪問HDFS

           使用Hadoop WebHDFS訪問HDFS
html

                                   做者:尹正傑node

版權聲明:原創做品,謝絕轉載!不然將追究法律責任。web

 

  webHDFS和HttpFS都是Hadoop的HTTP/HTTPS REST接口。這兩個接口使咱們可以讀取HDFS數據並寫入,以及執行與HDFS相關的幾個管理命令。能夠將它們嵌入程序,腳本或經過命令行工具(如curl或wget)來使用這兩個接口。apache

  WebHDFS不支持高可用NameNode架構,但HttpFS支持。json

 

 

一.WebHDFS概述安全

  當在Hadoop集羣中運行的應用程序想要訪問HDFS數據時,它們使用Hadoop的本地客戶端在HDFS上工做。可是,可能須要從集羣外部訪問HDFS,以便處理,存儲和檢索HDFS數據。

  若是應用程序須要使用本機HDFS協議,則必須在運行應用程序的服務器上安裝Hadoop,而且要提供與應用程序的Java依賴。

  Hadoop的WebHDFS提供了一組強大的HTTP REST API。REST是一種用於構建大規模Web服務的架構風格,其容許應用程序遠程訪問和使用HDFS。除了便於從外部訪問HDFS以外,當嘗試使用兩個Hadoop(每一個都運行不一樣版本的Hadoop)集羣時,WebHDFS也頗有用。

  因爲WebHDFS和MapReduce,HDFS版本無關,由於它使用REST API,因此它能夠在兩個集羣中使用。例如,當須要使用DistCp實用程序在兩個集羣之間執行數據複製時,可使用它。

  當使用WebHDFS遠程訪問HDFS數據時,不須要在客戶端上安裝Hadoop。可使用curl和wget等知名工具來訪問HDFS數據。WebHDFS支持直接鏈接到Hadoop集羣執行全部HDFS操做。

  WebHDFS使用基本的HTTP操做,如GET,PUT,POST和DELETE來遠程操做HDFS文件系統。

  博主推薦閱讀:
    https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

  舒適提示:
    若是你得HDFS集羣啓用來了Kerberos安全認證,則你應該須要關心如下參數(修改hdfs-site..xml):
      dfs.web.authentication.kerberos.principal
      dfs.web.authentication.kerberos.keytab

 

二.使用HDFS命令行工具經過WebHDFS REST API訪問HDFS實戰案例服務器

  使用WebHDFS很簡單,須要作的就是將HDFS文件系統URI替換爲HTTP URL,接下來咱們看一下幾個案例。

1>.列出"/yinzhengjie"的HDFS目錄全部文件和目錄cookie

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /        #須要注意的是,咱們在使用命令行工具並無指定文件系統的名稱則使用"core-site.xml"文件中"fs.defaultFS"屬性定義的默認文件系統名稱。
Found 4 items
drwxr-xr-x   - root admingroup          0 2020-08-21 16:40 /bigdata
drwxr-xr-x   - root admingroup          0 2020-08-20 19:26 /system
drwx------   - root admingroup          0 2020-08-14 19:19 /user
drwxr-xr-x   - root admingroup          0 2020-08-21 18:42 /yinzhengjie
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie
Found 3 items
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 hdfs://hadoop101.yinzhengjie.com:9000/yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/            
Found 3 items
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 /yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie        #使用webhdfs協議訪問HDFS
Found 3 items
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie        #使用webhdfs協議訪問HDFS

2>.將本地文件上傳到HDFS集羣中網絡

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 3 items
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 /yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -put /etc/fstab webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/fstab        #將本地文件"/etc/fstab"文件上傳到HDFS的"/yinzhengjie/"目錄
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 4 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 /yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -put /etc/fstab webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/fstab        #將本地文件"/etc/fstab"文件上傳到HDFS的"/yinzhengjie/"目錄

3>.下載HDFS文件系統中的文件或目錄架構

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 4 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 /yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# ll
total 0
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d      #下載目錄
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# ll
total 0
drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# ll yum.repos.d/
total 40
-rw-r--r-- 1 root root 1664 Aug 31 14:32 CentOS-Base.repo
-rw-r--r-- 1 root root 1309 Aug 31 14:32 CentOS-CR.repo
-rw-r--r-- 1 root root  649 Aug 31 14:32 CentOS-Debuginfo.repo
-rw-r--r-- 1 root root  314 Aug 31 14:32 CentOS-fasttrack.repo
-rw-r--r-- 1 root root  630 Aug 31 14:32 CentOS-Media.repo
-rw-r--r-- 1 root root 1331 Aug 31 14:32 CentOS-Sources.repo
-rw-r--r-- 1 root root 5701 Aug 31 14:32 CentOS-Vault.repo
-rw-r--r-- 1 root root  951 Aug 31 14:32 epel.repo
-rw-r--r-- 1 root root 1050 Aug 31 14:32 epel-testing.repo
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d      #下載目錄
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 4 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 /yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# ll
total 0
drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz       #下載文件
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# ll
total 4
-rw-r--r-- 1 root root  69 Aug 31 14:33 wc.txt.gz
drwxr-xr-x 2 root root 229 Aug 31 14:32 yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -get webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz       #下載文件

4>.刪除HDFS文件系統中的文件或目錄

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 4 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
drwxr-xr-x   - root admingroup          0 2020-08-14 23:13 /yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm -r webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d        #刪除目錄
20/08/31 14:38:12 INFO fs.TrashPolicyDefault: Moved: 'webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d' to trash at: webhdfs://hadoop101.yinzhengjie.com:50070/user/root/.Tr
ash/Current/yinzhengjie/yum.repos.d
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 3 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm -r webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/yum.repos.d        #刪除目錄
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 3 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
-rw-r--r--   3 root admingroup         69 2020-08-14 23:22 /yinzhengjie/wc.txt.gz
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz            #刪除文件
20/08/31 14:38:28 INFO fs.TrashPolicyDefault: Moved: 'webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz' to trash at: webhdfs://hadoop101.yinzhengjie.com:50070/user/root/.Tras
h/Current/yinzhengjie/wc.txt.gz
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 2 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -rm webhdfs://hadoop101.yinzhengjie.com:50070/yinzhengjie/wc.txt.gz            #刪除文件

5>.其它操做

  有了上面的4個案例打底,想必接下來讓你自行探索其它使用方法估計問題不大,和我以前分享的hdfs dfs工具的使用方法基本雷同,只不過須要將hdfs協議換成webhdfs協議便可。

  博主推薦閱讀:
    https://www.cnblogs.com/yinzhengjie2020/p/13296680.html

 

三.使用curl工具經過WebHDFS REST API訪問HDFS實戰案例

  WebHDFS真的是一個至關全面的工具,其包括許多用於訪問和使用HDFS數據的命令。接下來咱們就來看如何使用curl工具經過WebHDFS REST API訪問HDFS。

  關於curl工具的使用我這裏就不贅述了,感興趣的小夥伴能夠自行參考網上的博客,該工具的基本使用方法查看個人筆記便可。curl常見的選項以下所示:
    -A/--user-agent <string>:
      設置用戶代理髮送給服務器

    -e/--referer <URL>:
      來源網址

    --cacert <file>:
      CA證書 (SSL)

    -k/--insecure:
      容許忽略證書進行 SSL 鏈接

    --compressed:
      要求返回是壓縮的格式

    -H/--header <line>:
      自定義首部信息傳遞給服務器

    -i:
      顯示頁面內容,包括報文首部信息

    -I/--head:
      只顯示響應報文首部信息

    -D/--dump-header <file>:
      將url的header信息存放在指定文件中

    --basic:
      使用HTTP基本認證

    -u/--user <user[:password]>:
      設置服務器的用戶和密碼

    -L:
      若是有3xx響應碼,從新發請求到新位置
  
    -O:
      使用URL中默認的文件名保存文件到本地

    -o <file>:
      將網絡文件保存爲指定的文件中

    --limit-rate <rate>:
      設置傳輸速度

    -0/--http1.0:
      數字0,使用HTTP 1.0

    -v/--verbose:
      更詳細

    -C:
      選項可對文件使用斷點續傳功能

    -c/--cookie-jar <file name>:
      將url中cookie存放在指定文件中

    -x/--proxy <proxyhost[:port]>:
      指定代理服務器地址

    -X/--request <command>:
    向服務器發送指定請求方法

    -U/--proxy-user <user:password>:
      代理服務器用戶和密碼

    -T:
      選項可將指定的本地文件上傳到FTP服務器上

    --data/-d:
      方式指定使用POST方式傳遞數據

    -b name=data:
      從服務器響應set-cookie獲得值,返回給服務器
 
  博主推薦閱讀:
    https://www.cnblogs.com/yinzhengjie/p/7719804.html

1>.讀取HDFS中的文件(本案例讀取的是"/yinzhengjie/hosts")

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 2 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=OPEN&user.name=yinzhengjie"      #op指定操做,而user.name指定訪問URI的用戶
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 07:39:16 GMT
Date: Mon, 31 Aug 2020 07:39:16 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 07:39:16 GMT
Date: Mon, 31 Aug 2020 07:39:16 GMT
Pragma: no-cache
Content-Type: application/octet-stream
X-FRAME-OPTIONS: SAMEORIGIN
Set-Cookie: hadoop.auth="u=yinzhengjie&p=yinzhengjie&t=simple&e=1598895556829&s=ak8QrD/3I7HowelGDzH9uvnDeAGBihJhCbCm0wVqS2M="; Path=/; HttpOnly
Location: http://hadoop104.yinzhengjie.com:50075/webhdfs/v1/yinzhengjie/hosts?op=OPEN&user.name=yinzhengjie&namenoderpcaddress=hadoop101.yinzhengjie.com:9000&offset=0
Content-Length: 0

HTTP/1.1 200 OK
Access-Control-Allow-Methods: GET
Access-Control-Allow-Origin: *
Content-Type: application/octet-stream
Connection: close
Content-Length: 371

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

#Hadoop 2.x
172.200.6.101 hadoop101.yinzhengjie.com
172.200.6.102 hadoop102.yinzhengjie.com
172.200.6.103 hadoop103.yinzhengjie.com
172.200.6.104 hadoop104.yinzhengjie.com
172.200.6.105 hadoop105.yinzhengjie.com
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=OPEN&user.name=yinzhengjie"     #op指定操做,而user.name指定訪問URI的用戶

2>.檢查HDFS目錄的狀態

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /
Found 4 items
drwxr-xr-x   - root admingroup          0 2020-08-21 16:40 /bigdata
drwxr-xr-x   - root admingroup          0 2020-08-20 19:26 /system
drwx------   - root admingroup          0 2020-08-14 19:19 /user
drwxr-xr-x   - root admingroup          0 2020-08-31 14:38 /yinzhengjie
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 2 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=LISTSTATUS"        #查看"/yinzhengjie"目錄的狀態
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 07:51:31 GMT
Date: Mon, 31 Aug 2020 07:51:31 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 07:51:31 GMT
Date: Mon, 31 Aug 2020 07:51:31 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Transfer-Encoding: chunked

{"FileStatuses":{"FileStatus":[
{"accessTime":1598855175268,"blockSize":536870912,"childrenNum":0,"fileId":16489,"group":"admingroup","length":490,"modificationTime":1598855175823,"owner":"root","pathSuffix":"fstab","perm
ission":"644","replication":3,"storagePolicy":0,"type":"FILE"},{"accessTime":1598859477240,"blockSize":536870912,"childrenNum":0,"fileId":16484,"group":"admingroup","length":371,"modificationTime":1597999554986,"owner":"root","pathSuffix":"hosts","perm
ission":"644","replication":3,"storagePolicy":0,"type":"FILE"}]}}
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=LISTSTATUS"        #查看"/yinzhengjie"目錄的狀態

3>.檢查HDFS文件的狀態

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 2 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=GETFILESTATUS" ;echo       #查看"/yinzhengjie/hosts"文件的狀態
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 07:58:53 GMT
Date: Mon, 31 Aug 2020 07:58:53 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 07:58:53 GMT
Date: Mon, 31 Aug 2020 07:58:53 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Transfer-Encoding: chunked

{"FileStatus":{"accessTime":1598859477240,"blockSize":536870912,"childrenNum":0,"fileId":16484,"group":"admingroup","length":371,"modificationTime":1597999554986,"owner":"root","pathSuffix"
:"","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"}}
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -L "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/hosts?op=GETFILESTATUS" ;echo       #查看"/yinzhengjie/hosts"文件的狀態

4>.建立目錄

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /
Found 4 items
drwxr-xr-x   - root admingroup          0 2020-08-21 16:40 /bigdata
drwxr-xr-x   - root admingroup          0 2020-08-20 19:26 /system
drwx------   - root admingroup          0 2020-08-14 19:19 /user
drwxr-xr-x   - root admingroup          0 2020-08-31 16:17 /yinzhengjie
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 2 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -X PUT "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?user.name=root&op=MKDIRS&permissions=751" ;echo     #建立"/yinzhengjie/webHDFS"目錄
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 08:14:10 GMT
Date: Mon, 31 Aug 2020 08:14:10 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 08:14:10 GMT
Date: Mon, 31 Aug 2020 08:14:10 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598897650918&s=rp1JdtIpaV59fm8TFisjCUMH3ARerDWzI4oL+jCezrs="; Path=/; HttpOnly
Transfer-Encoding: chunked

{"boolean":true}
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 3 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-21 16:45 /yinzhengjie/hosts
drwxr-xr-x   - root admingroup          0 2020-08-31 16:14 /yinzhengjie/webHDFS
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -X PUT "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?user.name=root&op=MKDIRS&permissions=751" ;echo 

5>.建立並寫入數據到文件

  我使用的是"Hadoop 2.10.0"版本,在嘗試使用webhdfs官方的方法建立文件或者往已有的文件追加內容均失敗了,官方提供的2個方法須要發送2次HTTP請求,但我在測試屢次均沒法建立,如有成功的小夥伴請不吝賜教。

  參考鏈接:
    https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File
    https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Append_to_a_File

6>.刪除目錄或文件

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 3 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-31 18:07 /yinzhengjie/hosts
drwxr-xr-x   - root admingroup          0 2020-08-31 18:07 /yinzhengjie/webHDFS
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE  "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?op=DELETE&user.name=root";echo     #刪除目錄
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 10:07:56 GMT
Date: Mon, 31 Aug 2020 10:07:56 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 10:07:56 GMT
Date: Mon, 31 Aug 2020 10:07:56 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598904476157&s=4aHgz6EwyJfdmjlwOtkXs+8Je94BybNxDUYoon7FIWE="; Path=/; HttpOnly
Transfer-Encoding: chunked

{"boolean":true}
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 2 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-31 18:07 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/webHDFS?op=DELETE&user.name=root";echo     #刪除目錄
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 2 items
-rw-r--r--   3 root admingroup        490 2020-08-31 14:26 /yinzhengjie/fstab
-rw-r--r--   3 root admingroup        371 2020-08-31 18:07 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE  "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/fstab?op=DELETE&user.name=root";echo       #刪除文件
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 10:08:52 GMT
Date: Mon, 31 Aug 2020 10:08:52 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 10:08:52 GMT
Date: Mon, 31 Aug 2020 10:08:52 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Set-Cookie: hadoop.auth="u=root&p=root&t=simple&e=1598904532486&s=MCjvGp705lVZcZx7hc5UCeERNoRDGC5rsW5E/USXi6c="; Path=/; HttpOnly
Transfer-Encoding: chunked

{"boolean":true}
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -ls /yinzhengjie/
Found 1 items
-rw-r--r--   3 root admingroup        371 2020-08-31 18:07 /yinzhengjie/hosts
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i -X DELETE "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie/fstab?op=DELETE&user.name=root";echo       #刪除文件

7>.檢查目錄配額

[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -count -h -v -q /yinzhengjie
       QUOTA       REM_QUOTA     SPACE_QUOTA REM_SPACE_QUOTA    DIR_COUNT   FILE_COUNT       CONTENT_SIZE PATHNAME
        none             inf            none             inf            1            2                742 /yinzhengjie
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i   "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=GETCONTENTSUMMARY" ;echo 
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 10:30:13 GMT
Date: Mon, 31 Aug 2020 10:30:13 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 10:30:13 GMT
Date: Mon, 31 Aug 2020 10:30:13 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Transfer-Encoding: chunked

{"ContentSummary":{"directoryCount":1,"fileCount":2,"length":742,"quota":-1,"spaceConsumed":29631,"spaceQuota":-1,"typeQuota":{}}}
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfsadmin -setSpaceQuota 10g /yinzhengjie/
[root@hadoop105.yinzhengjie.com ~]# hdfs dfsadmin -setQuota 50 /yinzhengjie/
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# hdfs dfs -count -h -v -q /yinzhengjie
       QUOTA       REM_QUOTA     SPACE_QUOTA REM_SPACE_QUOTA    DIR_COUNT   FILE_COUNT       CONTENT_SIZE PATHNAME
          50              47            10 G          10.0 G            1            2                742 /yinzhengjie
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i   "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=GETCONTENTSUMMARY" ;echo 
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 31 Aug 2020 10:30:52 GMT
Date: Mon, 31 Aug 2020 10:30:52 GMT
Pragma: no-cache
Expires: Mon, 31 Aug 2020 10:30:52 GMT
Date: Mon, 31 Aug 2020 10:30:52 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Transfer-Encoding: chunked

{"ContentSummary":{"directoryCount":1,"fileCount":2,"length":742,"quota":50,"spaceConsumed":29631,"spaceQuota":10737418240,"typeQuota":{}}}
[root@hadoop105.yinzhengjie.com ~]# 
[root@hadoop105.yinzhengjie.com ~]# curl -i "http://hadoop101.yinzhengjie.com:50070/webhdfs/v1/yinzhengjie?op=GETCONTENTSUMMARY" ;echo

8>.其它操做

  博主推薦閱讀:
    https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

相關文章
相關標籤/搜索