HFTP Guide

Introduction(說明)java

HFTP is a Hadoop filesystem implementation that lets you read data from a remote Hadoop HDFS cluster. The reads are done via HTTP, and data is sourced from DataNodes. HFTP is a read-only filesystem, and will throw exceptions if you try to use it to write data or modify the filesystem state.node

HFTP是使hadoop文件系統從遠程hdfs集羣讀取數據的一種實現,讀取時經過http協議完成的,而且數據源來自於datanodes。HFTP時一種只讀文件系統,而且會拋出異常若是你嘗試經過他去寫數據或者修改文件系統狀態。apache

HFTP is primarily useful if you have multiple HDFS clusters with different versions and you need to move data from one to another. HFTP is wire-compatible even between different versions of HDFS. For example, you can do things like: hadoop distcp -i hftp://sourceFS:50070/src hdfs://destFS:8020/dest. Note that HFTP is read-only so the destination must be an HDFS filesystem. (Also, in this example, the distcp should be run using the configuraton of the new filesystem.)tcp

HFTP主要被用在若是你有多個不一樣版本的HDFS集羣,而且你須要從一個集羣移動數據到另外一個集羣時。HFTP時wire-compatible甚至在兩個不一樣的HDFS版本之間。例如,你能夠像這樣作一些事:hadoop distcp -i hftp://sourceFS:50070/src hdfs://destFS:8020/dest.注意HFTP是隻讀的而且目標端必須是一個HDFS文件系統。(所以,在這個例子中,dictcp應該被運行在使用了新文件系統配置的集權中。)oop

An extension, HSFTP, uses HTTPS by default. This means that data will be encrypted in transit.this

一個擴展,FSFTP,使用https協議,這意味着數據在傳輸過程當中被加密的。加密

Implementation(實現)code

The code for HFTP lives in the Java class org.apache.hadoop.hdfs.HftpFileSystem. Likewise, HSFTP is implemented in org.apache.hadoop.hdfs.HsftpFileSystem.ip

HFTP的代碼編寫在java類org.apache.hadoop.hdfs.HftpFileSystem.HSFTP的實現類是org.apache.hadoop.hdfs.HsftpFileSystem.hadoop

Configuration Options

Name

Description

dfs.hftp.https.port

the HTTPS port on the remote cluster. If not set, HFTP will fall back on dfs.https.port.

hdfs.service.host_ip:port

Specifies the service name (for the security subsystem) associated with the HFTP filesystem running at ip:port.

相關文章
相關標籤/搜索