在hive的官方文檔中給的例子中字段分隔符是\001,可是在他的API文檔中--hive-delims-replacement ,--hive-drop-import-delims 參數中會處理的字符是\0x01,一個時8進制的1,一個是16進制的1,有這麼一點差異,lz爲了確認這兩個是否同樣故作了下面的實驗java
CREATE TABLE page_view(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT
'IP Address of the User'
)
COMMENT
'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY
'\001'
STORED AS SEQUENCEFILE;
|
The above statement lets you create the same table as the previous table.apache
In the previous examples the data is stored in <hive.metastore.warehouse.dir>/page_view. Specify a value for the key hive.metastore.warehouse.dir
in the Hive config file hive-site.xml.oracle
--hive-delims-replacement <arg> Replace Hive record \0x01oop
and row delimiters (\n\r)fetch
from imported string fieldsurl
with user-defined stringspa
--hive-drop-import-delims Drop Hive record \0x01 andcode
row delimiters (\n\r) fromxml
imported string fieldsblog
這兩個參數不能一塊兒用
元數據(中間的方格是\001,用java代碼生成的)
11 QQjyyh qwqwqw 1 1111 2017/10/15 23:27:48
15 javajyyh 中文 2 1212 2017/10/15 23:39:57
sqoop import --connect jdbc:oracle:thin:@MSI:1521/study --username luo --password Sys_20170929 --table TB_NEWS --fields-terminated-by "\001" --lines-terminated-by "\n" --hive-import --hive-overwrite --null-string "" --null-non-string "" --fetch-size 1000 -m 3 --create-hive-table --hive-table luoqi_test.TB_NEWS --delete-target-dir
導入以後hive的結構是
11.0 QQ|jyyh qwqwqw 1 1111 2017-10-15 23:27:48.0
15.0 java|jyyh 中文 2 1212 2017-10-15 23:39:57.0
結果,'\001'被替換成正確的字符
結論,\001能夠被--hive-delims-replacement 參數替換
導出將這個符號還原?