(一)簡單入門 一、建立一個表 create table if not exists ljh_emp( name string, salary float, gender string) comment 'basic information of a employee' row format delimited fields terminated by ',’;
三、將數據導入表中 load data local inpath '/home/ljhn1829/test' into table ljh_emp;
四、查詢表中的內容 select * from ljh_emp; OK ljh 25000.0 male jediael 25000.0 male llq 15000.0 female Time taken: 0.159 seconds, Fetched: 3 row(s)
(二)關於分隔符 一、默認分隔符 hive中的行默認分隔符爲 \n,字段分隔符爲 ctrl+A,此外還有ctrl+B,ctrl+C,能夠用於分隔array,struct,map等,詳見《hive編程指南》P44。 所以,若在建表是不指定row format delimited fields terminated by ‘,’,則認爲默認字段分隔符爲ctrl+A。 能夠有2種解決方案: 一是在建立表時指定分隔符,如上例所示, 二是在數據文件中使用ctrl+A,見下例
二、在數據文件中使用ctrl+A全分隔符 (1)建立表 create table ljh_test_emp(name string, salary float, gender string); (2)準備數據文件 建立test2目錄,目錄下只有一個文件,文件內容以下: ljh^A25000^Amale jediael^A25000^Amale llq^A15000^Afemale 其中的^A字符僅在vi時才能看到,cat不能看到。 輸出^A的方法是:在vi的插入模式下,先按ctrl+V,再按ctrl+A (3)將數據導入表 create table ljh_test_emp(name string, salary float, gender string); (4)查詢數據 hive> select * from ljh_test_emp; OK ljh 25000.0 male jediael 25000.0 male llq 15000.0 female Time taken: 0.2 seconds, Fetched: 3 row(s)
三、未指定分隔符,且又未使用ctrl+A做文件中的分隔符,出現如下錯誤 (1)建立表 create table if not exists ljh_emp_test( name string, salary float, gender string) comment 'basic information of a employee’; (2)準備數據 ljh,25000,male jediael,25000,male llq,15000,female (3)將數據導入表中 load data local inpath '/home/ljhn1829/test' into table ljh_emp_test; (4)查看錶中數據 select * from ljh_emp_test; OK ljh,25000,male NULL NULL jediael,25000,male NULL NULL llq,15000,female NULL NULL Time taken: 0.185 seconds, Fetched: 3 row(s) 能夠看出,因爲分隔符爲ctrl+A,所以導入數據時將文件中的每一行內容均只看成第一個字段,致使後面2個字段均爲null。
二、準備數據 John Doe^A100001.1^AMary Smith^BTodd Jones^AFederal Taxes^C.2^BStateTaxes^C.05^BInsurance^C.1^A1 Michigan Ave.^BChicago^BIL^B60600 Mary Smith^A80000.0^ABill King^AFederal Taxes^C.2^BState Taxes^C.05^BInsurance^C.1^A100 Ontario St.^BChicago^BIL^B60601 Todd Jones^A70000.0^A^AFederal Taxes^C.15^BState Taxes^C.03^BInsurance^C.1^A200 Chicago Ave.^BOak Park^BIL^B60700 Bill King^A60001.0^A^AFederal Taxes^C.15^BState Taxes^C.03^BInsurance^C.1^A300 Obscure Dr.^BObscuria^BIL^B60100 注意 ^A:分隔字段 ^B:分隔array/struct/map中的元素 ^C:分隔map中的KV 詳見《hive編程指南》P44。
三、將數據導入表中 load data local inpath '/home/ljhn1829/phd' into table employees partition(country='us',state='ca');
四、查看錶數據 hive> select * from employees; OK John Doe 100001.1 ["Mary Smith","Todd Jones"] {"Federal Taxes":0.2,"StateTaxes":0.05,"Insurance":0.1} {"stree":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600} us ca Mary Smith 80000.0 ["Bill King"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"stree":"100 Ontario St.","city":"Chicago","state":"IL","zip":60601} us ca Todd Jones 70000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"stree":"200 Chicago Ave.","city":"Oak Park","state":"IL","zip":60700} us ca Bill King 60001.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"stree":"300 Obscure Dr.","city":"Obscuria","state":"IL","zip":60100} us ca Time taken: 0.312 seconds, Fetched: 4 row(s)
二、插入數據 hive> set hive.exec.dynamic.partition.mode=nonstrict; 不然會出現如下異常: FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict
insert into table employees2 partition (country,state) select name,slalary,suboddinates,deductions,address, e.country, e.state from employees e;