1.概述
express
lateral view用於和split, explode等UDTF一塊兒使用,它可以將一行數據拆成多行數據,在此基礎上能夠對拆分後的數據進行聚合。lateral view首先爲原始表的每行調用UDTF,UTDF會把一行拆分紅一或者多行,lateral view再把結果組合,產生一個支持別名表的虛擬表。json
2.Explode語法數據結構
上一篇文章咱們講了:Hive實戰:詳解Hive複合數據類型;將一些如何在複合數據結構使用explode語法。ide
1).應用於array數據類型,直接查上篇文章的表student1函數
語法格式:spa
select explode(arraycol) as newcol from tablename explode():函數中的參數傳入的是arrary數據類型的列名。 newcol:是給轉換成的列命名一個新的名字,用於表明轉換以後的列名。 tablename:表名。
//student1表中數據:hive> select * from student1;OK100 "student1" [80.0,82.0,84.0]101 "student2" [70.0,72.0,74.0]102 "student3" [60.0,62.0,64.0]Time taken: 0.162 seconds, Fetched: 3 row(s)
// 在字段類型爲array的字段score上應用explode語法:> select explode(score) as new_score from student1;OK80.082.084.070.072.074.060.062.064.0Time taken: 0.121 seconds, Fetched: 9 row(s)
2).應用於map數據類型,直接查上篇文章的表student2code
語法格式:
orm
select explode(mapcol) as (keyname,valuename) from tablename; explode():函數中的參數傳入的是map數據類型的列名。 因爲map是kay-value結構的,因此它在轉換的時候會轉換成兩列,一列是kay轉換而成的,一列是value轉換而成的。 keyname:表示key轉換成的列名稱,用於表明key轉換以後的列名。 valuename:表示value轉換成的列名稱,用於表明value轉換以後的列名稱。
// student2表中數據:hive> select * from student2;OK100 "student1" {"\"yuwen\"":80.0,"\"shuxue\"":82.0,"\"yingyu\"":84.0}101 "student2" {"\"yuwen\"":70.0,"\"shuxue\"":72.0,"\"yingyu\"":74.0}102 "student3" {"\"yuwen\"":60.0,"\"shuxue\"":62.0,"\"yingyu\"":64.0}Time taken: 0.262 seconds, Fetched: 3 row(s)
// 在字段類型爲array的字段score上應用explode語法:hive> select explode(score) as (score_key,score_value) from student2;OK"yuwen" 80.0"shuxue" 82.0"yingyu" 84.0"yuwen" 70.0"shuxue" 72.0"yingyu" 74.0"yuwen" 60.0"shuxue" 62.0"yingyu" 64.0Time taken: 0.153 seconds, Fetched: 9 row(s)
3)explode使用的侷限性:get
a.不能關聯原有的表中的其餘字段。string
b.不能與group by、cluster by、distribute by、sort by聯用。
c.不能進行UDTF嵌套。
d.不容許選擇其餘表達式。
3.Lateral View語法
因爲explode的侷限性,一般會與Lateral View結合使用,配合Explode(或者其餘的UDTF),一個語句生成把單行數據拆解成多行後的數據結果集;Lateral view語法首先會將UDTF處理生成的結果放到一張虛擬表中,而後再將這個虛擬表和輸入行進行關聯實現添加列到select中。
1).單個Later View語法使用
這裏仍是拿上篇文章中的student2爲例,字段id,name,score,其中score數據類型爲Map<科目,分數>,可結合Later View語法查詢出學生成績:
Explode與其餘字段關聯報錯:
// Explode與其餘字段關聯報錯:hive> select id,name,explode(score) as new_score from student1;FAILED: SemanticException [Error 10081]: UDTF's are not supported outside the SELECT clause, nor nested in expressions
Explode結合Lateral view與其餘字段關聯
hive> select id,name,course,value from student2 lateral view explode(score) scoretable as course,value;OK100 "student1" "yuwen" 80.0100 "student1" "shuxue" 82.0100 "student1" "yingyu" 84.0101 "student2" "yuwen" 70.0101 "student2" "shuxue" 72.0101 "student2" "yingyu" 74.0102 "student3" "yuwen" 60.0102 "student3" "shuxue" 62.0102 "student3" "yingyu" 64.0Time taken: 0.148 seconds, Fetched: 9 row(s)
2).多個Lateral View結合使用。
a.新建表student4,一共有四個字段id,name,score,score2,其中score公共基礎課,數據類型爲Map<科目,分數>,score2選修課,數據類型爲Map<科目,分數>,建表語句:
hive> create table student4(id int,name string,score map<string,double>,score2 map<string,double>)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '|' MAP KEYS TERMINATED BY ':';OKTime taken: 0.221 seconds
b.數據準備student4.txt:
[root@salver158 ~]# cat student4.txt 100,"student1","yuwen":80|"shuxue":82|"yingyu":84,"tiyu":86|"meishu":88101,"student2","yuwen":70|"shuxue":72|"yingyu":74,"tiyu":76|"meishu":76102,"student3","yuwen":60|"shuxue":62|"yingyu":64,"tiyu":68|"meishu":68
c.加載數據:
hive> load data local inpath "/root/student4.txt" into table student4;Loading data to table lujs.student4
d.數據加載成功:
hive> select * from student4;OK100 "student1" {"\"yuwen\"":80.0,"\"shuxue\"":82.0,"\"yingyu\"":84.0} {"\"tiyu\"":86.0,"\"meishu\"":88.0}101 "student2" {"\"yuwen\"":70.0,"\"shuxue\"":72.0,"\"yingyu\"":74.0} {"\"tiyu\"":76.0,"\"meishu\"":76.0}102 "student3" {"\"yuwen\"":60.0,"\"shuxue\"":62.0,"\"yingyu\"":64.0} {"\"tiyu\"":68.0,"\"meishu\"":68.0}Time taken: 0.16 seconds, Fetched: 3 row(s)
e.多個lateral view語句查詢:
hive> select id,name,score_course,score_value,score2_course,score2_value from student4 > lateral view explode(score) scoreTable1 as score_course,score_value > lateral view explode(score2) scoreTable2 as score2_course,score2_value;OK100 "student1" "yuwen" 80.0 "tiyu" 86.0100 "student1" "yuwen" 80.0 "meishu" 88.0100 "student1" "shuxue" 82.0 "tiyu" 86.0100 "student1" "shuxue" 82.0 "meishu" 88.0100 "student1" "yingyu" 84.0 "tiyu" 86.0100 "student1" "yingyu" 84.0 "meishu" 88.0101 "student2" "yuwen" 70.0 "tiyu" 76.0101 "student2" "yuwen" 70.0 "meishu" 76.0101 "student2" "shuxue" 72.0 "tiyu" 76.0101 "student2" "shuxue" 72.0 "meishu" 76.0101 "student2" "yingyu" 74.0 "tiyu" 76.0101 "student2" "yingyu" 74.0 "meishu" 76.0102 "student3" "yuwen" 60.0 "tiyu" 68.0102 "student3" "yuwen" 60.0 "meishu" 68.0102 "student3" "shuxue" 62.0 "tiyu" 68.0102 "student3" "shuxue" 62.0 "meishu" 68.0102 "student3" "yingyu" 64.0 "tiyu" 68.0102 "student3" "yingyu" 64.0 "meishu" 68.0Time taken: 0.188 seconds, Fetched: 18 row(s)
3.)lateral view解析Json,加入上面的表中score裏面是放的一個json串,固然咱們能夠用get_json_object處理,這裏也能夠用later view處理:
a.新建表student5,一共有3個字段id,name,score,其中score爲string類型裏面存的是json串,建表語句:
hive> create table student5(id int,name string,score string)ROW FORMAT DELIMITED FIELDS TERMINATED BY '|';
b.數據準備student5.txt
[root@salver158 ~]# cat student4.txt 100,"student1","yuwen":80|"shuxue":82|"yingyu":84,"tiyu":86|"meishu":88101,"student2","yuwen":70|"shuxue":72|"yingyu":74,"tiyu":76|"meishu":76102,"student3","yuwen":60|"shuxue":62|"yingyu":64,"tiyu":68|"meishu":68
c.加載數據到student5表中:
hive> load data local inpath '/root/student5.txt' into table student5;
d.數據加載成功:
hive> select * from student5;OK100 student1 {"yuwen":80,"shuxue":81,"yingyu":82}101 student2 {"yuwen":70,"shuxue":71,"yingyu":72}102 student3 {"yuwen":60,"shuxue":61,"yingyu":62}Time taken: 0.092 seconds, Fetched: 3 row(s)
e. 先用get_json_object解析json:
hive> select id,name,get_json_object(score, '$.yuwen'),get_json_object(score, '$.shuxue'),get_json_object(score, '$.yingyu') from student5;OK100 student1 80 81 82101 student2 70 71 72102 student3 60 61 62Time taken: 0.161 seconds, Fetched: 3 row(s)
f. 用lateral view解析json:
hive> select * from student5 > lateral view json_tuple(score,'yuwen','shuxue','yingyu') state_json as yuwen,shuxue,yingyu;OK100 student1 {"yuwen":80,"shuxue":81,"yingyu":82} 80 81 82101 student2 {"yuwen":70,"shuxue":71,"yingyu":72} 70 71 72102 student3 {"yuwen":60,"shuxue":61,"yingyu":62} 60 61 62Time taken: 0.111 seconds, Fetched: 3 row(s)
至此,Hive的explode和 lateral view語法介紹和使用講解完畢,有興趣本身去手動敲一遍試試吧!