hive中常規處理json數據,array類型json用get_json_object(#,"$.#")這個方法足夠了,map類型複合型json就須要經過數據處理才能解析。正則表達式
explode:字段行轉列json
select explode(split(字段,',')) as abc from explode_lateral_view;
select explode(split(字段,',')) as abc from explode_lateral_view;函數
LATERAL VIEW:單行數據拆解成多行數據
側視圖的意義是配合explode(或者其餘的UDTF),一個語句生成把單行數據拆解成多行後的數據結果集。測試
select get_json_object(concat('{',sale_info_r,'}'),'$.monthSales') as monthSales from explode_lateral_view LATERAL VIEW explode(split(regexp_replace(regexp_replace(sale_info,'\\[\\{',''),'}]',''),'},\\{'))sale_info as sale_info_r;
統一版spa
經過下面的句子,把這個json格式的一行數據,徹底轉換成二維表的方式展示code
select t1.id ,get_json_object(col,'$.key') as value ,get_json_object(col,'$.key') as value from (select id,s.col as col from table_a lateral view explode(split(regexp_replace(regexp_extract(json,'^\\[(.+)\\]$',1),'\\}\\,|[, ]{0,1}\\{', '\\}\\|\\|\\{'),'\\|\\|')) s as col ) t1
或者另外一版本regexp
select get_json_object(concat('{',sale_info_1,'}'),'$.source') as source, get_json_object(concat('{',sale_info_1,'}'),'$.monthSales') as monthSales, get_json_object(concat('{',sale_info_1,'}'),'$.userCount') as monthSales, get_json_object(concat('{',sale_info_1,'}'),'$.score') as monthSales from explode_lateral_view LATERAL VIEW explode(split(regexp_replace(regexp_replace(sale_info,'\\[\\{',''),'}]',''),'},\\{'))sale_info as sale_info_1
hive 數據轉成json數據組blog
concat('{\"name\":\"',name,'\",\"cus_nam\":\"',NVL(t2.cus_nam, ''), '\",\"orderNo\":\"', NVL(orderNo, ''), '\",\"ord_no\":\"', NVL(t1.ord_no, ''), '\",\"trigger\":\"', NVL(trigger, ''), '\",\"assignmentOfClaims\":\"', NVL(assignmentOfClaims, ''), '\"}') as value
經過get_json_object函數解析,測試無誤get
hive 正則匹配it
regexp_extract(字段,正則表達式,序號)
匹配樣例
select regexp_extract('honey123moon', 'hon([0-9]+)(moon)', 0) select regexp_extract('x=a3&x=18abc&x=2&y=3&x=4','x=([0-9]+)([a-z]+)',1)