有這樣一個json串sql
{"viewdata":[{"city_id":"59","position_id":0,"qd_title":"網紅打卡地","list_id":35},{"city_id":"59","position_id":1,"qd_title":"看青山遊綠水","list_id":37}]}
複製代碼
須要將json數組裏的qd_title都提取出來轉換成hive中的array數組。下面介紹兩種方法json
1.首先能夠使用get_json_object
函數,提取出數組,可是這個返回的是一個字符串數組
select get_json_object('{"viewdata":[{"city_id":"39","position_id":0,"qd_title":"網紅打卡地","list_id":135}, {"city_id":"39","position_id":1,"qd_title":"看青山遊綠水","list_id":327}]}',
'$.viewdata[*].qd_title')
-- 返回,注意這不是一個array數組,只是一個字符串
["網紅打卡地","看青山遊綠水"]
複製代碼
2.將字符串中的[ ] "
都去掉,造成一個,
分割的字符串bash
regexp_replace('${剛剛獲得的字符串}','(\\[|\\]|")','')
複製代碼
3.使用字符串分割函數split函數
select
split(
regexp_replace(
get_json_object('{"viewdata":[{"city_id":"39","position_id":0,"qd_title":"網紅打 卡地","list_id":135}, {"city_id":"39","position_id":1,"qd_title":"看青山遊綠水","list_id":327}]}',
'$.viewdata[*].qd_title'),
'(\\[|\\]|")',''),
",")
複製代碼
4.總體使用LATERAL VIEW 打平數組進行統計ui
SELECT qdtitle,COUNT(DISTINCT uuid) uv
FROM ba_travel.bas_log_sdk_mt_mv a LATERAL VIEW explode(split(regexp_replace(get_json_object(a.event_attribute['custom'],'$.viewdata[*].qd_title'),'(\\[|\\]|")',''),",")) b AS qdtitle
GROUP BY qdtitle
複製代碼
1.觀察json數組中每個元素都是由{}
保衛,由,
分割,因此能夠使用``},```對字符串進行拆分spa
-- event_attribute['custom'] 對應的就是上面的json字符串
split(event_attribute['custom'],'"}')
複製代碼
2.對分割出來的每個元素進行正則匹配,提取出qd_title對應的valuecode
-- qd_titles 爲上面分割出數組的一個元素
regexp_extract(qd_titles,'qd_title...([^"]+)',1)
複製代碼
3.總體使用later view 將數組打平regexp
SELECT regexp_extract(qd_titles,'qd_title...([^"]+)',1) title,
COUNT(DISTINCT uuid) uv
FROM ba_travel.bas_log_sdk_mt_mv a LATERAL VIEW explode(split(event_attribute['custom'],'"}')) b as qd_titles
GROUP BY regexp_extract(qd_titles,'qd_title...([^"]+)',1)
複製代碼