hive 基礎

hive join 的分類:

 1. joinnginx

 2. left outer joinapi

 3. right outer joinide

 4. full outer joinspa

 5.  Left Semi Joinorm

6.   Map Side Joinit

7.   Common Joinform

8.   Skew Jointest


Left Semi Join效率

hive以前(現已支持!)不支持in/exists,left semi join是in/exists更有效率的實現。select

hive 中沒有in/exist 這樣的子句,因此須要將這種類型的子句轉成left semi join. left semi join 是隻傳遞表的join key給map 階段 , 若是key 足夠小仍是執行map join, 若是不是則仍是common join.

eg:

SELECT a.key, a.value FROM a WHERE a.key in (SELECT b.key FROM B);可使用以下語句代替:

SELECT a.key, a.val FROM a LEFT SEMI JOIN b on (a.key = b.key)


case-when:

 eg:select platform ,case when req_interface='/api/check.do' then 'check' else 'not check' end from                           test_nginx_log where p_hour=2014070118 limit 10;

相關文章
相關標籤/搜索