hive join基礎

left join 是left outer join的簡寫,left join默認是outer屬性的。string

Inner Joinit

Inner Join 邏輯運算符返回知足第一個(頂端)輸入與第二個(底端)輸入聯接的每一行。這個和用select查詢多表是同樣的效果,因此不多用到;io

outer join則會返回每一個知足第一個(頂端)輸入與第二個(底端)輸入的聯接的行。它還返回任何在第二個輸入中沒有匹配行的第一個輸入中的行。test

關鍵就是後面那句,返回的多一些。因此一般意義上的left join就是left outer joinselect


CREATE TABLE t1(查詢

name string,join

age   int字符

)data

PARTITIONED BY( hour string)運算符

ROW FORMAT DELIMITED

FIELDS TERMINATED BY '\t'

COLLECTION ITEMS TERMINATED BY ':'

LINES TERMINATED BY '\n'

STORED AS textFILE;


CREATE TABLE t2(

name string,

sex   int

)

PARTITIONED BY( hour string)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY '\t'

COLLECTION ITEMS TERMINATED BY ':'

LINES TERMINATED BY '\n'

STORED AS textFILE; 


因爲第二列是int,可是咱們插入的一個字符型,因此會顯示NULL

b       c

hive:


b       NULL    2011111111


LOAD DATA LOCAL INPATH '/opt/smc/xuanli/data/t1.txt' OVERWRITE INTO TABLE test.t1 partition (hour='2011111112');

LOAD DATA LOCAL INPATH '/opt/smc/xuanli/data/t2.txt' OVERWRITE INTO TABLE test.t2 partition (hour='2011111111');


select t1.*,t2.* from t1 left join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 right join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 INNER  join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 full join t2  on (t1.hour=t2.hour and t1.name=t2.name);

SELECT t1.name  FROM t1 LEFT SEMI JOIN t2  on (t1.hour=t2.hour and t1.name=t2.name);


在t1存在。t2不存在

select t1.*,t2.* from t1 left join t2  on (t1.hour=t2.hour and t1.name=t2.name) where t1.name is not null and t2.name is null;

select t1.*,t3.* from t1 left join (select name,hour from test.t2) t3  on (t1.hour=t3.hour and t1.name=t3.name) where t1.name is not null and t3.name is null;

相關文章
相關標籤/搜索