好程序員大數據學習路線hive內部函數

時間 2020-09-06

原文原文鏈接

好程序員大數據學習路線hive內部函數，持續爲你們更新了大數據學習路線，但願對正在學習大數據的小夥伴有所幫助。
一、取隨機數函數：rand()
語法: rand(),rand(int seed) 返回值: double 說明: 返回一個0到1範圍內的隨機數。若是指定seed，則會獲得一個穩定的隨機數序列
select rand();
select rand(10);
二、分割字符串函數:split(str,splitor)
語法: split(string str, string pat) 返回值: array 說明: 按照pat字符串分割str，會返回分割後的字符串數組，注意特殊分割符的轉義
select split(5.0,".")[0];
select split(rand(10)*100,".")[0];
三、字符串截取函數：substr,substring
語法: substr(string A, int start),substring(string A, int start) 返回值: string 說明：返回字符串A從start位置到結尾的字符串
語法: substr(string A, int start, int len),substring(string A, int start, int len) 返回值: string 說明：返回字符串A從start位置開始，長度爲len的字符串
select substr(rand()*100,0,2);
select substring(rand()*100,0,2);
四、If函數:if
語法: if(boolean testCondition, T valueTrue, T valueFalseOrNull) 返回值: T 說明: 當條件testCondition爲TRUE時，返回valueTrue；不然返回valueFalseOrNull
select if(100>10,"this is true","this is false");
select if(2=1,"男","女");
select if(1=1,"男",(if(1=2,"女","不知道")));
select if(3=1,"男",(if(3=2,"女","不知道")));
五、條件判斷函數：CASE
第一種格式：
語法: CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END 返回值: T 說明：若是a爲TRUE,則返回b；若是c爲TRUE，則返回d；不然返回e
第二種格式：
語法: CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END 返回值: T 說明：若是a等於b，那麼返回c；若是a等於d，那麼返回e；不然返回f
select
case 6
when 1 then "100"
when 2 then "200"
when 3 then "300"
when 4 then "400"
else "others"
end
;html

建立表

create table if not exists cw(
flag int
)
;
load data local inpath '/home/flag' into table cw;java

第一種格式

select
case c.flag
when 1 then "100"
when 2 then "200"
when 3 then "300"
when 4 then "400"
else "others"
end
from cw c
;程序員

第二種格式

select
case
when 1=c.flag then "100"
when 2=c.flag then "200"
when 3=c.flag then "300"
when 4=c.flag then "400"
else "others"
end
from cw c
;
六、正則表達式替換函數：regexp_replace
語法: regexpreplace(string A, string B, string C) 返回值: string 說明：將字符串A中的符合java正則表達式B的部分替換爲C。注意，在有些狀況下要使用轉義字符,相似oracle中的regexpreplace函數
select regexp_replace("1.jsp",".jsp",".html");
七、類型轉換函數: cast
語法: cast(expr as ) 返回值: Expected "=" to follow "type" 說明: 返回轉換後的數據類型
select 1;
select cast(1 as double);
select cast("12" as int);
八、字符串鏈接函數：concat；帶分隔符字符串鏈接函數：concat_ws
語法: concat(string A, string B…) 返回值: string 說明：返回輸入字符串鏈接後的結果，支持任意個輸入字符串
語法: concat_ws(string SEP, string A, string B…) 返回值: string 說明：返回輸入字符串鏈接後的結果，SEP表示各個字符串間的分隔符
select "千峯" + 1603 + "班級";
select concat("千峯",1603,"班級");
select concat_ws("|","千峯","1603","班級");
九、排名函數：
rownumber(): 名次不併列 rank():名次並列，但空位 denserank():名次並列，但不空位正則表達式

數據

id class score
1 1 90
2 1 85
3 1 87
4 1 60
5 2 82
6 2 70
7 2 67
8 2 88
9 2 93 數組

1 1 90 1
3 1 87 2
2 1 85 3
9 2 93 1
8 2 88 2
5 2 82 3oracle

create table if not exists uscore(
uid int,
classid int,
score double
)
row format delimited fields terminated by 't'
;
load data local inpath '/home/uscore' into table uscore;
select
u.uid,
u.classid,
u.score
from uscore u
group by u.classid,u.uid,u.score
limit 3
;
select
u.uid,
u.classid,
u.score,
row_number() over(distribute by u.classid sort by u.score desc) rn
from uscore u
;
取前三名
select
t.uid,
t.classid,
t.score
from
(
select
u.uid,
u.classid,
u.score,
row_number() over(distribute by u.classid sort by u.score desc) rn
from uscore u
) t
where t.rn < 4
;
查看三個排名區別
select
u.uid,
u.classid,
u.score,
row_number() over(distribute by u.classid sort by u.score desc) rn,
rank() over(distribute by u.classid sort by u.score desc) rank,
dense_rank() over(distribute by u.classid sort by u.score desc) dr
from uscore u
;
10.聚合函數：
min() max() count() count(distinct ) sum() avg()
count(1):無論正行有沒有值，只要出現就累計1 count(*):正行值只要有一個不爲空就給類計1 count(col)：col列有值就累計1 count(distinct col)：col列有值而且不相同才累計1
11.null值操做
幾乎任何數和 NULL操做都返回NULL
select 1+null;
select 1/0;
select null%2;
12.等值操做
select null=null; #null
select null<=>null;#truejsp