1、over(partition by ......)主要和聚合函數sum()、count()、avg()等結合使用,實現分組聚合的功能sql
示列:根據day_id日期和mac_id機器碼進行聚合分組求每一天的該機器的銷量和即sum_num,hive sql語句:select day_id,mac_id,mac_color,day_num,sum(day_num)over(partition by day_id,mac_id order by day_id) sum_num from test_temp_mac_id;函數
注:day_id,mac_id,mac_color,day_num爲查詢原有數據,sum_num爲計算結果spa
day_id | mac_id | mac_color | day_num | sum_num |
20171011 | 1292 | 金色 | 1 | 89 |
20171011 | 1292 | 金色 | 14 | 89 |
20171011 | 1292 | 金色 | 2 | 89 |
20171011 | 1292 | 金色 | 11 | 89 |
20171011 | 1292 | 黑色 | 2 | 89 |
20171011 | 1292 | 粉金 | 58 | 89 |
20171011 | 1292 | 金色 | 1 | 89 |
20171011 | 2013 | 金色 | 10 | 22 |
20171011 | 2013 | 金色 | 9 | 22 |
20171011 | 2013 | 金色 | 2 | 22 |
20171011 | 2013 | 金色 | 1 | 22 |
20171012 | 1292 | 金色 | 5 | 18 |
20171012 | 1292 | 金色 | 7 | 18 |
20171012 | 1292 | 金色 | 5 | 18 |
20171012 | 1292 | 粉金 | 1 | 18 |
20171012 | 2013 | 粉金 | 1 | 7 |
20171012 | 2013 | 金色 | 6 | 7 |
20171013 | 1292 | 黑色 | 1 | 1 |
20171013 | 2013 | 粉金 | 2 | 2 |
20171011 | 12460 | 茶花金 | 1 | 1 |
2、over(partition by ......)與group by 區別.net
若是用group by實現一中根據day_id日期和mac_id機器碼進行聚合分組求每一天的該機器的銷量和即sum_num,blog
則hive sql語句爲:select day_id,mac_id,sum(day_num) sum_num from test_temp_mac_id group by day_id,mac_id order by day_id;結果以下表ci
注:咱們能夠觀察到group by能夠實現一樣的分組聚合功能,但sql語句不能寫與分組聚合無關的字段,不然會報錯,即group by 與over(partition by ......)主要區別爲,帶上group by的hive sql語句只能顯示與分組聚合相關的字段,而帶上over(partition by ......)的hive sql語句能顯示全部字段.。it
day_id | mac_id | sum_num |
20171011 | 124609 | 1 |
20171011 | 20130 | 22 |
20171011 | 12922 | 89 |
20171012 | 12922 | 18 |
20171012 | 20130 | 7 |
20171013 | 12922 | 1 |
20171013 | 20130 | 2 |
https://blog.csdn.net/qq_37325859/article/details/78222712io