Presto Functions

時間 2019-11-12

標籤 presto functions 简体版

原文原文鏈接

Presto的是什麼？優點是什麼呢？從官方文檔中咱們瞭解到css

Presto是一個分佈式SQL查詢引擎，用於查詢分佈在一個或多個不一樣數據源中的大數據集。
千萬不要覺得Presto能夠解析SQL，那麼Presto就是一個標準的數據庫。
Presto被設計爲數據倉庫和數據分析產品：數據分析、大規模數據彙集和生成報表。這些工做常常一般被認爲是線上分析處理操做。html

因此說，當公司業務有跨庫分析時（通常狀況是，業務數據庫分佈在各個部門），一些數據須要配合其餘部門的數據進行關聯查詢，這個時候能夠考慮Presto。可是目前，對於MySQL統計查詢在性能上有瓶頸。可考慮將數據按時間段歸檔到HDFS中，以提升統計效率。sql

Presto函數:https://www.alibabacloud.com/help/zh/doc-detail/64038.htm;
位運算函數
Presto提供了以下幾種位運算函數：數據庫

函數	語法	說明
bit_count	bit_count(x, bits) → bigint	返回x的補碼中置1的位數
bitwise_and	bitwise_and(x, y) → bigint	位與函數
bitwise_not	bitwise_not(x) → bigint	取非操做
bitwise_or	bitwise_or(x, y) → bigint	位或函數
bitwise_xor	bitwise_xor(x, y) → bigint	抑或函數
bitwise_and_agg	bitwise_and_agg(x) → bigint	返回x中全部值的與操做結果，x爲數組
bitwise_or_agg	bitwise_or_agg(x) → bigint	返回x中全部值的或操做結果，x位數組
示例

SELECT bit_count(9, 64); -- 2
SELECT bit_count(9, 8); -- 2
SELECT bit_count(-7, 64); -- 62
SELECT bit_count(-7, 8); -- 6

(二)JSON處理對比(轉至:https://www.cnblogs.com/cssdongl/p/8394000.html)
Hive
select get_json_object(json, '$.book');json

Presto
select json_extract_scalar(json, '$.book');數組

注意這裏Presto中json_extract_scalar返回值是一個string類型,其還有一個函數json_extract是直接返回一個json串，因此使用的時候你得本身知道取的究竟是一個什麼類型的值.緩存

(三).列轉行對比
Hive
select student, score from tests lateral view explode(split(scores, ',')) t as score;架構

Presto
select student, score from tests cross json unnest(split(scores, ',') as t (score);分佈式

簡單的講就是將scores字段中以逗號隔開的分數列好比
80,90,99,80
這種單列的值轉換成和student列一對多的行的值映射.函數

三.複雜Grouping對比
Hive
select origin_state, origin_zip, sum(package_weight) from shipping group by origin_state,origin_zip with rollup;

Presto
select origin_state, origin_zip, sum(package_weight) from shipping group by rollup (origin_state, origin_zip);

用過rollup的都知道，這是從右向左的遞減的多級統計的聚合,等價於(以下爲Presto寫法)
select origin_state, origin_zip, sum(package_weight) from shipping group by grouping sets ((origin_state, origin_zip), (origin_state), ());

其餘一些語法有細微的差異能夠慢慢了解，固然Hive和Presto底層架構不同致使Presto比Hive運算速度要快不少,再加上開源的Alluxio緩存更加如虎添翼了.

相關標籤/搜索

docker+hadoop+hive+presto

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。