譯：ORCFILE IN HDP 2:更好的壓縮，更高的性能

時間 2019-11-16

原文原文鏈接

原文地址：

https://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/ios

ORCFILE IN HDP 2: BETTER COMPRESSION, BETTER PERFORMANCE

Carter Shanklin

ORCFILE IN HDP 2:更好的壓縮，更高的性能

The upcoming Hive 0.12 is set to bring some great new advancements in the storage layer in the forms of higher compression and better query performance.數據庫

即將推出的Hive 0.12，將在存儲層面帶來一些大的進步，包括提供更高的壓縮、以及更好的查詢性能。

HIGHER COMPRESSION

更高的壓縮

ORCFile was introduced in Hive 0.11 and offered excellent compression, delivered through a number of techniques including run-length encoding, dictionary encoding for strings and bitmap encoding.app

ORCFile格式在Hive 0.11中就已經被引入了，而且提供了出色的壓縮性能。主要是基於如下幾種技術： run-length 編碼、字符串dictionary編碼、bitmap編碼。

This focus on efficiency leads to some impressive compression ratios. This picture shows the sizes of the TPC-DS dataset at Scale 500 in various encodings. This dataset contains randomly generated data including strings, floating point and integer data.less

對效率的追求，產生了出色的壓縮比。下圖展現了在各類編碼格式中，TPC-DS Scale 500數據集的大小。該數據集包含了隨機生成的字符串、浮點、整型數據。

【Columnar format arranges columns adjacent within the file for compression & fast access 爲了壓縮和快速訪問，縱列柱狀格式將文件內相鄰的列整理在一塊兒】

We’ve already seen customers whose clusters are maxed out from a storage perspective moving to ORCFile as a way to free up space while being 100% compatible with existing jobs.dom

咱們看到，那些集羣存儲爆滿的客戶們，在保證100%兼容現有做業的前提下，已經採用ORCFile格式來釋放存儲空間。

Data stored in ORCFile can be read or written through HCatalog, so any Pig or Map/Reduce process can play along seamlessly. Hive 12 builds on these impressive compression ratios and delivers deep integration at the Hive and execution layers to accelerate queries, both from the point of view of dealing with larger datasets and lower latencies.ide

以ORCFile格式存儲的數據，能夠經過HCatalog讀取或寫入。所以，任何Pig程序或Map/Reduce程序，都能無縫銜接運行。從數據集處理和低延遲的角度來看，Hive 12基於其出色的壓縮比，並在Hive和執行層面提供了深度集成，從而加速了查詢。

PREDICATE PUSHDOWN

謂詞下推

SQL queries will generally have some number of WHERE conditions which can be used to easily eliminate rows from consideration. In older versions of Hive, rows are read out of the storage layer before being later eliminated by SQL processing. There’s a lot of wasteful overhead and Hive 12 optimizes this by allowing predicates to be pushed down and evaluated in the storage layer itself. It’s controlled by the setting hive.optimize.ppd=true.工具

SQL查詢一般會有一些where條件，用於簡單排除一些數據行。在老版本Hive中，數據行先從存儲層中讀出，而後才進行過濾排除（執行where條件），所以形成了不少開銷的浪費。在Hive 12中進行了優化，容許將謂詞下推到存儲層，並在存儲層中進行計算。該優化能夠經過參數進行設置：hive.optimize.ppd=true

This requires a reader that is smart enough to understand the predicates. Fortunately ORC has had the corresponding improvements to allow predicates to be pushed into it, and takes advantages of its inline indexes to deliver performance benefits.性能

這須要一個足夠聰明的讀者才能理解謂詞。不過幸運的是，ORC已經進行了相應的改進，以便將謂詞推入其中，並利用其內聯索引的長處來提供性能優點。

For example if you have a SQL query like:優化

SELECT COUNT(*) FROM CUSTOMER WHERE CUSTOMER.state = ‘CA’;ui

The ORCFile reader will now only return rows that actually match the WHERE predicates and skip customers residing in any other state. The more columns you read from the table, the more data marshaling you avoid and the greater the speedup.

例如，假設你有這麼一個查詢SQL：

SELECT COUNT(*) FROM CUSTOMER WHERE CUSTOMER.state = ‘CA’;

如今,ORCFile reader將只返回實際匹配where條件的行數，而忽略那些where條件以外的數據。

所以，你從Table中讀出的列越多，則避免傳輸的數據量就越大，速度提高越明顯。

A WORD ON ORCFILE INLINE INDEXES

ORC文件內聯索引中的單詞

Before we move to the next section we need to spend a moment talking about how ORCFile breaks rows into row groups and applies columnar compression and indexing within these row groups.

在開始討論下一節以前，咱們須要花一點時間來討論ORCFile如何將行rows劃分紅行組row groups，並在這些行組內進行柱狀壓縮和索引。

TURNING PREDICATE PUSHDOWN TO 11

轉向謂詞下推到11

ORC’s Predicate Pushdown will consult the Inline Indexes to try to identify when entire blocks can be skipped all at once. Some times your dataset will naturally facilitate this. For instance if your data comes as a time series with a monotonically increasing timestamp, when you put a where condition on this timestamp, ORC will be able to skip a lot of row groups.

ORC的謂詞下推參照了內聯索引，以肯定什麼時候能夠一次性地忽略整個塊。有時，你的數據集剛好天然地促成這一點。好比，若是你的數據以單調增加的時間戳做爲時間序列，那麼，當你以此時間戳做爲過濾條件時，ORC將會跨過不少行組。

In other instances you may need to give things a kick by sorting data. If a column is sorted, relevant records will get confined to one area on disk and the other pieces will be skipped very quickly.

在其它狀況下，你可能須要經過對數據進行排序，從而給出一個依據。若是對列進行了排序，那麼，相應的記錄將被限定在磁盤的一個區域內，而其他部分將會被快速跳過。

Skipping works for number types and for string types. In both instances it’s done by recording a min and max value inside the inline index and determining if the lookup value falls outside that range.

跳過忽略適用於數值類型和字符串類型。在這兩種類型的場景下，都會在內聯索引中記錄最小值/最大值，從而肯定要查詢的值是否超出該範圍。

Sorting can lead to very nice speedups. There is a trade-off in that you need to decide what columns to sort on in advance. The decision making process is somewhat similar to deciding what columns to index in traditional SQL systems. The best payback is when you have a column that is frequently used and accessed with very specific conditions and is used in a lot of queries. Remember that you can force Hive to sort on a column by using the SORT BY keyword when creating the table and setting hive.enforce.sorting to true before inserting into the table.

排序能帶來至關好的速度提高。此處有一個權衡，須要你提早肯定參與排序的列。決策的過程有些相似於在傳統SQL數據庫中給哪些列創建索引。最佳實踐是：那個被頻繁使用、會以明確條件訪問、且用於大量查詢的列。請記住：你能夠在建立表時使用SORT BY keyword強制Hive按某一個列進行排序，而後在向表中插入數據以前將hive.enforce.sorting設置爲true。

ORCFile is an important piece of our Stinger Initiative to improve Hive performance 100x. To show the impact we ran a modified TPC-DS Query 27 query with a modified data schema. Query 27 does a star schema join on a large fact table, accessing 4 separate dimension tables. In the modified schema, the state in which the sale is made is denormalized into the fact table and the resulting table is sorted by state. In this way, when the query scans the fact table, it can skip entire blocks of rows because the query filters based on the state. This results in some incremental speedup as you can see from the chart below.

ORCFile是Stinger Initiative （一個完全提高Hive效率的工具）的重要組成部分，旨在將Hive性能提升100倍。爲了展現效果，咱們運行一個改造了數據模型的 TPC-DS Query 27查詢。Query 27在一個大的fact table上作了星形鏈接，來訪問4個單獨的維度表。在修改後的模型中的，銷售狀態被非規範化到fact table中，而且結束以狀態排序。以這種方式，當查詢掃描fact table時，它能夠跳過整個的行塊，由於查詢基於狀態列進行過濾，這會產生一些增量加速，以下圖所示。

This feature gives you the best bang for the buck when:

You frequently filter a large fact table in a precise way on a column with moderate to large cardinality.
You select a large number of columns, or wide columns. The more data marshaling you save, the greater your speedup will be.

該功能爲你提供了最佳選擇：

一、在一個大型事實表中，你常常在某個中等到大基數列上，以精確方式進行過濾。

二、你在查詢大量的列、或寬列時，傳輸的數據量越大，速度提高越明顯。

【關於Fact table:

Fact table和dimension table這是數據倉庫的兩個概念，是數據倉庫的兩種類型表。從保存數據的角度來講，本質上沒區別，都是表。區別在於，Fact表用來存fact 數據，就是一些能夠計量的數據和可加性數據，數據數量，金額等。dimension table用來存描述性的數據，用來描述fact的數據，如區域，銷售表明，產品等。star schema 就是一個fact表有多個維表（dimension table）關聯。

星形模式的基本特色是由多個維表(Dimension Table)和事實表(Fact Table)組成。維表表明瞭相關的分析維度，事實表則是這些不一樣維度上的計算結果。星形模式存儲了系統在不一樣維度上的預處理結果，所以進行多維分析時速度很快，但若是分析範圍超出了預約義的維度，那麼增長維度將是很困難的事情。】

USING ORCFILE

使用ORCFILE

Using ORCFile or converting existing data to ORCFile is simple. To use it just add STORED AS orc to the end of your create table statements like this:

使用ORCFile或將現有數據轉換成ORCFile很簡單。要使用它，僅須要將 STORED AS orc添加到 create table 語句的後面，以下所示：

CREATE TABLE mytable ( ... ) STORED AS orc;

To convert existing data to ORCFile create a table with the same schema as the source table plus stored as orc, then you can use issue a query like:

要將現有數據轉換成ORCFile，須要建立與源表相同schema的一個表並加上stored as orc，而後就能夠以下查詢：

INSERT INTO TABLE orctable SELECT * FROM oldtable;

Hive will handle all the details of conversion to ORCFile and you are free to delete the old table to free up loads of space.

Hive將處理轉換成ORCFile的所有細節，而後你能夠自由刪除舊錶來釋放空間。

When you create an ORC table there are a number of table properties you can use to further tune the way ORC works.

建立ORC表時，可使用多個表屬性來進一步調整ORC的工做方式。

Key	Default	Notes
`orc.compress`	`ZLIB`	Compression to use in addition to columnar compression (one of NONE, ZLIB, SNAPPY)
`orc.compress.size`	`262,144 (= 256KiB)`	Number of bytes in each compression chunk
orc.stripe.size	`268,435,456 (=256 MiB)`	Number of bytes in each stripe
`orc.row.index.stride`	`10,000`	Number of rows between index entries (must be >= 1,000)
`orc.create.index`	`true`	Whether to create inline indexes