[譯]Cassandra的數據讀寫與壓縮

時間 2019-11-12

標籤 cassandra 數據讀寫壓縮简体版

原文原文鏈接

本文翻譯主要來自Datastax的cassandra1.2文檔。http://www.datastax.com/documentation/cassandra/1.2/index.html。此外還有一些來自於相關官方博客。html

該翻譯做爲ISE實驗室大數據組Laud的學習材料的一部分，適合對Cassandra已經有必定了解的讀者。sql

未經本人許可，請勿轉載。nosql

簡述數據模型

一、不是sql（沒有事務、沒有join），可是不只僅是kv分佈式

二、來自於Google BigTable的靈感。性能

三、基於列族的。學習

例子：大數據

還有二級索引、分佈式counter、複合列等等this

Cassandra Storage Enginespa

目標：最小化隨機IO。翻譯

一次寫入的流程：

寫入的特色是：

沒有讀取、沒有seek

只有順序io

sstable再也不改變：很容易備份

一次讀的流程：

壓縮

目的：減小sstable數量

合併多個sstable的順序

順序IO

SStable的樣子：

再說壓縮：

Cassandra中，講新的列寫入新的sstable中，那麼壓縮就是爲了將多個sstable合併成一個。

Figure 1: adding sstables with size tiered compaction

所以，一段時間後，會有一行的許多版本會存在於多個不一樣的sstable中。這些版本中的每個均可能有不一樣的列集合。若是sstable就這麼積攢下去，讀一行數據就須要屢次定位到多個文件中去。

所以須要合併，合併也是高性能的，不須要隨機IO，由於行也都被有序的存儲在了各自的sstable中（基於primary key的順序）。

Figure 2: sstables under size-tiered compaction after many inserts

cassnadra的大小分層壓縮策略跟bigtable論文中的很像：當到達足夠數量的sstable（默認4個）的時候，就進行合併。

圖1中，一個綠色格子就表明一個sstable，一行就表明一次壓縮合並。一旦sstable到了4個，就合併在一塊兒。圖2展現了一段時間以後的層次結構，第一層的sstable合併成第二層，第二層的會合併成第三層…

在頻繁更新的任務中，會出現三個問題：

一、性能會不一致，由於不能確保一行到底跨越了多少個sstable。最糟糕的例子是，咱們可能在每一個sstable都有某一行的某些列。

二、由於沒法肯定到底過期的列會被合併的多塊，所以可能會浪費大量的空間，尤爲是不少delete的時候。

三、Space can also be a problem as sstables grow larger from repeated compactions, since an obsolete sstable cannot be removed until the merged sstable is completely written. In the worst case of a single set of large sstable with no obsolete rows to remove, Cassandra would need 100% as much free space as is used by the sstables being compacted, into which to write the merged one.

Cassandra1.0以後引進了Leveled compaction策略，這是基於Chromium團隊的levelDB的

Leveled Compaction （譯者注：翻譯的不是很懂）

leveled compation建立固定大小的sstable（默認5MB）,他們組成了「levels」。在每一層裏面，sstable們能確保不重疊。每一層都比前一層大10倍。

Figure 3: adding sstables under leveled compaction

圖3中，新的sstable首先加入第一層level， L0.而後馬上合併成sstable到L1，（藍色的），當L1滿了，就合併成L2（紫色的）。Subsequent sstables generated in L1 will be compacted with the sstables in L2 with which they overlap. As more data is added, leveled compaction results in a situation like the one shown in figure 4.

Figure 4: sstables under leveled compaction after many inserts

這種方式能解決上述問題：

一、這種合併壓縮能確保90%的讀取都能從單個sstable中獲取（假設行的大小統一）。最壞的狀況是讀取層的數量次。好比 10T的數據會讀取7個。

二、之多10%的空間會由於過期行而浪費。

三、在compact時只須要有10*sstable大小的空間被臨時使用。

使用：經過在建立或者更新表結構時加入：compaction_strategy option set to LeveledCompactionStrategy.（更新也是後臺的，因此對於已經存在的表，修改compact類型不影響讀寫）

因爲leveled compaction要確保上面的問題，他比size-tiered compation 要花費大概兩倍的io。對於寫入爲主的負載，這種額外的io並不會由於上面的好處帶來不少收益，由於沒有多少行的舊版本涉及。

設置的一些細節：Leveled compaction ignores the concurrent_compactors setting. Concurrent compaction is designed to avoid tiered compaction’s problem of a backlog of small compaction sets becoming blocked temporarily while the compaction system is busy with a large set. Leveled compaction does not have this problem, since all compaction sets are roughly the same size. Leveled compaction does honor the multithreaded_compaction setting, which allows using one thread per sstable to speed up compaction. However, most compaction tuning will still involve usingcompaction_throughput_mb_per_sec (default: 16) to throttle compaction back.

何時使用leveled compation呢：英文版，中文版

數據管理

爲了管理和訪問數據，那麼就必須知道Cassandra如何讀寫數據的，hinted handoff特徵，與ACID的一致和不一致的地方。在Cassandra中，一致性指的是如何更新和同步一行的數據到他的全部副本上。In Cassandra, consistency refers to how up-to-date and synchronized a row of data is on all of its replicas.

to be continue…