Cassandra repair 工具使用

時間 2019-11-17

標籤 cassandra repair 工具使用简体版

原文原文鏈接

前言

Cassandra是一款去中心化的分佈式數據庫。一份數據會分佈在多個對等的節點上，即有多個副本。咱們須要按期的對多個副本檢查，看是否有不一致的狀況。好比由於磁盤損壞，可能會致使副本丟失，這樣同一份數據的多個副本就會出現不一致。node

nodetool repair

Cassandra提供的nodetool中提供了repair這個工具，能夠用來平常巡檢數據的一致性。或者當修修改了keysapce 副本配置時，也須要運行此工具。數據庫

能夠經過nodetool help 'repair'查看命令幫助，以下：微信

NAME nodetool repair - Repair one or more tables SYNOPSIS nodetool [(-h <host> | --host <host>)] [(-p <port> | --port <port>)] [(-pw <password> | --password <password>)] [(-pwf <passwordFilePath> | --password-file <passwordFilePath>)] [(-u <username> | --username <username>)] repair [(-dc <specific_dc> | --in-dc <specific_dc>)...] [(-dcpar | --dc-parallel)] [(-et <end_token> | --end-token <end_token>)] [(-full | --full)] [(-hosts <specific_host> | --in-hosts <specific_host>)...] [(-j <job_threads> | --job-threads <job_threads>)] [(-local | --in-local-dc)] [(-pl | --pull)] [(-pr | --partitioner-range)] [(-seq | --sequential)] [(-st <start_token> | --start-token <start_token>)] [(-tr | --trace)] [--] [<keyspace> <tables>...] OPTIONS -dc <specific_dc>, --in-dc <specific_dc> Use -dc to repair specific datacenters -dcpar, --dc-parallel Use -dcpar to repair data centers in parallel. -et <end_token>, --end-token <end_token> Use -et to specify a token at which repair range ends -full, --full Use -full to issue a full repair. -h <host>, --host <host> Node hostname or ip address -hosts <specific_host>, --in-hosts <specific_host> Use -hosts to repair specific hosts -j <job_threads>, --job-threads <job_threads> Number of threads to run repair jobs. Usually this means number of CFs to repair concurrently. WARNING: increasing this puts more load on repairing nodes, so be careful. (default: 1, max: 4) -local, --in-local-dc Use -local to only repair against nodes in the same datacenter -p <port>, --port <port> Remote jmx agent port number -pl, --pull Use --pull to perform a one way repair where data is only streamed from a remote node to this node. -pr, --partitioner-range Use -pr to repair only the first range returned by the partitioner -pw <password>, --password <password> Remote jmx agent password -pwf <passwordFilePath>, --password-file <passwordFilePath> Path to the JMX password file -seq, --sequential Use -seq to carry out a sequential repair -st <start_token>, --start-token <start_token> Use -st to specify a token at which the repair range starts -tr, --trace Use -tr to trace the repair. Traces are logged to system_traces.events. -u <username>, --username <username> Remote jmx agent username -- This option can be used to separate command-line options from the list of argument, (useful when arguments might be mistaken for command-line options [<keyspace> <tables>...] The keyspace followed by one or many tables

主要用法說明

nodetool repair mykeyspace mytable 檢查並修復特定表分佈式

經常使用參數：工具

-j <job_threads>, --job-threads <job_threads>this

後臺並行運行的RepairSession個數，一個RepairSession對應一組節點以及節點共同維護的分區。這個謹慎調整，會增長集羣負載。spa

-full, --fullcode

全量檢查並修復，2.2以後的版本引入增量修復功能(increment repair)，默認都是走增量。增量修復會把已經repair過的數據從sstable裏分離出來，分紅2個sstable，一個是檢修過的，一個是包含未檢修數據（這個過程叫AntiCompaction）。這樣下次運行repair只會檢查沒有修復過的那個sstable，減小磁盤帶寬和創建MerkleTree開銷，避免影響在線服務（repair過程是會讀取數據並創建MerkleTree，而後在某一節點上對比不一樣節點上各自維護的副本的MerkleTree）。orm

-st <start_token>, --start-token <start_token>token

-et <end_token>, --end-token <end_token>

自定義token範圍，也就是分區（range）範圍。好比(100,1000] 表示只檢查一致性hash環上從100到1000這個區間段內分區段數據。默認無需指定，會檢修運行repair命令的當前節點上全部token。指定了這個參數，至關於作一個subrange repair，會跳過AntiCompaction。一若是想避免AntiCompaction的影響，能夠本身計算好token範圍，本身作多個subrange repair。

-pr, --partitioner-range

只檢修主要的range。主要range是什麼？好比一行數據被hash到某個range，也就是對應了某個token（此token假設由節點A負責）。而後由於keyspace是多副本的，會根據keyspace配置的ReplicationStrategy，再選出多個token負責（這些token是不一樣節點維護的）存放副本。那麼這個range對於節點A而言就是主要range。

此參數不作subrange repair纔有效

-dc <specific_dc>, --in-dc <specific_dc>

檢修只會涉及到指定dc中的節點

-hosts <specific_host>, --in-hosts <specific_host>

檢修只會涉及到指定主機列表中的節點