Cassandra是一款去中心化的分佈式數據庫。一份數據會分佈在多個對等的節點上,即有多個副本。咱們須要按期的對多個副本檢查,看是否有不一致的狀況。好比由於磁盤損壞,可能會致使副本丟失,這樣同一份數據的多個副本就會出現不一致。node
Cassandra提供的nodetool中提供了repair這個工具,能夠用來平常巡檢數據的一致性。或者當修修改了keysapce 副本配置時,也須要運行此工具。數據庫
能夠經過nodetool help 'repair'查看命令幫助,以下:微信
NAME nodetool repair - Repair one or more tables SYNOPSIS nodetool [(-h <host> | --host <host>)] [(-p <port> | --port <port>)] [(-pw <password> | --password <password>)] [(-pwf <passwordFilePath> | --password-file <passwordFilePath>)] [(-u <username> | --username <username>)] repair [(-dc <specific_dc> | --in-dc <specific_dc>)...] [(-dcpar | --dc-parallel)] [(-et <end_token> | --end-token <end_token>)] [(-full | --full)] [(-hosts <specific_host> | --in-hosts <specific_host>)...] [(-j <job_threads> | --job-threads <job_threads>)] [(-local | --in-local-dc)] [(-pl | --pull)] [(-pr | --partitioner-range)] [(-seq | --sequential)] [(-st <start_token> | --start-token <start_token>)] [(-tr | --trace)] [--] [<keyspace> <tables>...] OPTIONS -dc <specific_dc>, --in-dc <specific_dc> Use -dc to repair specific datacenters -dcpar, --dc-parallel Use -dcpar to repair data centers in parallel. -et <end_token>, --end-token <end_token> Use -et to specify a token at which repair range ends -full, --full Use -full to issue a full repair. -h <host>, --host <host> Node hostname or ip address -hosts <specific_host>, --in-hosts <specific_host> Use -hosts to repair specific hosts -j <job_threads>, --job-threads <job_threads> Number of threads to run repair jobs. Usually this means number of CFs to repair concurrently. WARNING: increasing this puts more load on repairing nodes, so be careful. (default: 1, max: 4) -local, --in-local-dc Use -local to only repair against nodes in the same datacenter -p <port>, --port <port> Remote jmx agent port number -pl, --pull Use --pull to perform a one way repair where data is only streamed from a remote node to this node. -pr, --partitioner-range Use -pr to repair only the first range returned by the partitioner -pw <password>, --password <password> Remote jmx agent password -pwf <passwordFilePath>, --password-file <passwordFilePath> Path to the JMX password file -seq, --sequential Use -seq to carry out a sequential repair -st <start_token>, --start-token <start_token> Use -st to specify a token at which the repair range starts -tr, --trace Use -tr to trace the repair. Traces are logged to system_traces.events. -u <username>, --username <username> Remote jmx agent username -- This option can be used to separate command-line options from the list of argument, (useful when arguments might be mistaken for command-line options [<keyspace> <tables>...] The keyspace followed by one or many tables
nodetool repair mykeyspace mytable 檢查並修復特定表分佈式
經常使用參數:工具
-j <job_threads>, --job-threads <job_threads>this
後臺並行運行的RepairSession個數,一個RepairSession對應一組節點以及節點共同維護的分區。這個謹慎調整,會增長集羣負載。spa
-full, --fullcode
全量檢查並修復,2.2以後的版本引入增量修復功能(increment repair),默認都是走增量。增量修復會把已經repair過的數據從sstable裏分離出來,分紅2個sstable,一個是檢修過的,一個是包含未檢修數據(這個過程叫AntiCompaction)。這樣下次運行repair只會檢查沒有修復過的那個sstable,減小磁盤帶寬和創建MerkleTree開銷,避免影響在線服務(repair過程是會讀取數據並創建MerkleTree,而後在某一節點上對比不一樣節點上各自維護的副本的MerkleTree)。orm
-st <start_token>, --start-token <start_token>token
-et <end_token>, --end-token <end_token>
自定義token範圍,也就是分區(range)範圍。好比(100,1000] 表示只檢查一致性hash環上從100到1000這個區間段內分區段數據。默認無需指定,會檢修運行repair命令的當前節點上全部token。指定了這個參數,至關於作一個subrange repair,會跳過AntiCompaction。一若是想避免AntiCompaction的影響,能夠本身計算好token範圍,本身作多個subrange repair。
-pr, --partitioner-range
只檢修主要的range。主要range是什麼?好比一行數據被hash到某個range,也就是對應了某個token(此token假設由節點A負責)。而後由於keyspace是多副本的,會根據keyspace配置的ReplicationStrategy,再選出多個token負責(這些token是不一樣節點維護的)存放副本。那麼這個range對於節點A而言就是主要range。
此參數不作subrange repair纔有效
-dc <specific_dc>, --in-dc <specific_dc>
檢修只會涉及到指定dc中的節點
-hosts <specific_host>, --in-hosts <specific_host>
檢修只會涉及到指定主機列表中的節點
爲了營造一個開放的Cassandra技術交流環境,社區創建了微信公衆號和釘釘羣。爲廣大用戶提供專業的技術分享及問答,按期開展專家技術直播,歡迎你們加入。另雲Cassandra免費火爆公測中,歡迎試用:https://www.aliyun.com/product/cds
本文爲雲棲社區原創內容,未經容許不得轉載。