Kafka 1.1新功能:數據的路徑間遷移

  常常有小夥伴有這樣的疑問:爲何線上Kafka機器各個磁盤間的佔用不均勻,常常出現「一邊倒」的情形? 這是由於Kafka只保證分區數量在各個磁盤上均勻分佈,但它沒法知曉每一個分區實際佔用空間,故頗有可能出現某些分區消息數量巨大致使佔用大量磁盤空間的狀況。在1.1版本以前,用戶對此毫無辦法,由於1.1以前Kafka只支持分區數據在不一樣broker間的重分配,而沒法作到在同一個broker下的不一樣磁盤間作重分配。1.1版本正式支持副本在不一樣路徑間的遷移,具體的實現細節詳見KIP-113。本文簡單演示一下該新功能的用法。apache

  假設我在Kafka broker的server.properties文件中配置了多個路徑(表明多塊磁盤),以下所示:json

...bootstrap

############################# Log Basics #############################併發

# A comma seperated list of directories under which to store log filesthis

log.dirs=/Users/huxi/SourceCode/newenv/datalogs/kafka_1,/Users/huxi/SourceCode/newenv/datalogs/kafka_2,/Users/huxi/SourceCode/newenv/datalogs/kafka_3spa

...日誌

  以後我建立了一個9分區的topic,併發送了9百萬條消息。查詢這些目錄發現Kafka均勻地將9個分區分佈到這三個路徑上,以下所示:server

ll kafka_1/ |grep test-topic索引

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-3get

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-4

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-5

ll kafka_2/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-0

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-1

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-2

ll kafka_3/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-6

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-7

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-8

  如今咱們想要將test-topic的6,7,8分區所有遷移到kafka_2路徑下,而且把test-topic的1分區遷移到kafka_1下。若要實現這個需求,咱們首先須要編寫一個JSON文件,假定名爲migrate-replica.json:

{"partitions":[{"topic": "test-topic","partition": 1,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_1"]},{"topic": "test-topic","partition": 6,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_2"]},{"topic": "test-topic","partition": 7,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_2"]},{"topic": "test-topic","partition": 8,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_2"]}],"version":1}

其中,replicas中的0表示broker ID,因爲本文只啓動了一個broker,且broker.id = 0,故這裏只寫0便可。實際上你能夠指定多個broker實現爲多個broker同時遷移副本的功能。另外當前的version固定是1.

保存好這個JSON後,咱們執行如下命令執行副本遷移:

bin/kafka-reassign-partitions.sh  --zookeeper localhost:2181 --bootstrap-server localhost:9092 --reassignment-json-file ../migrate-replica.json --execute

Current partition replica assignment

 

{"version":1,"partitions":[{"topic":"test-topic","partition":8,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":4,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":5,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":2,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":6,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":3,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":1,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":7,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":0,"replicas":[0],"log_dirs":["any"]}]}

 

Save this to use as the --reassignment-json-file option during rollback

Successfully started reassignment of partitions.

再次查看路徑副本分佈:

ll kafka_1/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-1

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-3

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-4

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-5

ll kafka_2/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-0

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-2

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-6

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-7

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-8

ll kafka_3/ |grep test-topic

<empty>

顯然,6,7,8已經被成功地遷移到kafka_2下,而分區1也遷移到了kafka_1下。值得一提的是,不只全部的日誌段、索引文件被遷移,實際上分區外層的checkpoint文件也會被更新。好比咱們檢查kafka_2下的replication-offset-checkpoint文件能夠發現,如今該文件已經包含了6,7,8分區的位移數據,以下所示:

cat replication-offset-checkpoint 

0

7

test-topic 8 1000000

test-topic 2 1000000

test 0 1285714

test-topic 6 1000000

test-topic 7 1000000

test-topic 0 1000000

test 2 1285714

 

以上就是對1.1新功能「副本跨路徑遷移」的簡單嘗試,但願對有此困擾的用戶有用~~

相關文章
相關標籤/搜索