PHP下kafka的實踐

kafka

簡介

Kafka 是一種高吞吐量的分佈式發佈訂閱消息系統

kafka角色必知

producer:生產者。
consumer:消費者。
topic: 消息以topic爲類別記錄,Kafka將消息種子(Feed)分類, 每一類的消息稱之爲一個主題(Topic)。
broker:以集羣的方式運行,能夠由一個或多個服務組成,每一個服務叫作一個broker;消費者能夠訂閱一個或多個主題(topic), 並從Broker拉數據,從而消費這些已發佈的消息。

經典模型

1. 一個主題下的分區不能小於消費者數量,即一個主題下消費者數量不能大於分區屬,大了就浪費了空閒了
2. 一個主題下的一個分區能夠同時被不一樣消費組其中某一個消費者消費
3. 一個主題下的一個分區只能被同一個消費組的一個消費者消費

clipboard.png

經常使用參數說明

request.required.acks

Kafka producer的ack有3中機制,初始化producer時的producerconfig能夠經過配置request.required.acks不一樣的值來實現。

0:這意味着生產者producer不等待來自broker同步完成的確認繼續發送下一條(批)消息。此選項提供最低的延遲但最弱的耐久性保證(當服務器發生故障時某些數據會丟失,如leader已死,但producer並不知情,發出去的信息broker就收不到)。

1:這意味着producer在leader已成功收到的數據並獲得確認後發送下一條message。此選項提供了更好的耐久性爲客戶等待服務器確認請求成功(被寫入死亡leader但還沒有複製將失去了惟一的消息)。

-1:這意味着producer在follower副本確認接收到數據後纔算一次發送完成。 
此選項提供最好的耐久性,咱們保證沒有信息將丟失,只要至少一個同步副本保持存活。

三種機制,性能依次遞減 (producer吞吐量下降),數據健壯性則依次遞增。

auto.offset.reset

1. earliest:自動將偏移重置爲最先的偏移量
2. latest:自動將偏移量重置爲最新的偏移量(默認)
3. none:若是consumer group沒有發現先前的偏移量,則向consumer拋出異常。
4. 其餘的參數:向consumer拋出異常(無效參數)

kafka安裝和簡單測試

安裝kafka(不須要安裝,解包便可)

# 官方下載地址:http://kafka.apache.org/downloads
# wget https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.12-1.1.1.tgz
tar -xzf kafka_2.12-1.1.1.tgz
cd kafka_2.12-1.1.0

啓動kafka server

# 需先啓動zookeeper
# -daemon 可啓動後臺守護模式
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

啓動kafka客戶端測試

# 建立一個話題,test話題2個分區
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 2 --topic test
Created topic "test".

# 顯示全部話題
bin/kafka-topics.sh --list --zookeeper localhost:2181
test

# 顯示話題信息
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
Topic:test    PartitionCount:2    ReplicationFactor:1    Configs:
    Topic: test    Partition: 0    Leader: 0    Replicas: 0    Isr: 0
    Topic: test    Partition: 1    Leader: 0    Replicas: 0    Isr: 0


# 啓動一個生產者(輸入消息)
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
[等待輸入本身的內容 出現>輸入便可]
>i am a new msg !
>i am a good msg ?

# 啓動一個消費者(等待消息) 
# 注意這裏的--from-beginning,每次都會從頭開始讀取,你能夠嘗試去掉和不去掉看下效果
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
[等待消息]
i am a new msg !
i am a good msg ?

安裝kafka的php擴展

# 先安裝rdkfka庫文件
git clone https://github.com/edenhill/librdkafka.git
cd librdkafka/
./configure 
make
sudo make install

git clone https://github.com/arnaud-lb/php-rdkafka.git
cd php-rdkafka
phpize
./configure
make all -j 5
sudo make install

vim [php]/php.ini
extension=rdkafka.so

php代碼實踐

生產者

<?php
$conf = new RdKafka\Conf();
$conf->setDrMsgCb(function ($kafka, $message) {
    file_put_contents("./dr_cb.log", var_export($message, true).PHP_EOL, FILE_APPEND);
});
$conf->setErrorCb(function ($kafka, $err, $reason) {
    file_put_contents("./err_cb.log", sprintf("Kafka error: %s (reason: %s)", rd_kafka_err2str($err), $reason).PHP_EOL, FILE_APPEND);
});

$rk = new RdKafka\Producer($conf);
$rk->setLogLevel(LOG_DEBUG);
$rk->addBrokers("127.0.0.1");

$cf = new RdKafka\TopicConf();
// -1必須等全部brokers同步完成的確認 1當前服務器確認 0不確認,這裏若是是0回調裏的offset無返回,若是是1和-1會返回offset
// 咱們能夠利用該機制作消息生產的確認,不過還不是100%,由於有可能會中途kafka服務器掛掉
$cf->set('request.required.acks', 0);
$topic = $rk->newTopic("test", $cf);

$option = 'qkl';
for ($i = 0; $i < 20; $i++) {
    //RD_KAFKA_PARTITION_UA自動選擇分區
    //$option可選
    $topic->produce(RD_KAFKA_PARTITION_UA, 0, "qkl . $i", $option);
}


$len = $rk->getOutQLen();
while ($len > 0) {
    $len = $rk->getOutQLen();
    var_dump($len);
    $rk->poll(50);
}

運行生產者

php producer.php
# output

int(20)
int(20)
int(20)
int(20)
int(0)

# 你能夠查看你剛纔上面啓動的消費者shell應該會輸出消息
qkl . 0
qkl . 1
qkl . 2
qkl . 3
qkl . 4
qkl . 5
qkl . 6
qkl . 7
qkl . 8
qkl . 9
qkl . 10
qkl . 11
qkl . 12
qkl . 13
qkl . 14
qkl . 15
qkl . 16
qkl . 17
qkl . 18
qkl . 19

Low Level 消費者

<?php
$conf = new RdKafka\Conf();
$conf->setDrMsgCb(function ($kafka, $message) {
    file_put_contents("./c_dr_cb.log", var_export($message, true), FILE_APPEND);
});
$conf->setErrorCb(function ($kafka, $err, $reason) {
    file_put_contents("./err_cb.log", sprintf("Kafka error: %s (reason: %s)", rd_kafka_err2str($err), $reason).PHP_EOL, FILE_APPEND);
});

//設置消費組
$conf->set('group.id', 'myConsumerGroup');

$rk = new RdKafka\Consumer($conf);
$rk->addBrokers("127.0.0.1");

$topicConf = new RdKafka\TopicConf();
$topicConf->set('request.required.acks', 1);
//在interval.ms的時間內自動提交確認、建議不要啓動
//$topicConf->set('auto.commit.enable', 1);
$topicConf->set('auto.commit.enable', 0);
$topicConf->set('auto.commit.interval.ms', 100);

// 設置offset的存儲爲file
//$topicConf->set('offset.store.method', 'file');
// 設置offset的存儲爲broker
 $topicConf->set('offset.store.method', 'broker');
//$topicConf->set('offset.store.path', __DIR__);

//smallest:簡單理解爲從頭開始消費,其實等價於上面的 earliest
//largest:簡單理解爲從最新的開始消費,其實等價於上面的 latest
//$topicConf->set('auto.offset.reset', 'smallest');

$topic = $rk->newTopic("test", $topicConf);

// 參數1消費分區0
// RD_KAFKA_OFFSET_BEGINNING 重頭開始消費
// RD_KAFKA_OFFSET_STORED 最後一條消費的offset記錄開始消費
// RD_KAFKA_OFFSET_END 最後一條消費
$topic->consumeStart(0, RD_KAFKA_OFFSET_BEGINNING);
//$topic->consumeStart(0, RD_KAFKA_OFFSET_END); //
//$topic->consumeStart(0, RD_KAFKA_OFFSET_STORED);

while (true) {
    //參數1表示消費分區,這裏是分區0
    //參數2表示同步阻塞多久
    $message = $topic->consume(0, 12 * 1000);
    if (is_null($message)) {
        sleep(1);
        echo "No more messages\n";
        continue;
    }
    switch ($message->err) {
        case RD_KAFKA_RESP_ERR_NO_ERROR:
            var_dump($message);
            break;
        case RD_KAFKA_RESP_ERR__PARTITION_EOF:
            echo "No more messages; will wait for more\n";
            break;
        case RD_KAFKA_RESP_ERR__TIMED_OUT:
            echo "Timed out\n";
            break;
        default:
            throw new \Exception($message->errstr(), $message->err);
            break;
    }
}

High LEVEL消費者

<?php
/**
 * Created by PhpStorm.
 * User: qkl
 * Date: 2018/8/22
 * Time: 17:58
 */
$conf = new \RdKafka\Conf();

function rebalance(\RdKafka\KafkaConsumer $kafka, $err, array $partitions = null) {
    global $offset;
    switch ($err) {
        case RD_KAFKA_RESP_ERR__ASSIGN_PARTITIONS:
            echo "Assign: ";
            var_dump($partitions);
            $kafka->assign();
//            $kafka->assign([new RdKafka\TopicPartition("qkl01", 0, 0)]);
            break;

        case RD_KAFKA_RESP_ERR__REVOKE_PARTITIONS:
            echo "Revoke: ";
            var_dump($partitions);
            $kafka->assign(NULL);
            break;

        default:
            throw new \Exception($err);
    }
}

// Set a rebalance callback to log partition assignments (optional)
$conf->setRebalanceCb(function(\RdKafka\KafkaConsumer $kafka, $err, array $partitions = null) {
    rebalance($kafka, $err, $partitions);
});

// Configure the group.id. All consumer with the same group.id will consume
// different partitions.
$conf->set('group.id', 'test-110-g100');

// Initial list of Kafka brokers
$conf->set('metadata.broker.list', '192.168.216.122');

$topicConf = new \RdKafka\TopicConf();

$topicConf->set('request.required.acks', -1);
//在interval.ms的時間內自動提交確認、建議不要啓動
$topicConf->set('auto.commit.enable', 0);
//$topicConf->set('auto.commit.enable', 0);
$topicConf->set('auto.commit.interval.ms', 100);

// 設置offset的存儲爲file
$topicConf->set('offset.store.method', 'file');
$topicConf->set('offset.store.path', __DIR__);
// 設置offset的存儲爲broker
// $topicConf->set('offset.store.method', 'broker');

// Set where to start consuming messages when there is no initial offset in
// offset store or the desired offset is out of range.
// 'smallest': start from the beginning
$topicConf->set('auto.offset.reset', 'smallest');

// Set the configuration to use for subscribed/assigned topics
$conf->setDefaultTopicConf($topicConf);

$consumer = new \RdKafka\KafkaConsumer($conf);

//$KafkaConsumerTopic = $consumer->newTopic('qkl01', $topicConf);

// Subscribe to topic 'test'
$consumer->subscribe(['qkl01']);

echo "Waiting for partition assignment... (make take some time when\n";
echo "quickly re-joining the group after leaving it.)\n";

while (true) {
    $message = $consumer->consume(120*1000);
    switch ($message->err) {
        case RD_KAFKA_RESP_ERR_NO_ERROR:
            var_dump($message);
//            $consumer->commit($message);
//            $KafkaConsumerTopic->offsetStore(0, 20);
            break;
        case RD_KAFKA_RESP_ERR__PARTITION_EOF:
            echo "No more messages; will wait for more\n";
            break;
        case RD_KAFKA_RESP_ERR__TIMED_OUT:
            echo "Timed out\n";
            break;
        default:
            throw new \Exception($message->errstr(), $message->err);
            break;
    }
}

消費組特別說明

特別注意,High LEVEL消費者設置的消費組,kafka服務器纔會記錄, Low Level消費者設置的消費組,服務器不會記錄

具體查看消費組信息,你能夠翻閱本篇文章php

查看服務器元數據(topic/partition/broker)

<?php

$conf = new RdKafka\Conf();
$conf->setDrMsgCb(function ($kafka, $message) {
    file_put_contents("./xx.log", var_export($message, true), FILE_APPEND);
});
$conf->setErrorCb(function ($kafka, $err, $reason) {
    printf("Kafka error: %s (reason: %s)\n", rd_kafka_err2str($err), $reason);
});

$conf->set('group.id', 'myConsumerGroup');

$rk = new RdKafka\Consumer($conf);
$rk->addBrokers("127.0.0.1");

$allInfo = $rk->metadata(true, NULL, 60e3);

$topics = $allInfo->getTopics();

echo rd_kafka_offset_tail(100);
echo "--";

echo count($topics);
echo "--";


foreach ($topics as $topic) {

    $topicName = $topic->getTopic();
    if ($topicName == "__consumer_offsets") {
        continue ;
    }

    $partitions = $topic->getPartitions();
    foreach ($partitions as $partition) {
//        $rf = new ReflectionClass(get_class($partition));
//        foreach ($rf->getMethods() as $f) {
//            var_dump($f);
//        }
//        die();
        $topPartition = new RdKafka\TopicPartition($topicName, $partition->getId());
        echo  "當前的話題:" . ($topPartition->getTopic()) . " - " . $partition->getId() . " - ";
        echo  "offset:" . ($topPartition->getOffset()) . PHP_EOL;
    }
}

若是需遠端生產和消費

vim config/server.properties
advertised.listeners=PLAINTEXT://ip:9092
# ip 未你kafka的外網ip便可

分享一個打包好的php-rdkafka的類庫

https://github.com/qkl9527/php-rdkafka-classgit

clipboard.png

參考文獻

Kafka文檔github

相關文章
相關標籤/搜索