Sematext Monitoring 是最全面的Kafka監視解決方案之一,可捕獲約200個Kafka指標,包括Kafka Broker,Producer和Consumer指標。儘管其中許多指標頗有用,但每一個人都有一個要監視的特定指標–消費者滯後。java
卡夫卡消費者滯後指標代表卡夫卡生產者和消費者之間存在多少滯後。人們談論卡夫卡時,一般指的是卡夫卡經紀人。您能夠將Kafka Broker視爲Kafka服務器。代理其實是存儲和提供Kafka消息的對象。Kafka生產者是將消息寫入Kafka(經紀人)的應用程序。Kafka使用者是從Kafka(Brokers)讀取消息的應用程序。git
內部經紀人數據存儲在一個或多個主題中,每一個主題由一個或多個分區組成。當寫入數據時,代理實際上會將其寫入特定的分區。在寫入數據時,它會跟蹤每一個分區中的最後一個「寫入位置」。這稱爲最新偏移,也稱爲對數結束偏移。每一個分區都有本身獨立的最新偏移量。github
就像Broker跟蹤每一個分區中的寫入位置同樣,每一個Consumer跟蹤每一個正在消耗其數據的分區中的「讀取位置」。也就是說,它跟蹤已讀取的數據。這被稱爲消費者抵銷。消費者偏移量會按期存在(到ZooKeeper或Kafka自己的特殊主題),所以它能夠承受消費者崩潰或不正常關機的狀況,並避免重複使用過多的舊數據。服務器
卡夫卡消費者滯後率和讀/寫率app
在上面的圖表中,咱們能夠看到黃色的條形,表明着經紀人編寫生產者建立的消息的速率。橙色條形表示消費者從經紀人那裏消費消息的速率。費率看起來大體相等-必須保持一致,不然消費者將落後。可是,在寫入消息和使用消息之間始終會有一些延遲。讀取老是落後於寫入,這就是咱們所說的「消費者滯後」。消費者滯後時間只是最新偏移量和消費者偏移量之間的增量。less
現在,許多應用程序都是基於可以處理(接近)實時數據的。考慮一下性能監控系統(例如Sematext Monitoring)或日誌管理服務(例如Sematext Logs)。他們接二連三地處理無限量的近實時數據。若是它們向您顯示指標或日誌的時間過長-若是「消費者滯後」過大-它們將幾乎無用。消費者滯後告訴咱們每一個分區中每一個消費者(組)落後多遠。滯後時間越短,實時數據消耗就越大。ide
卡夫卡消費者滯後和經紀人抵銷變化工具
正如咱們剛剛瞭解到的,「最新偏移量」與「消費者偏移量」之間的差別是致使咱們「消費者滯後」的緣由。在上面的Sematext圖表中,您可能已經注意到其餘一些指標:性能
速率指標是派生的指標。若是您查看Kafka的指標,您將找不到它們。在後臺,開源Sematext代理收集了一些Kafka指標具備各類偏移量,可從這些偏移量計算這些費率。此外,它還繪製了經紀人最先的偏移量變化圖,這是每一個經紀人分區中已知的最先的偏移量。換句話說,此偏移量是分區中最舊消息的偏移量。儘管僅靠偏移量可能並非超級有用,但當狀況出現問題時,瞭解其變化狀況可能會很方便。Kafka中的數據具備必定的TTL(生存時間),能夠輕鬆清除舊數據。該清除操做由Kafka自己執行。每次清除都會使最舊數據的偏移量發生變化。Sematext的經紀人最先的抵銷更改會浮出水面,以便您進行監控。該指標使您瞭解清除的頻率以及每次運行時清除的消息數量。this
那裏有幾種Kafka監控工具,例如 LinkedIn的Burrow,其Sematext中使用了Kafka Offset監控和Consumer Lag監控方法。咱們在Kafka開源監控工具中編寫了各類開源監控工具。若是您須要一個好的Kafka監控解決方案,請嘗試使用Sematext。將您的Kafka和其餘日誌發送到Sematext Logs中,您便擁有了一個DevOps解決方案,該解決方案使故障排除變得容易而不是麻煩。
is one of the most comprehensive Kafka monitoring solutions, capturing some 200 Kafka metrics, including Kafka Broker, Producer, and Consumer metrics. While lots of those metrics are useful, there is one particular metric everyone wants to monitor – Consumer Lag.
Kafka Consumer Lag is the indicator of how much lag there is between Kafka producers and consumers. When people talk about Kafka they are typically referring to Kafka Brokers. You can think of a Kafka Broker as a Kafka server. A Broker is what actually stores and serves Kafka messages. Kafka Producers are applications that write messages into Kafka (Brokers). Kafka Consumers are applications that read messages from Kafka (Brokers).
Inside Brokers data is stored in one or more Topics, and each Topic consists of one or more Partitions. When writing data a Broker actually writes it into a specific Partition. As it writes data it keeps track of the last 「write position」 in each Partition. This is called Latest Offset also known as Log End Offset. Each Partition has its own independent Latest Offset.
Just like Brokers keep track of their write position in each Partition, each Consumer keeps track of 「read position」 in each Partition whose data it is consuming. That is, it keeps track of which data it has read. This is known as Consumer Offset. This Consumer Offset is periodically persisted (to ZooKeeper or a special Topic in Kafka itself) so it can survive Consumer crashes or unclean shutdowns and avoid re-consuming too much old data.
Kafka Consumer Lag and Read/Write Rates
In our diagram above we can see yellow bars, which represents the rate at which Brokers are writing messages created by Producers. The orange bars represent the rate at which Consumers are consuming messages from Brokers. The rates look roughly equal – and they need to be, otherwise the Consumers will fall behind. However, there is always going to be some delay between the moment a message is written and the moment it is consumed. Reads are always going to be lagging behind writes, and that is what we call Consumer Lag. The Consumer Lag is simply the delta between the Latest Offset and Consumer Offset.
Many applications today are based on being able to process (near) real-time data. Think about performance monitoring system like Sematext Monitoring or log management service like Sematext Logs. They continuously process infinite streams of near real-time data. If they were to show you metrics or logs with too much delay – if the Consumer Lag were too big – they’d be nearly useless. This Consumer Lag tells us how far behind each Consumer (Group) is in each Partition. The smaller the lag the more real-time the data consumption.
Kafka Consumer Lag and Broker Offset Changes
As we just learned the delta between the Latest Offset and the Consumer Offset is what gives us the Consumer Lag. In the above chart from Sematext you may have noticed a few other metrics:
The rate metrics are derived metrics. If you look at Kafka’s metrics you won’t find them there. Under the hood the open source Sematext agent collects a few Kafka metrics with various offsets from which these rates are computed. In addition, it charts Broker Earliest Offset Changes, which is the earliest known offset in each Broker’s Partition. Put another way, this offset is the offset of the oldest message in a Partition. While this offset alone may not be super useful, knowing how it’s changing could be handy when things go awry. Data in Kafka has a certain TTL (Time To Live) to allow for easy purging of old data. This purging is performed by Kafka itself. Every time such purging kicks in the offset of the oldest data changes. Sematext’s Broker Earliest Offset Change surfaces this information for your monitoring pleasure. This metric gives you an idea how often purges are happening and how many messages they’ve removed each time they ran.
There are several Kafka monitoring tools out there that, like LinkedIn’s Burrow, whose Kafka Offset monitoring and Consumer Lag monitoring approach is used in Sematext. We’ve written various open source monitoring tools in Kafka Open Source Monitoring Tools. If you need a good Kafka monitoring solution, give Sematext a go. Ship your Kafka and other logs into Sematext Logs and you’ve got yourself a DevOps solution that will make troubleshooting easy instead of dreadful.