主機監控是監控Kafka集羣Broker所在的節點機器的性能。常見的主機監控指標包括:
(1)機器負載(Load)
(2)CPU使用率
(3)內存使用率,包括空閒內存(Free Memory)和已使用內存(Used Memory)
(4)磁盤I/O使用率,包括讀使用率和寫使用率網絡
(5)I/O使用率
(6)TCP鏈接數
(7)打開文件數
(8)inode使用狀況java
Kafka Broker進程是一個普通的Java進程,所以全部關於JVM的監控方式均可以用於對Kafka Broker進程的監控。
(1)Full GC發生頻率和時長,用於評估Full GC對Broker進程的影響。長時間的停頓會令Broker端拋出各類超時異常。
(2)活躍對象大小,是設定堆大小的重要依據,能幫助細粒度地調優JVM各個代的堆大小。
(3)應用線程總數。瞭解Broker進程對CPU的使用狀況。2019-07-30T09:13:03.809+0800: 552.982: [GC cleanup 827M->645M(1024M), 0.0019078 secs]
Broker JVM進程默認使用G1的GC算法,當cleanup步驟結束後,堆上活躍對象大小從827MB縮減成645MB。Kafka 0.9.0.0版本起,默認GC收集器爲G1,而G1中的Full GC是由單線程執行的,速度很是慢。所以,須要監控Broker GC日誌,即以kafkaServer-gc.log開頭的文件。若是發現Broker進程頻繁Full GC,能夠開啓G1的-XX:+PrintAdaptiveSizePolicy開關,讓JVM指明是誰引起Full GC。node
(1)查看Broker進程是否啓動,端口是否創建。在容器化的Kafka環境中,使用Docker啓動Kafka Broker時,Docker容器雖然成功啓動,但網絡設置若是配置有誤,就可能會出現進程已經啓動但端口未成功創建監聽的情形。
(2)查看Broker端關鍵日誌。Broker端服務器日誌server.log,控制器日誌controller.log以及主題分區狀態變動日誌state-change.log。
(3)查看Broker端關鍵線程的運行狀態。Kafka Broker進程會啓動十幾個甚至是幾十個線程。在實際生產環境中,Log Compaction線程是以kafka-log-cleaner-thread開頭的,負責日誌Compaction;副本拉取消息的線程,一般以ReplicaFetcherThread開頭,負責執行Follower副本向Leader副本拉取消息的邏輯。
(4)查看Broker端的關鍵JMX指標。
BytesIn/BytesOut:即Broker端每秒入站和出站字節數,若是值接近網絡帶寬,很容易出現網絡丟包的情形。
NetworkProcessorAvgIdlePercent:即網絡線程池線程平均的空閒比例,一般須要確保其值長期大於30%。若是小於30%,代表網絡線程池很是繁忙,須要經過增長網絡線程數或將負載轉移給其它服務器的方式,來給Broker減負。
RequestHandlerAvgIdlePercent:即I/O線程池線程平均的空閒比例。若是值長期小於30%,須要調整I/O線程池的數量或者減小 Broker端的負載。
UnderReplicatedPartitions:即未充分備份的分區數。所謂未充分備份,是指並不是全部的Follower副本都和Leader副本保持同步。
ISRShrink/ISRExpand:即ISR收縮和擴容的頻次指標。若是生產環境中出現ISR中副本頻繁進出的情形,其值必定是很高的。須要診斷下副本頻繁進出ISR的緣由,並採起適當的措施。
ActiveControllerCount:即當前處於激活狀態的控制器的數量。一般,Controller所在Broker上的ActiveControllerCount指標值是1,其它Broker上的值是 0。若是發現存在多臺Broker上ActiveControllerCount值都是1,代表Kafka集羣出現了腦裂,必須儘快處理,處理方式主要是查看網絡連通性。腦裂問題是很是嚴重的分佈式故障,Kafka目前依託ZooKeeper來防止腦裂,一旦出現腦裂,Kafka沒法保證正常工做。
(5)監控Kafka客戶端。客戶端所在的機器與Kafka Broker機器之間的網絡往返時延(Round-Trip Time,RTT)。對於生產者,以kafka-producer-network-thread開頭的線程負責實際消息發送,一旦掛掉,Producer將沒法正常工做,但Producer進程不會自動掛掉。對於消費者,以kafka-coordinator-heartbeat-thread 開頭的心跳線程事關Rebalance。
從Producer角度,須要關注的JMX指標是request-latency,即消息生產請求的延時,最直接地表徵Producer程序的TPS;從 Consumer角度,records-lag和records-lead是兩個重要的JMX 指標。若是使用Consumer Group,須要關注join rate和sync rate指標,其代表Rebalance的頻繁程度。ios
JMX(Java Management Extensions)能夠管理、監控正在運行中的Java程序,用於管理線程、內存、日誌Level、服務重啓、系統環境等。git
開啓JMX端口的方式有兩種:
(1)啓動Kafka時設置JMX_PORTexport JMX_PORT=9999 kafka-server-start.sh -daemon config/server.properties
(2)修改kafka-run-class.sh
在kafka-run-class.sh文件開始增長下列行:JMX_PORT=9999
修改kafka-run-class.sh文件後重啓Kafka集羣。
(3)Kafka Docker容器服務的JMX開啓
Kafka容器服務的docker-compose.yml文件導入KAFKA_JMX_OPTS和JMX_PORT環境變量。github
KAFKA_JMX_OPTS: "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=192.168.0.105 -Dcom.sun.management.jmxremote.rmi.port=9999" JMX_PORT: 9999
將相應的JMX端口對外暴露。web
ports: - "9999:9999" # 對外暴露端口號
Kafka須要監控Broker和Topic數據時,須要開啓JMX_PORT,一般在腳本kafka-run-class.sh裏面定義JMX_PORT變量,但JMX_PORT定義完成後,執行bin目錄下腳本工具會報錯。緣由在於
kafka-run-class.sh是被調用腳本,當被其它腳本調用時,Java會綁定JMX_PORT,致使端口被佔用。
解決方法是在執行Kafka啓動時指定JMX_PORT。
(1)supervisor啓動Kafka,在supervisor服務啓動配置文件中加入environment=JMX_PORT=9999。
(2)kafka-server-start.sh腳本啓動Kafka,在啓動時export JMX_PORT=9999或者在kafka-server-start.sh腳本指定。
(3)修改kafka-run-class.sh腳本
修改Kafka安裝目錄下的bin/Kafka-run-class.sh文件:算法
JMXTool是Kafka社區的工具,可以實時查看Kafka JMX指標。kafka-run-class.sh kafka.tools.JmxTool
--attributes:指定要查詢的JMX屬性名稱,是以逗號分隔的CSV格式。
--date-format:指定顯示的日誌格式
--jmx-url:指定要鏈接的JMX接口,默認格式是service:jmx:rmi:///jndi/rmi://:JMX端口/jmxrmi
。
--object-name:指定要查詢的JMX MBean名稱。
--reporting-interval:指定實時查詢的時間間隔,默認2s。
每秒查詢一次過去1分鐘的Broker端每秒入站的流量(BytesInPerSec)命令以下:kafka-run-class.sh kafka.tools.JmxTool --object-name kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec --jmx-url service:jmx:rmi:///jndi/rmi://:9999/jmxrmi --date-format "YYYY-MM-dd HH:mm:ss" --attributes OneMinuteRate --reporting-interval 1000
ActiveController JMX指標查看命令以下:kafka-run-class.sh kafka.tools.JmxTool --object-name kafka.controller:type=KafkaController,name=ActiveControllerCount --jmx-url service:jmx:rmi:///jndi/rmi://:9999/jmxrmi --date-format "YYYY-MM-dd HH:mm:ss" --reporting-interval 1000
docker
Kafka Manager是雅虎公司於2015年開源的一個Kafka監控框架,使用Scala語言開發,主要用於管理和監控Kafka集羣。
Kafka Manager目前已經更名爲CMAK (Cluster Manager for Apache Kafka)。
GitHub地址:
https://github.com/yahoo/CMAK
Kafka Manager Docker鏡像:kafkamanager/kafka-manager
若是須要設置Kafka Manager基本安全認證,能夠爲Kafka Manager設置環境變量:數據庫
KAFKA_MANAGER_AUTH_ENABLED: "true" KAFKA_MANAGER_USERNAME: username KAFKA_MANAGER_PASSWORD: password
Kafka-Manager服務部署Docker-Compose.yml文件以下:json
# 定義kafka-manager服務 kafka-manager-test: image: kafkamanager/kafka-manager # kafka-manager鏡像 restart: always container_name: kafka-manager-test hostname: kafka-manager-test ports: - "9000:9000" # 對外暴露端口,提供web訪問 depends_on: - kafka-test # 依賴 environment: ZK_HOSTS: zookeeper-test:2181 # 宿主機IP KAFKA_BROKERS: kafka-test:9090 # kafka KAFKA_MANAGER_AUTH_ENABLED: "true" KAFKA_MANAGER_USERNAME: admin KAFKA_MANAGER_PASSWORD: password
啓動Kafka Manager服務,登陸Kafka Manager Web。
Web地址:http://127.0.0.1:9000
增長Kafka-Manager管理Kafka Broker節點:
一般,監控框架可使用JMXTrans + InfluxDB + Grafana組合,因爲Grafana支持對JMX指標的監控,所以很容易將Kafka各類 JMX指標集成進來,對於已經採用JMXTrans + InfluxDB + Grafana監控方案的公司來講,能夠直接複用已有的監控框架,能夠極大地節省運維成本。
Control Center可以實時地監控Kafka集羣,同時還可以幫助操做和搭建基於Kafka的實時流處理應用。Control Center不是免費的,必須使用Confluent Kafka Platform企業版才能使用。
Jconsole(Java Monitoring and Management Console)是一種基於JMX的可視化監視、管理工具,提供概述、內存、線程、類、VM概要、MBean的監控。
在Linux Terminal執行jsoncole,在彈出的窗口的遠程進程中輸入service:jmx:rmi:///jndi/rmi://192.168.0.105:9999/jmxrmi
或192.168.0.105:9999
。
選擇MBeans選項卡,
JMXTrans是一個經過JMX採集Java應用程序的數據採集器,只要Java應用程序開啓JMX端口,就能夠進行採集。
JMXTrans之後臺deamon形式運行,每隔1分鐘採集一次數據。
GitHub地址:https://github.com/jmxtrans/jmxtrans
JMXTrans Docker容器鏡像下載:docker pull jmxtrans/jmxtrans
JMXTrans默認讀取/var/lib/jmxtran
s目錄下全部數據源配置文件(json格式文件),實時從數據源中獲取數據,解析數據後存儲到InfluxDB中。
JMXTrans配置JSON文件以下:
{ "servers": [{ "port": "9901", "host": "192.168.0.105", "queries": [{ "obj": "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec", "attr": ["MeanRate", "OneMinuteRate", "FiveMinuteRate", "FifteenMinuteRate"], "resultAlias": "kafkaServer", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://192.168.0.105:8086/", "username": "admin", "password": "123456", "database": "jmx", "tags": { "application": "kafka_server" } }] }] }] }
servers:數組,數據源配置。 port:字符串,接收jmx的json數據的端口。 host:字符串,接收jmx的json數據的IP地址。 queries:數組,具體監控指標項,按JSON格式列出多個指標項,監控指標能夠經過jconsole工具(JDK自帶的工具)獲取。 obj:字符串,監控指標的名稱。 attr:數組,須要存儲的指標項字段,是數據目標表的字段名。 resultAlias:字符串,InfluxDB中的表名。 outputWriters:數組,數據目的地。 @class:字符串,數據目的地的類。 url:字符串,數據目的地( InfluxDb )的url。 username:字符串,InfluxDB登陸名。 password:字符串,InfluxDB登陸密碼。 database:字符串,InfluxDB數據庫名(須要預先創好)。 tags:json,避免指標項在 InfluxDbB表中所對應的字段重名的狀況。
Kafka的JMX監控指標能夠經過jconsole進行獲取。
對於BytesInPerSec監控指標,在jconsole的MBeans選項頁找到BytesInPerSe。
ObjectName的值是監控指標obj的值。
ObjectName的屬性是"attr"對應的指標值,能夠選擇一個或多個。
metric名稱是resultAlias對應的指標值,在InfluxDB中是MEASUREMENTS名。
"tags" 對應InfluxDB的tag功能,用於與存儲在同一個MEASUREMENTS裏的不一樣監控指標作區分。
{ "obj":"kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec", "attr":[ "Count", "EventType","RateUnit","OneMinuteRate" ], "resultAlias":"BytesInPerSec", "outputWriters": [{ "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : { "application" : "BytesInPerSec" } } ] }
對於全局監控,每個監控指標對應一個InfluxDB的MEASUREMENTS,全部的Kafka節點的同一個監控指標數據寫同一個MEASUREMENTS;對於Topic的監控指標,同一個Topic的全部Kafka節點寫到同一個MEASUREMENTS,而且以Topic名稱命名。
{ "servers" : [ { "port" : "9999", "host" : "192.168.0.105", "queries" : [ { "obj" : "java.lang:type=Memory", "attr" : [ "HeapMemoryUsage", "NonHeapMemoryUsage" ], "resultAlias":"jvmMemory", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"kafkaServer", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"kafkaServer", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"kafkaServer", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"kafkaServer", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"kafkaServer", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"kafkaServer", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"kafkaServer", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions", "attr" : [ "Value" ], "resultAlias":"underReplicated", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.controller:type=KafkaController,name=ActiveControllerCount", "attr" : [ "Value" ], "resultAlias":"activeController", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "java.lang:type=OperatingSystem", "attr" : [ "FreePhysicalMemorySize","SystemCpuLoad","ProcessCpuLoad","SystemLoadAverage" ], "resultAlias":"jvmMemory", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] } ,{ "obj" : "kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent", "attr" : [ "Value" ], "resultAlias":"network", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent", "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ], "resultAlias":"network", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] },{ "obj" : "java.lang:type=GarbageCollector,name=G1 Young Generation", "attr" : [ "CollectionCount","CollectionTime" ], "resultAlias":"gc", "outputWriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url" : "http://192.168.0.105:8086/", "username" : "admin", "password" : "123456", "database" : "jmx", "tags" : {"application" : "kafka_server"} } ] }] } ] }
JMX經過網絡鏈接,所以JMXtrans有2種部署方案:
(1)集中式。在一臺服務器上部署JMXtrans,分別鏈接全部的Kafka Broker實例,並將數據寫入到InfluxDB。爲了減小網絡傳輸,一般部署到InfluxDB所在服務器上。
(2)分佈式。每一個Kafka Broker實例部署一個JMXtrans。
JMXTrans配置文件分全局指標(每一個Kafka節點)和Topic指標,全局指標是每一個節點一個配置文件,命名規則:kafka-brokerxx.json,Topic指標是每一個Topic一個配置文件,命名規則:TopicName.json。
監控系統架構一般分爲三部分:數據採集、分析與轉換、數據展現(可視化)。
(1)數據採集
數據採集一般先開發數據採集程序,而後使用Nagios、Zabbix等監控軟件來調度執行,並將採集到的數據進行上報。對於Java程序,可使用JMXTrans採集數據。
(2)分析與轉換
Kafka是Java應用程序,所提供的性能指標數據已經很是全面,指標的直方圖、次數、最大最小、標準方差都已經計算好,所以不須要再對數據進行分析加工,直接將MBeans數據存儲到InfluxDB。
(3)數據可視化
Grafana是一個開源的可視化面板(Dashboard),支持Graphite、Zabbix、InfluxDB、Prometheus和OpenTSDB做爲數據源。
InfluxDB是一款用Go語言編寫的開源分佈式時序、事件和指標數據庫,無需外部依賴,主要用於存儲涉及大量的時間戳數據,如DevOps監控數據、APP metrics、lOT傳感器數據和實時分析數據。docker pull influxdb
influxdb.yml文件:
version: '2' services: influxdb: image: influxdb container_name: influxdb volumes: - /data/influxdb/conf:/etc/influxdb - /data/influxdb/data:/var/lib/influxdb/data - /data/influxdb/meta:/var/lib/influxdb/meta - /data/influxdb/wal:/var/lib/influxdb/wal ports: - "8086:8086" restart: always
結果查看:docker exec -it influxdb influx
JMXTrans是一個經過JMX採集Java應用程序的數據採集器,只要Java應用程序開啓JMX端口,就能夠進行採集。docker pull jmxtrans/jmxtrans
JMXTrans默認讀取/var/lib/jmxtrans目錄下全部數據源配置文件(json格式文件),實時從數據源中獲取數據,解析數據後存儲到InfluxDB中。
version: '2' services: # JMXTrans服務 jmxtrans: image: jmxtrans/jmxtrans container_name: jmxtrans volumes: - ./jmxtrans:/var/lib/jmxtrans
Grafana是一個可視化面板(Dashboard),有很是漂亮的圖表和佈局展現,功能齊全的度量儀表盤和圖形編輯器,支持Graphite、zabbix、InfluxDB、Prometheus和OpenTSDB做爲數據源。
Grafana主要特性以下:
(1)展現方式:快速靈活的客戶端圖表,面板插件有許多不一樣方式的可視化指標和日誌,官方庫中具備豐富的儀表盤插件,好比熱圖、折線圖、圖表等多種展現方式。
(2)數據源:Graphite,InfluxDB,OpenTSDB,Prometheus,Elasticsearch,CloudWatch和KairosDB等。
(3)通知提醒:以可視方式定義最重要指標的警報規則,Grafana將不斷計算併發送通知,在數據達到閾值時經過Slack、PagerDuty等得到通知。
(4)混合展現:在同一圖表中混合使用不一樣的數據源,能夠基於每一個查詢指定數據源,甚至自定義數據源。
(5)註釋:使用來自不一樣數據源的豐富事件註釋圖表,將鼠標懸停在事件上會顯示完整的事件元數據和標記。
(6)過濾器:Ad-hoc過濾器容許動態建立新的鍵/值過濾器,這些過濾器會自動應用於使用該數據源的全部查詢。
GitHub地址:https://github.com/grafana/grafana
Grafana容器鏡像下載:docker pull grafana/grafana:6.5.0
Grafana容器啓動:docker run -d --name=grafana -p 3000:3000 grafana/grafana:6.5.0
Web登陸:192.168.0.105:3000
初次登陸默認使用admin/admin登陸,登陸後會強制要求修改密碼。
增長數據源:
導入DashBoard模板:
DashBoard模板json文件以下:
{ "__inputs": [ { "name": "DS_KAFKAMONITOR", "label": "KafkaMonitor", "description": "", "type": "datasource", "pluginId": "influxdb", "pluginName": "InfluxDB" } ], "__requires": [ { "type": "grafana", "id": "grafana", "name": "Grafana", "version": "6.7.3" }, { "type": "panel", "id": "graph", "name": "Graph", "version": "" }, { "type": "datasource", "id": "influxdb", "name": "InfluxDB", "version": "1.0.0" } ], "annotations": { "list": [ { "$$hashKey": "object:318", "builtIn": 1, "datasource": "-- Grafana --", "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] }, "editable": true, "gnetId": null, "graphTooltip": 0, "id": null, "links": [], "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "java.lang:type=OperatingSystem", "fill": 1, "fillGradient": 0, "gridPos": { "h": 12, "w": 8, "x": 0, "y": 0 }, "hiddenSeries": false, "id": 6, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "jvmMemory", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "ProcessCpuLoad" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "進程CPU使用率" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka進程CPU使用率", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:1134", "format": "percentunit", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:1135", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "服務器CPU使用率", "fill": 1, "fillGradient": 0, "gridPos": { "h": 12, "w": 8, "x": 8, "y": 0 }, "hiddenSeries": false, "id": 2, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "jvmMemory", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "SystemCpuLoad" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "CPU使用率" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "CPU使用率", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:369", "format": "percentunit", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:370", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "java.lang:type=OperatingSystem\nLinux系統負載", "fill": 1, "fillGradient": 0, "gridPos": { "h": 12, "w": 8, "x": 16, "y": 0 }, "hiddenSeries": false, "id": 4, "legend": { "alignAsTable": true, "avg": false, "current": true, "max": true, "min": false, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "jvmMemory", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "SystemLoadAverage" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "系統負載" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "系統負載", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:656", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:657", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "Kafka每一個broker每秒中的數據量,包括__consumer_offsets topic", "fill": 1, "fillGradient": 0, "gridPos": { "h": 12, "w": 8, "x": 0, "y": 12 }, "hiddenSeries": false, "id": 34, "legend": { "alignAsTable": true, "avg": false, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" } ], "hide": false, "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "D", "resultFormat": "time_series", "select": [ [ { "params": [ "OneMinuteRate" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "平均每秒" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=MessagesInPerSec" } ] }, { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" } ], "hide": false, "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "OneMinuteRate" ], "type": "field" }, { "params": [], "type": "sum" }, { "params": [ "全部broker平均每秒" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=MessagesInPerSec" } ] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka Topic 每秒數據量", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:2118", "format": "none", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:2119", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "java.lang:type=OperatingSystem\n服務器可用物理內存", "fill": 1, "fillGradient": 0, "gridPos": { "h": 12, "w": 8, "x": 8, "y": 12 }, "hiddenSeries": false, "id": 32, "legend": { "alignAsTable": true, "avg": false, "current": true, "max": false, "min": false, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "jvmMemory", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "FreePhysicalMemorySize" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "系統剩餘物理內存" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "可用物理內存", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:2324", "format": "decbytes", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:2325", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "cacheTimeout": null, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "kafka.controller:type=KafkaController,name=ActiveControllerCount\n\nKafka控制器數量,每一個集羣只有一臺機器爲1,爲1的機器是Kafka控制器Crontroller", "fill": 1, "fillGradient": 0, "gridPos": { "h": 12, "w": 8, "x": 16, "y": 12 }, "hiddenSeries": false, "id": 26, "legend": { "alignAsTable": true, "avg": false, "current": true, "max": false, "min": false, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pluginVersion": "6.7.3", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" } ], "measurement": "activeController", "orderByTime": "ASC", "policy": "default", "query": "SELECT sum(\"Value\") AS \"獲取控制器數量\" FROM \"activeController\" WHERE $timeFilter GROUP BY time($__interval), \"hostname\"", "rawQuery": false, "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "Value" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "獲取控制器數量" ], "type": "alias" } ] ], "tags": [], "tz": "" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka控制器數量", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:4446", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:4447", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "監控 kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec 指標", "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 8, "x": 0, "y": 24 }, "hiddenSeries": false, "id": 16, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "FiveMinuteRate" ], "type": "field" }, { "params": [], "type": "mean" }, { "params": [ "每秒拉取字節數" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=BytesOutPerSec" } ] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka每秒拉取流量", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:77", "format": "decbytes", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:78", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "監控 kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec 指標", "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 8, "x": 8, "y": 24 }, "hiddenSeries": false, "id": 14, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "F", "resultFormat": "time_series", "select": [ [ { "params": [ "OneMinuteRate" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "平均每秒進入字節數" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=BytesInPerSec" } ] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka每秒進入流量", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:77", "format": "decbytes", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:78", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "監控 kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec 和 kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec 指標", "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 8, "x": 16, "y": 24 }, "hiddenSeries": false, "id": 20, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "OneMinuteRate" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "每秒Fetch(獲取)的請求數量" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec" } ] }, { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "D", "resultFormat": "time_series", "select": [ [ { "params": [ "MeanRate" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "每秒Producer發送的請求數量" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec" } ] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka生產、消費每秒請求數量", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:77", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:78", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "java.lang:type=Memory", "fill": 1, "fillGradient": 0, "gridPos": { "h": 13, "w": 8, "x": 0, "y": 33 }, "hiddenSeries": false, "id": 8, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "jvmMemory", "orderByTime": "ASC", "policy": "default", "refId": "E", "resultFormat": "time_series", "select": [ [ { "params": [ "HeapMemoryUsage_used" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "堆內存使用" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka使用堆內存", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:1850", "format": "decbytes", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:1851", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "java.lang:type=Memory", "fill": 1, "fillGradient": 0, "gridPos": { "h": 13, "w": 8, "x": 8, "y": 33 }, "hiddenSeries": false, "id": 30, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "jvmMemory", "orderByTime": "ASC", "policy": "default", "refId": "E", "resultFormat": "time_series", "select": [ [ { "params": [ "NonHeapMemoryUsage_used" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "對外內存使用" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka使用堆外內存", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:1850", "format": "decbytes", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:1851", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions\n不爲0則說明有的副本跟不上leader", "fill": 1, "fillGradient": 0, "gridPos": { "h": 13, "w": 8, "x": 16, "y": 33 }, "hiddenSeries": false, "id": 24, "legend": { "alignAsTable": true, "avg": false, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pluginVersion": "6.7.3", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "underReplicated", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "Value" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "未充分備份的分區數" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "未充分備份的分區數監控", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:11235", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:11236", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "cacheTimeout": null, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "", "fill": 1, "fillGradient": 0, "gridPos": { "h": 13, "w": 8, "x": 0, "y": 46 }, "hiddenSeries": false, "id": 12, "legend": { "alignAsTable": true, "avg": false, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pluginVersion": "6.7.3", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "5m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "network", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "Value" ], "type": "field" }, { "params": [], "type": "mean" }, { "params": [ "網絡線程池空閒比例" ], "type": "alias" } ] ], "tags": [] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka網絡線程池線程平均的空閒比例", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:13734", "format": "percentunit", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:13735", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "cacheTimeout": null, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent", "fill": 1, "fillGradient": 0, "gridPos": { "h": 13, "w": 8, "x": 8, "y": 46 }, "hiddenSeries": false, "id": 22, "legend": { "alignAsTable": true, "avg": false, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "links": [], "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pluginVersion": "6.7.3", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" } ], "measurement": "network", "orderByTime": "ASC", "policy": "default", "refId": "A", "resultFormat": "time_series", "select": [ [ { "params": [ "OneMinuteRate" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "IO空閒比例" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent" } ] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": " I/O 線程池線程平均的空閒比例", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:13517", "format": "percentunit", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:13518", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "${DS_KAFKAMONITOR}", "description": "監控 kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec 和 kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec 指標", "fill": 1, "fillGradient": 0, "gridPos": { "h": 13, "w": 8, "x": 16, "y": 46 }, "hiddenSeries": false, "id": 18, "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true }, "lines": true, "linewidth": 1, "nullPointMode": "null", "options": { "dataLinks": [] }, "percentage": false, "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "H", "resultFormat": "time_series", "select": [ [ { "params": [ "OneMinuteRate" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "每秒Fetch(獲取)異常的請求" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec" } ] }, { "alias": "", "groupBy": [ { "params": [ "1m" ], "type": "time" }, { "params": [ "hostname" ], "type": "tag" }, { "params": [ "null" ], "type": "fill" } ], "measurement": "kafkaServer", "orderByTime": "ASC", "policy": "default", "refId": "J", "resultFormat": "time_series", "select": [ [ { "params": [ "MeanRate" ], "type": "field" }, { "params": [], "type": "last" }, { "params": [ "每秒Producer異常的請求" ], "type": "alias" } ] ], "tags": [ { "key": "typeName", "operator": "=", "value": "type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec" } ] } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Kafka生產、消費請求失敗數量", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "$$hashKey": "object:77", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "$$hashKey": "object:78", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } } ], "refresh": false, "schemaVersion": 22, "style": "dark", "tags": [], "templating": { "list": [] }, "time": { "from": "now-1h", "to": "now" }, "timepicker": { "refresh_intervals": [ "5s", "10s", "30s", "1m", "5m", "15m", "30m", "1h", "2h", "1d" ] }, "timezone": "", "title": "Kafka集羣監控模板", "uid": "PkULDneZkALL", "variables": { "list": [] }, "version": 27 }
將InfluxDB、JMXTrans、Grafana部署整合使用Docker-Compose進行部署,建立KafkaMonitor目錄,在KafkaMonitor目錄內建立influxdb目錄和jmxtrans目錄以及docker-compose.yml文件,將jmxtrans.json文件放到jmxtrans目錄。
docker-compose.yml文件以下:
version: '2' services: # JMXTrans服務 jmxtrans: image: jmxtrans/jmxtrans container_name: jmxtrans volumes: - ./jmxtrans:/var/lib/jmxtrans # InfluxDB服務 influxdb: image: influxdb container_name: influxdb volumes: - ./influxdb/conf:/etc/influxdb - ./influxdb/data:/var/lib/influxdb/data - ./influxdb/meta:/var/lib/influxdb/meta - ./influxdb/wal:/var/lib/influxdb/wal ports: - "8086:8086" # 對外暴露端口,提供Grafana訪問 restart: always # Grafana服務 grafana: image: grafana/grafana:6.5.0 #高版本可能存在bug container_name: grafana ports: - "3000:3000" # 對外暴露端口,提供web訪問
啓動監控框架服務:docker-compose -f docker-compose.yml up -d
須要Web登陸Grafana服務,配置相應的數據源和模板。