Kafka快速入門(七)——Kafka監控

Kafka快速入門(七)——Kafka監控

1、Kafka監控指標

一、Kafka主機監控指標

主機監控是監控Kafka集羣Broker所在的節點機器的性能。常見的主機監控指標包括:
(1)機器負載(Load)
(2)CPU使用率
(3)內存使用率,包括空閒內存(Free Memory)和已使用內存(Used Memory)
(4)磁盤I/O使用率,包括讀使用率和寫使用率網絡
(5)I/O使用率
(6)TCP鏈接數
(7)打開文件數
(8)inode使用狀況java

二、JVM監控指標

Kafka Broker進程是一個普通的Java進程,所以全部關於JVM的監控方式均可以用於對Kafka Broker進程的監控。
(1)Full GC發生頻率和時長,用於評估Full GC對Broker進程的影響。長時間的停頓會令Broker端拋出各類超時異常。
(2)活躍對象大小,是設定堆大小的重要依據,能幫助細粒度地調優JVM各個代的堆大小。
(3)應用線程總數。瞭解Broker進程對CPU的使用狀況。
2019-07-30T09:13:03.809+0800: 552.982: [GC cleanup 827M->645M(1024M), 0.0019078 secs]
Broker JVM進程默認使用G1的GC算法,當cleanup步驟結束後,堆上活躍對象大小從827MB縮減成645MB。Kafka 0.9.0.0版本起,默認GC收集器爲G1,而G1中的Full GC是由單線程執行的,速度很是慢。所以,須要監控Broker GC日誌,即以kafkaServer-gc.log開頭的文件。若是發現Broker進程頻繁Full GC,能夠開啓G1的-XX:+PrintAdaptiveSizePolicy開關,讓JVM指明是誰引起Full GC。node

三、集羣監控指標

(1)查看Broker進程是否啓動,端口是否創建。在容器化的Kafka環境中,使用Docker啓動Kafka Broker時,Docker容器雖然成功啓動,但網絡設置若是配置有誤,就可能會出現進程已經啓動但端口未成功創建監聽的情形。
(2)查看Broker端關鍵日誌。Broker端服務器日誌server.log,控制器日誌controller.log以及主題分區狀態變動日誌state-change.log。
(3)查看Broker端關鍵線程的運行狀態。Kafka Broker進程會啓動十幾個甚至是幾十個線程。在實際生產環境中,Log Compaction線程是以kafka-log-cleaner-thread開頭的,負責日誌Compaction;副本拉取消息的線程,一般以ReplicaFetcherThread開頭,負責執行Follower副本向Leader副本拉取消息的邏輯。
(4)查看Broker端的關鍵JMX指標。
BytesIn/BytesOut:即Broker端每秒入站和出站字節數,若是值接近網絡帶寬,很容易出現網絡丟包的情形。
NetworkProcessorAvgIdlePercent:即網絡線程池線程平均的空閒比例,一般須要確保其值長期大於30%。若是小於30%,代表網絡線程池很是繁忙,須要經過增長網絡線程數或將負載轉移給其它服務器的方式,來給Broker減負。
RequestHandlerAvgIdlePercent:即I/O線程池線程平均的空閒比例。若是值長期小於30%,須要調整I/O線程池的數量或者減小 Broker端的負載。
UnderReplicatedPartitions:即未充分備份的分區數。所謂未充分備份,是指並不是全部的Follower副本都和Leader副本保持同步。
ISRShrink/ISRExpand:即ISR收縮和擴容的頻次指標。若是生產環境中出現ISR中副本頻繁進出的情形,其值必定是很高的。須要診斷下副本頻繁進出ISR的緣由,並採起適當的措施。
ActiveControllerCount:即當前處於激活狀態的控制器的數量。一般,Controller所在Broker上的ActiveControllerCount指標值是1,其它Broker上的值是 0。若是發現存在多臺Broker上ActiveControllerCount值都是1,代表Kafka集羣出現了腦裂,必須儘快處理,處理方式主要是查看網絡連通性。腦裂問題是很是嚴重的分佈式故障,Kafka目前依託ZooKeeper來防止腦裂,一旦出現腦裂,Kafka沒法保證正常工做。
(5)監控Kafka客戶端。客戶端所在的機器與Kafka Broker機器之間的網絡往返時延(Round-Trip Time,RTT)。對於生產者,以kafka-producer-network-thread開頭的線程負責實際消息發送,一旦掛掉,Producer將沒法正常工做,但Producer進程不會自動掛掉。對於消費者,以kafka-coordinator-heartbeat-thread 開頭的心跳線程事關Rebalance。
從Producer角度,須要關注的JMX指標是request-latency,即消息生產請求的延時,最直接地表徵Producer程序的TPS;從 Consumer角度,records-lag和records-lead是兩個重要的JMX 指標。若是使用Consumer Group,須要關注join rate和sync rate指標,其代表Rebalance的頻繁程度。ios

2、JMX監控Kafka

一、JMX簡介

JMX(Java Management Extensions)能夠管理、監控正在運行中的Java程序,用於管理線程、內存、日誌Level、服務重啓、系統環境等。git

二、Kafka開啓JMX

開啓JMX端口的方式有兩種:
(1)啓動Kafka時設置JMX_PORT
export JMX_PORT=9999 kafka-server-start.sh -daemon config/server.properties
(2)修改kafka-run-class.sh
在kafka-run-class.sh文件開始增長下列行:
JMX_PORT=9999
修改kafka-run-class.sh文件後重啓Kafka集羣。
(3)Kafka Docker容器服務的JMX開啓
Kafka容器服務的docker-compose.yml文件導入KAFKA_JMX_OPTS和JMX_PORT環境變量。github

KAFKA_JMX_OPTS: "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=192.168.0.105 -Dcom.sun.management.jmxremote.rmi.port=9999"
JMX_PORT: 9999

將相應的JMX端口對外暴露。web

ports:
      - "9999:9999" # 對外暴露端口號

三、JMX_PORT佔用問題

Kafka須要監控Broker和Topic數據時,須要開啓JMX_PORT,一般在腳本kafka-run-class.sh裏面定義JMX_PORT變量,但JMX_PORT定義完成後,執行bin目錄下腳本工具會報錯。緣由在於
kafka-run-class.sh是被調用腳本,當被其它腳本調用時,Java會綁定JMX_PORT,致使端口被佔用。
Kafka快速入門(七)——Kafka監控
解決方法是在執行Kafka啓動時指定JMX_PORT。
(1)supervisor啓動Kafka,在supervisor服務啓動配置文件中加入environment=JMX_PORT=9999。
(2)kafka-server-start.sh腳本啓動Kafka,在啓動時export JMX_PORT=9999或者在kafka-server-start.sh腳本指定。
(3)修改kafka-run-class.sh腳本
修改Kafka安裝目錄下的bin/Kafka-run-class.sh文件:
Kafka快速入門(七)——Kafka監控算法

3、Kafka監控工具

一、JMXTool工具

JMXTool是Kafka社區的工具,可以實時查看Kafka JMX指標。
kafka-run-class.sh kafka.tools.JmxTool
--attributes:指定要查詢的JMX屬性名稱,是以逗號分隔的CSV格式。
--date-format:指定顯示的日誌格式
--jmx-url:指定要鏈接的JMX接口,默認格式是service:jmx:rmi:///jndi/rmi://:JMX端口/jmxrmi
--object-name:指定要查詢的JMX MBean名稱。
--reporting-interval:指定實時查詢的時間間隔,默認2s。
每秒查詢一次過去1分鐘的Broker端每秒入站的流量(BytesInPerSec)命令以下:
kafka-run-class.sh kafka.tools.JmxTool --object-name kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec --jmx-url service:jmx:rmi:///jndi/rmi://:9999/jmxrmi --date-format "YYYY-MM-dd HH:mm:ss" --attributes OneMinuteRate --reporting-interval 1000
ActiveController JMX指標查看命令以下:
kafka-run-class.sh kafka.tools.JmxTool --object-name kafka.controller:type=KafkaController,name=ActiveControllerCount --jmx-url service:jmx:rmi:///jndi/rmi://:9999/jmxrmi --date-format "YYYY-MM-dd HH:mm:ss" --reporting-interval 1000docker

二、Kafka Manager

Kafka Manager是雅虎公司於2015年開源的一個Kafka監控框架,使用Scala語言開發,主要用於管理和監控Kafka集羣。
Kafka Manager目前已經更名爲CMAK (Cluster Manager for Apache Kafka)。
GitHub地址:
https://github.com/yahoo/CMAK
Kafka Manager Docker鏡像:kafkamanager/kafka-manager
若是須要設置Kafka Manager基本安全認證,能夠爲Kafka Manager設置環境變量:數據庫

KAFKA_MANAGER_AUTH_ENABLED: "true"
KAFKA_MANAGER_USERNAME: username
KAFKA_MANAGER_PASSWORD: password

Kafka-Manager服務部署Docker-Compose.yml文件以下:json

# 定義kafka-manager服務
kafka-manager-test:
  image: kafkamanager/kafka-manager # kafka-manager鏡像
  restart: always
  container_name: kafka-manager-test
  hostname: kafka-manager-test
  ports:
    - "9000:9000"  # 對外暴露端口,提供web訪問
  depends_on:
    - kafka-test # 依賴
  environment:
    ZK_HOSTS: zookeeper-test:2181 # 宿主機IP
    KAFKA_BROKERS: kafka-test:9090 # kafka
    KAFKA_MANAGER_AUTH_ENABLED: "true"
    KAFKA_MANAGER_USERNAME: admin
    KAFKA_MANAGER_PASSWORD: password

啓動Kafka Manager服務,登陸Kafka Manager Web。
Web地址:http://127.0.0.1:9000
Kafka快速入門(七)——Kafka監控
增長Kafka-Manager管理Kafka Broker節點:
Kafka快速入門(七)——Kafka監控

三、JMXTrans + InfluxDB + Grafana

一般,監控框架可使用JMXTrans + InfluxDB + Grafana組合,因爲Grafana支持對JMX指標的監控,所以很容易將Kafka各類 JMX指標集成進來,對於已經採用JMXTrans + InfluxDB + Grafana監控方案的公司來講,能夠直接複用已有的監控框架,能夠極大地節省運維成本。

四、Confluent Control Center

Control Center可以實時地監控Kafka集羣,同時還可以幫助操做和搭建基於Kafka的實時流處理應用。Control Center不是免費的,必須使用Confluent Kafka Platform企業版才能使用。
Kafka快速入門(七)——Kafka監控

五、jconsole

Jconsole(Java Monitoring and Management Console)是一種基於JMX的可視化監視、管理工具,提供概述、內存、線程、類、VM概要、MBean的監控。
在Linux Terminal執行jsoncole,在彈出的窗口的遠程進程中輸入service:jmx:rmi:///jndi/rmi://192.168.0.105:9999/jmxrmi192.168.0.105:9999
Kafka快速入門(七)——Kafka監控
選擇MBeans選項卡,
Kafka快速入門(七)——Kafka監控

4、JMXTrans

一、JMXTrans簡介

JMXTrans是一個經過JMX採集Java應用程序的數據採集器,只要Java應用程序開啓JMX端口,就能夠進行採集。
JMXTrans之後臺deamon形式運行,每隔1分鐘採集一次數據。
GitHub地址:https://github.com/jmxtrans/jmxtrans
JMXTrans Docker容器鏡像下載:
docker pull jmxtrans/jmxtrans

二、JMXTrans配置文件

JMXTrans默認讀取/var/lib/jmxtrans目錄下全部數據源配置文件(json格式文件),實時從數據源中獲取數據,解析數據後存儲到InfluxDB中。
JMXTrans配置JSON文件以下:

{
   "servers": [{
      "port": "9901",
      "host": "192.168.0.105",
      "queries": [{
         "obj": "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec",
         "attr": ["MeanRate", "OneMinuteRate", "FiveMinuteRate", "FifteenMinuteRate"],
         "resultAlias": "kafkaServer",
         "outputWriters": [{
            "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
            "url": "http://192.168.0.105:8086/",
            "username": "admin",
            "password": "123456",
            "database": "jmx",
            "tags": {
               "application": "kafka_server"
            }
         }]
      }]
   }]
}
servers:數組,數據源配置。
port:字符串,接收jmx的json數據的端口。
host:字符串,接收jmx的json數據的IP地址。
queries:數組,具體監控指標項,按JSON格式列出多個指標項,監控指標能夠經過jconsole工具(JDK自帶的工具)獲取。
obj:字符串,監控指標的名稱。
attr:數組,須要存儲的指標項字段,是數據目標表的字段名。
resultAlias:字符串,InfluxDB中的表名。
outputWriters:數組,數據目的地。
@class:字符串,數據目的地的類。
url:字符串,數據目的地( InfluxDb )的url。
username:字符串,InfluxDB登陸名。
password:字符串,InfluxDB登陸密碼。
database:字符串,InfluxDB數據庫名(須要預先創好)。
tags:json,避免指標項在 InfluxDbB表中所對應的字段重名的狀況。

三、Kafka JMX監控指標

Kafka的JMX監控指標能夠經過jconsole進行獲取。
對於BytesInPerSec監控指標,在jconsole的MBeans選項頁找到BytesInPerSe。
Kafka快速入門(七)——Kafka監控
ObjectName的值是監控指標obj的值。
ObjectName的屬性是"attr"對應的指標值,能夠選擇一個或多個。
metric名稱是resultAlias對應的指標值,在InfluxDB中是MEASUREMENTS名。
"tags" 對應InfluxDB的tag功能,用於與存儲在同一個MEASUREMENTS裏的不一樣監控指標作區分。

{      
   "obj":"kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec",
         "attr":[ "Count", "EventType","RateUnit","OneMinuteRate" ],
         "resultAlias":"BytesInPerSec",
         "outputWriters": [{
      "@class" :   "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
              "url" :   "http://192.168.0.105:8086/",
              "username" :   "admin",
              "password" :   "123456",
              "database" :   "jmx",
              "tags"     :  {
         "application" :   "BytesInPerSec"
      }
   } ]
}

對於全局監控,每個監控指標對應一個InfluxDB的MEASUREMENTS,全部的Kafka節點的同一個監控指標數據寫同一個MEASUREMENTS;對於Topic的監控指標,同一個Topic的全部Kafka節點寫到同一個MEASUREMENTS,而且以Topic名稱命名。

{
  "servers" : [ {
    "port" : "9999",
    "host" : "192.168.0.105",
    "queries" : [ {
      "obj" : "java.lang:type=Memory",
      "attr" : [ "HeapMemoryUsage", "NonHeapMemoryUsage" ],
      "resultAlias":"jvmMemory",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"kafkaServer",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"kafkaServer",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"kafkaServer",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"kafkaServer",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"kafkaServer",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"kafkaServer",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"kafkaServer",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions",
      "attr" : [ "Value" ],
      "resultAlias":"underReplicated",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.controller:type=KafkaController,name=ActiveControllerCount",
      "attr" : [ "Value" ],
      "resultAlias":"activeController",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "java.lang:type=OperatingSystem",
      "attr" : [ "FreePhysicalMemorySize","SystemCpuLoad","ProcessCpuLoad","SystemLoadAverage" ],
      "resultAlias":"jvmMemory",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    } ,{
      "obj" : "kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent",
      "attr" : [ "Value" ],
      "resultAlias":"network",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent",
      "attr" : [ "MeanRate","OneMinuteRate","FiveMinuteRate","FifteenMinuteRate" ],
      "resultAlias":"network",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    },{
      "obj" : "java.lang:type=GarbageCollector,name=G1 Young Generation",
      "attr" : [ "CollectionCount","CollectionTime" ],
      "resultAlias":"gc",
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
        "url" : "http://192.168.0.105:8086/",
        "username" : "admin",
        "password" : "123456",
        "database" : "jmx",
        "tags"     : {"application" : "kafka_server"}
      } ]
    }]
  } ]
}

四、JMXTrans部署

JMX經過網絡鏈接,所以JMXtrans有2種部署方案:
(1)集中式。在一臺服務器上部署JMXtrans,分別鏈接全部的Kafka Broker實例,並將數據寫入到InfluxDB。爲了減小網絡傳輸,一般部署到InfluxDB所在服務器上。
(2)分佈式。每一個Kafka Broker實例部署一個JMXtrans。
JMXTrans配置文件分全局指標(每一個Kafka節點)和Topic指標,全局指標是每一個節點一個配置文件,命名規則:kafka-brokerxx.json,Topic指標是每一個Topic一個配置文件,命名規則:TopicName.json。

5、Kafka監控方案實例

一、Kafka監控架構方案選擇

監控系統架構一般分爲三部分:數據採集、分析與轉換、數據展現(可視化)。
(1)數據採集
數據採集一般先開發數據採集程序,而後使用Nagios、Zabbix等監控軟件來調度執行,並將採集到的數據進行上報。對於Java程序,可使用JMXTrans採集數據。
(2)分析與轉換
Kafka是Java應用程序,所提供的性能指標數據已經很是全面,指標的直方圖、次數、最大最小、標準方差都已經計算好,所以不須要再對數據進行分析加工,直接將MBeans數據存儲到InfluxDB。
(3)數據可視化
Grafana是一個開源的可視化面板(Dashboard),支持Graphite、Zabbix、InfluxDB、Prometheus和OpenTSDB做爲數據源。

二、InfluxDB部署

InfluxDB是一款用Go語言編寫的開源分佈式時序、事件和指標數據庫,無需外部依賴,主要用於存儲涉及大量的時間戳數據,如DevOps監控數據、APP metrics、lOT傳感器數據和實時分析數據。
docker pull influxdb
influxdb.yml文件:

version: '2'
services:
  influxdb:
    image: influxdb
    container_name: influxdb
    volumes:
      - /data/influxdb/conf:/etc/influxdb
      - /data/influxdb/data:/var/lib/influxdb/data
      - /data/influxdb/meta:/var/lib/influxdb/meta
      - /data/influxdb/wal:/var/lib/influxdb/wal
    ports:
      - "8086:8086"
    restart: always

結果查看:
docker exec -it influxdb influx

三、JMXTrans部署

JMXTrans是一個經過JMX採集Java應用程序的數據採集器,只要Java應用程序開啓JMX端口,就能夠進行採集。
docker pull jmxtrans/jmxtrans
JMXTrans默認讀取/var/lib/jmxtrans目錄下全部數據源配置文件(json格式文件),實時從數據源中獲取數據,解析數據後存儲到InfluxDB中。

version: '2'
services:
  # JMXTrans服務
  jmxtrans:
    image: jmxtrans/jmxtrans
    container_name: jmxtrans
    volumes:
      - ./jmxtrans:/var/lib/jmxtrans

四、Grafana部署

Grafana是一個可視化面板(Dashboard),有很是漂亮的圖表和佈局展現,功能齊全的度量儀表盤和圖形編輯器,支持Graphite、zabbix、InfluxDB、Prometheus和OpenTSDB做爲數據源。
Grafana主要特性以下:
(1)展現方式:快速靈活的客戶端圖表,面板插件有許多不一樣方式的可視化指標和日誌,官方庫中具備豐富的儀表盤插件,好比熱圖、折線圖、圖表等多種展現方式。
(2)數據源:Graphite,InfluxDB,OpenTSDB,Prometheus,Elasticsearch,CloudWatch和KairosDB等。
(3)通知提醒:以可視方式定義最重要指標的警報規則,Grafana將不斷計算併發送通知,在數據達到閾值時經過Slack、PagerDuty等得到通知。
(4)混合展現:在同一圖表中混合使用不一樣的數據源,能夠基於每一個查詢指定數據源,甚至自定義數據源。
(5)註釋:使用來自不一樣數據源的豐富事件註釋圖表,將鼠標懸停在事件上會顯示完整的事件元數據和標記。
(6)過濾器:Ad-hoc過濾器容許動態建立新的鍵/值過濾器,這些過濾器會自動應用於使用該數據源的全部查詢。
GitHub地址:https://github.com/grafana/grafana
Grafana容器鏡像下載:
docker pull grafana/grafana:6.5.0
Grafana容器啓動:
docker run -d --name=grafana -p 3000:3000 grafana/grafana:6.5.0
Web登陸:192.168.0.105:3000
Kafka快速入門(七)——Kafka監控
初次登陸默認使用admin/admin登陸,登陸後會強制要求修改密碼。
增長數據源:
Kafka快速入門(七)——Kafka監控
導入DashBoard模板:
Kafka快速入門(七)——Kafka監控
DashBoard模板json文件以下:

{
  "__inputs": [
    {
      "name": "DS_KAFKAMONITOR",
      "label": "KafkaMonitor",
      "description": "",
      "type": "datasource",
      "pluginId": "influxdb",
      "pluginName": "InfluxDB"
    }
  ],
  "__requires": [
    {
      "type": "grafana",
      "id": "grafana",
      "name": "Grafana",
      "version": "6.7.3"
    },
    {
      "type": "panel",
      "id": "graph",
      "name": "Graph",
      "version": ""
    },
    {
      "type": "datasource",
      "id": "influxdb",
      "name": "InfluxDB",
      "version": "1.0.0"
    }
  ],
  "annotations": {
    "list": [
      {
        "$$hashKey": "object:318",
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "type": "dashboard"
      }
    ]
  },
  "editable": true,
  "gnetId": null,
  "graphTooltip": 0,
  "id": null,
  "links": [],
  "panels": [
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "java.lang:type=OperatingSystem",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 8,
        "x": 0,
        "y": 0
      },
      "hiddenSeries": false,
      "id": 6,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "jvmMemory",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "ProcessCpuLoad"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "進程CPU使用率"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka進程CPU使用率",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:1134",
          "format": "percentunit",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:1135",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "服務器CPU使用率",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 8,
        "x": 8,
        "y": 0
      },
      "hiddenSeries": false,
      "id": 2,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "jvmMemory",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "SystemCpuLoad"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "CPU使用率"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "CPU使用率",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:369",
          "format": "percentunit",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:370",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "java.lang:type=OperatingSystem\nLinux系統負載",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 8,
        "x": 16,
        "y": 0
      },
      "hiddenSeries": false,
      "id": 4,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": true,
        "max": true,
        "min": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "jvmMemory",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "SystemLoadAverage"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "系統負載"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "系統負載",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:656",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:657",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "Kafka每一個broker每秒中的數據量,包括__consumer_offsets topic",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 8,
        "x": 0,
        "y": 12
      },
      "hiddenSeries": false,
      "id": 34,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            }
          ],
          "hide": false,
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "D",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "OneMinuteRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "平均每秒"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=MessagesInPerSec"
            }
          ]
        },
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            }
          ],
          "hide": false,
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "OneMinuteRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "sum"
              },
              {
                "params": [
                  "全部broker平均每秒"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=MessagesInPerSec"
            }
          ]
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka Topic 每秒數據量",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:2118",
          "format": "none",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:2119",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "java.lang:type=OperatingSystem\n服務器可用物理內存",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 8,
        "x": 8,
        "y": 12
      },
      "hiddenSeries": false,
      "id": 32,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": true,
        "max": false,
        "min": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "jvmMemory",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "FreePhysicalMemorySize"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "系統剩餘物理內存"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "可用物理內存",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:2324",
          "format": "decbytes",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:2325",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "cacheTimeout": null,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "kafka.controller:type=KafkaController,name=ActiveControllerCount\n\nKafka控制器數量,每一個集羣只有一臺機器爲1,爲1的機器是Kafka控制器Crontroller",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 12,
        "w": 8,
        "x": 16,
        "y": 12
      },
      "hiddenSeries": false,
      "id": 26,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": true,
        "max": false,
        "min": false,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pluginVersion": "6.7.3",
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            }
          ],
          "measurement": "activeController",
          "orderByTime": "ASC",
          "policy": "default",
          "query": "SELECT sum(\"Value\") AS \"獲取控制器數量\" FROM \"activeController\" WHERE $timeFilter GROUP BY time($__interval), \"hostname\"",
          "rawQuery": false,
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "Value"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "獲取控制器數量"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [],
          "tz": ""
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka控制器數量",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:4446",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:4447",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "監控 kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec 指標",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 9,
        "w": 8,
        "x": 0,
        "y": 24
      },
      "hiddenSeries": false,
      "id": 16,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "FiveMinuteRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "mean"
              },
              {
                "params": [
                  "每秒拉取字節數"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=BytesOutPerSec"
            }
          ]
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka每秒拉取流量",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:77",
          "format": "decbytes",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:78",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "監控 kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec 指標",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 9,
        "w": 8,
        "x": 8,
        "y": 24
      },
      "hiddenSeries": false,
      "id": 14,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "F",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "OneMinuteRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "平均每秒進入字節數"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=BytesInPerSec"
            }
          ]
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka每秒進入流量",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:77",
          "format": "decbytes",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:78",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "監控 kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec 和 kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec 指標",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 9,
        "w": 8,
        "x": 16,
        "y": 24
      },
      "hiddenSeries": false,
      "id": 20,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "OneMinuteRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "每秒Fetch(獲取)的請求數量"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec"
            }
          ]
        },
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "D",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "MeanRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "每秒Producer發送的請求數量"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec"
            }
          ]
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka生產、消費每秒請求數量",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:77",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:78",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "java.lang:type=Memory",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 13,
        "w": 8,
        "x": 0,
        "y": 33
      },
      "hiddenSeries": false,
      "id": 8,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "jvmMemory",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "E",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "HeapMemoryUsage_used"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "堆內存使用"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka使用堆內存",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:1850",
          "format": "decbytes",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:1851",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "java.lang:type=Memory",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 13,
        "w": 8,
        "x": 8,
        "y": 33
      },
      "hiddenSeries": false,
      "id": 30,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "jvmMemory",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "E",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "NonHeapMemoryUsage_used"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "對外內存使用"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka使用堆外內存",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:1850",
          "format": "decbytes",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:1851",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions\n不爲0則說明有的副本跟不上leader",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 13,
        "w": 8,
        "x": 16,
        "y": 33
      },
      "hiddenSeries": false,
      "id": 24,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pluginVersion": "6.7.3",
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "underReplicated",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "Value"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "未充分備份的分區數"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "未充分備份的分區數監控",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:11235",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:11236",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "cacheTimeout": null,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 13,
        "w": 8,
        "x": 0,
        "y": 46
      },
      "hiddenSeries": false,
      "id": 12,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pluginVersion": "6.7.3",
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "5m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "network",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "Value"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "mean"
              },
              {
                "params": [
                  "網絡線程池空閒比例"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": []
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka網絡線程池線程平均的空閒比例",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:13734",
          "format": "percentunit",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:13735",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "cacheTimeout": null,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 13,
        "w": 8,
        "x": 8,
        "y": 46
      },
      "hiddenSeries": false,
      "id": 22,
      "legend": {
        "alignAsTable": true,
        "avg": false,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pluginVersion": "6.7.3",
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            }
          ],
          "measurement": "network",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "A",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "OneMinuteRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "IO空閒比例"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdlePercent"
            }
          ]
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": " I/O 線程池線程平均的空閒比例",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:13517",
          "format": "percentunit",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:13518",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_KAFKAMONITOR}",
      "description": "監控 kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec 和 kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec 指標",
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 13,
        "w": 8,
        "x": 16,
        "y": 46
      },
      "hiddenSeries": false,
      "id": 18,
      "legend": {
        "alignAsTable": true,
        "avg": true,
        "current": true,
        "max": true,
        "min": true,
        "show": true,
        "total": false,
        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "nullPointMode": "null",
      "options": {
        "dataLinks": []
      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "H",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "OneMinuteRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "每秒Fetch(獲取)異常的請求"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec"
            }
          ]
        },
        {
          "alias": "",
          "groupBy": [
            {
              "params": [
                "1m"
              ],
              "type": "time"
            },
            {
              "params": [
                "hostname"
              ],
              "type": "tag"
            },
            {
              "params": [
                "null"
              ],
              "type": "fill"
            }
          ],
          "measurement": "kafkaServer",
          "orderByTime": "ASC",
          "policy": "default",
          "refId": "J",
          "resultFormat": "time_series",
          "select": [
            [
              {
                "params": [
                  "MeanRate"
                ],
                "type": "field"
              },
              {
                "params": [],
                "type": "last"
              },
              {
                "params": [
                  "每秒Producer異常的請求"
                ],
                "type": "alias"
              }
            ]
          ],
          "tags": [
            {
              "key": "typeName",
              "operator": "=",
              "value": "type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec"
            }
          ]
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
      "title": "Kafka生產、消費請求失敗數量",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "buckets": null,
        "mode": "time",
        "name": null,
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "$$hashKey": "object:77",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        },
        {
          "$$hashKey": "object:78",
          "format": "short",
          "label": null,
          "logBase": 1,
          "max": null,
          "min": null,
          "show": true
        }
      ],
      "yaxis": {
        "align": false,
        "alignLevel": null
      }
    }
  ],
  "refresh": false,
  "schemaVersion": 22,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": []
  },
  "time": {
    "from": "now-1h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ]
  },
  "timezone": "",
  "title": "Kafka集羣監控模板",
  "uid": "PkULDneZkALL",
  "variables": {
    "list": []
  },
  "version": 27
}

五、docker-compose.yml文件

將InfluxDB、JMXTrans、Grafana部署整合使用Docker-Compose進行部署,建立KafkaMonitor目錄,在KafkaMonitor目錄內建立influxdb目錄和jmxtrans目錄以及docker-compose.yml文件,將jmxtrans.json文件放到jmxtrans目錄。
docker-compose.yml文件以下:

version: '2'
services:
  # JMXTrans服務
  jmxtrans:
    image: jmxtrans/jmxtrans
    container_name: jmxtrans
    volumes:
      - ./jmxtrans:/var/lib/jmxtrans
  # InfluxDB服務
  influxdb:
    image: influxdb
    container_name: influxdb
    volumes:
      - ./influxdb/conf:/etc/influxdb
      - ./influxdb/data:/var/lib/influxdb/data
      - ./influxdb/meta:/var/lib/influxdb/meta
      - ./influxdb/wal:/var/lib/influxdb/wal
    ports:
      - "8086:8086" # 對外暴露端口,提供Grafana訪問
    restart: always
  # Grafana服務
  grafana:
    image: grafana/grafana:6.5.0  #高版本可能存在bug
    container_name: grafana
    ports:
      - "3000:3000"  # 對外暴露端口,提供web訪問

啓動監控框架服務:
docker-compose -f docker-compose.yml up -d
須要Web登陸Grafana服務,配置相應的數據源和模板。

六、監控查看

Kafka快速入門(七)——Kafka監控

相關文章
相關標籤/搜索