最近在搞Kafka集羣監控,以前也是看了網上的不少資料。之因此使用jmxtrans+influxdb+grafana是由於界面酷炫,能夠定製化,缺點是不能操做Kafka集羣,可能須要配合Kafka Manager一塊兒使用。java
JMX(Java Management Extensions,即Java管理擴展)是一個爲應用程序、設備、系統等植入管理功能的框架。JMX能夠跨越一系列異構操做系統平臺、系統體系結構和網絡傳輸協議,靈活的開發無縫集成的系統、網絡和服務管理應用。Kafka作爲一款Java應用,已經定義了豐富的性能指標,(能夠參考Kafka監控指標),經過JMX能夠輕鬆對其進行監控。node
在${KAFKA_HOME}/bin/路徑下修改kafka-server-start.sh腳本,第一行增長JMX_PORT=9999
便可。git
JMX_PORT=9999
重啓Kafka
github
./bin/kafka-server-stop.sh ./bin/kafka-server-start.sh -daemon ./config/server.properties
重啓後查看Kafka以及JMX端口狀態shell
ps -ef | grep kafka root 8273 1 99 02:32 pts/0 00:00:09 /opt/jdk1.8.0_201/bin/java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 ...... kafka.Kafka ./config/server.properties netstat -anop | grep 9999 tcp6 0 0 :::9999 :::* LISTEN 8273/java off (0.00/0/0)
InfluxDB是一個時間序列數據庫,用於處理海量寫入與負載查詢。InfluxDB旨在用做涉及大量時間戳數據的任何用例(包括DevOps監控,應用程序指標,物聯網傳感器數據和實時分析)的後端存儲。數據庫
下載InfluxDB rpm安裝包apache
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.7.5.x86_64.rpm --2019-04-10 02:52:30-- https://dl.influxdata.com/influxdb/releases/influxdb-1.7.5.x86_64.rpm Resolving dl.influxdata.com (dl.influxdata.com)... 54.192.151.21, 54.192.151.81, 54.192.151.87, ... Connecting to dl.influxdata.com (dl.influxdata.com)|54.192.151.21|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 46536692 (44M) [application/octet-stream] Saving to: ‘influxdb-1.7.5.x86_64.rpm’ 100%[================================================================================================================================================================================>] 46,536,692 440KB/s in 60s 2019-04-10 02:53:37 (756 KB/s) - ‘influxdb-1.7.5.x86_64.rpm’ saved [46536692/46536692]
安裝rpm包json
rpm -ivh influxdb-1.7.5.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:influxdb-1.7.5-1 ################################# [100%] Created symlink from /etc/systemd/system/influxd.service to /usr/lib/systemd/system/influxdb.service. Created symlink from /etc/systemd/system/multi-user.target.wants/influxdb.service to /usr/lib/systemd/system/influxdb.service.
啓動InfluxDB後端
service influxdb start
Redirecting to /bin/systemctl start influxdb.service
查看InfluxDB狀態瀏覽器
ps -ef | grep influxdb influxdb 8475 1 2 03:01 ? 00:00:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf root 8486 7007 0 03:02 pts/0 00:00:00 grep --color=auto influxdb service influxdb status Redirecting to /bin/systemctl status influxdb.service ● influxdb.service - InfluxDB is an open-source, distributed, time series database Loaded: loaded (/usr/lib/systemd/system/influxdb.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2019-04-10 03:01:48 EDT; 22s ago Docs: https://docs.influxdata.com/influxdb/ Main PID: 8475 (influxd) CGroup: /system.slice/influxdb.service └─8475 /usr/bin/influxd -config /etc/influxdb/influxdb.conf Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.375804Z lvl=info msg="Starting precreation service" log_id=0EiWgWRl000 service=shard-precreation check_interval=10m advance_period=30m Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.375810Z lvl=info msg="Starting snapshot service" log_id=0EiWgWRl000 service=snapshot Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.375816Z lvl=info msg="Starting continuous query service" log_id=0EiWgWRl000 service=continuous_querier Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.375826Z lvl=info msg="Starting HTTP service" log_id=0EiWgWRl000 service=httpd authentication=false Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.375830Z lvl=info msg="opened HTTP access log" log_id=0EiWgWRl000 service=httpd path=stderr Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.375936Z lvl=info msg="Listening on HTTP" log_id=0EiWgWRl000 service=httpd addr=[::]:8086 https=false Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.375949Z lvl=info msg="Starting retention policy enforcement service" log_id=0EiWgWRl000 service=retention check_interval=30m Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.376138Z lvl=info msg="Listening for signals" log_id=0EiWgWRl000 Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.376389Z lvl=info msg="Storing statistics" log_id=0EiWgWRl000 service=monitor db_instance=_internal db_rp=monitor interval=10s Apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10T07:01:48.376534Z lvl=info msg="Sending usage statistics to usage.influxdata.com" log_id=0EiWgWRl000
使用InfluxDB客戶端
influx Connected to http://localhost:8086 version 1.7.5 InfluxDB shell version: 1.7.5 Enter an InfluxQL query >
建立用戶和數據庫
> CREATE USER "admin" WITH PASSWORD 'admin' WITH ALL PRIVILEGES
> create database "jmxDB"
建立完成InfluxDB的用戶和數據庫暫時就夠用了,其它簡單操做以下,後面會用到
#建立數據庫 create database "db_name" #顯示全部的數據庫 show databases #刪除數據庫 drop database "db_name" #使用數據庫 use db_name #顯示該數據庫中全部的表 show measurements #建立表,直接在插入數據的時候指定表名 insert test,host=127.0.0.1,monitor_name=test count=1 #刪除表 drop measurement "measurement_name" #退出 quit
jmxtrans的做用是自動去jvm中獲取全部jmx格式數據,並按照某種格式(json文件配置格式)輸出到其餘應用程序(本例中的influxDB)。
下載jmxtrans rpm安裝包
wget http://central.maven.org/maven2/org/jmxtrans/jmxtrans/270/jmxtrans-270.rpm --2019-04-10 03:18:14-- http://central.maven.org/maven2/org/jmxtrans/jmxtrans/270/jmxtrans-270.rpm Resolving central.maven.org (central.maven.org)... 151.101.40.209 Connecting to central.maven.org (central.maven.org)|151.101.40.209|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 18750744 (18M) [application/x-rpm] Saving to: ‘jmxtrans-270.rpm’ 100%[================================================================================================================================================================================>] 18,750,744 342KB/s in 43s
2019-04-10 03:18:59 (422 KB/s) - ‘jmxtrans-270.rpm’ saved [18750744/18750744]
安裝rpm包
rpm -ivh jmxtrans-270.rpm
Preparing... ################################# [100%]
Updating / installing...
1:jmxtrans-270-1 ################################# [100%]
jmxtrans相關路徑
jmxtrans安裝目錄:/usr/share/jmxtrans json文件默認目錄:/var/lib/jmxtrans/ 日誌路徑:/var/log/jmxtrans/jmxtrans.log
配置json,jmxtrans的github上有一段示例配置
{
"servers" : [ {
"port" : "1099",
"host" : "w2",
"queries" : [ {
"obj" : "java.lang:type=Memory",
"attr" : [ "HeapMemoryUsage", "NonHeapMemoryUsage" ],
"resultAlias":"jvmMemory",
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://127.0.0.1:8086/",
"username" : "admin",
"password" : "admin",
"database" : "jmxDB",
"tags" : {"application" : "kafka"}
} ]
} ]
} ]
}
啓動jmxtrans
service jmxtrans start
Starting JmxTrans...
查看日誌沒有報錯即爲成功
tail /var/log/jmxtrans/jmxtrans.log INFO | jvm 1 | 2019/04/10 04:44:31 | Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads. INFO | jvm 1 | 2019/04/10 04:44:31 | Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered. INFO | jvm 1 | 2019/04/10 04:44:31 | INFO | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [WrapperSimpleAppMain] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler 'ServerScheduler' initialized from an externally opened InputStream. INFO | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [WrapperSimpleAppMain] INFO org.quartz.impl.StdSchedulerFactory - Quartz scheduler version: 1.8.6 INFO | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [WrapperSimpleAppMain] INFO org.quartz.core.QuartzScheduler - JobFactory set to: com.googlecode.jmxtrans.guice.GuiceJobFactory@23822296 2019-04-10 04:44:31 [WrapperSimpleAppMain] level com.googlecode.jmxtrans.JmxTransformer [JmxTransformer.java:177] - Starting Jmxtrans on : /var/lib/jmxtrans 2019-04-10 04:44:31 [WrapperSimpleAppMain] level org.quartz.core.QuartzScheduler [QuartzScheduler.java:519] - Scheduler ServerScheduler_$_node11554885871753 started. INFO | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [WrapperSimpleAppMain] INFO c.googlecode.jmxtrans.JmxTransformer - Starting Jmxtrans on : /var/lib/jmxtrans INFO | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [WrapperSimpleAppMain] INFO org.quartz.core.QuartzScheduler - Scheduler ServerScheduler_$_node11554885871753 started.
附上兩段通用的json文件
base_127.0.0.1.json
{ "servers": [{ "port": "9999", "host": "127.0.0.1", "queries": [{ "obj": "kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec", "attr": ["Count", "OneMinuteRate"], "resultAlias": "BytesInPerSec", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "BytesInPerSec" } }] }, { "obj": "kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec", "attr": ["Count", "OneMinuteRate"], "resultAlias": "BytesOutPerSec", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "BytesOutPerSec" } }] }, { "obj": "kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec", "attr": ["Count", "OneMinuteRate"], "resultAlias": "BytesRejectedPerSec", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "BytesRejectedPerSec" } }] }, { "obj": "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec", "attr": ["Count", "OneMinuteRate"], "resultAlias": "MessagesInPerSec", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "MessagesInPerSec" } }] }, { "obj": "kafka.network:type=RequestMetrics,name=RequestsPerSec,request=FetchConsumer", "attr": ["Count"], "resultAlias": "RequestsPerSec", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "request": "FetchConsumer" } }] }, { "obj": "kafka.network:type=RequestMetrics,name=RequestsPerSec,request=FetchFollower", "attr": ["Count"], "resultAlias": "RequestsPerSec", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "request": "FetchFollower" } }] }, { "obj": "kafka.network:type=RequestMetrics,name=RequestsPerSec,request=Produce", "attr": ["Count"], "resultAlias": "RequestsPerSec", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "request": "Produce" } }] }, { "obj": "java.lang:type=Memory", "attr": ["HeapMemoryUsage", "NonHeapMemoryUsage"], "resultAlias": "MemoryUsage", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "MemoryUsage" } }] }, { "obj": "java.lang:type=GarbageCollector,name=*", "attr": ["CollectionCount", "CollectionTime"], "resultAlias": "GC", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "GC" } }] }, { "obj": "java.lang:type=Threading", "attr": ["PeakThreadCount", "ThreadCount"], "resultAlias": "Thread", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "Thread" } }] }, { "obj": "kafka.server:type=ReplicaFetcherManager,name=MaxLag,clientId=Replica", "attr": ["Value"], "resultAlias": "ReplicaFetcherManager", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "MaxLag" } }] }, { "obj": "kafka.server:type=ReplicaManager,name=PartitionCount", "attr": ["Value"], "resultAlias": "ReplicaManager", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "PartitionCount" } }] }, { "obj": "kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions", "attr": ["Value"], "resultAlias": "ReplicaManager", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "UnderReplicatedPartitions" } }] }, { "obj": "kafka.server:type=ReplicaManager,name=LeaderCount", "attr": ["Value"], "resultAlias": "ReplicaManager", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "LeaderCount" } }] }, { "obj": "kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer", "attr": ["Count", "Max"], "resultAlias": "TotalTimeMs", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "FetchConsumer" } }] }, { "obj": "kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower", "attr": ["Count", "Max"], "resultAlias": "TotalTimeMs", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "FetchFollower" } }] }, { "obj": "kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce", "attr": ["Count", "Max"], "resultAlias": "TotalTimeMs", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "Produce" } }] }, { "obj": "kafka.server:type=ReplicaManager,name=IsrShrinksPerSec", "attr": ["Count"], "resultAlias": "ReplicaManager", "outputWriters": [{ "@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxDB", "tags": { "application": "IsrShrinksPerSec" } }] }] }] }
topicA_1.json
{
"servers": [{
"port": "9999",
"host": "127.0.0.1",
"queries": [{
"obj": "kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=topicA",
"attr": ["Count"],
"resultAlias": "topicA",
"outputWriters": [{
"@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url": "http://127.0.0.1:8086/",
"username": "admin",
"password": "admin",
"database": "jmxDB",
"tags": {
"application": "BytesInPerSec"
}
}]
}, {
"obj": "kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topic=topicA",
"attr": ["Count"],
"resultAlias": "topicA",
"outputWriters": [{
"@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url": "http://127.0.0.1:8086/",
"username": "admin",
"password": "admin",
"database": "jmxDB",
"tags": {
"application": "BytesOutPerSec"
}
}]
}, {
"obj": "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=topicA",
"attr": ["Count"],
"resultAlias": "topicA",
"outputWriters": [{
"@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url": "http://127.0.0.1:8086/",
"username": "admin",
"password": "admin",
"database": "jmxDB",
"tags": {
"application": "MessagesInPerSec"
}
}]
}, {
"obj": "kafka.log:type=Log,name=LogEndOffset,topic=topicA,partition=*",
"attr": ["Value"],
"resultAlias": "topicA",
"outputWriters": [{
"@class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url": "http://127.0.0.1:8086/",
"username": "admin",
"password": "admin",
"database": "jmxDB",
"tags": {
"application": "LogEndOffset"
}
}]
}]
}]
}
Grafana是一個跨平臺的開源的度量分析和可視化工具,能夠經過將採集的數據查詢而後可視化的展現,並及時通知。
下載jmxtrans rpm安裝包
wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-6.0.2-1.x86_64.rpm --2019-04-10 04:53:15-- https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-6.0.2-1.x86_64.rpm Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.144.92 Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.144.92|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 56002012 (53M) [application/x-redhat-package-manager] Saving to: ‘grafana-6.0.2-1.x86_64.rpm’ 100%[================================================================================================================================================================================>] 56,002,012 177KB/s in 2m 52s
2019-04-10 04:56:08 (318 KB/s) - ‘grafana-6.0.2-1.x86_64.rpm’ saved [56002012/56002012]
安裝rpm包
rpm -ivh grafana-6.0.2-1.x86_64.rpm warning: grafana-6.0.2-1.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID 24098cb6: NOKEY error: Failed dependencies: fontconfig is needed by grafana-6.0.2-1.x86_64 urw-fonts is needed by grafana-6.0.2-1.x86_64
缺乏依賴,下載依賴
yum install --downloadonly --downloaddir=./ fontconfig yum localinstall fontconfig-2.13.0-4.3.el7.x86_64.rpm yum install --downloadonly --downloaddir=./ urw-fonts yum localinstall urw-fonts-2.4-16.el7.noarch.rpm rpm -ivh grafana-6.0.2-1.x86_64.rpm warning: grafana-6.0.2-1.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID 24098cb6: NOKEY Preparing... ################################# [100%] Updating / installing... 1:grafana-6.0.2-1 ################################# [100%] ### NOT starting on installation, please execute the following statements to configure grafana to start automatically using systemd sudo /bin/systemctl daemon-reload sudo /bin/systemctl enable grafana-server.service ### You can start grafana-server by executing sudo /bin/systemctl start grafana-server.service POSTTRANS: Running script
啓動Grafana
service grafana-server start
Starting grafana-server (via systemctl): [ OK ]
打開瀏覽器
http://127.0.0.1:3000
先輸入默認用戶名密碼admin/admin
設置新密碼
點擊Add data source
選擇InfluxDB
輸入鏈接信息後點擊Save & Test
經過後點擊Back返回
左側 + 能夠建立或引入儀表盤
相似於數據庫SQL語句,查詢相應的指標
計算平均每秒數值可使用如上語法,用當前值減1分鐘以前的值再除以60
具體展現效果就看各位的審美能力,這裏就不貼出來了。至此,Kafka的JMX指標監控就完成了。