基於grafana+prometheus構建Flink監控

先上一個架構圖html

Flink App : 經過report 將數據發出去java

Pushgateway :  Prometheus 生態中一個重要工具node

Prometheus :  一套開源的系統監控報警框架 (Prometheus 入門與實踐linux

Grafana: 一個跨平臺的開源的度量分析和可視化工具,能夠經過將採集的數據查詢而後可視化的展現,並及時通知(可視化工具Grafana:簡介及安裝web

Node_exporter : 跟Pushgateway同樣是Prometheus 的組件,採集到主機的運行指標如CPU, 內存,磁盤等信息docker

如下安裝,大部分參考博客: http://www.javashuo.com/article/p-bydptqfy-hb.htmlapache

一、docker  pull 鏡像api

docker pull prom/node-exporter
docker pull prom/pushgateway
docker pull prom/prometheus
docker pull grafana/grafana

查看下載的鏡像架構

$ docker images
REPOSITORY           TAG                 IMAGE ID            CREATED             SIZE
prom/prometheus      latest              d5b9d7ed160a        2 weeks ago         138MB
grafana/grafana      latest              a6e14b4109af        2 weeks ago         253MB
prom/pushgateway     latest              20e6dcae675f        4 weeks ago         19.2MB
prom/node-exporter   latest              e5a616e4b9cf        2 months ago        22.9MB

二、編輯prometheus.yml 、建立 Grafana 數據存儲目錄框架

$ mkdir /opt/grafana-storage  # grafana 數據存儲目錄

$ cat /opt/prometheus/prometheus.yml # prometheus 配置
global:
  scrape_interval:     60s
  evaluation_interval: 60s
 
scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ['localhost:9090']
        labels:
          instance: prometheus
 
  - job_name: linux
    static_configs:
      - targets: ['venn:9100']
        labels:
          instance: localhost
  - job_name: 'pushgateway'
    static_configs:
      - targets: ['venn:9091']
        labels:
          instance: 'pushgateway'

三、啓動各個組件

docker run -d -p 3000:3000   --name=grafana   -v /opt/grafana-storage:/var/lib/grafana   grafana/grafana
docker run -d -p 9100:9100  -v "/proc:/host/proc:ro"  -v "/sys:/host/sys:ro"  -v "/:/rootfs:ro"  --net="host"  prom/node-exporter
docker run -d -p 9090:9090  -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml  prom/prometheus
docker run -d -p 9091:9091 prom/pushgateway

查看docker進程

$ docker ps
CONTAINER ID        IMAGE                COMMAND                  CREATED             STATUS              PORTS                    NAMES
4a689cf48e10        prom/pushgateway     "/bin/pushgateway"       5 days ago          Up 5 days           0.0.0.0:9091->9091/tcp   infallible_goldstine
fcc40433bf75        grafana/grafana      "/run.sh"                5 days ago          Up 5 days           0.0.0.0:3000->3000/tcp   grafana
8ba942d0cf35        prom/prometheus      "/bin/prometheus --c…"   5 days ago          Up 5 days           0.0.0.0:9090->9090/tcp   quizzical_colden
b84b0f4be2b2        prom/node-exporter   "/bin/node_exporter"     5 days ago          Up 5 days                                    fervent_poitras

查看端口

$ netstat -apn | grep -E '9091|3000|9090|9100'
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 172.17.0.1:39028        172.17.0.4:9091         ESTABLISHED -                   
tcp6       0      0 :::9100                 :::*                    LISTEN      -                   
tcp6       0      0 :::3000                 :::*                    LISTEN      -                   
tcp6       0      0 :::9090                 :::*                    LISTEN      -                   
tcp6       0      0 :::9091                 :::*                    LISTEN      -                   
tcp6       0      0 192.168.229.129:45864   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45856   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45824   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45874   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45854   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45836   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45814   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.128:9100    192.168.229.1:13405     ESTABLISHED -                   
tcp6       0      0 192.168.229.129:45826   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45844   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.128:9091    172.17.0.2:53930        ESTABLISHED -                   
tcp6       0      0 192.168.229.129:45846   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.128:9100    172.17.0.2:54776        ESTABLISHED -                   
tcp6       0      0 192.168.229.129:45816   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45876   192.168.229.128:9091    ESTABLISHED 40846/java          
tcp6       0      0 192.168.229.129:45834   192.168.229.128:9091    TIME_WAIT   -                   
tcp6       0      0 192.168.229.129:45866   192.168.229.128:9091    TIME_WAIT   -   

四、查看組件頁面

node_exporter: ip:9100/metrics

 

查看 prometheus: ip:9090/targets

若是state 不是 UP 的,等一會就起來了 

查看Grafana: 

 

  默認用戶名密碼 : amin/admin

此處再也不贅述,配置數據源、建立系統負載監控參考博客:http://www.javashuo.com/article/p-bydptqfy-hb.html 

五、配置Flink report :

在Flink 配置文件 flink-conf.yml 中添加以下內容:

##metrics
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
metrics.reporter.promgateway.host: venn
metrics.reporter.promgateway.port: 9091
metrics.reporter.promgateway.jobName: myJob
metrics.reporter.promgateway.randomJobNameSuffix: true
metrics.reporter.promgateway.deleteOnShutdown: false

啓動一個任務(上一篇博客的案例遲到數據處理):

flink run -m yarn-cluster -ynm LateDataProcess -yn 1 -c com.venn.stream.api.sideoutput.lateDataProcess.LateDataProcess jar/flinkDemo-1.0.jar

查看任務webUI:

PS:任務已經跑了一段時間了

六、Grafana 中配置Flink監控

因爲上面一句配置好Flink report、 pushgateway、prometheus,而且在Grafana中已經添加了prometheus 數據源,因此Grafana中會自動獲取到 flink job的metrics 。

 Grafana 首頁,點擊New dashboard,建立一個新的dashboard

選中以後,即會出現對應的監控指標

至此,Flink 的metrics 的指標展現在Grafana 中了

flink 指標對應的指標名比較長,能夠在Legend 中配置顯示內容,在{{key}} 將key換成對應須要展現的字段便可,如: {{job_name}},{{operator_name}}

對應顯示以下:

保存,搞定

相關文章
相關標籤/搜索