先上一個架構圖html
Flink App : 經過report 將數據發出去java
Pushgateway : Prometheus 生態中一個重要工具node
Prometheus : 一套開源的系統監控報警框架 (Prometheus 入門與實踐)linux
Grafana: 一個跨平臺的開源的度量分析和可視化工具,能夠經過將採集的數據查詢而後可視化的展現,並及時通知(可視化工具Grafana:簡介及安裝)web
Node_exporter : 跟Pushgateway同樣是Prometheus 的組件,採集到主機的運行指標如CPU, 內存,磁盤等信息docker
如下安裝,大部分參考博客: http://www.javashuo.com/article/p-bydptqfy-hb.htmlapache
一、docker pull 鏡像api
docker pull prom/node-exporter docker pull prom/pushgateway docker pull prom/prometheus docker pull grafana/grafana
查看下載的鏡像架構
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE prom/prometheus latest d5b9d7ed160a 2 weeks ago 138MB grafana/grafana latest a6e14b4109af 2 weeks ago 253MB prom/pushgateway latest 20e6dcae675f 4 weeks ago 19.2MB prom/node-exporter latest e5a616e4b9cf 2 months ago 22.9MB
二、編輯prometheus.yml 、建立 Grafana 數據存儲目錄框架
$ mkdir /opt/grafana-storage # grafana 數據存儲目錄
$ cat /opt/prometheus/prometheus.yml # prometheus 配置
global: scrape_interval: 60s evaluation_interval: 60s scrape_configs: - job_name: prometheus static_configs: - targets: ['localhost:9090'] labels: instance: prometheus - job_name: linux static_configs: - targets: ['venn:9100'] labels: instance: localhost - job_name: 'pushgateway' static_configs: - targets: ['venn:9091'] labels: instance: 'pushgateway'
三、啓動各個組件
docker run -d -p 3000:3000 --name=grafana -v /opt/grafana-storage:/var/lib/grafana grafana/grafana docker run -d -p 9100:9100 -v "/proc:/host/proc:ro" -v "/sys:/host/sys:ro" -v "/:/rootfs:ro" --net="host" prom/node-exporter docker run -d -p 9090:9090 -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus docker run -d -p 9091:9091 prom/pushgateway
查看docker進程
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4a689cf48e10 prom/pushgateway "/bin/pushgateway" 5 days ago Up 5 days 0.0.0.0:9091->9091/tcp infallible_goldstine fcc40433bf75 grafana/grafana "/run.sh" 5 days ago Up 5 days 0.0.0.0:3000->3000/tcp grafana 8ba942d0cf35 prom/prometheus "/bin/prometheus --c…" 5 days ago Up 5 days 0.0.0.0:9090->9090/tcp quizzical_colden b84b0f4be2b2 prom/node-exporter "/bin/node_exporter" 5 days ago Up 5 days fervent_poitras
查看端口
$ netstat -apn | grep -E '9091|3000|9090|9100' (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) tcp 0 0 172.17.0.1:39028 172.17.0.4:9091 ESTABLISHED - tcp6 0 0 :::9100 :::* LISTEN - tcp6 0 0 :::3000 :::* LISTEN - tcp6 0 0 :::9090 :::* LISTEN - tcp6 0 0 :::9091 :::* LISTEN - tcp6 0 0 192.168.229.129:45864 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45856 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45824 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45874 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45854 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45836 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45814 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.128:9100 192.168.229.1:13405 ESTABLISHED - tcp6 0 0 192.168.229.129:45826 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45844 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.128:9091 172.17.0.2:53930 ESTABLISHED - tcp6 0 0 192.168.229.129:45846 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.128:9100 172.17.0.2:54776 ESTABLISHED - tcp6 0 0 192.168.229.129:45816 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45876 192.168.229.128:9091 ESTABLISHED 40846/java tcp6 0 0 192.168.229.129:45834 192.168.229.128:9091 TIME_WAIT - tcp6 0 0 192.168.229.129:45866 192.168.229.128:9091 TIME_WAIT -
四、查看組件頁面
node_exporter: ip:9100/metrics
查看 prometheus: ip:9090/targets
若是state 不是 UP 的,等一會就起來了
查看Grafana:
默認用戶名密碼 : amin/admin
此處再也不贅述,配置數據源、建立系統負載監控參考博客:http://www.javashuo.com/article/p-bydptqfy-hb.html
五、配置Flink report :
在Flink 配置文件 flink-conf.yml 中添加以下內容:
##metrics metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter metrics.reporter.promgateway.host: venn metrics.reporter.promgateway.port: 9091 metrics.reporter.promgateway.jobName: myJob metrics.reporter.promgateway.randomJobNameSuffix: true metrics.reporter.promgateway.deleteOnShutdown: false
啓動一個任務(上一篇博客的案例遲到數據處理):
flink run -m yarn-cluster -ynm LateDataProcess -yn 1 -c com.venn.stream.api.sideoutput.lateDataProcess.LateDataProcess jar/flinkDemo-1.0.jar
查看任務webUI:
PS:任務已經跑了一段時間了
六、Grafana 中配置Flink監控
因爲上面一句配置好Flink report、 pushgateway、prometheus,而且在Grafana中已經添加了prometheus 數據源,因此Grafana中會自動獲取到 flink job的metrics 。
Grafana 首頁,點擊New dashboard,建立一個新的dashboard
選中以後,即會出現對應的監控指標
至此,Flink 的metrics 的指標展現在Grafana 中了
flink 指標對應的指標名比較長,能夠在Legend 中配置顯示內容,在{{key}} 將key換成對應須要展現的字段便可,如: {{job_name}},{{operator_name}}
對應顯示以下:
保存,搞定