Docker監控方案之Prometheus

時間 2019-12-09

標籤 docker 監控方案 prometheus 欄目 Docker 简体版

原文原文鏈接

Docker經常使用的監控方案node

Prometheusdocker

Prometheus架構vim

Prometheus是一種很不錯的監控方案，它提供了監控數據蒐集、存儲、處理、可視化和警告一套完整的解決方案，下面是Prometheus的架構瀏覽器

Prometheus Server架構

Prometheus Server負責從Exporter拉取和存儲監控數據，並提供一套靈活的查詢語言（PromQL）供用戶使用性能

Exporter測試

Exporter負責收集目標對象（host,container...）的性能數據，並經過HTTP接口提供Prometheus Server獲取this

Alertmanagergoogle

用戶能夠定義基於監控數據的告警規則，規則會觸發告警。一旦Alertmanager收到告警，會經過預約義的方式發出告警通知。支持的方式包括Email、PagerDuty、Webhook等lua

Prometheus的優點

（1）經過維度對數據進行說明，附加更多的業務信息，進而知足不一樣業務的需求。同時維度是能夠動態添加的，好比再給數據加上一個user維度，就能夠按用戶來統計容器內存使用量了

（2）Prometheus豐富的查詢語言可以靈活、充分地挖掘數據的價值

部署Prometheus

環境說明

咱們將經過Prometheus監控兩臺Docker Host：10.211.55.17和10.211.55.21，監控host和容器兩個層次的數據，按照構架圖，咱們須要運行以下組件

Prometheus Server

Prometheus Server自己也將以容器的方式運行在host 10.211.55.21上

Exporter

Prometheus有不少現成的Exporter，完整列表可參照https://prometheus.io/docs/instrumenting/exporters/

這裏將使用

（1）Node Exporter，負責收集host硬件和操做系統數據。它將以容器方式運行在全部host上

（2）cAdvisor，負責收集容器數據。它將以容器的方式運行在全部host上

Grafana

顯示多維數據，Grafana自己也將以容器方式運行在host 10.211.55.21上

運行Node Exporter

在兩臺主機上執行以下命令

sudo docker run -d -p 9100:9100 -v "/proc:/host/proc" -v "/sys:/host/sys" -v "/:/rootfs" --net=host prom/node-exporter --path.procfs /host/proc --path.sysfs /host/sys --collector.filesystem.ignored-mount-points "/(sys|proc|dev|host|etc)($|/)"

這裏使用了--net=host，這樣Prometheus Server能夠直接與Node Exporter通訊。Node Exporter啓動後，將經過9100提供host的監控數據，在瀏覽器中經過http://10.211.55.17:9100/metrics測試一下

運行cAdvisor

在兩個主機上執行一下命令

sudo docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker:/var/lib/docker:ro --publish=8080:8080 --detach=true --name=cadvisor --net=host google/cadvisor:latest

這裏使用了--net=host，這樣Prometheus Server能夠直接與cAdvisor通訊。cAdvisor啓動後，將經過8080提供host的監控數據，在瀏覽器中經過http://10.211.55.17:8080/metrics測試一下

運行Prometheus Server

先在主機10.211.55.21上編寫prometheus.yml文件，其具體內容以下

sudo vim Prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090','localhost:8080','localhost:9100','10.211.55.17:9100','10.211.55.17:8080']

編寫好配置文件後執行一下命令，以容器的方式來運行prometheus

sudo docker run -d -p 9090:9090 -v /home/chenjin/prometheus.yml:/etc/prometheus/prometheus.yml --name prometheus --net=host prom/prometheus

這裏使用了--net=host，這樣Prometheus Server能夠直接與Exporter和Grafana通訊。上面的配置文件中最重要的是-targets裏面的內容，指定從哪些exporter抓取數據。這裏指定了兩臺主機上的Node Exporter個cAdvisor，另外localhost:9090就是Prometheus Server本身，可見Prometheus自己也會收集本身的監控數據。能夠經過http://10.211.55.21:9090/metrics測試一下

在瀏覽器中打開http://10.211.55.21:9090，點擊菜單Status -> Targets

以下圖所示

全部Target的State都是都是UP狀態，說明Prometheus Server可以正常獲取監控數據

運行Grafana

在主機10.211.55.21上執行以下命令

sudo docker run -d -i -p 3000:3000 -e "GF_SERVER_ROOT_URL=http://grafana.server.name" -e "GF_SECURITY_ADMIN_PASSWORD=secret" --net=host grafana/grafana

這裏使用了--net=host，這樣Grafana能夠直接與Prometheus Server通訊。-e "GF_SECURITY_ADMIN_PASSWORD=secret"指定了Grafana admin用戶和密碼secret

Grafana啓動後，在瀏覽器中打開http://10.211.55.21:3000

登陸後，Grafana將引導咱們配置Data Source

Name爲Date Source命令，例如prometheus

Type選擇Prometheus

Url輸入Prometheus Server的地址

其餘保持默認，點擊下面的Save & Test

配置完成後，Grafana就可以訪問Prometheus中存放的監控數據了