【Prometheus】第三篇:配置alertmamager

 

監控系統中很是重要的一環,就是告警,系統得在故障發生的第一時間將事件發送出來,通知干係人,prometheus提供了alertmanager來實現這個功能。node

第一步:prometheus.yml配置文件,配置alertmanager地址web

第二步:編寫觸發器,也就是在什麼狀況下產生告警。docker

Prometheus.yml填寫觸發器配置文件路徑post

alert_rule.yml內容url

groups:
- name: node
  rules:
  - alert: node_cpu>80%
    expr: (1-rate(node_cpu_seconds_total{mode="idle"}[1m]))*100 > 80
    labels:
      severity: 3
  - alert: node_mem_availble<20%
    expr: node_memory_MemAvailable_bytes/node_memory_MemTotal_bytes*100 < 20
    labels:
      severity: 3
  - alert: node_cpu_load>10
    expr: node_load1 > 10
    labels:
      severity: 3
  - alert: node_disk<20%
    expr: node_filesystem_avail_bytes{device!='nsfs'}/node_filesystem_size_bytes{device!='nsfs'}*100 < 20
    labels:
      severity: 3
- name: docker
  rules:
  - alert: docker_cpu>50%
    expr: rate(container_cpu_usage_seconds_total{image!=''}[1m])*100 > 50
    labels:
      severity: 3
  - alert: docker_restarted
    expr: changes(container_start_time_seconds[1m]) != 0
    labels:
      severity: 4

其中expr就是產生告警的條件,即當這個語句條件成立時,觸發告警,下面的labels是告警內容中的標籤,這裏添加了一個標籤,即告警等級severity,能夠自定義1-5,來區分不一樣級別的告警。spa

第三步:產生的告警怎麼處理,是發消息?發送給誰?經過什麼發送?都是在這裏配置。alertmanager.yml配置文件3d

內容以下:rest

global:
  resolve_timeout: 5m

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'wechat'
  routes:
    - match_re:
        severity: 1|2|3|4|5
      receiver: 'wechat'
      continue: true
    - match:
        severity: 5
      receiver: 'message'
      continue: true
    - match:
        severity: 5
      receiver: 'call'
      continue: true
receivers:
- name: 'wechat'
  webhook_configs:
  - url: 'http://localhost/alert_wechat'
- name: 'message'
  webhook_configs:
  - url: 'http://localhost/alert_message'
- name: 'call'
  webhook_configs:
  - url: 'http://localhost/alert_call'
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

這裏用了一個receiver,即web_hook,Prometheus會把告警內容post到指定的url地址。code

相關文章
相關標籤/搜索