Prometheus快速瞭解

時間 2019-12-26

標籤 prometheus 快速瞭解简体版

原文原文鏈接

概覽

Prometheus是一個獨立的開源監控系統。其組成主要有時序數據庫、數據採集、數據查詢（PromQL查詢語言）和報警。
先看一下Prometheus結構圖
前端

Prometheus工做流程

先得部署一個被監控的應用：App。
讓App和Prometheus通訊才能達到監控的目的。
將app關聯到Prometheus，即在Prometheus中配置被監控App的位置信息。
須要採集app的數據，App就得按照Prometheus提供的規則編寫Http接口。

數據採集兩種方式：node
1. 好比Java應用，能夠在Java應用中引入相關依賴，提供Prometheus採集數據的pull接口。
2. 提供exporter作爲中間層適配數據採集。

Prometheus按時經過該接口pull數據，即達到數據採集的目的。
可是監控平臺不少時候不止監控一個應用，也許成百上千個，而且這些應用的位置等配置信息還在動態改變，所以爲了方便部署和管理應用，此時引入了kubernetes或marathon集羣管理系統。
Prometheus只用鏈接到集羣管理系統便可拿到全部被監控應用的配置信息。
而後咱們就能夠經過PromQL查詢咱們想要的數據或以圖的方式顯示：。
Prometheus根據配置規則產生警報，並將警報發送給Alertmanager,Alertmanager經過聚合分組，去重等一些列處理後，纔會經過Email、短信或電話將警報信息可發送給用戶。

告警規則須要在Prometheus配置文件中配置mysql

Alertmanager

原文檔linux

接收警報的配置

配置文件定義了抑制規則，通知路由和通知接收者。
用以下命令查看全部可配置標籤：web

alertmanager -h

使用目標配置文件啓動Alertmanagersql

./alertmanager --config.file=simple.yml

route匹配數據庫

例子

# The root route with all parameters, which are inherited by the child
# routes if they are not overwritten.
route:
  receiver: 'default-receiver'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  group_by: [cluster, alertname]
  # All alerts that do not match the following child routes
  # will remain at the root node and be dispatched to 'default-receiver'.
  routes:
  # All alerts with service=mysql or service=cassandra
  # are dispatched to the database pager.
  - receiver: 'database-pager'
    group_wait: 10s
    match_re:
      service: mysql|cassandra
  # All alerts with the team=frontend label match this sub-route.
  # They are grouped by product and environment rather than cluster
  # and alertname.
  - receiver: 'frontend-pager'
    group_by: [product, environment]
    match:
      team: frontend

每一個警報在配置的頂級路由上進入路由樹，它必須匹配全部警報。而後遍歷子節點。若是continue設置爲false，那麼它將在第一個匹配的子節點以後中止。若是在匹配節點上continue爲真，則警報將繼續匹配後續的兄弟節點。若是警報不匹配節點的任何子節點(沒有匹配的子節點，或者根本不存在)，則根據當前節點的配置參數處理警報。

接收警報的數據結構

webhooks是一個api概念，是微服務api的使用範式之一，也被成爲反向api，即：前端不主動發送請求，徹底由後端推送。舉個經常使用例子，好比你的好友發了一條朋友圈，後端將這條消息推送給全部其餘好友的客戶端，就是 Webhooks 的典型場景。

Prometheus經過webhook向Alertmanager推送警報，數據結構以下：

Alers結構以下：

每一個警報的標籤用於標識警報的相同實例並執行重複數據刪除。註釋老是設置爲最近收到的，而且沒有標識警報。
KV是一組用於表示標籤和註釋的鍵/值字符串對。後端

type KV map[string]string

例如：api

{
  summary: "alert summary",
  description: "alert description",
}

關於對KV的操做：
數據結構

Alertmanager管理

檢查Alertmanager狀態

GET /-/healthy

Alertmanager已經準備好爲流量服務

GET /-/ready

從新加載配置信息

POST /-/reload

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。