Prometheus入門

時間 2019-11-24

標籤 prometheus 入門简体版

原文原文鏈接

什麼是TSDB？

TSDB(Time Series Database)時序列數據庫，咱們能夠簡單的理解爲一個優化後用來處理時間序列數據的軟件，而且數據中的數組是由時間進行索引的。php

時間序列數據庫的特色html

大部分時間都是寫入操做。
寫入操做幾乎是順序添加，大多數時候數據到達後都以時間排序。
寫操做不多寫入好久以前的數據，也不多更新數據。大多數狀況在數據被採集到數秒或者數分鐘後就會被寫入數據庫。
刪除操做通常爲區塊刪除，選定開始的歷史時間並指定後續的區塊。不多單獨刪除某個時間或者分開的隨機時間的數據。
基本數據大，通常超過內存大小。通常選取的只是其一小部分且沒有規律，緩存幾乎不起任何做用。
讀操做是十分典型的升序或者降序的順序讀。
高併發的讀操做十分常見。

常見的時間序列數據庫java

TSDB項目	官網
influxDB	https://influxdata.com/
RRDtool	http://oss.oetiker.ch/rrdtool/
Graphite	http://graphiteapp.org/
OpenTSDB	http://opentsdb.net/
Kdb+	http://kx.com/
Druid	http://druid.io/
KairosDB	http://kairosdb.github.io/
Prometheus	https://prometheus.io/

什麼是Prometheus?

Prometheus是由SoundCloud開發的開源監控報警系統和時序列數據庫(TSDB)。Prometheus使用Go語言開發，是Google BorgMon監控系統的開源版本。node

2016年由Google發起Linux基金會旗下的原生雲基金會(Cloud Native Computing Foundation), 將Prometheus歸入其下第二大開源項目。Prometheus目前在開源社區至關活躍。python

Prometheus和Heapster(Heapster是K8S的一個子項目，用於獲取集羣的性能數據。)相比功能更完善、更全面。Prometheus性能也足夠支撐上萬臺規模的集羣。linux

Prometheus的特色

多維度數據模型。
靈活的查詢語言。
不依賴分佈式存儲，單個服務器節點是自主的。
經過基於HTTP的pull方式採集時序數據。
能夠經過中間網關進行時序列數據推送。
經過服務發現或者靜態配置來發現目標服務對象。
支持多種多樣的圖表和界面展現，好比Grafana等。

Prometheus相關組件

Prometheus生態系統由多個組件組成，它們中的一些是可選的。多數Prometheus組件是Go語言寫的，這使得這些組件很容易編譯和部署。git

Prometheus Server

主要負責數據採集和存儲，提供PromQL查詢語言的支持。github

客戶端SDK

官方提供的客戶端類庫有go、java、scala、python、ruby，其餘還有不少第三方開發的類庫，支持nodejs、php、erlang等。數據庫

Push Gateway

支持臨時性Job主動推送指標的中間網關。vim

PromDash

使用Rails開發可視化的Dashboard，用於可視化指標數據。

Exporter

Exporter是Prometheus的一類數據採集組件的總稱。它負責從目標處蒐集數據，並將其轉化爲Prometheus支持的格式。與傳統的數據採集組件不一樣的是，它並不向中央服務器發送數據，而是等待中央服務器主動前來抓取。

Prometheus提供多種類型的Exporter用於採集各類不一樣服務的運行狀態。目前支持的有數據庫、硬件、消息中間件、存儲系統、HTTP服務器、JMX等。

alertmanager

警告管理器，用來進行報警。

prometheus_cli

命令行工具。

其餘輔助性工具

多種導出工具，能夠支持Prometheus存儲數據轉化爲HAProxy、StatsD、Graphite等工具所須要的數據存儲格式。

Prometheus的架構

下面這張圖說明了Prometheus的總體架構，以及生態中的一些組件做用:

Prometheus的基本原理是經過HTTP協議週期性抓取被監控組件的狀態，任意組件只要提供對應的HTTP接口就能夠接入監控。不須要任何SDK或者其餘的集成過程。這樣作很是適合作虛擬化環境監控系統，好比VM、Docker、Kubernetes等。輸出被監控組件信息的HTTP接口被叫作exporter 。目前互聯網公司經常使用的組件大部分都有exporter能夠直接使用，好比Varnish、Haproxy、Nginx、MySQL、Linux系統信息(包括磁盤、內存、CPU、網絡等等)。

Prometheus服務過程大概是這樣：

Prometheus Daemon負責定時去目標上抓取metrics(指標)數據，每一個抓取目標須要暴露一個http服務的接口給它定時抓取。Prometheus支持經過配置文件、文本文件、Zookeeper、Consul、DNS SRV Lookup等方式指定抓取目標。Prometheus採用PULL的方式進行監控，即服務器能夠直接經過目標PULL數據或者間接地經過中間網關來Push數據。
Prometheus在本地存儲抓取的全部數據，並經過必定規則進行清理和整理數據，並把獲得的結果存儲到新的時間序列中。
Prometheus經過PromQL和其餘API可視化地展現收集的數據。Prometheus支持不少方式的圖表可視化，例如Grafana、自帶的Promdash以及自身提供的模版引擎等等。Prometheus還提供HTTP API的查詢方式，自定義所須要的輸出。
PushGateway支持Client主動推送metrics到PushGateway，而Prometheus只是定時去Gateway上抓取數據。
Alertmanager是獨立於Prometheus的一個組件，能夠支持Prometheus的查詢語句，提供十分靈活的報警方式。

Prometheus適用的場景

Prometheus在記錄純數字時間序列方面表現很是好。它既適用於面向服務器等硬件指標的監控，也適用於高動態的面向服務架構的監控。對於如今流行的微服務，Prometheus的多維度數據收集和數據篩選查詢語言也是很是的強大。Prometheus是爲服務的可靠性而設計的，當服務出現故障時，它可使你快速定位和診斷問題。它的搭建過程對硬件和服務沒有很強的依賴關係。

Prometheus不適用的場景

Prometheus它的價值在於可靠性，甚至在很惡劣的環境下，你均可以隨時訪問它和查看系統服務各類指標的統計信息。若是你對統計數據須要100%的精確，它並不適用，例如：它不適用於實時計費系統。

Prometheus官網：https://prometheus.io/

安裝Prometheus

Prometheus官方給出了多重部署方案，好比：Docker容器、Ansible、Chef、Puppet、Saltstack等。

Prometheus用Golang實現，所以具備自然可移植性(支持Linux、Windows、macOS和Freebsd)。這裏直接使用預編譯的二進制文件部署，開箱即用。

Prometheus安裝

這裏以Linux系統爲例：

1
2
3

$ wget https://github.com/prometheus/prometheus/releases/download/v1.6.3/prometheus-1.6.3.linux-amd64.tar.gz
$ tar xzvf prometheus-1.6.3.linux-amd64.tar.gz
$ mv prometheus-1.6.3.linux-amd64 /usr/local/prometheus

其它系統版本可在這裏下載：https://prometheus.io/download/

驗證安裝

$ cd /usr/local/prometheus
$ ./prometheus --version
prometheus, version 1.6.3 (branch: master, revision: c580b60c67f2c5f6b638c3322161bcdf6d68d7fc)
 build user: root@e54b06e0b22f
 build date: 20170519-08:00:43
 go version: go1.8.1

配置Prometheus

在prometheus目錄下有一個名爲prometheus.yml的主配置文件。其中包含大多數標準配置及prometheus的自檢控配置，默認配置文件以下：

$ cat /usr/local/prometheus/prometheus.yml

# 全局配置
global:
  scrape_interval:     15s # 默認抓取間隔, 15秒向目標抓取一次數據。
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # 這個標籤是在本機上每一條時間序列上都會默認產生的，主要能夠用於聯合查詢、遠程存儲、Alertmanger時使用。
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# 這裏就表示抓取對象的配置
# 這裏是抓去promethues自身的配置
scrape_configs:
# job name 這個配置是表示在這個配置內的時間序例，每一條都會自動添加上這個{job_name:"prometheus"}的標籤。
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    # 重寫了全局抓取間隔時間，由15秒重寫成5秒。
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090']

建立用戶

這裏單首創建一個專門用於運行prometheus的用戶，不用root運行程序是一種好習慣。主目錄爲/var/lib/prometheus，用做prometheus的數據目錄。

1 2	$ groupadd prometheus $ useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus

建立Systemd服務

$ vim /etc/systemd/system/prometheus.service

[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/prometheus -config.file=/usr/local/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus
Restart=on-failure
[Install]
WantedBy=multi-user.target

啓動Prometheus

1	$ systemctl start prometheus

驗證Prometheus是否啓動成功

$ systemctl status prometheus
● prometheus.service - prometheus
 Loaded: loaded (/etc/systemd/system/prometheus.service; disabled; vendor preset: enabled)
 Active: active (running) since Mon 2017-05-22 11:13:36 CST; 18s ago
 Main PID: 9175 (prometheus)
 Tasks: 9
 Memory: 15.8M
 CPU: 207ms
 CGroup: /system.slice/prometheus.service
 └─9175 /usr/local/prometheus/prometheus -config.file=/usr/local/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus

訪問自帶Web

Prometheus自帶一個比較簡單的Web，能夠查看錶達式搜索結果、報警配置、prometheus配置,exporter狀態等。自帶Web默認在http://ip:9090。

Prometheus自己也是自帶exporter的,咱們經過請求 http://ip:9090/metrics 能夠查看從exporter中能具體抓到哪些數據。

這裏以Prometheus自己數據爲例，簡單演示下在Web中查詢指定表達式及圖形化顯示查詢結果。

使用Prometheus監控服務器

上面用Prometheus自己的數據簡單演示了監控數據的查詢，這裏咱們用一個監控服務器狀態的例子來更加直觀說明。

爲監控服務器CPU、內存、磁盤、I/O等信息，首先須要安裝node_exporter。node_exporter的做用是用於機器系統數據收集。

安裝node_exporter

node_exporter也是用Golang實現，直接使用預編譯的二進制文件部署，開箱即用。

1
2
3

$ wget https://github.com/prometheus/node_exporter/releases/download/v0.14.0/node_exporter-0.14.0.linux-amd64.tar.gz
$ tar -zxvf node_exporter-0.14.0.linux-amd64.tar.gz
$ mv node_exporter-0.14.0.linux-amd64 /usr/local/prometheus/node_exporter

建立Systemd服務

$ vim /etc/systemd/system/node_exporter.service

[Unit]
Description=node_exporter
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target

啓動Node exporter

1	$ systemctl start node_exporter

驗證Node exporter是否啓動成功

$ systemctl status node_exporter
● node_exporter.service - node_exporter
 Loaded: loaded (/etc/systemd/system/node_exporter.service; disabled; vendor preset: enabled)
 Active: active (running) since Mon 2017-05-22 12:13:43 CST; 6s ago
 Main PID: 11776 (node_exporter)
 Tasks: 4
 Memory: 1.5M
 CPU: 24ms
 CGroup: /system.slice/node_exporter.service
 └─11776 /usr/local/prometheus/node_exporter/node_exporter

修改prometheus.yml，加入下面的監控目標：

Node Exporter默認的抓取地址爲http://IP:9100/metrics

$ vim /usr/local/prometheus/prometheus.yml

 - job_name: 'linux'
 static_configs:
 - targets: ['localhost:9100']
 labels:
 instance: node1

prometheus.yml中一共定義了兩個監控：一個是監控prometheus自身服務，另外一個是監控Linux服務器。這裏給個完整的示例：

scrape_configs:

 - job_name: 'prometheus'
 static_configs:
 - targets: ['localhost:9090']

 - job_name: 'linux'
 static_configs:
 - targets: ['localhost:9100']
 labels:
 instance: node1

重啓Prometheus

1	$ systemctl restart prometheus

在Prometheus Web查看監控的目標

訪問Prometheus Web，在Status->Targets頁面下，咱們能夠看到咱們配置的兩個Target，它們的State爲UP。

使用Prometheus Web來驗證Node Exporter的數據已經被正確的採集。

a) 查看當前主機的CPU使用狀況

b) 查看當前主機的CPU負載狀況

Prometheus Web界面自帶的圖表是很是基礎的，比較適合用來作測試。若是要構建強大的Dashboard，仍是須要更加專業的工具才行。接下來咱們將使用Grafana來對Prometheus採集到的數據進行可視化展現。

給Prometheus添加一個強大的儀表盤

Grafana是用於可視化大型測量數據的開源程序，它提供了強大和優雅的方式去建立、共享、瀏覽數據。Dashboard中顯示了你不一樣metric數據源中的數據。

Grafana最經常使用於因特網基礎設施和應用分析，但在其餘領域也有用到，好比：工業傳感器、家庭自動化、過程控制等等。Grafana支持熱插拔控制面板和可擴展的數據源，目前已經支持Graphite、InfluxDB、OpenTSDB、Elasticsearch、Prometheus等。

Grafana安裝

軟件源裏是比較舊的2.6版本，而且還須要單獨打補丁才能正常使用Prometheus的數據源。這裏直接下載4.2版本安裝包進行安裝。

以Ubutu系統爲例：

1 2	$ wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana_4.2.0_amd64.deb $ dpkg -i grafana_4.2.0_amd64.deb

其它系統可在這裏下載：https://grafana.com/grafana/download

啓動Grafana

1	$ systemctl start grafana-server

查看Grafana是否啓動成功

$ systemctl status grafana-server
● grafana-server.service - Grafana instance
 Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; masked; vendor preset: enabled)
 Active: active (running) since Mon 2017-05-22 14:57:29 CST; 49min ago
 Docs: http://docs.grafana.org
 Main PID: 21735 (grafana-server)
 CGroup: /system.slice/grafana-server.service
 └─21735 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile= cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins

訪問Grafana

經過http://ip:3000訪問Grafana Web界面(缺省賬號/密碼爲admin/admin)

在Grafana中添加Prometheus數據源

Name:Prometheus
Type:Prometheus
Url:http://localhost:9090/
Access:proxy

在Dashboards頁面導入自帶的Prometheus Status模板

導入Node Exporter Server Metrics模板

訪問https://grafana.com/dashboards/405，從這裏下載Node Exporter Server Metrics模板的JSON文件。

在Grafana--Dashboard中導入這個文件，數據源選擇Prometheus。

訪問Dashboards

在Dashboards上選Node Exporter Server Metrics模板，就能夠看到被監控服務器的CPU, 內存, 磁盤等統計信息。

若是想具體查看某一項指標也是能夠的。

在Dashboards上選Prometheus Status模板，查看Prometheus各項指標數據。

參考文章: https://www.hi-linux.com/posts/25047.html

1. Prometheus入門
2. Prometheus : 入門
3. Prometheus 入門
4. Prometheus入門實踐
5. Prometheus快速入門
6. prometheus-入門嘗試
7. Prometheus 入門與實踐
8. Prometheus入門+grafana集成
9. Prometheus監控系統入門與部署
10. Prometheus從入門到精通：1、部署
更多相關文章...
• Memcached入門教程 - NoSQL教程
• Neo4j數據庫入門教程 - NoSQL教程
• YAML 入門教程
• Java Agent入門實戰（一）-Instrumentation介紹與使用

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。