Prometheus監控

時間 2019-11-12

標籤 prometheus 監控简体版

原文原文鏈接

Prometheus介紹

(1)Prometheus由來

普羅米修斯的靈感來自於谷歌的Borgmon。它最初是由馬特·t·普勞德(Matt T. Proud)做爲一個研究項目開發的，普勞德曾是谷歌(google)的一名僱員。在普勞德加入SoundCloud以後，他與另外一位工程師朱利葉斯•沃爾茲(Julius Volz)合做，認真開發普羅米修斯。其餘開發人員也參與了這項工做，並繼續在SoundCloud內部進行開發，最終於2015年1月公開發佈。node

(2)Prometheus架構

Prometheus Server：用於收集和存儲時間序列數據。
PushGateway：主要用於短時間的 jobs。因爲這類 jobs 存在時間較短，可能在 Prometheus 來 pull 以前就消失了。爲此，此次 jobs 能夠直接向 Prometheus server 端push metrics。
Exporters：Exporter是Prometheus的一類數據採集組件的總稱。它負責從目標處蒐集數據，並將其轉化爲Prometheus支持的格式。與傳統的數據採集組件不一樣的是，它並不向中央服務器發送數據，而是等待中央服務器主動前來抓取。
Alertmanager：從 Prometheus server 端接收到 alerts 後，會進行去除重複數據，分組，並路由到對收的接受方式，發出報警。常見的接收方式有：電子郵件，pagerduty，OpsGenie, webhook 等。

(3)Prometheus基本原理

Prometheus的基本原理是經過HTTP協議週期性抓取被監控組件的狀態，任意組件只要提供對應的HTTP接口就能夠接入監控。不須要任何SDK或者其餘的集成過程。這樣作很是適合作虛擬化環境監控系統，好比VM、Docker、Kubernetes等。輸出被監控組件信息的HTTP接口被叫作exporter 。目前互聯網公司經常使用的組件大部分都有exporter能夠直接使用，好比Varnish、Haproxy、Nginx、MySQL、Linux系統信息(包括磁盤、內存、CPU、網絡等等)。linux

Prometheus安裝

Prometheus Server端安裝ios

1.下載:
wget https://github.com/prometheus/prometheus/releases/download/v2.8.0/prometheus-2.8.0.linux-amd64.tar.gz
tar xf prometheus-2.8.0.linux-amd64.tar.gz -C /usr/local/
mv /usr/local/prometheus-2.8.0.linux-amd64 /usr/local/prometheus
mkdir /usr/local/prometheus/data     #數據存放目錄
2.使用screen來管理Prometheus
yum -y install screen
screen     #打開一個新的窗口
/usr/local/prometheus/prometheus --web.listen-address="0.0.0.0:9090" --web.read-timeout=5m --web.max-connections=10 --storage.tsdb.retention=15d  --storage.tsdb.path="data/"   --query.max-concurrency=20   --query.timeout=2m   #C-a d  退出窗口,screen -ls查看後臺進程
3.啓動參數說明
--web.read-timeout=5m #請求連接的最⼤等待時間，防⽌太多的空閒連接 佔⽤資源
--web.max-connections=512 #最⼤連接數
--storage.tsdb.retention=15d  #prometheus開始採集監控數據後，對於保留期限的設置
--storage.tsdb.path="data/"  #存儲數據路徑，wal目錄保存着按照⼀定間隔的內存中近期的監控數據
--query.timeout=2m   #防⽌單個⽤戶執⾏過慢的查詢
--query.max-concurrency=20  #容許多少用戶同時查詢
注:prometheus 對系統時間⾮常敏感,⼀定要時刻保證系統時間同步,否則曲線是亂的

Prometheus Client端安裝 node_export插件git

wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz
tar xf node_exporter-0.17.0.linux-amd64.tar.gz -C /usr/local/
mv /usr/local/node_exporter-0.17.0.linux-amd64 /usr/local/node_exporter
2.使用screen來管理Prometheus
yum -y install screen
screen  #打開一個新的窗口
./node_exporter --collector.systemd

Prometheus配置文件說明github

# 全局配置
global:
  scrape_interval:     15s   # 多長時間抓取一次數據
  evaluation_interval: 15s   # 多長時間評估一次報警規則
  scrape_timeout:      10s   # 每次抓取數據的超時時間
# 告警配置
alerting:
  ...  #這裏咱們不使用prometheus自帶的告警，使用無需關注
# 告警規則
rule_files:
  ...  #制定了規則所在的位置，prometheus能夠根據這個配置加載規則
# 定義Promeetheus監控那些資源
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']   #監控prometheus自己的健康狀況
#添加客戶端監控
  - job_name: 'test'
    static_configs:
    - targets: ['jenkins:9100','gitlab:9100']   #此處主機名須要在/etc/hosts上定義。

注: 修改完配置文件須要重啓prometheus, web上輸入PrometheusIP:Prot查看頁面。
web

Pushgateway

(1) pushgateway介紹

Pushgateway是Prometheus 生態中一個重要工具，使用它的緣由主要是：vim

Prometheus 採用 pull 模式，可能因爲不在一個子網或者防火牆緣由，致使 Prometheus 沒法直接拉取各個 target 數據。
在監控業務數據的時候，須要將不一樣數據彙總, 由 Prometheus 統一收集。

Pushgateway缺點:bash

Prometheus拉取狀態UP只能針對Pushgateway，沒法作到對每一個節點有效。
將多個節點數據彙總到pushgateway, 若是pushgateway宕機，受影響比多個target大。

Pushgateway的客戶端採用push方式將數據發送到服務端，Prometheus只須要到Pushgateway拉取數據便可。Pushgateway能夠單獨運⾏在任何節點上的插件（並不⼀定要在被監控客戶端）服務器

(2) pushgateway安裝

wget http://github.com/prometheus/pushgateway/releases/download/v0.7.0/pushgateway-0.7.0.linux-amd64.tar.gz
tar xf pushgateway-0.7.0.linux-amd64.tar.gz -C /usr/local/
mv /usr/local/pushgateway-0.7.0.linux-amd64 /usr/local/pushgateway
screen
/usr/local/pushgateway/pushgateway

(3) Prometheus配置文件引用pushgateway

[root@nagios ~]# tail -3 /usr/local/prometheus/prometheus.yml
  - job_name: 'pushgateway'
    static_configs:
    - targets: ['localhost:9091']       
#由於我將pushgateway裝到了prometheus機器上因此使用的主機名是localhost,端口默認是9091。
#須要重啓prometheus。

(4) 客戶端自定義腳本推送數據到pushgateway

咱們來寫一個監控客戶端主機登錄用戶數量的腳本，將數據推送到pushgateway
[root@jenkins_test ~]# cat user_login.sh 
#!/bin/bash
count=$(w| awk 'NR==1{print $4}')
label="Count_login_users"
instance_name=$(hostname)
echo "$label $count" | curl --data-binary @- http://192.168.18.213:9091/metrics/job/pushgateway/instance/$instance_name

#job/pushgateway  推送到prometheus.yml的哪個job⾥。
#instance/$instance_name 推送後顯⽰的機器名是什麼。

(5) 客戶端定時推送數據

編寫的監控bash腳本是⼀次性執⾏的bash,咱們須要按時間段反覆執⾏，因此呢？⾃然就得結合contab了。可是crontab默認只能最短⼀分鐘的間隔，若是但願⼩於⼀分鐘的間隔15s，可使用以下方法:

[root@jenkins_test ~]# cat user_login.sh 
#!/bin/bash
for((i=1;i<=4;i++));
  do 
  count=$(w| awk 'NR==1{print $4}')
  label="Count_login_users"
  instance_name=$(hostname)
  echo "$label $count" | curl --data-binary @- http://192.168.18.213:9091/metrics/job/pushgateway/instance/$instance_name
  sleep 15        #等待15秒
done

[root@jenkins_test ~]# crontab -l
* * * * * /bin/bash /root/user_login.sh  &>/dev/null

(6) Prometheus頁面查看數據

Grafana

(1) Grafana介紹

Grafana是一個跨平臺的開源的度量分析和可視化工具，能夠經過將採集的數據查詢而後可視化的展現，並及時通知。它主要有如下幾個特色：網絡

展現方式：快速靈活的客戶端圖表，面板插件有許多不一樣方式的可視化指標和日誌，官方庫中具備豐富的儀表盤插件，好比熱圖、折線圖、圖表等多種展現方式；
數據源：Graphite，InfluxDB，OpenTSDB，Prometheus，Elasticsearch，CloudWatch和KairosDB等；
通知提醒：4.0以後的添加了報警功能，能夠以可視方式定義最重要指標的警報規則，Grafana將不斷計算併發送通知，在數據達到閾值時經過Slack、PagerDuty等得到通知；
混合展現：在同一圖表中混合使用不一樣的數據源，能夠基於每一個查詢指定數據源，甚至自定義數據源；
註釋：使用來自不一樣數據源的豐富事件註釋圖表，將鼠標懸停在事件上會顯示完整的事件元數據和標記；

(2) Grafana安裝(安裝特別簡單)

wget https://dl.grafana.com/oss/release/grafana-6.0.1-1.x86_64.rpm  #最新版本
yum localinstall -y grafana-6.0.1-1.x86_64.rpm
#安裝餅圖插件
cd /var/lib/grafana/plugins/
git clone https://github.com/grafana/piechart-panel.git
#修改配置文件
vim /etc/grafana/grafana.ini  
root_url = http://192.168.18.213:3000    #將localhost改成grafana服務端地址
#啓動Grafana
systemctl start grafana-server.service
systemctl enable grafana-server.service

注: 默認運行在3000端口，web上輸入IP:Prot查看頁面,初始帳號密碼爲admin/admin。

(3) Grafana配置鏈接Prometheus數據源

(4) Grafana導入儀表盤

編輯儀表盤屬性

(5) 查看頁面展現效果

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。