普羅米修斯 -- 快速構建你的業務監控平臺

時間 2019-12-10

標籤普羅快速構建業務監控平臺简体版

原文原文鏈接

Prometheus是什麼

Prometheus(普羅米修斯)是一個名字很是酷的開源監控系統。java

它支持多維度的指標數據模型，服務端經過HTTP協議定時拉取數據後，經過靈活的查詢語言，實現監控的目的。python

如上圖，客戶端記錄相關指標數據，對外提供查詢接口。Prometheus服務端經過服務器發現機制找到客戶端，並定時抓取存儲爲時間序列數據。最後經過Grafana等圖表工具集成展現。mysql

Prometheus能夠作什麼

在業務層用做埋點系統
Prometheus支持各個主流開發語言（Go，java，python，ruby官方提供客戶端，其餘語言有第三方開源客戶端）。咱們能夠經過客戶端方面的對核心業務進行埋點。以下單流程、添加購物車流程。nginx
在應用層用做應用監控系統
一些主流應用能夠經過官方或第三方的導出器，來對這些應用作核心指標的收集。如redis,mysql。git
在系統層用做系統監控
除了經常使用軟件， prometheus也有相關係統層和網絡層exporter,用以監控服務器或網絡。github
集成其餘的監控
prometheus還能夠經過各類exporte，集成其餘的監控系統，收集監控數據，如AWS CloudWatch,JMX，Pingdom等等。redis

不要用Prometheus作什麼

prometheus也提供了Grok exporter等工具能夠用來讀取日誌，可是prometheus是監控系統，不是日誌系統。應用的日誌仍是應該走ELK等工具棧。spring

Prometheus 和 spring boot集成

Prometheus中配置服務發現

- job_name: 'consul' consul_sd_configs: - server: '192.168.1.248:8500' relabel_configs: - source_labels: [__meta_consul_service] regex: .*,prometheus.* target_label: job metrics_path: '/prometheus'sql

maven中添加相關依賴ruby

<!-- The client -->
  <dependency>
      <groupId>io.prometheus</groupId>
      <artifactId>simpleclient</artifactId> 
  </dependency> 
  <!-- Exposition servlet-->
  <dependency>
      <groupId>io.prometheus</groupId>
      <artifactId>simpleclient_servlet</artifactId> 
  </dependency>
  <dependency>
      <groupId>io.prometheus</groupId>
      <artifactId>simpleclient_spring_boot</artifactId> 
  </dependency>

關閉spring boot原生metrics
spring.metrics.servo.enabled: false
Application類添加註解

@EnablePrometheusEndpoint
@EnableSpringBootMetricsCollector
業務類定義埋點
static final Counter orderCount = Counter.build()
.name("b2c_order_count").help("order count.").labelNames("shop","siteUid").register();
業務埋點
orderCount.labels("shein","mus").inc();

Prometheus監控nginx

Prometheus能夠經過nginx-lua-prometheus這個庫對nginx進行埋點。

使用起來也很是簡單：

lua_shared_dict prometheus_metrics 10M;
lua_package_path "/path/to/nginx-lua-prometheus/?.lua";
init_by_lua '
  prometheus = require("prometheus").init("prometheus_metrics")
  metric_requests = prometheus:counter(
"nginx_http_requests_total", "Number of HTTP requests", {"host", "status"})
  metric_latency = prometheus:histogram(
"nginx_http_request_duration_seconds", "HTTP request latency", {"host"})
  metric_connections = prometheus:gauge(
"nginx_http_connections", "Number of HTTP connections", {"state"})
';
log_by_lua '
  local host = ngx.var.host:gsub("^www.", "")
  metric_requests:inc(1, {host, ngx.var.status})
  metric_latency:observe(ngx.now() - ngx.req.start_time(), {host})
';

可是，經過基準測試，發現使用了histogram類型的指標後，吞吐量會有5%-10%左右的下降。