Consul支持衆多監控工具進行對自身監控。咱們這裏使用Prometheus進行監控。node
有一個consul server集羣及agent。集羣搭建及配置請參考Consul安裝備份升級python
須要在配置文件中指定telemetry選項。以下所示shell
~]# cat /usr/local/consul/consul.d/consul.json { "datacenter": "dc1", "client_addr": "0.0.0.0", "bind_addr": "{{ GetInterfaceIP \"eth0\" }}", "data_dir": "/usr/local/consul/data", "retry_interval": "20s", "retry_join": ["10.111.67.1","10.111.67.2","10.111.67.3","10.111.67.4","10.111.67.5"], "enable_local_script_checks": true, "log_file": "/usr/local/consul/logs/", "log_level": "debug", "enable_debug": true, "pid_file": "/var/run/consul.pid", "performance": { "raft_multiplier": 1 }, "telemetry": { "prometheus_retention_time": "120s", "disable_hostname": true } }
啓動成功後,使用以下命令測試json
~]# curl 127.0.0.1:8500/v1/agent/metrics?format=prometheus # HELP consul_fsm_register consul_fsm_register # TYPE consul_fsm_register summary consul_fsm_register{quantile="0.5"} NaN consul_fsm_register{quantile="0.9"} NaN consul_fsm_register{quantile="0.99"} NaN consul_fsm_register_sum 3.396029010415077 consul_fsm_register_count 8 # HELP consul_http_GET_v1_agent_metrics consul_http_GET_v1_agent_metrics # TYPE consul_http_GET_v1_agent_metrics summary consul_http_GET_v1_agent_metrics{quantile="0.5"} 0.5403839945793152 consul_http_GET_v1_agent_metrics{quantile="0.9"} 0.5403839945793152 consul_http_GET_v1_agent_metrics{quantile="0.99"} 0.5403839945793152 consul_http_GET_v1_agent_metrics_sum 366820.44427236915 consul_http_GET_v1_agent_metrics_count 349523 # HELP consul_http_GET_v1_catalog_service__ consul_http_GET_v1_catalog_service__ # TYPE consul_http_GET_v1_catalog_service__ summary consul_http_GET_v1_catalog_service__{quantile="0.5"} 31258.423828125 consul_http_GET_v1_catalog_service__{quantile="0.9"} 306137.71875 consul_http_GET_v1_catalog_service__{quantile="0.99"} 306137.71875 consul_http_GET_v1_catalog_service___sum 4.0220439955034314e+11 consul_http_GET_v1_catalog_service___count 2.388023e+06 …………………………
server監控咱們採用Prometheus基於文件的自動發現(file_sd_configs
),也能夠使用靜態配置(static_config
)。api
由於咱們要作Consul的報警,報警須要有主機名,因此咱們使用基於文件的自動發現(file_sd_configs
),對每臺主機打上consul_node_name
標籤。而靜態配置(static_config
)則不能對每一臺主機單獨打標籤,只能對總體的targets列表打標籤。瀏覽器
配置文件以下,此配置文件是k8s的配置文件bash
~]# cat prometheus-configmap.yaml apiVersion: v1 kind: ConfigMap metadata: name: prometheus-config-consul namespace: prometheus labels: app: prometheus-consul environment: prod release: release data: prometheus.yml: | global: external_labels: region: cn-hangzhou monitor: consul replica: A scrape_configs: - job_name: prometheus static_configs: - targets: - localhost:9090 - job_name: consul-server # 採集頻率 scrape_interval: 60s # 採集超時 scrape_timeout: 10s # 採集對象的path路徑 metrics_path: "/v1/agent/metrics" scheme: http params: format: ['prometheus'] file_sd_configs: - files: - /etc/config/consul-server.json refresh_interval: 1m consul-server.json: | [ { "targets": [ "10.111.67.1:8500" ], "labels": { "consul_node_name": "Consul-Server-1" } }, { "targets": [ "10.111.67.2:8500" ], "labels": { "consul_node_name": "Consul-Server-2" } }, { "targets": [ "10.111.67.3:8500" ], "labels": { "consul_node_name": "Consul-Server-3" } }, { "targets": [ "10.111.67.4:8500" ], "labels": { "consul_node_name": "Consul-Server-4" } }, { "targets": [ "10.111.67.5:8500" ], "labels": { "consul_node_name": "Consul-Server-5" } } ]
至此,Prometheus就能夠採集的Consul Server的數據了,能夠使用Prometheus自帶的UI進行查詢。markdown
對於Consul client監控,由於Consul client數量太多,成百上千臺。所以若是使用基於文件的發現(file_sd_configs
)給每一臺主機打標籤,維護這個文件工做量太大(有主機的新增和刪除)。因此咱們選用基於Consul的自動發現(consul_sd_config
)`來實現client的監控。app
要想讓Prometheus或者別的服務發現,那這個服務必須得註冊到Consul中。所以咱們使用腳本生成一個簡單的服務註冊curl
~]# cat create-consul-registration.sh #!/bin/bash ADDR=`ip addr show|awk -F '[ /]+' '/eth[0-9]|em[0-9]/ && /inet/ {print $3}'` CONSUL_CONF_DIR='/usr/local/consul/consul.d' CONSUL_REDISTER_FILE="$CONSUL_CONF_DIR/consul-members-registration.json" if [[ -n "$ADDR" && -d $CONSUL_CONF_DIR ]];then cat > ${CONSUL_REDISTER_FILE} <<-EOF { "service": { "id": "consul-${ADDR}", "name": "consul-members", "tags": [ "prometheus", "client", "consul-client" ], "address": "${ADDR}", "port": 8500, "check": { "http": "http://127.0.0.1:8500", "interval": "60s" } } } EOF else echo "ip address is empty or the $CONSUL_CONF_DIR does not exist" fi
執行這個腳本會在/usr/local/consul/consul.d/
下建立服務註冊的配置文件consul-members-registration.json
~]# cat /usr/local/consul/consul.d/consul-members-registration.json { "service": { "id": "consul-10.111.74.8", "name": "consul-members", "tags": [ "prometheus", "client", "consul-client" ], "address": "10.111.74.8", "port": 8500, "check": { "http": "http://127.0.0.1:8500", "interval": "60s" } } }
以後執行consul reload
加載配置
~]# consul reload
此時,這個服務就已經註冊到Consul中了,service名稱爲consul-members
,service ID爲consul-10.111.74.86
,咱們能夠使用curl命令或者瀏覽器來驗證。
~]# curl -s 127.0.0.1:8500/v1/agent/services|python -m json.tool { "consul-10.111.74.8": { "Address": "10.111.74.8", "EnableTagOverride": false, "ID": "consul-10.111.74.8", "Meta": {}, "Port": 8500, "Service": "consul-members", "Tags": [ "prometheus", "client", "consul-client" ], "Weights": { "Passing": 1, "Warning": 1 } } }
配置以下:
~]# cat prometheus-configmap.yaml apiVersion: v1 kind: ConfigMap metadata: name: prometheus-config-consul namespace: prometheus labels: app: prometheus-consul environment: prod release: release data: prometheus.yml: | global: external_labels: region: cn-hangzhou monitor: consul replica: A scrape_configs: - job_name: prometheus static_configs: - targets: - localhost:9090 - job_name: consul-client # 採集頻率 scrape_interval: 60s # 採集超時 scrape_timeout: 10s # 採集對象的path路徑 metrics_path: "/v1/agent/metrics" scheme: http params: format: ['prometheus'] consul_sd_configs: - server: "10.111.67.1:8500" services: - consul-members relabel_configs: - action: replace source_labels: - __meta_consul_dc target_label: consul_dc - action: replace source_labels: - __meta_consul_node target_label: consul_node_name - action: replace source_labels: - __meta_consul_service target_label: consul_service - action: replace source_labels: - __meta_consul_service_id target_label: consul_service_id
由於咱們要作Consul的報警,報警須要有主機名、Service名稱、Service ID、DC等信息,因此咱們須要對標籤進行重寫。可重寫的標籤有:
__meta_consul_address
: the address of the target__meta_consul_dc
: the datacenter name for the target__meta_consul_tagged_address_<key>
: each node tagged address key value of the target__meta_consul_metadata_<key>
: each node metadata key value of the target__meta_consul_node
: the node name defined for the target__meta_consul_service_address
: the service address of the target__meta_consul_service_id
: the service ID of the target__meta_consul_service_metadata_<key>
: each service metadata key value of the target__meta_consul_service_port
: the service port of the target__meta_consul_service
: the name of the service the target belongs to__meta_consul_tags
: the list of tags of the target joined by the tag separator