Kubernetes學習之路（二十四）之Prometheus監控

時間 2019-11-06

標籤 kubernetes 學習之路二十四 prometheus 監控简体版

原文原文鏈接

目錄node

一、Prometheus概述

除了前面的資源指標（如CPU、內存）之外，用戶或管理員須要瞭解更多的指標數據，好比Kubernetes指標、容器指標、節點資源指標以及應用程序指標等等。自定義指標API容許請求任意的指標，其指標API的實現要指定相應的後端監視系統。而Prometheus是第一個開發了相應適配器的監控系統。這個適用於Prometheus的Kubernetes Customm Metrics Adapter是屬於Github上的k8s-prometheus-adapter項目提供的。其原理圖以下：mysql

要知道的是prometheus自己就是一監控系統，也分爲server端和agent端，server端從被監控主機獲取數據，而agent端須要部署一個node_exporter，主要用於數據採集和暴露節點的數據，那麼在獲取Pod級別或者是mysql等多種應用的數據，也是須要部署相關的exporter。咱們能夠經過PromQL的方式對數據進行查詢，可是因爲自己prometheus屬於第三方的解決方案，原生的k8s系統並不能對Prometheus的自定義指標進行解析，就須要藉助於k8s-prometheus-adapter將這些指標數據查詢接口轉換爲標準的Kubernetes自定義指標。git

Prometheus是一個開源的服務監控系統和時序數據庫，其提供了通用的數據模型和快捷數據採集、存儲和查詢接口。它的核心組件Prometheus服務器按期從靜態配置的監控目標或者基於服務發現自動配置的目標中進行拉取數據，新拉取到啊的數據大於配置的內存緩存區時，數據就會持久化到存儲設備當中。Prometheus組件架構圖以下：
github

如上圖，每一個被監控的主機均可以經過專用的exporter程序提供輸出監控數據的接口，並等待Prometheus服務器週期性的進行數據抓取。若是存在告警規則，則抓取到數據以後會根據規則進行計算，知足告警條件則會生成告警，併發送到Alertmanager完成告警的彙總和分發。當被監控的目標有主動推送數據的需求時，能夠以Pushgateway組件進行接收並臨時存儲數據，而後等待Prometheus服務器完成數據的採集。sql

任何被監控的目標都須要事先歸入到監控系統中才能進行時序數據採集、存儲、告警和展現，監控目標能夠經過配置信息以靜態形式指定，也可讓Prometheus經過服務發現的機制進行動態管理。下面是組件的一些解析：數據庫

監控代理程序：如node_exporter：收集主機的指標數據，如平均負載、CPU、內存、磁盤、網絡等等多個維度的指標數據。
kubelet（cAdvisor）：收集容器指標數據，也是K8S的核心指標收集，每一個容器的相關指標數據包括：CPU使用率、限額、文件系統讀寫限額、內存使用率和限額、網絡報文發送、接收、丟棄速率等等。
API Server：收集API Server的性能指標數據，包括控制隊列的性能、請求速率和延遲時長等等
etcd：收集etcd存儲集羣的相關指標數據
kube-state-metrics：該組件能夠派生出k8s相關的多個指標數據，主要是資源類型相關的計數器和元數據信息，包括制定類型的對象總數、資源限額、容器狀態以及Pod資源標籤系列等。

Prometheus 可以直接把 Kubernetes API Server 做爲服務發現系統使用進而動態發現和監控集羣中的全部可被監控的對象。這裏須要特別說明的是， Pod 資源須要添加下列註解信息才能被 Prometheus 系統自動發現並抓取其內建的指標數據。vim

1） prometheus. io/ scrape：用於標識是否須要被採集指標數據，布爾型值， true 或 false。
2） prometheus. io/ path：抓取指標數據時使用的 URL 路徑，通常爲/ metrics。
3） prometheus. io/ port：抓取指標數據時使用的套接字端口，如 8080。

另外，僅指望 Prometheus 爲後端生成自定義指標時僅部署 Prometheus 服務器便可，它甚至也不須要數據持久功能。但若要配置完整功能的監控系統，管理員還需要在每一個主機上部署 node_ exporter、按需部署其餘特有類型的 exporter 以及 Alertmanager。後端

二、Prometheus部署

因爲官方的YAML部署方式須要使用到PVC，這裏使用馬哥提供的學習類型的部署，具體生產仍是須要根據官方的建議進行。本次部署的YAMLapi

2.一、建立名稱空間prom

[root@k8s-master ~]# git clone https://github.com/iKubernetes/k8s-prom.git && cd k8s-prom
[root@k8s-master k8s-prom]# kubectl apply -f namespace.yaml 
namespace/prom created

2.二、部署node_exporter

[root@k8s-master k8s-prom]# kubectl apply -f node_exporter/
daemonset.apps/prometheus-node-exporter created
service/prometheus-node-exporter created

[root@k8s-master k8s-prom]# kubectl get pods -n prom
NAME                             READY     STATUS    RESTARTS   AGE
prometheus-node-exporter-6srrq   1/1       Running   0          32s
prometheus-node-exporter-fftmc   1/1       Running   0          32s
prometheus-node-exporter-qlr8d   1/1       Running   0          32s

2.三、部署prometheus-server

[root@k8s-master k8s-prom]# kubectl apply -f prometheus/
configmap/prometheus-config unchanged
deployment.apps/prometheus-server configured
clusterrole.rbac.authorization.k8s.io/prometheus configured
serviceaccount/prometheus unchanged
clusterrolebinding.rbac.authorization.k8s.io/prometheus configured
service/prometheus unchanged


[root@k8s-master k8s-prom]# kubectl get all -n prom
NAME                                    READY     STATUS    RESTARTS   AGE
pod/prometheus-node-exporter-6srrq      1/1       Running   0          11m
pod/prometheus-node-exporter-fftmc      1/1       Running   0          11m
pod/prometheus-node-exporter-qlr8d      1/1       Running   0          11m
pod/prometheus-server-66cbd4c6b-j9lqr   1/1       Running   0          4m

NAME                               TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
service/prometheus                 NodePort    10.96.65.72   <none>        9090:30090/TCP   10m
service/prometheus-node-exporter   ClusterIP   None          <none>        9100/TCP         11m

NAME                                      DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/prometheus-node-exporter   3         3         3         3            3           <none>          11m

NAME                                DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-server   1         1         1            1           10m

NAME                                           DESIRED   CURRENT   READY     AGE
replicaset.apps/prometheus-server-65f5d59585   0         0         0         10m
replicaset.apps/prometheus-server-66cbd4c6b    1         1         1         4m

2.四、部署kube-sate-metrics

[root@k8s-master k8s-prom]# kubectl apply -f kube-state-metrics/
deployment.apps/kube-state-metrics created
serviceaccount/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
service/kube-state-metrics created

[root@k8s-master k8s-prom]# kubectl get pods -n prom -o wide
NAME                                  READY     STATUS    RESTARTS   AGE       IP              NODE
kube-state-metrics-78fc9fc745-g66p8   1/1       Running   0          11m       10.244.1.22     k8s-node01
prometheus-node-exporter-6srrq        1/1       Running   0          31m       192.168.56.11   k8s-master
prometheus-node-exporter-fftmc        1/1       Running   0          31m       192.168.56.12   k8s-node01
prometheus-node-exporter-qlr8d        1/1       Running   0          31m       192.168.56.13   k8s-node02
prometheus-server-66cbd4c6b-j9lqr     1/1       Running   0          24m       10.244.0.4      k8s-master

2.五、製做證書

[root@k8s-master pki]# (umask 077; openssl genrsa -out serving.key 2048)
Generating RSA private key, 2048 bit long modulus
......................+++
....+++
e is 65537 (0x10001)
[root@k8s-master pki]# openssl req -new -key serving.key -out serving.csr -subj "/CN=serving"
[root@k8s-master pki]# openssl x509 -req -in serving.csr -CA ./ca.crt -CAkey ./ca.key -CAcreateserial -out serving.crt -days 3650
Signature ok
subject=/CN=serving
Getting CA Private Key

[root@k8s-master pki]# kubectl create secret generic cm-adapter-serving-certs --from-file=serving.crt=./serving.crt --from-file=serving.key -n prom
secret/cm-adapter-serving-certs created

[root@k8s-master pki]# kubectl get secret -n prom
NAME                             TYPE                                  DATA      AGE
cm-adapter-serving-certs         Opaque                                2         20s

2.六、部署k8s-prometheus-adapter

這裏自帶的custom-metrics-apiserver-deployment.yaml和custom-metrics-config-map.yaml有點問題，須要下載k8s-prometheus-adapter項目中的這2個文件緩存

[root@k8s-master k8s-prometheus-adapter]# wget https://raw.githubusercontent.com/DirectXMan12/k8s-prometheus-adapter/master/deploy/manifests/custom-metrics-apiserver-deployment.yaml

[root@k8s-master k8s-prometheus-adapter]# vim k8s-prometheus-adapter/custom-metrics-apiserver-deployment.yaml #修更名稱空間爲prom

[root@k8s-master k8s-prometheus-adapter]# wget https://raw.githubusercontent.com/DirectXMan12/k8s-prometheus-adapter/master/deploy/manifests/custom-metrics-config-map.yaml  #也須要修更名稱空間爲prom

[root@k8s-master k8s-prom]# kubectl apply -f k8s-prometheus-adapter/
clusterrolebinding.rbac.authorization.k8s.io/custom-metrics:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/custom-metrics-auth-reader created
deployment.apps/custom-metrics-apiserver created
clusterrolebinding.rbac.authorization.k8s.io/custom-metrics-resource-reader created
serviceaccount/custom-metrics-apiserver created
service/custom-metrics-apiserver created
apiservice.apiregistration.k8s.io/v1beta1.custom.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/custom-metrics-server-resources created
clusterrole.rbac.authorization.k8s.io/custom-metrics-resource-reader created
clusterrolebinding.rbac.authorization.k8s.io/hpa-controller-custom-metrics created
configmap/adapter-config created

[root@k8s-master k8s-prom]# kubectl get pods -n prom
NAME                                       READY     STATUS    RESTARTS   AGE
custom-metrics-apiserver-65f545496-l5md9   1/1       Running   0          7m
kube-state-metrics-78fc9fc745-g66p8        1/1       Running   0          40m
prometheus-node-exporter-6srrq             1/1       Running   0          1h
prometheus-node-exporter-fftmc             1/1       Running   0          1h
prometheus-node-exporter-qlr8d             1/1       Running   0          1h
prometheus-server-66cbd4c6b-j9lqr          1/1       Running   0          53m

[root@k8s-master k8s-prom]# kubectl api-versions |grep custom
custom.metrics.k8s.io/v1beta1

[root@k8s-master ~]# kubectl get svc -n  prom
NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
custom-metrics-apiserver   ClusterIP   10.99.14.141    <none>        443/TCP          11h
kube-state-metrics         ClusterIP   10.107.23.237   <none>        8080/TCP         11h
prometheus                 NodePort    10.96.65.72     <none>        9090:30090/TCP   11h
prometheus-node-exporter   ClusterIP   None            <none>        9100/TCP         11h

訪問192.168.56.11:30090，以下圖：選擇須要查看的指標，點擊Execute

三、Grafana數據展現

[root@k8s-master k8s-prom]# cat grafana.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: monitoring-grafana
  namespace: prom    #修更名稱空間
spec:
  replicas: 1
  selector:
    matchLabels:
      task: monitoring
      k8s-app: grafana
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: grafana
    spec:
      containers:
      - name: grafana
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v5.0.4
        ports:
        - containerPort: 3000
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/ssl/certs
          name: ca-certificates
          readOnly: true
        - mountPath: /var
          name: grafana-storage
        env:    #這裏使用的是原先的heapster的grafana的配置文件，須要註釋掉這個環境變量
        #- name: INFLUXDB_HOST
        #  value: monitoring-influxdb
        - name: GF_SERVER_HTTP_PORT
          value: "3000"
          # The following env variables are required to make Grafana accessible via
          # the kubernetes api-server proxy. On production clusters, we recommend
          # removing these env variables, setup auth for grafana, and expose the grafana
          # service using a LoadBalancer or a public IP.
        - name: GF_AUTH_BASIC_ENABLED
          value: "false"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          value: Admin
        - name: GF_SERVER_ROOT_URL
          # If you're only using the API Server proxy, set this value instead:
          # value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
          value: /
      volumes:
      - name: ca-certificates
        hostPath:
          path: /etc/ssl/certs
      - name: grafana-storage
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  labels:
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: prom
spec:
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP.
  # type: LoadBalancer
  # You could also use NodePort to expose the service at a randomly-generated port
  type: NodePort
  ports:
  - port: 80
    targetPort: 3000
  selector:
    k8s-app: grafana

[root@k8s-master k8s-prom]# kubectl apply -f grafana.yaml 
deployment.apps/monitoring-grafana created
service/monitoring-grafana created

[root@k8s-master k8s-prom]# kubectl get pods -n prom
NAME                                       READY     STATUS    RESTARTS   AGE
custom-metrics-apiserver-65f545496-l5md9   1/1       Running   0          16m
kube-state-metrics-78fc9fc745-g66p8        1/1       Running   0          49m
monitoring-grafana-7c94886cd5-dhcqz        1/1       Running   0          36s
prometheus-node-exporter-6srrq             1/1       Running   0          1h
prometheus-node-exporter-fftmc             1/1       Running   0          1h
prometheus-node-exporter-qlr8d             1/1       Running   0          1h
prometheus-server-66cbd4c6b-j9lqr          1/1       Running   0          1h

[root@k8s-master k8s-prom]# kubectl get svc -n prom
NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
custom-metrics-apiserver   ClusterIP   10.99.14.141    <none>        443/TCP          11h
kube-state-metrics         ClusterIP   10.107.23.237   <none>        8080/TCP         11h
monitoring-grafana         NodePort    10.98.174.125   <none>        80:30582/TCP     10h
prometheus                 NodePort    10.96.65.72     <none>        9090:30090/TCP   11h
prometheus-node-exporter   ClusterIP   None            <none>        9100/TCP         11h

訪問grafana的地址：192.168.56.11:30582，默認是沒有kubernetes的模板的，能夠到grafana.com中去下載相關的kubernetes模板。