使用 Elastic 技術棧構建 Kubernetes全棧監控

如下咱們描述如何使用 Elastic 技術棧來爲 Kubernetes 構建監控環境。可觀測性的目標是爲生產環境提供運維工具來檢測服務不可用的狀況(好比服務宕機、錯誤或者響應變慢等),而且保留一些能夠排查的信息,以幫助咱們定位問題。總的來講主要包括3個方面:html

  • 監控指標提供系統各個組件的時間序列數據,好比 CPU、內存、磁盤、網絡等信息,一般能夠用來顯示系統的總體情況以及檢測某個時間的異常行爲
  • 日誌爲運維人員提供了一個數據來分析系統的一些錯誤行爲,一般將系統、服務和應用的日誌集中收集在同一個數據庫中
  • 追蹤或者 APM(應用性能監控)提供了一個更加詳細的應用視圖,能夠將服務執行的每個請求和步驟都記錄下來(好比 HTTP 調用、數據庫查詢等),經過追蹤這些數據,咱們能夠檢測到服務的性能,並相應地改進或修復咱們的系統。

img

本文咱們就將在 Kubernetes 集羣中使用由 ElasticSearch、Kibana、Filebeat、Metricbeat 和 APM-Server 組成的 Elastic 技術棧來監控系統環境。爲了更好地去了解這些組件的配置,咱們這裏將採用手寫資源清單文件的方式來安裝這些組件,固然咱們也可使用 Helm 等其餘工具來快速安裝配置。java

img

接下來咱們就來學習下如何使用 Elastic 技術構建 Kubernetes 監控棧。咱們這裏的試驗環境是 Kubernetes v1.16.3 版本的集羣(已經配置完成),爲方便管理,咱們將全部的資源對象都部署在一個名爲 elastic 的命名空間中:node

$ kubectl create ns elastic
namespace/elastic created

1. SpringBoot 和 MongoDB 開發的示例應用

這裏咱們先部署一個使用 SpringBoot 和 MongoDB 開發的示例應用。首先部署一個 MongoDB 應用,對應的資源清單文件以下所示:git

# mongo.yml
---
apiVersion: v1
kind: Service
metadata:
  name: mongo
  namespace: elastic
  labels:
    app: mongo
spec:
  ports:
  - port: 27017
    protocol: TCP
  selector:
    app: mongo
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  namespace: elastic
  name: mongo
  labels:
    app: mongo
spec:
  serviceName: "mongo"
  selector:
    matchLabels:
      app: mongo
  template:
    metadata:
      labels:
        app: mongo
    spec:
      containers:
      - name: mongo
        image: mongo
        ports:
        - containerPort: 27017
        volumeMounts:
        - name: data
          mountPath: /data/db
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: rook-ceph-block  # 使用支持 RWO 的 StorageClass
      resources:
        requests:
          storage: 1Gi

這裏咱們使用了一個名爲 rook-ceph-block 的 StorageClass 對象來自動建立 PV,能夠替換成本身集羣中支持 RWO 的 StorageClass 對象便可。存儲採用rook-ceph 實踐配置,直接使用上面的資源清單建立便可:github

$ kubectl apply -f mongo.yml
service/mongo created
statefulset.apps/mongo created
$ kubectl get pods -n elastic -l app=mongo             
NAME      READY   STATUS    RESTARTS   AGE
mongo-0   1/1     Running   0          34m

直到 Pod 變成 Running 狀態證實 mongodb 部署成功了。接下來部署 SpringBoot 的 API 應用,這裏咱們經過 NodePort 類型的 Service 服務來暴露該服務,對應的資源清單文件以下所示:web

# spring-boot-simple.yml
---
apiVersion: v1
kind: Service
metadata:
  namespace: elastic
  name: spring-boot-simple
  labels:
    app: spring-boot-simple
spec:
  type: NodePort
  ports:
  - port: 8080
    protocol: TCP
  selector:
    app: spring-boot-simple
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: elastic
  name: spring-boot-simple
  labels:
    app: spring-boot-simple
spec:
  replicas: 1
  selector:
    matchLabels:
      app: spring-boot-simple
  template:
    metadata:
      labels:
        app: spring-boot-simple
    spec:
      containers:
      - image: cnych/spring-boot-simple:0.0.1-SNAPSHOT
        name: spring-boot-simple
        env:
        - name: SPRING_DATA_MONGODB_HOST  # 指定MONGODB地址
          value: mongo
        ports:
        - containerPort: 8080

一樣直接建立上面的應用的應用便可:spring

$ kubectl apply -f spring-boot-simple.yaml 
service/spring-boot-simple created
deployment.apps/spring-boot-simple created
$ kubectl get pods -n elastic -l app=spring-boot-simple
NAME                                  READY   STATUS    RESTARTS   AGE
spring-boot-simple-64795494bf-hqpcj   1/1     Running   0          24m
$ kubectl get svc -n elastic -l app=spring-boot-simple
NAME                 TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
spring-boot-simple   NodePort   10.109.55.134   <none>        8080:31847/TCP   84s

當應用部署完成後,咱們就能夠經過地址 http://:31847 訪問應用,能夠經過以下命令進行簡單測試:mongodb

$ curl -X GET  http://k8s.qikqiak.com:31847/
Greetings from Spring Boot!

發送一個 POST 請求:docker

$ curl -X POST http://k8s.qikqiak.com:31847/message -d 'hello world'
{"id":"5ef55c130d53190001bf74d2","message":"hello+world=","postedAt":"2020-06-26T02:23:15.860+0000"}

獲取因此消息數據:shell

$ curl -X GET http://k8s.qikqiak.com:31847/message
[{"id":"5ef55c130d53190001bf74d2","message":"hello+world=","postedAt":"2020-06-26T02:23:15.860+0000"}]

2. ElasticSearch 集羣

要創建一個 Elastic 技術的監控棧,固然首先咱們須要部署 ElasticSearch,它是用來存儲全部的指標、日誌和追蹤的數據庫,這裏咱們經過3個不一樣角色的可擴展的節點組成一個集羣。

2.1 安裝 ElasticSearch 主節點

設置集羣的第一個節點爲 Master 主節點,來負責控制整個集羣。首先建立一個 ConfigMap 對象,用來描述集羣的一些配置信息,以方便將 ElasticSearch 的主節點配置到集羣中並開啓安全認證功能。對應的資源清單文件以下所示:

# elasticsearch-master.configmap.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: elasticsearch-master-config
  labels:
    app: elasticsearch
    role: master
data:
  elasticsearch.yml: |-
    cluster.name: ${CLUSTER_NAME}
    node.name: ${NODE_NAME}
    discovery.seed_hosts: ${NODE_LIST}
    cluster.initial_master_nodes: ${MASTER_NODES}

    network.host: 0.0.0.0

    node:
      master: true
      data: false
      ingest: false

    xpack.security.enabled: true
    xpack.monitoring.collection.enabled: true
---

而後建立一個 Service 對象,在 Master 節點下,咱們只須要經過用於集羣通訊的 9300 端口進行通訊。資源清單文件以下所示:

# elasticsearch-master.service.yaml
---
apiVersion: v1
kind: Service
metadata:
  namespace: elastic
  name: elasticsearch-master
  labels:
    app: elasticsearch
    role: master
spec:
  ports:
  - port: 9300
    name: transport
  selector:
    app: elasticsearch
    role: master
---

最後使用一個 Deployment 對象來定義 Master 節點應用,資源清單文件以下所示:

# elasticsearch-master.deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: elastic
  name: elasticsearch-master
  labels:
    app: elasticsearch
    role: master
spec:
  replicas: 1
  selector:
    matchLabels:
      app: elasticsearch
      role: master
  template:
    metadata:
      labels:
        app: elasticsearch
        role: master
    spec:
      containers:
      - name: elasticsearch-master
        image: docker.elastic.co/elasticsearch/elasticsearch:7.8.0
        env:
        - name: CLUSTER_NAME
          value: elasticsearch
        - name: NODE_NAME
          value: elasticsearch-master
        - name: NODE_LIST
          value: elasticsearch-master,elasticsearch-data,elasticsearch-client
        - name: MASTER_NODES
          value: elasticsearch-master
        - name: "ES_JAVA_OPTS"
          value: "-Xms512m -Xmx512m"
        ports:
        - containerPort: 9300
          name: transport
        volumeMounts:
        - name: config
          mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          readOnly: true
          subPath: elasticsearch.yml
        - name: storage
          mountPath: /data
      volumes:
      - name: config
        configMap:
          name: elasticsearch-master-config
      - name: "storage"
        emptyDir:
          medium: ""
---

直接建立上面的3個資源對象便可:

$ kubectl apply  -f elasticsearch-master.configmap.yaml \
                 -f elasticsearch-master.service.yaml \
                 -f elasticsearch-master.deployment.yaml

configmap/elasticsearch-master-config created
service/elasticsearch-master created
deployment.apps/elasticsearch-master created
$ kubectl get pods -n elastic -l app=elasticsearch
NAME                                    READY   STATUS    RESTARTS   AGE
elasticsearch-master-6f666cbbd-r9vtx    1/1     Running   0          111m

直到 Pod 變成 Running 狀態就代表 master 節點安裝成功。

2.2 安裝 ElasticSearch 數據節點

如今咱們須要安裝的是集羣的數據節點,它主要來負責集羣的數據託管和執行查詢。 和 master 節點同樣,咱們使用一個 ConfigMap 對象來配置咱們的數據節點:

# elasticsearch-data.configmap.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: elasticsearch-data-config
  labels:
    app: elasticsearch
    role: data
data:
  elasticsearch.yml: |-
    cluster.name: ${CLUSTER_NAME}
    node.name: ${NODE_NAME}
    discovery.seed_hosts: ${NODE_LIST}
    cluster.initial_master_nodes: ${MASTER_NODES}

    network.host: 0.0.0.0

    node:
      master: false
      data: true
      ingest: false

    xpack.security.enabled: true
    xpack.monitoring.collection.enabled: true
---

能夠看到和上面的 master 配置很是相似,不過須要注意的是屬性 node.data=true。

一樣只須要經過 9300 端口和其餘節點進行通訊:

# elasticsearch-data.service.yaml
---
apiVersion: v1
kind: Service
metadata:
  namespace: elastic
  name: elasticsearch-data
  labels:
    app: elasticsearch
    role: data
spec:
  ports:
  - port: 9300
    name: transport
  selector:
    app: elasticsearch
    role: data
---

最後建立一個 StatefulSet 的控制器,由於可能會有多個數據節點,每個節點的數據不是同樣的,須要單獨存儲,因此也使用了一個 volumeClaimTemplates 來分別建立存儲卷,對應的資源清單文件以下所示:

# elasticsearch-data.statefulset.yaml
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  namespace: elastic
  name: elasticsearch-data
  labels:
    app: elasticsearch
    role: data
spec:
  serviceName: "elasticsearch-data"
  selector:
    matchLabels:
      app: elasticsearch
      role: data
  template:
    metadata:
      labels:
        app: elasticsearch
        role: data
    spec:
      containers:
      - name: elasticsearch-data
        image: docker.elastic.co/elasticsearch/elasticsearch:7.8.0
        env:
        - name: CLUSTER_NAME
          value: elasticsearch
        - name: NODE_NAME
          value: elasticsearch-data
        - name: NODE_LIST
          value: elasticsearch-master,elasticsearch-data,elasticsearch-client
        - name: MASTER_NODES
          value: elasticsearch-master
        - name: "ES_JAVA_OPTS"
          value: "-Xms1024m -Xmx1024m"
        ports:
        - containerPort: 9300
          name: transport
        volumeMounts:
        - name: config
          mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          readOnly: true
          subPath: elasticsearch.yml
        - name: elasticsearch-data-persistent-storage
          mountPath: /data/db
      volumes:
      - name: config
        configMap:
          name: elasticsearch-data-config
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data-persistent-storage
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: rook-ceph-block
      resources:
        requests:
          storage: 50Gi
---

直接建立上面的資源對象便可:

$ kubectl apply -f elasticsearch-data.configmap.yaml \
                -f elasticsearch-data.service.yaml \
                -f elasticsearch-data.statefulset.yaml

configmap/elasticsearch-data-config created
service/elasticsearch-data created
statefulset.apps/elasticsearch-data created

直到 Pod 變成 Running 狀態證實節點啓動成功:

$ kubectl get pods -n elastic -l app=elasticsearch
NAME                                    READY   STATUS    RESTARTS   AGE
elasticsearch-data-0                    1/1     Running   0          90m
elasticsearch-master-6f666cbbd-r9vtx    1/1     Running   0          111m

2.3 安裝 ElasticSearch 客戶端節點

最後來安裝配置 ElasticSearch 的客戶端節點,該節點主要負責暴露一個 HTTP 接口將查詢數據傳遞給數據節點獲取數據。

一樣使用一個 ConfigMap 對象來配置該節點:

# elasticsearch-client.configmap.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: elasticsearch-client-config
  labels:
    app: elasticsearch
    role: client
data:
  elasticsearch.yml: |-
    cluster.name: ${CLUSTER_NAME}
    node.name: ${NODE_NAME}
    discovery.seed_hosts: ${NODE_LIST}
    cluster.initial_master_nodes: ${MASTER_NODES}

    network.host: 0.0.0.0

    node:
      master: false
      data: false
      ingest: true

    xpack.security.enabled: true
    xpack.monitoring.collection.enabled: true
---

客戶端節點須要暴露兩個端口,9300端口用於與集羣的其餘節點進行通訊,9200端口用於 HTTP API。對應的 Service 對象以下所示:

# elasticsearch-client.service.yaml
---
apiVersion: v1
kind: Service
metadata:
  namespace: elastic
  name: elasticsearch-client
  labels:
    app: elasticsearch
    role: client
spec:
  ports:
  - port: 9200
    name: client
  - port: 9300
    name: transport
  selector:
    app: elasticsearch
    role: client
---

使用一個 Deployment 對象來描述客戶端節點:

# elasticsearch-client.deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: elastic
  name: elasticsearch-client
  labels:
    app: elasticsearch
    role: client
spec:
  selector:
    matchLabels:
      app: elasticsearch
      role: client
  template:
    metadata:
      labels:
        app: elasticsearch
        role: client
    spec:
      containers:
      - name: elasticsearch-client
        image: docker.elastic.co/elasticsearch/elasticsearch:7.8.0
        env:
        - name: CLUSTER_NAME
          value: elasticsearch
        - name: NODE_NAME
          value: elasticsearch-client
        - name: NODE_LIST
          value: elasticsearch-master,elasticsearch-data,elasticsearch-client
        - name: MASTER_NODES
          value: elasticsearch-master
        - name: "ES_JAVA_OPTS"
          value: "-Xms256m -Xmx256m"
        ports:
        - containerPort: 9200
          name: client
        - containerPort: 9300
          name: transport
        volumeMounts:
        - name: config
          mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          readOnly: true
          subPath: elasticsearch.yml
        - name: storage
          mountPath: /data
      volumes:
      - name: config
        configMap:
          name: elasticsearch-client-config
      - name: "storage"
        emptyDir:
          medium: ""
---

一樣直接建立上面的資源對象來部署 client 節點:

$ kubectl apply  -f elasticsearch-client.configmap.yaml \
                 -f elasticsearch-client.service.yaml \
                 -f elasticsearch-client.deployment.yaml

configmap/elasticsearch-client-config created
service/elasticsearch-client created
deployment.apps/elasticsearch-client created

直到全部的節點都部署成功後證實集羣安裝成功:

$ kubectl get pods -n elastic -l app=elasticsearch
NAME                                    READY   STATUS    RESTARTS   AGE
elasticsearch-client-788bffcc98-hh2s8   1/1     Running   0          83m
elasticsearch-data-0                    1/1     Running   0          91m
elasticsearch-master-6f666cbbd-r9vtx    1/1     Running   0          112m

能夠經過以下所示的命令來查看集羣的狀態變化:

$ kubectl logs -f -n elastic \
  $(kubectl get pods -n elastic | grep elasticsearch-master | sed -n 1p | awk '{print $1}') \
  | grep "Cluster health status changed from"

{"type": "server", "timestamp": "2020-06-26T03:31:21,353Z", "level": "INFO", "component": "o.e.c.r.a.AllocationService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master", "message": "Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[.monitoring-es-7-2020.06.26][0]]]).", "cluster.uuid": "SS_nyhNiTDSCE6gG7z-J4w", "node.id": "BdVScO9oQByBHR5rfw-KDA"  }

2.4 生成密碼

咱們啓用了 xpack 安全模塊來保護咱們的集羣,因此咱們須要一個初始化的密碼。咱們能夠執行以下所示的命令,在客戶端節點容器內運行 bin/elasticsearch-setup-passwords 命令來生成默認的用戶名和密碼:

$ kubectl exec $(kubectl get pods -n elastic | grep elasticsearch-client | sed -n 1p | awk '{print $1}') \
    -n elastic \
    -- bin/elasticsearch-setup-passwords auto -b

Changed password for user apm_system
PASSWORD apm_system = 3Lhx61s6woNLvoL5Bb7t

Changed password for user kibana_system
PASSWORD kibana_system = NpZv9Cvhq4roFCMzpja3

Changed password for user kibana
PASSWORD kibana = NpZv9Cvhq4roFCMzpja3

Changed password for user logstash_system
PASSWORD logstash_system = nNnGnwxu08xxbsiRGk2C

Changed password for user beats_system
PASSWORD beats_system = fen759y5qxyeJmqj6UPp

Changed password for user remote_monitoring_user
PASSWORD remote_monitoring_user = mCP77zjCATGmbcTFFgOX

Changed password for user elastic
PASSWORD elastic = wmxhvsJFeti2dSjbQEAH

注意須要將 elastic 用戶名和密碼也添加到 Kubernetes 的 Secret 對象中(後續會進行調用):

$ kubectl create secret generic elasticsearch-pw-elastic \
    -n elastic \
    --from-literal password=wmxhvsJFeti2dSjbQEAH
secret/elasticsearch-pw-elastic created

3. Kibana

ElasticSearch 集羣安裝完成後,接着咱們能夠來部署 Kibana,這是 ElasticSearch 的數據可視化工具,它提供了管理 ElasticSearch 集羣和可視化數據的各類功能。

一樣首先咱們使用 ConfigMap 對象來提供一個文件文件,其中包括對 ElasticSearch 的訪問(主機、用戶名和密碼),這些都是經過環境變量配置的。對應的資源清單文件以下所示:

# kibana.configmap.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: kibana-config
  labels:
    app: kibana
data:
  kibana.yml: |-
    server.host: 0.0.0.0

    elasticsearch:
      hosts: ${ELASTICSEARCH_HOSTS}
      username: ${ELASTICSEARCH_USER}
      password: ${ELASTICSEARCH_PASSWORD}
---

而後經過一個 NodePort 類型的服務來暴露 Kibana 服務:

# kibana.service.yaml
---
apiVersion: v1
kind: Service
metadata:
  namespace: elastic
  name: kibana
  labels:
    app: kibana
spec:
  type: NodePort
  ports:
  - port: 5601
    name: webinterface
  selector:
    app: kibana
---

最後經過 Deployment 來部署 Kibana 服務,因爲須要經過環境變量提供密碼,這裏咱們使用上面建立的 Secret 對象來引用:

# kibana.deployment.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: elastic
  name: kibana
  labels:
    app: kibana
spec:
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:7.8.0
        ports:
        - containerPort: 5601
          name: webinterface
        env:
        - name: ELASTICSEARCH_HOSTS
          value: "http://elasticsearch-client.elastic.svc.cluster.local:9200"
        - name: ELASTICSEARCH_USER
          value: "elastic"
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:  # 調用前面建立的secret密碼文件,將密碼賦值成爲變量使用
              name: elasticsearch-pw-elastic
              key: password
        volumeMounts:
        - name: config
          mountPath: /usr/share/kibana/config/kibana.yml
          readOnly: true
          subPath: kibana.yml
      volumes:
      - name: config
        configMap:
          name: kibana-config
---

一樣直接建立上面的資源清單便可部署:

$ kubectl apply  -f kibana.configmap.yaml \
                 -f kibana.service.yaml \
                 -f kibana.deployment.yaml

configmap/kibana-config created
service/kibana created
deployment.apps/kibana created

部署成功後,能夠經過查看 Pod 的日誌來了解 Kibana 的狀態:

$ kubectl logs -f -n elastic $(kubectl get pods -n elastic | grep kibana | sed -n 1p | awk '{print $1}') \
     | grep "Status changed from yellow to green"

{"type":"log","@timestamp":"2020-06-26T04:20:38Z","tags":["status","plugin:elasticsearch@7.8.0","info"],"pid":6,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}

當狀態變成 green 後,咱們就能夠經過 NodePort 端口 30474 去瀏覽器中訪問 Kibana 服務了:

$ kubectl get svc kibana -n elastic   
NAME     TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
kibana   NodePort   10.101.121.31   <none>        5601:30474/TCP   8m18s

以下圖所示,使用上面咱們建立的 Secret 對象的 elastic 用戶和生成的密碼便可登陸:

img

登陸成功後會自動跳轉到 Kibana 首頁:

img

一樣也能夠本身建立一個新的超級用戶,Management → Stack Management → Create User:

img

使用新的用戶名和密碼,選擇 superuser 這個角色來建立新的用戶:

img

建立成功後就可使用上面新建的用戶登陸 Kibana,最後還能夠經過 Management → Stack Monitoring 頁面查看整個集羣的健康狀態:

img

到這裏咱們就安裝成功了 ElasticSearch 與 Kibana,它們將爲咱們來存儲和可視化咱們的應用數據(監控指標、日誌和追蹤)服務。

上面咱們已經安裝配置了 ElasticSearch 的集羣,接下來咱們未來使用 Metricbeat 對 Kubernetes 集羣進行監控。Metricbeat 是一個服務器上的輕量級採集器,用於按期收集主機和服務的監控指標。這也是咱們構建 Kubernetes 全棧監控的第一個部分。

Metribeat 默認採集系統的指標,可是也包含了大量的其餘模塊來採集有關服務的指標,好比 Nginx、Kafka、MySQL、Redis 等等,支持的完整模塊能夠在 Elastic 官方網站上查看到 https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-modules.html。

4. kube-state-metrics

首先,咱們須要安裝 kube-state-metrics,這個組件是一個監聽 Kubernetes API 的服務,能夠暴露每一個資源對象狀態的相關指標數據。

要安裝 kube-state-metrics 也很是簡單,在對應的 GitHub 倉庫下就有對應的安裝資源清單文件:

$ git clone https://github.com/kubernetes/kube-state-metrics.git
$ cd kube-state-metrics
# 執行安裝命令
$ kubectl apply -f examples/standard/  
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics configured
clusterrole.rbac.authorization.k8s.io/kube-state-metrics configured
deployment.apps/kube-state-metrics configured
serviceaccount/kube-state-metrics configured
service/kube-state-metrics configured
$ kubectl get pods -n kube-system -l app.kubernetes.io/name=kube-state-metrics
NAME                                  READY   STATUS    RESTARTS   AGE
kube-state-metrics-6d7449fc78-mgf4f   1/1     Running   0          88s

當 Pod 變成 Running 狀態後證實安裝成功。

5. Metricbeat

因爲咱們須要監控全部的節點,因此咱們須要使用一個 DaemonSet 控制器來安裝 Metricbeat。

首先,使用一個 ConfigMap 來配置 Metricbeat,而後經過 Volume 將該對象掛載到容器中的 /etc/metricbeat.yaml 中去。配置文件中包含了 ElasticSearch 的地址、用戶名和密碼,以及 Kibana 配置,咱們要啓用的模塊與抓取頻率等信息。

# metricbeat.settings.configmap.yml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: metricbeat-config
  labels:
    app: metricbeat
data:
  metricbeat.yml: |-

    # 模塊配置
    metricbeat.modules:
    - module: system
      period: ${PERIOD}  # 設置一個抓取數據的間隔
      metricsets: ["cpu", "load", "memory", "network", "process", "process_summary", "core", "diskio", "socket"]
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5      # 根據 CPU 計算的前5個進程
        by_memory: 5   # 根據內存計算的前5個進程

    - module: system
      period: ${PERIOD}
      metricsets:  ["filesystem", "fsstat"]
      processors:
      - drop_event.when.regexp:  # 排除一些系統目錄的監控
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'

    - module: docker             # 抓取docker應用,可是不支持containerd 
      period: ${PERIOD}
      hosts: ["unix:///var/run/docker.sock"]
      metricsets: ["container", "cpu", "diskio", "healthcheck", "info", "memory", "network"]

    - module: kubernetes  # 抓取 kubelet 監控指標
      period: ${PERIOD}
      node: ${NODE_NAME}
      hosts: ["https://${NODE_NAME}:10250"]    # 鏈接kubelet的監控端口,若是須要監控api-server/controller-manager等其餘組件的監控,也須要鏈接端口
      metricsets: ["node", "system", "pod", "container", "volume"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      ssl.verification_mode: "none"
     
    - module: kubernetes  # 抓取 kube-state-metrics 數據
      period: ${PERIOD}
      node: ${NODE_NAME}
      metricsets: ["state_node", "state_deployment", "state_replicaset", "state_pod", "state_container"]
      hosts: ["kube-state-metrics.kube-system.svc.cluster.local:8080"]

    # 根據 k8s deployment 配置具體的服務模塊mongo
    metricbeat.autodiscover:
      providers:
      - type: kubernetes
        node: ${NODE_NAME}
        templates:
        - condition.equals:
            kubernetes.labels.app: mongo
          config:
          - module: mongodb
            period: ${PERIOD}
            hosts: ["mongo.elastic:27017"]
            metricsets: ["dbstats", "status", "collstats", "metrics", "replstatus"]

    # ElasticSearch 鏈接配置
    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}

    # 鏈接到 Kibana
    setup.kibana:
      host: '${KIBANA_HOST:kibana}:${KIBANA_PORT:5601}'

    # 導入已經存在的 Dashboard
    setup.dashboards.enabled: true

    # 配置 indice 生命週期
    setup.ilm:
      policy_file: /etc/indice-lifecycle.json
---

ElasticSearch 的 indice 生命週期表示一組規則,能夠根據 indice 的大小或者時長應用到你的 indice 上。好比能夠天天或者每次超過 1GB 大小的時候對 indice 進行輪轉,咱們也能夠根據規則配置不一樣的階段。因爲監控會產生大量的數據,頗有可能一天就超過幾十G的數據,因此爲了防止大量的數據存儲,咱們能夠利用 indice 的生命週期來配置數據保留,這個在 Prometheus 中也有相似的操做。 以下所示的文件中,咱們配置成天天或每次超過5GB的時候就對 indice 進行輪轉,並刪除全部超過10天的 indice 文件,咱們這裏只保留10天監控數據徹底足夠了。

# metricbeat.indice-lifecycle.configmap.yml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: metricbeat-indice-lifecycle
  labels:
    app: metricbeat
data:
  indice-lifecycle.json: |-
    {
      "policy": {
        "phases": {
          "hot": {
            "actions": {
              "rollover": {
                "max_size": "5GB" ,
                "max_age": "1d"
              }
            }
          },
          "delete": {
            "min_age": "10d",
            "actions": {
              "delete": {}
            }
          }
        }
      }
    }
---

接下來就能夠來編寫 Metricbeat 的 DaemonSet 資源對象清單,以下所示:

# metricbeat.daemonset.yml
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: elastic
  name: metricbeat
  labels:
    app: metricbeat
spec:
  selector:
    matchLabels:
      app: metricbeat
  template:
    metadata:
      labels:
        app: metricbeat
    spec:
      serviceAccountName: metricbeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: metricbeat
        image: docker.elastic.co/beats/metricbeat:7.8.0
        args: [
          "-c", "/etc/metricbeat.yml",
          "-e", "-system.hostfs=/hostfs"
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch-client.elastic.svc.cluster.local
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:    # 調用前面建立的secret密碼文件
              name: elasticsearch-pw-elastic
              key: password
        - name: KIBANA_HOST
          value: kibana.elastic.svc.cluster.local
        - name: KIBANA_PORT
          value: "5601"
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: PERIOD
          value: "10s"
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml
        - name: indice-lifecycle
          mountPath: /etc/indice-lifecycle.json
          readOnly: true
          subPath: indice-lifecycle.json
        - name: dockersock
          mountPath: /var/run/docker.sock
        - name: proc
          mountPath: /hostfs/proc
          readOnly: true
        - name: cgroup
          mountPath: /hostfs/sys/fs/cgroup
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: cgroup
        hostPath:
          path: /sys/fs/cgroup
      - name: dockersock
        hostPath:
          path: /var/run/docker.sock
      - name: config
        configMap:
          defaultMode: 0600
          name: metricbeat-config
      - name: indice-lifecycle
        configMap:
          defaultMode: 0600
          name: metricbeat-indice-lifecycle
      - name: data
        hostPath:
          path: /var/lib/metricbeat-data
          type: DirectoryOrCreate
---

須要注意的將上面的兩個 ConfigMap 掛載到容器中去,因爲須要 Metricbeat 獲取宿主機的相關信息,因此咱們這裏也掛載了一些宿主機的文件到容器中去,好比 proc 目錄,cgroup 目錄以及 dockersock 文件。

因爲 Metricbeat 須要去獲取 Kubernetes 集羣的資源對象信息,因此一樣須要對應的 RBAC 權限聲明,因爲是全局做用域的,因此這裏咱們使用 ClusterRole 進行聲明:

# metricbeat.permissions.yml
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: metricbeat
subjects:
- kind: ServiceAccount
  name: metricbeat
  namespace: elastic
roleRef:
  kind: ClusterRole
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    app: metricbeat
rules:
- apiGroups: [""]
  resources:
  - nodes
  - namespaces
  - events
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
  resources:
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources:
  - statefulsets
  - deployments
	- replicasets
  verbs: ["get", "list", "watch"]
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: elastic
  name: metricbeat
  labels:
    app: metricbeat
---

直接建立上面的幾個資源對象便可:

$ kubectl apply  -f metricbeat.settings.configmap.yml \
                 -f metricbeat.indice-lifecycle.configmap.yml \
                 -f metricbeat.daemonset.yml \
                 -f metricbeat.permissions.yml

configmap/metricbeat-config configured
configmap/metricbeat-indice-lifecycle configured
daemonset.extensions/metricbeat created
clusterrolebinding.rbac.authorization.k8s.io/metricbeat created
clusterrole.rbac.authorization.k8s.io/metricbeat created
serviceaccount/metricbeat created
$ kubectl get pods -n elastic -l app=metricbeat   
NAME               READY   STATUS    RESTARTS   AGE
metricbeat-2gstq   1/1     Running   0          18m
metricbeat-99rdb   1/1     Running   0          18m
metricbeat-9bb27   1/1     Running   0          18m
metricbeat-cgbrg   1/1     Running   0          18m
metricbeat-l2csd   1/1     Running   0          18m
metricbeat-lsrgv   1/1     Running   0          18m

當 Metricbeat 的 Pod 變成 Running 狀態後,正常咱們就能夠在 Kibana 中去查看對應的監控信息了。

在 Kibana 左側頁面 Observability → Metrics 進入指標監控頁面,正常就能夠看到一些監控數據了:

img

也能夠根據本身的需求進行篩選,好比咱們能夠按照 Kubernetes Namespace 進行分組做爲視圖查看監控信息:

img

因爲咱們在配置文件中設置了屬性 setup.dashboards.enabled=true,因此 Kibana 會導入預先已經存在的一些 Dashboard。咱們能夠在左側菜單進入 Kibana → Dashboard 頁面,咱們會看到一個大約有 50 個 Metricbeat 的 Dashboard 列表,咱們能夠根據須要篩選 Dashboard,好比咱們要查看集羣節點的信息,能夠查看 [Metricbeat Kubernetes] Overview ECS 這個 Dashboard:

img

咱們還單獨啓用了 mongodb 模塊,咱們可使用 [Metricbeat MongoDB] Overview ECS 這個 Dashboard 來查看監控信息:

img

咱們還啓用了 docker 這個模塊,也可使用 [Metricbeat Docker] Overview ECS 這個 Dashboard 來查看監控信息:

img

到這裏咱們就完成了使用 Metricbeat 來監控 Kubernetes 集羣信息,在下面咱們學習如何使用 Filebeat 來收集日誌以監控 Kubernetes 集羣。

6. Filebeat

咱們將要安裝配置 Filebeat 來收集 Kubernetes 集羣中的日誌數據,而後發送到 ElasticSearch 去中,Filebeat 是一個輕量級的日誌採集代理,還能夠配置特定的模塊來解析和可視化應用(好比數據庫、Nginx 等)的日誌格式。

和 Metricbeat 相似,Filebeat 也須要一個配置文件來設置和 ElasticSearch 的連接信息、和 Kibana 的鏈接已經日誌採集和解析的方式。

以下所示的 ConfigMap 資源對象就是咱們這裏用於日誌採集的配置信息(能夠從官方網站上獲取完整的可配置信息):

# filebeat.settings.configmap.yml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: filebeat-config
  labels:
    app: filebeat
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: container
      enabled: true
      paths:
      - /var/log/containers/*.log
      processors:
      - add_kubernetes_metadata:
          in_cluster: true
          host: ${NODE_NAME}
          matchers:
          - logs_path:
              logs_path: "/var/log/containers/"
    
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          templates:
            - condition.equals:
                kubernetes.labels.app: mongo
              config:
                - module: mongodb
                  enabled: true
                  log:
                    input:
                      type: docker
                      containers.ids:
                        - ${data.kubernetes.container.id}

    processors:
      - drop_event:
          when.or:
              - and:
                  - regexp:
                      message: '^\d+\.\d+\.\d+\.\d+ '
                  - equals:
                      fileset.name: error
              - and:
                  - not:
                      regexp:
                          message: '^\d+\.\d+\.\d+\.\d+ '
                  - equals:
                      fileset.name: access
      - add_cloud_metadata:
      - add_kubernetes_metadata:
          matchers:
          - logs_path:
              logs_path: "/var/log/containers/"
      - add_docker_metadata:

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}

    setup.kibana:
      host: '${KIBANA_HOST:kibana}:${KIBANA_PORT:5601}'

    setup.dashboards.enabled: true
    setup.template.enabled: true

    setup.ilm:
      policy_file: /etc/indice-lifecycle.json
---

咱們配置採集 /var/log/containers/ 下面的全部日誌數據,而且使用 inCluster 的模式訪問 Kubernetes 的 APIServer,獲取日誌數據的 Meta 信息,將日誌直接發送到 Elasticsearch。

此外還經過 policy_file 定義了 indice 的回收策略:

# filebeat.indice-lifecycle.configmap.yml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: filebeat-indice-lifecycle
  labels:
    app: filebeat
data:
  indice-lifecycle.json: |-
    {
      "policy": {
        "phases": {
          "hot": {
            "actions": {
              "rollover": {
                "max_size": "5GB" ,
                "max_age": "1d"
              }
            }
          },
          "delete": {
            "min_age": "30d",
            "actions": {
              "delete": {}
            }
          }
        }
      }
    }
---

一樣爲了採集每一個節點上的日誌數據,咱們這裏使用一個 DaemonSet 控制器,使用上面的配置來採集節點的日誌。

#filebeat.daemonset.yml
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: elastic
  name: filebeat
  labels:
    app: filebeat
spec:
  selector:
    matchLabels:
      app: filebeat
  template:
    metadata:
      labels:
        app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:7.8.0
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch-client.elastic.svc.cluster.local
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:
              name: elasticsearch-pw-elastic
              key: password
        - name: KIBANA_HOST
          value: kibana.elastic.svc.cluster.local
        - name: KIBANA_PORT
          value: "5601"
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: filebeat-indice-lifecycle
          mountPath: /etc/indice-lifecycle.json
          readOnly: true
          subPath: indice-lifecycle.json
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlog
          mountPath: /var/log
          readOnly: true
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: dockersock
          mountPath: /var/run/docker.sock
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: filebeat-indice-lifecycle
        configMap:
          defaultMode: 0600
          name: filebeat-indice-lifecycle
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: dockersock
        hostPath:
          path: /var/run/docker.sock
      - name: data
        hostPath:
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---

咱們這裏使用的是 Kubeadm 搭建的集羣,默認 Master 節點是有污點的,因此若是還想採集 Master 節點的日誌,還必須加上對應的容忍,我這裏不採集就沒有添加容忍了。 此外因爲須要獲取日誌在 Kubernetes 集羣中的 Meta 信息,好比 Pod 名稱、所在的命名空間等,因此 Filebeat 須要訪問 APIServer,天然就須要對應的 RBAC 權限了,因此還須要進行權限聲明:

# filebeat.permission.yml
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: elastic
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    app: filebeat
rules:
- apiGroups: [""]
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: elastic
  name: filebeat
  labels:
    app: filebeat
---

而後直接安裝部署上面的幾個資源對象便可:

$ kubectl apply  -f filebeat.settings.configmap.yml \
                 -f filebeat.indice-lifecycle.configmap.yml \
                 -f filebeat.daemonset.yml \
                 -f filebeat.permissions.yml 

configmap/filebeat-config created
configmap/filebeat-indice-lifecycle created
daemonset.apps/filebeat created
clusterrolebinding.rbac.authorization.k8s.io/filebeat created
clusterrole.rbac.authorization.k8s.io/filebeat created
serviceaccount/filebeat created

當全部的 Filebeat 和 Logstash 的 Pod 都變成 Running 狀態後,證實部署完成。如今咱們就能夠進入到 Kibana 頁面中去查看日誌了。左側菜單 Observability → Logs

img

此外還能夠從上節咱們提到的 Metrics 頁面進入查看 Pod 的日誌:

img

點擊 Kubernetes Pod logs 獲取須要查看的 Pod 日誌:

img

若是集羣中要採集的日誌數據量太大,直接將數據發送給 ElasticSearch,對 ES 壓力比較大,這種狀況通常能夠加一個相似於 Kafka 這樣的中間件來緩衝下,或者經過 Logstash 來收集 Filebeat 的日誌。

這裏咱們就完成了使用 Filebeat 採集 Kubernetes 集羣的日誌,在下篇文章中,咱們繼續學習如何使用 Elastic APM 來追蹤 Kubernetes 集羣應用。

7. Elastic APM

Elastic APM 是 Elastic Stack 上用於應用性能監控的工具,它容許咱們經過收集傳入請求、數據庫查詢、緩存調用等方式來實時監控應用性能。這可讓咱們更加輕鬆快速定位性能問題。

Elastic APM 是兼容 OpenTracing 的,因此咱們可使用大量現有的庫來跟蹤應用程序性能。

好比咱們能夠在一個分佈式環境(微服務架構)中跟蹤一個請求,並輕鬆找到可能潛在的性能瓶頸。

img

Elastic APM 經過一個名爲 APM-Server 的組件提供服務,用於收集並向 ElasticSearch 以及和應用一塊兒運行的 agent 程序發送追蹤數據。

img

安裝 APM-Server

首先咱們須要在 Kubernetes 集羣上安裝 APM-Server 來收集 agent 的追蹤數據,並轉發給 ElasticSearch,這裏一樣咱們使用一個 ConfigMap 來配置:

# apm.configmap.yml
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: elastic
  name: apm-server-config
  labels:
    app: apm-server
data:
  apm-server.yml: |-
    apm-server:
      host: "0.0.0.0:8200"

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}

    setup.kibana:
      host: '${KIBANA_HOST:kibana}:${KIBANA_PORT:5601}'
---

APM-Server 須要暴露 8200 端口來讓 agent 轉發他們的追蹤數據,新建一個對應的 Service 對象便可:

# apm.service.yml
---
apiVersion: v1
kind: Service
metadata:
  namespace: elastic
  name: apm-server
  labels:
    app: apm-server
spec:
  ports:
  - port: 8200
    name: apm-server
  selector:
    app: apm-server
---

而後使用一個 Deployment 資源對象管理便可:

# apm.deployment.yml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: elastic
  name: apm-server
  labels:
    app: apm-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: apm-server
  template:
    metadata:
      labels:
        app: apm-server
    spec:
      containers:
      - name: apm-server
        image: docker.elastic.co/apm/apm-server:7.8.0
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch-client.elastic.svc.cluster.local
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:
              name: elasticsearch-pw-elastic
              key: password
        - name: KIBANA_HOST
          value: kibana.elastic.svc.cluster.local
        - name: KIBANA_PORT
          value: "5601"
        ports:
        - containerPort: 8200
          name: apm-server
        volumeMounts:
        - name: config
          mountPath: /usr/share/apm-server/apm-server.yml
          readOnly: true
          subPath: apm-server.yml
      volumes:
      - name: config
        configMap:
          name: apm-server-config
---

直接部署上面的幾個資源對象:

$ kubectl apply  -f apm.configmap.yml \
                 -f apm.service.yml \
                 -f apm.deployment.yml

configmap/apm-server-config created
service/apm-server created
deployment.extensions/apm-server created

當 Pod 處於 Running 狀態證實運行成功:

$ kubectl get pods -n elastic -l app=apm-server
NAME                          READY   STATUS    RESTARTS   AGE
apm-server-667bfc5cff-zj8nq   1/1     Running   0          12m

接下來咱們能夠在第一節中部署的 Spring-Boot 應用上安裝一個 agent 應用。

配置 Java Agent

接下來咱們在示例應用程序 spring-boot-simple 上配置一個 Elastic APM Java agent。 首先咱們須要把 elastic-apm-agent-1.8.0.jar 這個 jar 包程序內置到應用容器中去,在構建鏡像的 Dockerfile 文件中添加一行以下所示的命令直接下載該 JAR 包便可:

RUN wget -O /apm-agent.jar https://search.maven.org/remotecontent?filepath=co/elastic/apm/elastic-apm-agent/1.8.0/elastic-apm-agent-1.8.0.jar

完整的 Dockerfile 文件以下所示:

FROM openjdk:8-jdk-alpine

ENV ELASTIC_APM_VERSION "1.8.0"
RUN wget -O /apm-agent.jar https://search.maven.org/remotecontent?filepath=co/elastic/apm/elastic-apm-agent/$ELASTIC_APM_VERSION/elastic-apm-agent-$ELASTIC_APM_VERSION.jar

COPY target/spring-boot-simple.jar /app.jar

CMD java -jar /app.jar

而後須要在示例應用中添加上以下依賴關係,這樣咱們就能夠集成 open-tracing 的依賴庫或者使用 Elastic APM API 手動檢測。

<dependency>
    <groupId>co.elastic.apm</groupId>
    <artifactId>apm-agent-api</artifactId>
    <version>${elastic-apm.version}</version>
</dependency>
<dependency>
    <groupId>co.elastic.apm</groupId>
    <artifactId>apm-opentracing</artifactId>
    <version>${elastic-apm.version}</version>
</dependency>
<dependency>
    <groupId>io.opentracing.contrib</groupId>
    <artifactId>opentracing-spring-cloud-mongo-starter</artifactId>
    <version>${opentracing-spring-cloud.version}</version>
</dependency>

而後須要修改第一篇文章中使用 Deployment 部署的 Spring-Boot 應用,須要開啓 Java agent 而且要鏈接到 APM-Server。

# spring-boot-simple.deployment.yml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: elastic
  name: spring-boot-simple
  labels:
    app: spring-boot-simple
spec:
  selector:
    matchLabels:
      app: spring-boot-simple
  template:
    metadata:
      labels:
        app: spring-boot-simple
    spec:
      containers:
      - image: cnych/spring-boot-simple:0.0.1-SNAPSHOT
        imagePullPolicy: Always
        name: spring-boot-simple
        command:
          - "java"
          - "-javaagent:/apm-agent.jar"
          - "-Delastic.apm.active=$(ELASTIC_APM_ACTIVE)"
          - "-Delastic.apm.server_urls=$(ELASTIC_APM_SERVER)"
          - "-Delastic.apm.service_name=spring-boot-simple"
          - "-jar"
          - "app.jar"
        env:
          - name: SPRING_DATA_MONGODB_HOST
            value: mongo
          - name: ELASTIC_APM_ACTIVE
            value: "true"
          - name: ELASTIC_APM_SERVER
            value: http://apm-server.elastic.svc.cluster.local:8200
        ports:
        - containerPort: 8080
---

而後從新部署上面的示例應用:

$ kubectl apply -f spring-boot-simple.yml
$ kubectl get pods -n elastic -l app=spring-boot-simple
NAME                                 READY   STATUS    RESTARTS   AGE
spring-boot-simple-fb5564885-tf68d   1/1     Running   0          5m11s
$ kubectl get svc -n elastic -l app=spring-boot-simple
NAME                 TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
spring-boot-simple   NodePort   10.109.55.134   <none>        8080:31847/TCP   9d

當示例應用從新部署完成後,執行以下幾個請求:

get messages

獲取全部發布的 messages 數據:

$ curl -X GET http://k8s.qikqiak.com:31847/message

get messages (慢請求)

使用 sleep=<ms> 來模擬慢請求:

$ curl -X GET http://k8s.qikqiak.com:31847/message?sleep=3000

get messages (error)

使用 error=true 來觸發一異常:

$ curl -X GET http://k8s.qikqiak.com:31847/message?error=true

如今咱們去到 Kibana 頁面中路由到 APM 頁面,咱們應該就能夠看到 spring-boot-simple 應用的數據了。

img

點擊應用就能夠查看到當前應用的各類性能追蹤數據:

img

能夠查看如今的錯誤數據:

img

還能夠查看 JVM 的監控數據:

https://bxdc-static.oss-cn-beijing.aliyuncs.com/images/20200706093331.png

除此以外,咱們還能夠添加報警信息,就能夠在第一時間掌握應用的性能情況了。

總結

到這裏咱們就完成了使用 Elastic Stack 進行 Kubernetes 環境的全棧監控,經過監控指標、日誌、性能追蹤來了解咱們的應用各方面運行狀況,加快咱們排查和解決各類問題。

問題排錯

  1. 關於kibana調取secret密碼時,登入kibana內查看密碼變量發現變量是一個亂碼值,這個目前只在變量掛入kibana容器中發現。

    解決辦法:將容器變量調用設置成密碼
  2. es 自動生成索引時,使用索引模板,生成默認tag 過多,能夠經過修改索引模板的方法來進行減小索引創建

相關文章
相關標籤/搜索