Kubernetes pod中systemctl狀態探針失敗問題

在Heketi的glusterd容器服務,使用systemctl探針來檢測glusterfs服務是否可用,發現老是出現失敗問題。node

經查,在Ubuntu 18.04 上 systemctl status glusterd.service 運行時輸出信息不是K8s livenessProbe但願的,致使檢測器超時掛起了。centos

  • 使用systemctl status glusterd.service並不能檢測到服務的真實狀態,會掛起、超時,返回錯誤狀態碼。

使用下面的方式,能夠正確檢測service的真實狀態: api

systemctl is-active --quiet glusterd.service; echo $?; 

或者(相似於):bash

systemctl is-active sshd >/dev/null 2>&1 && echo 0 || echo 1

輸出:app

  • 正常時 0;
  • 非正常時爲錯誤碼。
  • 以下所示:
livenessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - systemctl is-active --quiet glusterd.service; echo $?;
          failureThreshold: 3
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - systemctl is-active --quiet glusterd.service; echo $?;

修改後的k8s yaml文件以下:ssh

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: glusterfs-daemon
  namespace: gluster
  labels:
    k8s-app: glusterfs-node
spec:
  selector:
    matchLabels:
      name: glusterfs-daemon
  template:
    metadata:
      labels:
        name: glusterfs-daemon
    spec:
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - image: gluster/gluster-centos:latest
        imagePullPolicy: IfNotPresent
        name: glusterfs
        livenessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - systemctl is-active --quiet glusterd.service; echo $?;
          failureThreshold: 3
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - systemctl is-active --quiet glusterd.service; echo $?;
          failureThreshold: 3
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        resources: {}
        securityContext:
          capabilities: {}
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/heketi
          name: glusterfs-heketi
        - mountPath: /run
          name: glusterfs-run
        - mountPath: /run/lvm
          name: glusterfs-lvm
        - mountPath: /etc/glusterfs
          name: glusterfs-etc
        - mountPath: /var/log/glusterfs
          name: glusterfs-logs
        - mountPath: /var/lib/glusterd
          name: glusterfs-config
        - mountPath: /dev
          name: glusterfs-dev
        - mountPath: /sys/fs/cgroup
          name: glusterfs-cgroup
      dnsPolicy: ClusterFirst
      hostNetwork: true
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - hostPath:
          path: /var/lib/heketi
          type: ""
        name: glusterfs-heketi
      - emptyDir: {}
        name: glusterfs-run
      - hostPath:
          path: /run/lvm
          type: ""
        name: glusterfs-lvm
      - hostPath:
          path: /etc/glusterfs
          type: ""
        name: glusterfs-etc
      - hostPath:
          path: /var/log/glusterfs
          type: ""
        name: glusterfs-logs
      - hostPath:
          path: /var/lib/glusterd
          type: ""
        name: glusterfs-config
      - hostPath:
          path: /dev
          type: ""
        name: glusterfs-dev
      - hostPath:
          path: /sys/fs/cgroup
          type: ""
        name: glusterfs-cgroup

可能在不一樣的Linux版本上,systemd的版本不一樣,參數也可能不同,輸入systemctl help來獲取當前版本的幫助。ui

相關文章
相關標籤/搜索