kubernetes redis pod CrashLoopBackOff修復心得

時間 2021-04-23

標籤 redis api bash 服務器 app ide oop spa 日誌 code 欄目 Redis 简体版

原文原文鏈接

前言

實驗環境的kubernetes服務器物理機忽然斷電，重啓後helm 部署的harbor出現了啓動故障，首先查看harbor 相關容器運行狀態：
redis

解決方法

前面兩個CrashLoopBackOff的容器，能夠的使用命令刪除容器，就能夠解決，關鍵的是redis 容器，刪除是解決不了的。

使用命令查看容器的日誌。api

[root@master ~]# kubectl logs hub-redis-master-0 

Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix <filename>

簡單理解：文件格式損壞，作個備份，使用命令修復。bash

關鍵問題是pod啓動不起來，不能直接進去修復，因此關鍵問題仍是讓redis的容器啓動起來，想讓pod起來就必須不讓容器加載以前的appendonly.aof文件，找到appendonly.aof重命名，讓redis容器從新生成appendonly.aof。服務器

查找appendonly.aof

接着查看容器的描述：app

# kubectl describe po hub-redis-master-0

能夠獲取到須要的信息：ide

/bitnami/redis/data   #aof在容器上的路徑
Volumes:   #redis pod的pvc信息
  redis-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  redis-data-hub-redis-master-0

確認redis 容器使用的 pv，獲取pv的建立信息:oop

[root@master ~]# kubectl get pv | grep redis
pv006      100Gi      RWO            Recycle          Bound     default/redis-data-hub-redis-master-0 
[root@master ~]# kubectl describe pv pv006
Name:            pv006
Labels:          <none>
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"pv006","namespace":""},"spec":{"accessModes":["ReadWriteOnce"],"capac...
                 pv.kubernetes.io/bound-by-controller=yes
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    
Status:          Bound
Claim:           default/redis-data-hub-redis-master-0
Reclaim Policy:  Recycle
Access Modes:    RWO
Capacity:        100Gi
Node Affinity:   <none>
Message:         
Source:
    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    Server:    192.168.2.4
    Path:      /volume1/harbor/nfs6
    ReadOnly:  false
Events:        <none>

這裏能夠找到nfs對應的路徑，直接進入nfs服務器對應路徑下重命名appendonly.aof，redis的pod就當即啓動狀態爲running了，接下來就是修復appendonly.aof。
spa

修復appendonly.aof

進入到容器：日誌

[root@master ~]# kubectl exec -it hub-redis-master-0 bash
I have no name!@hub-redis-master-0:/$ ls /bitnami/redis/data/
appendonly.aof      appendonly.bak.aof  dump.rdb

修復code

redis-check-aof --fix /bitnami/redis/data/appendonly.bak.aof
0x           10f69: Expected prefix '*', got: '
AOF analyzed: size=10316900, ok_up_to=69481, diff=10247419
This will shrink the AOF from 10316900 bytes, with 10247419 bytes, to 69481 bytes
Continue? [y/N]: y
Successfully truncated AOF

如今就能夠把正在使用的appendonly.aof 重命名，把修復後的aof命名爲appendonly.aof ，刪除容器，kubernetes自動從新建立redis容器，若是其它容器仍是CrashLoopBackOff，這多是redis沒有啓動致使的，redis修復好後，刪除CrashLoopBackOff的容器，kubernetes自動從新創建就能夠了。