一次kuberneets evicted的歷險

1、概述node

  kubernetes 的eviction檢測diskpresure,檢測的是kubelet的root-dir。kubelet的默認root-dir是/var/lib/kubelet,能夠使用參數--root-dir進行修改,源碼:git

    kubernetes/cmd/kubelet/app/options/options.gogithub

   

const defaultRootDir = "/var/lib/kubelet" fs.StringVar(&f.RootDirectory, "root-dir", f.RootDirectory, "Directory path for managing kubelet files (volume mounts,etc).")

 kubernetes/pkg/kubelet/eviction/helpers.gojson

  

// diskUsage converts used bytes into a resource quantity.
func diskUsage(fsStats *statsapi.FsStats) *resource.Quantity { if fsStats == nil || fsStats.UsedBytes == nil { return &resource.Quantity{Format: resource.BinarySI} } usage := int64(*fsStats.UsedBytes) return resource.NewQuantity(usage, resource.BinarySI) } // rankDiskPressureFunc returns a rankFunc that measures the specified fs stats.
func rankDiskPressureFunc(fsStatsToMeasure []fsStatsType, diskResource v1.ResourceName) rankFunc { return func(pods []*v1.Pod, stats statsFunc) { orderedBy(exceedDiskRequests(stats, fsStatsToMeasure, diskResource), priority, disk(stats, fsStatsToMeasure, diskResource)).Sort(pods) } } if nodeFs := summary.Node.Fs; nodeFs != nil { if nodeFs.AvailableBytes != nil && nodeFs.CapacityBytes != nil { result[evictionapi.SignalNodeFsAvailable] = signalObservation{ available: resource.NewQuantity(int64(*nodeFs.AvailableBytes), resource.BinarySI), capacity: resource.NewQuantity(int64(*nodeFs.CapacityBytes), resource.BinarySI), time: nodeFs.Time, } }
type NodeStats struct { // Reference to the measured Node.
    NodeName string `json:"nodeName"` // Stats of system daemons tracked as raw containers. // The system containers are named according to the SystemContainer* constants. // +optional // +patchMergeKey=name // +patchStrategy=merge
    SystemContainers []ContainerStats `json:"systemContainers,omitempty" patchStrategy:"merge" patchMergeKey:"name"` // The time at which data collection for the node-scoped (i.e. aggregate) stats was (re)started.
    StartTime metav1.Time `json:"startTime"` // Stats pertaining to CPU resources. // +optional
    CPU *CPUStats `json:"cpu,omitempty"` // Stats pertaining to memory (RAM) resources. // +optional
    Memory *MemoryStats `json:"memory,omitempty"` // Stats pertaining to network resources. // +optional
    Network *NetworkStats `json:"network,omitempty"` // Stats pertaining to total usage of filesystem resources on the rootfs used by node k8s components. // NodeFs.Used is the total bytes used on the filesystem. // +optional
    Fs *FsStats `json:"fs,omitempty"` // Stats about the underlying container runtime. // +optional
    Runtime *RuntimeStats `json:"runtime,omitempty"` // Stats about the rlimit of system. // +optional
    Rlimit *RlimitStats `json:"rlimit,omitempty"` }

 

2、事故app

   事情發生在幾個月前,有人修改了fluentd的pattern,fluentd使用ds部署的,裏面有掛載了一個hostpath,/var/log.裏面的日誌會輸出到syslog裏面。致使pattern不匹配的日誌所有打入到/var/log/syslog裏面,一個小時寫入了7個多G。後面磁盤使用率直接達到了90%,而咱們在kubelet裏面設置的驅逐策略以下:flex

  

evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5%

當kubelet的root-dir所在的磁盤使用率達到90%就開始evicted,這個fluentd是沒有報錯的,只是pattern不匹配而後就把日誌輸出到了sysylog,因此使用的時候必定要設置好日誌的輸出路徑和日誌的輸出級別。spa

 

3、善後3d

經過分析源碼得出結論,緊急恢復服務。(系統盤的告警閾值沒有減掉kubelet裏面設置的驅逐閾值)。從新規劃監控閾值,線上的node節點設置特性,不一樣的業務部署在不一樣node節點上。日誌

相關文章
相關標籤/搜索