深刻分析Kubernetes Critical Pod(一)介紹了Scheduler對Critical Pod的處理邏輯,下面咱們再看下Kubelet Eviction Manager對Critical Pod的處理邏輯是怎樣的,以便咱們瞭解Kubelet Evict Pod時對Critical Pod是否有保護措施,若是有,又是如何保護的。node
kubelet在syncLoop中每一個1s會循環調用syncLoopIteration,從config change channel | pleg channel | sync channel | houseKeeping channel | liveness manager's update channel
中獲取event,而後分別調用對應的event handler進行處理。app
特別提一下,houseKeeping channel是每隔houseKeeping(10s)時間就會有event,而後執行HandlePodCleanups,執行如下清理操做:ide
pkg/kubelet/kubelet.go:1753 func (kl *Kubelet) syncLoopIteration(configCh <-chan kubetypes.PodUpdate, handler SyncHandler, syncCh <-chan time.Time, housekeepingCh <-chan time.Time, plegCh <-chan *pleg.PodLifecycleEvent) bool { select { case u, open := <-configCh: if !open { glog.Errorf("Update channel is closed. Exiting the sync loop.") return false } switch u.Op { case kubetypes.ADD: handler.HandlePodAdditions(u.Pods) ... case kubetypes.RESTORE: glog.V(2).Infof("SyncLoop (RESTORE, %q): %q", u.Source, format.Pods(u.Pods)) // These are pods restored from the checkpoint. Treat them as new // pods. handler.HandlePodAdditions(u.Pods) ... } if u.Op != kubetypes.RESTORE { ... } case e := <-plegCh: ... case <-syncCh: ... case update := <-kl.livenessManager.Updates(): ... case <-housekeepingCh: ... } return true }
syncLoopIteration中定義了當kubelet配置變動重啓後的邏輯:kubelet會對正在running的Pods進行Admission處理,Admission的結果有可能會讓該Pod被本節點拒絕。oop
HandlePodAdditions就是用來處理Kubelet ConficCh中的event的Handler。spa
// HandlePodAdditions is the callback in SyncHandler for pods being added from a config source. func (kl *Kubelet) HandlePodAdditions(pods []*v1.Pod) { start := kl.clock.Now() sort.Sort(sliceutils.PodsByCreationTime(pods)) for _, pod := range pods { ... if !kl.podIsTerminated(pod) { ... // Check if we can admit the pod; if not, reject it. if ok, reason, message := kl.canAdmitPod(activePods, pod); !ok { kl.rejectPod(pod, reason, message) continue } } ... } }
若是該Pod Status不是屬於Terminated,就調用canAdmitPod對該Pod進行准入檢查。若是准入檢查結果表示該Pod被拒絕,那麼就會將該Pod Phase設置爲Failed。rest
pkg/kubelet/kubelet.go:1643 func (kl *Kubelet) canAdmitPod(pods []*v1.Pod, pod *v1.Pod) (bool, string, string) { // the kubelet will invoke each pod admit handler in sequence // if any handler rejects, the pod is rejected. // TODO: move out of disk check into a pod admitter // TODO: out of resource eviction should have a pod admitter call-out attrs := &lifecycle.PodAdmitAttributes{Pod: pod, OtherPods: pods} for _, podAdmitHandler := range kl.admitHandlers { if result := podAdmitHandler.Admit(attrs); !result.Admit { return false, result.Reason, result.Message } } return true, "", "" }
canAdmitPod就會調用kubelet啓動時註冊的一系列admitHandlers對該Pod進行准入檢查,其中就包括kubelet eviction manager對應的admitHandle。code
pkg/kubelet/eviction/eviction_manager.go:123 // Admit rejects a pod if its not safe to admit for node stability. func (m *managerImpl) Admit(attrs *lifecycle.PodAdmitAttributes) lifecycle.PodAdmitResult { m.RLock() defer m.RUnlock() if len(m.nodeConditions) == 0 { return lifecycle.PodAdmitResult{Admit: true} } if utilfeature.DefaultFeatureGate.Enabled(features.ExperimentalCriticalPodAnnotation) && kubelettypes.IsCriticalPod(attrs.Pod) { return lifecycle.PodAdmitResult{Admit: true} } if hasNodeCondition(m.nodeConditions, v1.NodeMemoryPressure) { notBestEffort := v1.PodQOSBestEffort != v1qos.GetPodQOS(attrs.Pod) if notBestEffort { return lifecycle.PodAdmitResult{Admit: true} } } return lifecycle.PodAdmitResult{ Admit: false, Reason: reason, Message: fmt.Sprintf(message, m.nodeConditions), } }
eviction manager的Admit的邏輯以下:orm
另外,在kubelet eviction manager的syncLoop中,也會對Critical Pod有特殊處理,代碼以下。rem
pkg/kubelet/eviction/eviction_manager.go:226 // synchronize is the main control loop that enforces eviction thresholds. // Returns the pod that was killed, or nil if no pod was killed. func (m *managerImpl) synchronize(diskInfoProvider DiskInfoProvider, podFunc ActivePodsFunc) []*v1.Pod { ... // we kill at most a single pod during each eviction interval for i := range activePods { pod := activePods[i] if utilfeature.DefaultFeatureGate.Enabled(features.ExperimentalCriticalPodAnnotation) && kubelettypes.IsCriticalPod(pod) && kubepod.IsStaticPod(pod) { continue } ... return []*v1.Pod{pod} } glog.Infof("eviction manager: unable to evict any pods from the node") return nil }
當觸發了kubelet evict pod時,若是該pod知足如下全部條件時,將不會被kubelet eviction manager kill掉。get
通過上面的分析,咱們獲得如下Kubelet Eviction Manager對Critical Pod處理的關鍵點:
kubelet重啓後,eviction manager的Admit流程中對Critical Pod作以下特殊處理:若是enable了ExperimentalCriticalPodAnnotation Feature Gate,則容許該Critical Pod准入該node,無視該node的Condition。
當觸發了kubelet evict pod時,若是該Critical Pod知足如下全部條件時,將不會被kubelet eviction manager kill掉。