在實際狀況下咱們可能須要將Pod調度到指定的一些Node節點上,能夠經過Node的標籤(Label)和Pod的NodeSelector屬性相匹配,來達到上述目的。node
咱們能夠經過在節點打上Label,經過定義Pod時,指定Label將Pod調度到指定的節點上linux
kubectl label node nodename label=labelname
nginx
[root@k8s-master01 ~]# kubectl get node --show-labels NAME STATUS ROLES AGE VERSION LABELS k8s-master01 Ready <none> 6d6h v1.19.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master01,kubernetes.io/os=linux,node.kubernetes.io/node= k8s-master02 Ready <none> 6d6h v1.19.0 app=linux,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master02,kubernetes.io/os=linux,node.kubernetes.io/node= k8s-node01 Ready <none> 6d6h v1.19.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,ds=true,ingress=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node01,kubernetes.io/os=linux,node.kubernetes.io/node=,xinyuan=liu k8s-node02 Ready <none> 6d6h v1.19.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,ds=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node02,kubernetes.io/os=linux,node.kubernetes.io/node=,zone=foo [root@k8s-master01 ~]# [root@k8s-master01 ~]# [root@k8s-master01 ~]# kubectl label node k8s-node01 itax=app
配置yaml文件
在template.metadata.spec.nodeSelector定義node標籤web
[root@k8s-master01 ~]# cat deploy.yaml apiVersion: apps/v1 kind: Deployment metadata: namespace: default name: nginxapp labels: app: nginx-deploy spec: replicas: 2 selector: matchLabels: app: mynginx template: metadata: labels: app: mynginx spec: containers: - name: nginxweb1 image: nginx:1.15-alpine nodeSelector: #定義node標籤調度 itax: app #標籤名字和值
注意算法
NodeAffinity稱之爲Node節點親和性調度有兩種調度策略api
RequiredDuringSchedulingIgnoredDuringExecution
稱之爲硬親和,必須知足指定定義的規則纔會將Pod調度到Node上app
PreferredDuringSchedulingIgnoredDuringExecution
軟親和,儘可能將Pod調度到知足條件的Node上,若是沒有知足的,調度器會根據算法選擇一個最合適的節點優化
在template.spec下定義ui
[root@k8s-master01 ~]# cat deploy.yaml apiVersion: apps/v1 kind: Deployment metadata: namespace: default name: nginxapp labels: app: nginx-deploy spec: replicas: 2 selector: matchLabels: app: mynginx template: metadata: labels: app: mynginx spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: zone #標籤名 operator: In #表示值在key的裏面,下邊能夠定義多個key values: - foo #標籤值 weight: 1 containers: - name: nginxweb image: nginx:1.15-alpine
此Pod定義的是軟親和,表示將此Pod儘可能調度到node節點標籤名爲zone值是foo的節點上spa
apiVersion: apps/v1 kind: Deployment metadata: namespace: default name: nginxapp labels: app: nginx-deploy spec: replicas: 2 selector: matchLabels: app: mynginx template: metadata: labels: app: mynginx spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: zone operator: In values: - foo weight: 1 requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: zone operator: In values: - bar1 containers: - name: nginxweb image: nginx:1.15-alpine
若是兩個同時使用那麼先匹配硬親和而後再從匹配到的硬親和去匹配軟親和
Pod的親和性和反親和性(互斥)
能讓用戶從另外一個角度來限制Pod所能運行的節點,Pod的親和和互斥是基於Pod標籤進行判斷的
注意Pod的親和和互斥是基於namespace進行匹配的
apiVersion: apps/v1 kind: Deployment metadata: namespace: default name: nginxapp labels: app: nginx-deploy spec: replicas: 2 selector: matchLabels: app: mynginx template: metadata: labels: app: mynginx spec: containers: - name: nginxweb1 image: nginx:1.15-alpine affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - mynginx topologyKey: kubernetes.io/hostname # 表示只要hostname值同樣就說明是同一位置拓撲域
表示將Pod部署到Pod的標籤名app值爲mynginx的Pod所在節點的node上
apiVersion: apps/v1 kind: Deployment metadata: namespace: default name: nginxapp labels: app: nginx-deploy spec: replicas: 2 selector: matchLabels: app: mynginx template: metadata: labels: app: mynginx spec: containers: - name: nginxweb1 image: nginx:1.15-alpine affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - mynginx topologyKey: kubernetes.io/hostname # 表示只要hostname值同樣就說明是同一位置拓撲域
對於Node affinity,不管是強制約束(hard)或偏好(preference)方式,都是調度pod到預期節點上,而Taints剛好與之相反,若是一個節點標記爲Taints,除非 Pod也被標識爲能夠耐受污點節點,不然該Taints節點不會被調度pod。Taints與tolerations當前處於beta階段, Taints節點應用場景好比用戶但願把Kubernetes Master節點保留給 Kubernetes 系統組件使用,或者把一組具備特殊資源預留給某些pod。pod不會再被調度到taint標記過的節點
若是想爲特殊的用戶指定一組node,能夠添加一個污點到這組node上(運行命令: kubectl taint nodes nodename dedicated=groupName:NoSchedule)。而後添加對應的容忍到這個Pod上(這個最容易實現的方法是寫一個客戶端准入控制器)。帶有相應容忍的Pod就能夠像被調度到集羣中其餘node同樣被調度到帶有相應污點的node上。
污點是定義節點上的一種鍵值屬性;主要用於讓節點拒絕Pod的(拒絕那些不能容忍污點的Pod);因此說咱們得在Pod建議可以容忍的污點
Taint的定義對Pod的排斥結果
設置node污點
[root@k8s-master01 ~]# kubectl taint node k8s-master01 node-type=production:NoSchedule #此處node現有的Pod不會被驅逐
[root@k8s-master01 ~]# kubectl taint node k8s-master01 node-type=production:NoExecute #此處node現有的Pod會被驅除
設置Pod的容忍
apiVersion: apps/v1 kind: Deployment metadata: namespace: default name: nginxapp labels: app: nginx-deploy spec: replicas: 2 selector: matchLabels: app: mynginx template: metadata: labels: app: mynginx spec: containers: - name: nginxweb1 image: nginx:1.15-alpine affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - mynginx topologyKey: kubernetes.io/hostname # 表示只要hostname值同樣就說明是同一位置拓撲域 tolerations: - key: "node-type" # 污點的key operator: "Equal" # 容忍程度 value: "producction" # 污點的volue effect: "NoSchedule"
具體的調度過程,通常以下:首先,客戶端經過API Server的REST API/kubectl/helm建立pod/service/deployment/job等,支持類型主要爲JSON/YAML/helm tgz。 接下來,API Server收到用戶請求,存儲到相關數據到etcd。 調度器經過API Server查看未調度(bind)的Pod列表,循環遍歷地爲每一個Pod分配節點,嘗試爲Pod分配節點。
調度過程分爲2個階段:
第一階段:預選過程,過濾節點,調度器用一組規則過濾掉不符合要求的主機。好比Pod指定了所須要的資源量,那麼可用資源比Pod須要的資源量少的主機會被過濾掉。
第二階段:優選過程,節點優先級打分,對第一步篩選出的符合要求的主機進行打分,在主機打分階段,調度器會考慮一些總體優化策略,好比把容一個Replication Controller的副本分佈到不一樣的主機上,使用最低負載的主機等。
選擇主機:選擇打分最高的節點,進行binding操做,結果存儲到etcd中。 所選節點對於的kubelet根據調度結果執行Pod建立操做