關於做者 劉海平(HappyLau )雲計算高級顧問 目前在騰訊雲從事公有云相關工做,曾就任於酷狗,EasyStack,擁有多年公有云+私有云計算架構設計,運維,交付相關經驗,參與了酷狗,南方電網,國泰君安等大型私有云平臺建設,精通Linux,Kubernetes,OpenStack,Ceph等開源技術,在雲計算領域具備豐富實戰經驗,擁有RHCA/OpenStack/Linux授課經驗。node
上一篇文章中kubernetes系列教程(六)kubernetes資源管理和服務質量初步介紹了kubernetes中的resource資源調度和服務質量Qos,介紹了kubernetes中如何定義pod的資源和資源調度,以及設置resource以後的優先級別Qos,接下來介紹kubernetes系列教程pod的調度機制。linux
kubernets是容器編排引擎,其中最主要的一個功能是容器的調度,經過kube-scheduler實現容器的徹底自動化調度,調度週期分爲:調度週期Scheduling Cycle和綁定週期Binding Cycle,其中調度週期細分爲過濾filter和weight稱重,按照指定的調度策略將知足運行pod節點的node賽選出來,而後進行排序;綁定週期是通過kube-scheduler調度優選的pod後,由特定的node節點watch而後經過kubelet運行。nginx
過濾階段包含預選Predicate和scoring排序,預選是篩選知足條件的node,排序是最知足條件的node打分並排序,預選的算法包含有:web
過濾條件須要檢查node上知足的條件,能夠經過kubectl describe node node-id方式查看,以下圖:算法
優選調度算法有:api
nodeName是PodSpec中的一個字段,能夠經過pod.spec.nodeName指定將pod調度到某個具體的node節點上,該字段比較特殊通常都爲空,若是有設置nodeName字段,kube-scheduler會直接跳過調度,在特定節點上經過kubelet啓動pod。經過nodeName調度並不是是集羣的智能調度,經過指定調度的方式可能會存在資源不均勻的狀況,建議設置Guaranteed的Qos,防止資源不均時候Pod被驅逐evince。以下以建立一個pod運行在node-3上爲例:架構
[root@node-1 demo]# cat nginx-nodeName.yaml apiVersion: v1 kind: Pod metadata: name: nginx-run-on-nodename annotations: kubernetes.io/description: "Running the Pod on specific nodeName" spec: containers: - name: nginx-run-on-nodename image: nginx:latest ports: - name: http-80-port protocol: TCP containerPort: 80 nodeName: node-3 #經過nodeName指定將nginx-run-on-nodename運行在特定節點node-3上
[root@node-1 demo]# kubectl apply -f nginx-nodeName.yaml pod/nginx-run-on-nodename created
[root@node-1 demo]# kubectl get pods nginx-run-on-nodename -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-run-on-nodename 1/1 Running 0 6m52s 10.244.2.15 node-3 <none> <none>
nodeSelector是PodSpec中的一個字段,nodeSelector是最簡單實現將pod運行在特定node節點的實現方式,其經過指定key和value鍵值對的方式實現,須要node設置上匹配的Labels,節點調度的時候指定上特定的labels便可。以下以node-2添加一個app:web的labels,調度pod的時候經過nodeSelector選擇該labels:
[root@node-1 demo]# kubectl label node node-2 app=web node/node-2 labeled
[root@node-1 demo]# kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS node-1 Ready master 15d v1.15.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node-1,kubernetes.io/os=linux,node-role.kubernetes.io/master= node-2 Ready <none> 15d v1.15.3 app=web,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node-2,kubernetes.io/os=linux node-3 Ready <none> 15d v1.15.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node-3,kubernetes.io/os=linux
[root@node-1 demo]# cat nginx-nodeselector.yaml apiVersion: v1 kind: Pod metadata: name: nginx-run-on-nodeselector annotations: kubernetes.io/description: "Running the Pod on specific node by nodeSelector" spec: containers: - name: nginx-run-on-nodeselector image: nginx:latest ports: - name: http-80-port protocol: TCP containerPort: 80 nodeSelector: #經過nodeSelector將pod調度到特定的labels app: web
[root@node-1 demo]# kubectl apply -f nginx-nodeselector.yaml pod/nginx-run-on-nodeselector created
[root@node-1 demo]# kubectl get pods nginx-run-on-nodeselector -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-run-on-nodeselector 1/1 Running 0 51s 10.244.1.24 node-2 <none> <none>
系統默認預先定義有多種內置的labels,這些labels能夠標識node的屬性,如arch架構,操做系統類型,主機名等app
affinity/anti-affinity和nodeSelector功能相相似,相比於nodeSelector,affinity的功能更加豐富,將來會取代nodeSelector,affinity增長了以下的一些功能加強:框架
下面經過一個例子來演示node affinity的使用,requiredDuringSchedulingIgnoredDuringExecution指定須要知足的條件,preferredDuringSchedulingIgnoredDuringExecution指定優選的條件,二者之間取與關係。運維
[root@node-1 ~]# kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS node-1 Ready master 15d v1.15.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node-1,kubernetes.io/os=linux,node-role.kubernetes.io/master= node-2 Ready <none> 15d v1.15.3 app=web,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node-2,kubernetes.io/os=linux node-3 Ready <none> 15d v1.15.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node-3,kubernetes.io/os=linux
[root@node-1 demo]# cat nginx-node-affinity.yaml apiVersion: v1 kind: Pod metadata: name: nginx-run-node-affinity annotations: kubernetes.io/description: "Running the Pod on specific node by node affinity" spec: containers: - name: nginx-run-node-affinity image: nginx:latest ports: - name: http-80-port protocol: TCP containerPort: 80 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - node-1 - node-2 - node-3 preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: app operator: In values: ["web"]
[root@node-1 demo]# kubectl apply -f nginx-node-affinity.yaml pod/nginx-run-node-affinity created
[root@node-1 demo]# kubectl get pods --show-labels nginx-run-node-affinity -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS nginx-run-node-affinity 1/1 Running 0 106s 10.244.1.25 node-2 <none> <none> <none>
本文介紹了kubernetes中的調度機制,默認建立pod是全自動調度機制,調度由kube-scheduler實現,調度過程分爲兩個階段調度階段(過濾和沉重排序)和綁定階段(在node上運行pod)。經過干預有四種方式:
調度框架介紹:https://kubernetes.io/docs/concepts/configuration/scheduling-framework/
Pod調度方法:https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
當你的才華撐不起你的野心時,你就應該靜下心來學習