1、知識準備
● 本文詳細探索deployment在滾動更新時候的行爲
● 相關的參數介紹:
livenessProbe:存活性探測。判斷pod是否已經中止
readinessProbe:就緒性探測。判斷pod是否可以提供正常服務
maxSurge:在滾動更新過程當中最多能夠存在的pod數
maxUnavailable:在滾動更新過程當中最多不可用的pod數node
2、環境準備
組件 | 版本 |
---|---|
OS | Ubuntu 18.04.1 LTS |
docker | 18.06.0-ce |
3、準備鏡像、yaml文件
首先準備2個不一樣版本的鏡像,用於測試(已經在阿里雲上建立好2個不一樣版本的nginx鏡像)nginx
docker pull registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:v1 docker pull registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:delay_v1
2個鏡像都提供相同的服務,只不過nginx:delay_v1
會延遲啓動20才啓動nginxdocker
root@k8s-master:~# docker run -d --rm -p 10080:80 nginx:v1 e88097841c5feef92e4285a2448b943934ade5d86412946bc8d86e262f80a050 root@k8s-master:~# curl http://127.0.0.1:10080 ---------- version: v1 hostname: f5189a5d3ad3
yaml文件:後端
root@k8s-master:~# more roll_update.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: update-deployment spec: replicas: 3 template: metadata: labels: app: roll-update spec: containers: - name: nginx image: registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:v1 imagePullPolicy: Always --- apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: roll-update ports: - protocol: TCP port: 10080 targetPort: 80
4、livenessProbe與readinessProbe
livenessProbe:存活性探測,最主要是用來探測pod是否須要重啓
readinessProbe:就緒性探測,用來探測pod是否已經可以提供服務api
● 在滾動更新的過程當中,pod會動態的被delete,而後又被create出來。存活性探測保證了始終有足夠的pod存活提供服務,一旦出現pod數量不足,k8s會當即拉起新的pod
● 可是在pod啓動的過程當中,服務正在打開,並不可用,這時候若是有流量打過來,就會形成報錯app
下面來模擬一下這個場景:curl
首先apply上述的配置文件tcp
root@k8s-master:~# kubectl apply -f roll_update.yaml deployment.extensions "update-deployment" created service "nginx-service" created root@k8s-master:~# kubectl get pod -owide NAME READY STATUS RESTARTS AGE IP NODE update-deployment-7db77f7cc6-c4s2v 1/1 Running 0 28s 10.10.235.232 k8s-master update-deployment-7db77f7cc6-nfgtd 1/1 Running 0 28s 10.10.36.82 k8s-node1 update-deployment-7db77f7cc6-tflfl 1/1 Running 0 28s 10.10.169.158 k8s-node2 root@k8s-master:~# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx-service ClusterIP 10.254.254.199 <none> 10080/TCP 1m
從新打開終端,測試當前服務的可用性(每秒作一次循環去獲取nginx的服務內容):ide
root@k8s-master:~# while :; do curl http://10.254.254.199:10080; sleep 1; done ---------- version: v1 hostname: update-deployment-7db77f7cc6-nfgtd ---------- version: v1 hostname: update-deployment-7db77f7cc6-c4s2v ---------- version: v1 hostname: update-deployment-7db77f7cc6-tflfl ---------- version: v1 hostname: update-deployment-7db77f7cc6-nfgtd ...
這時候把鏡像版本更新到nginx:delay_v1,這個鏡像會延遲啓動nginx,也就是說,會先sleep 20s,而後纔去啓動nginx服務。這就模擬了在服務啓動過程當中,雖然pod已是存在的狀態,可是並無真正提供服務測試
root@k8s-master:~# kubectl patch deployment update-deployment --patch '{"metadata":{"annotations":{"kubernetes.io/change-cause":"update version to v2"}} ,"spec": {"template": {"spec": {"containers": [{"name": "nginx","image":"registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:delay_v1"}]}}}}' deployment.extensions "update-deployment" patched
... ---------- version: v1 hostname: update-deployment-7db77f7cc6-h6hvt curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused curl: (7) Failed to connect to 10.254.254.199 port 10080: Connection refused ---------- version: delay_v1 hostname: update-deployment-d788c7dc6-6th87 ---------- version: delay_v1 hostname: update-deployment-d788c7dc6-n22vz ---------- version: delay_v1 hostname: update-deployment-d788c7dc6-njmpz ---------- version: delay_v1 hostname: update-deployment-d788c7dc6-6th87
能夠看到,因爲延遲啓動,nginx並無真正作好準備提供服務,此時流量已經發到後端,致使服務不可用的狀態
因此,加入readinessProbe是很是必要的手段:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: update-deployment spec: replicas: 3 template: metadata: labels: app: roll-update spec: containers: - name: nginx image: registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:v1 imagePullPolicy: Always readinessProbe: tcpSocket: port: 80 initialDelaySeconds: 5 periodSeconds: 10 --- apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: roll-update ports: - protocol: TCP port: 10080 targetPort: 80
重複上述步驟,先建立nginx:v1
,而後patch到nginx:delay_v1
root@k8s-master:~# kubectl apply -f roll_update.yaml deployment.extensions "update-deployment" created service "nginx-service" created root@k8s-master:~# kubectl patch deployment update-deployment --patch '{"metadata":{"annotations":{"kubernetes.io/change-cause":"update version to v2"}} ,"spec": {"template": {"spec": {"containers": [{"name": "nginx","image":"registry.cn-beijing.aliyuncs.com/mrvolleyball/nginx:delay_v1"}]}}}}' deployment.extensions "update-deployment" patched
root@k8s-master:~# kubectl get pod -owide NAME READY STATUS RESTARTS AGE IP NODE busybox 1/1 Running 0 45d 10.10.235.255 k8s-master lifecycle-demo 1/1 Running 0 32d 10.10.169.186 k8s-node2 private-reg 1/1 Running 0 92d 10.10.235.209 k8s-master update-deployment-54d497b7dc-4mlqc 0/1 Running 0 13s 10.10.169.178 k8s-node2 update-deployment-54d497b7dc-pk4tb 0/1 Running 0 13s 10.10.36.98 k8s-node1 update-deployment-6d5d7c9947-l7dkb 1/1 Terminating 0 1m 10.10.169.177 k8s-node2 update-deployment-6d5d7c9947-pbzmf 1/1 Running 0 1m 10.10.36.97 k8s-node1 update-deployment-6d5d7c9947-zwt4z 1/1 Running 0 1m 10.10.235.246 k8s-master
● 因爲設置了readinessProbe,雖然pod已經啓動起來了,可是並不會當即投入使用,因此出現了 READY: 0/1
的狀況
● 而且有pod出現了一直持續Terminating
狀態,由於滾動更新的限制,至少要保證有pod可用
再查看curl的狀態,image的版本平滑更新到了nginx:delay_v1
,沒有出現報錯的情況
root@k8s-master:~# while :; do curl http://10.254.66.136:10080; sleep 1; done ... version: v1 hostname: update-deployment-6d5d7c9947-pbzmf ---------- version: v1 hostname: update-deployment-6d5d7c9947-zwt4z ---------- version: v1 hostname: update-deployment-6d5d7c9947-pbzmf ---------- version: v1 hostname: update-deployment-6d5d7c9947-zwt4z ---------- version: delay_v1 hostname: update-deployment-54d497b7dc-pk4tb ---------- version: delay_v1 hostname: update-deployment-54d497b7dc-4mlqc ---------- version: delay_v1 hostname: update-deployment-54d497b7dc-pk4tb ---------- version: delay_v1 hostname: update-deployment-54d497b7dc-4mlqc ...
5、maxSurge與maxUnavailable
● 在滾動更新中,有幾種更新方案:先刪除老的pod,而後添加新的pod;先添加新的pod,而後刪除老的pod。在這個過程當中,服務必須是可用的(也就是livenessProbe與readiness必須檢測經過)
● 在具體的實施中,由maxSurge與maxUnavailable來控制到底是先刪老的仍是先加新的以及粒度
● 若指定的副本數爲3:
maxSurge=1 maxUnavailable=0:最多容許存在4個(3+1)pod,必須有3個pod(3-0)同時提供服務。先建立一個新的pod,可用以後刪除老的pod,直至所有更新完畢
maxSurge=0 maxUnavailable=1:最多容許存在3個(3+0)pod,必須有2個pod(3-1)同時提供服務。先刪除一個老的pod,而後建立新的pod,直至所有更新完畢
● 歸根結底,必須知足maxSurge與maxUnavailable的條件,若是maxSurge與maxUnavailable同時爲0,那就無法更新了,由於又不讓刪除,也不讓添加,這種條件是沒法知足的
6、小結
● 本文介紹了deployment滾動更新過程當中,maxSurge、maxUnavailable、liveness、readiness等參數的使用
● 在滾動更新過程當中,還有留有一個問題。好比在一個大型的系統中,某個業務的pod數不少(100個),執行一次滾動更新時,勢必會形成pod版本不一致(有些pod是老版本,有些pod是新版本),用戶訪問頗有可能會形成屢次結果不一致的現象,直至版本更新完畢。關於這個問題待以後慢慢討論
至此,本文結束 在下才疏學淺,有撒湯漏水的,請各位不吝賜教...