RC、Deployment、DaemonSet都是面向無狀態的服務,它們所管理的Pod的IP、Hostname、啓停順序等都是隨機的,被管理的Pod重建時,Pod的IP、Hostname都會有變化。而StatefulSet是有狀態的集合,管理全部有狀態的服務,好比MySQL、MongoDB集羣等。
StatefulSet本質上是Deployment的一種變體,在v1.9版本中已成爲GA版本,它爲了解決有狀態服務的問題,它所管理的Pod擁有固定的Pod名稱、啓停順序;在StatefulSet中,Pod名字稱爲網絡標識(hostname),還必需要用到共享存儲。
在Deployment中,與之對應的服務是service,而在StatefulSet中與之對應的headless service,headless service,即無頭服務,與service的區別就是它沒有Cluster IP,解析它的名稱時將返回該Headless Service對應的所有Pod的Endpoint列表。
以redis cluster爲例,因爲各redis container 的角色不必定相同(有master、slave之分),因此每一個redis container被重建以後必須保持原有的hostname,必須掛載原有的volume,這樣才能保證每一個shard內是正常的。並且每一個redis shard 所管理的slot不一樣,存儲的數據不一樣,因此要求每一個redis shard 所鏈接的存儲不一樣,保證數據不會被覆蓋或混亂。(注:在Deployment中 Pod template裏定義的存儲卷,全部副本集共用一個存儲卷,數據是相同的,由於Pod建立時基於同一模板生成)
爲了保證container所掛載的volume不會出錯,k8s引入了volumeClaimTemplate。
因此具備如下特性的應用使用statefullSet:
1)、穩定且惟一的網絡標識符;
2)、穩定且持久的存儲;
3)、有序、平滑地部署和擴展;
4)、有序、平滑的終止和刪除;
5)、有序的滾動更新;html
對於一個完整的StatefulSet應用由三個部分組成: headless service、StatefulSet controller、volumeClaimTemplate。node
例1:
因爲本例中pv是靜態提供,因此首先準備pv,以下: nginx
[root@k8s-master-dev statefulset]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv01 5Gi RWO,RWX Retain Available 27m pv02 10Gi RWO,RWX Retain Available 27m pv03 15Gi RWO,RWX Retain Available 27m [root@k8s-master-dev statefulset]#
而後定義一個statefulset 應用,以下:web
[root@k8s-master-dev statefulset]# vim statefulset-demo.yaml [root@k8s-master-dev statefulset]# cat statefulset-demo.yaml apiVersion: v1 kind: Service metadata: name: ngx-svc labels: app: ngx-svc spec: ports: - port: 80 name: web clusterIP: None selector: app: ngx-pod --- apiVersion: apps/v1 kind: StatefulSet metadata: name: ngx spec: serviceName: ngx-svc #聲明它屬於哪一個Headless Service replicas: 2 selector: matchLabels: app: ngx-pod #has to match .spec.template.metadata.labels template: metadata: labels: app: ngx-pod #has to match .spec.selector.matchLabels spec: containers: - name: ngx image: nginx:1.15-alpine imagePullPolicy: IfNotPresent ports: - containerPort: 80 name: web volumeMounts: - name: ngxvol mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: ngxvol spec: accessModes: ["ReadWriteMany"] resources: requests: storage: 5Gi [root@k8s-master-dev statefulset]# kubectl apply -f statefulset-demo.yaml service/ngx-svc created statefulset.apps/ngx created [root@k8s-master-dev statefulset]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 1d ngx-svc ClusterIP None <none> 80/TCP 15s [root@k8s-master-dev statefulset]# kubectl get sts NAME DESIRED CURRENT AGE ngx 2 2 30s [root@k8s-master-dev statefulset]# kubectl get pods NAME READY STATUS RESTARTS AGE ngx-0 1/1 Running 0 35s ngx-1 1/1 Running 0 34s [root@k8s-master-dev statefulset]#
每一個podname 被定義爲pod.name-0、pod.name-1... 依次類推。而每一個pod的FQDN名被定義爲: $(pod.name).(headless server name).namespace.svc.cluster.local
每一個PVC 的名稱又由兩個部分組成:$(volumeClaimTemplates.name)-(pod.name) ,表明該PVC由哪一個volumeClaimTemplates申請建立,且永遠被掛載到$(pod.name)上。當原Pod被刪除以後,PVC保持不變,數據不會丟失(手動刪除pvc將自動釋放pv)。當新Pod被建立以後,原podname會被繼承,也會再次掛載到原Volume之上。redis
[root@k8s-master-dev statefulset]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv01 5Gi RWO,RWX Retain Bound default/ngxvol-ngx-0 35m pv02 10Gi RWO,RWX Retain Bound default/ngxvol-ngx-1 35m pv03 15Gi RWO,RWX Retain Available 35m [root@k8s-master-dev statefulset]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ngxvol-ngx-0 Bound pv01 5Gi RWO,RWX 5m ngxvol-ngx-1 Bound pv02 10Gi RWO,RWX 5m [root@k8s-master-dev statefulset]# [root@k8s-master-dev manifests]# kubectl exec -it ngx-0 -- /bin/sh / # nslookup ngx-0.ngx-svc.default.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve Name: ngx-0.ngx-svc.default.svc.cluster.local Address 1: 10.244.4.2 ngx-0.ngx-svc.default.svc.cluster.local / # / # / # nslookup ngx-1.ngx-svc.default.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve Name: ngx-1.ngx-svc.default.svc.cluster.local Address 1: 10.244.1.101 ngx-1.ngx-svc.default.svc.cluster.local / # / # [root@k8s-master-dev manifests]# [root@k8s-master-dev statefulset]# kubectl exec ngx-0 -- ls /usr/share/nginx/html [root@k8s-master-dev statefulset]# kubectl exec -it ngx-0 -- /bin/sh / # echo ngx-0 > /usr/share/nginx/html/index.html / # [root@k8s-master-dev statefulset]# [root@k8s-master-dev statefulset]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE ngx-0 1/1 Running 0 9m 10.244.1.98 k8s-node1-dev <none> ngx-1 1/1 Running 0 9m 10.244.2.63 k8s-node2-dev <none> [root@k8s-master-dev statefulset]# curl http://10.244.1.98 ngx-0 [root@k8s-master-dev statefulset]# kubectl delete pod/ngx-0 pod "ngx-0" deleted [root@k8s-master-dev statefulset]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE ngx-0 1/1 Running 0 8s 10.244.1.99 k8s-node1-dev <none> ngx-1 1/1 Running 0 10m 10.244.2.63 k8s-node2-dev <none> [root@k8s-master-dev statefulset]# curl http://10.244.1.99 ngx-0 [root@k8s-master-dev statefulset]#
pod的擴展、收縮都是按照順序執行。以下所示:shell
[root@k8s-master-dev statefulset]# kubectl scale sts ngx --replicas=3 statefulset.apps/ngx scaled [root@k8s-master-dev statefulset]# kubectl get pods NAME READY STATUS RESTARTS AGE ngx-0 1/1 Running 0 8m ngx-1 1/1 Running 0 18m ngx-2 1/1 Running 0 3s [root@k8s-master-dev statefulset]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ngxvol-ngx-0 Bound pv01 5Gi RWO,RWX 18m ngxvol-ngx-1 Bound pv02 10Gi RWO,RWX 18m ngxvol-ngx-2 Bound pv03 15Gi RWO,RWX 12s [root@k8s-master-dev statefulset]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv01 5Gi RWO,RWX Retain Bound default/ngxvol-ngx-0 49m pv02 10Gi RWO,RWX Retain Bound default/ngxvol-ngx-1 48m pv03 15Gi RWO,RWX Retain Bound default/ngxvol-ngx-2 48m [root@k8s-master-dev statefulset]# kubectl patch sts ngx -p '{"spec":{"replicas":2}}' statefulset.apps/ngx patched [root@k8s-master-dev statefulset]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ngxvol-ngx-0 Bound pv01 5Gi RWO,RWX 20m ngxvol-ngx-1 Bound pv02 10Gi RWO,RWX 20m ngxvol-ngx-2 Bound pv03 15Gi RWO,RWX 1m [root@k8s-master-dev statefulset]# kubectl get pods NAME READY STATUS RESTARTS AGE ngx-0 1/1 Running 0 9m ngx-1 1/1 Running 0 20m [root@k8s-master-dev statefulset]# [root@k8s-master-dev statefulset]# kubectl delete -f statefulset-demo.yaml service "ngx-svc" deleted statefulset.apps "ngx" deleted [root@k8s-master-dev statefulset]# kubectl delete pvc --all persistentvolumeclaim "ngxvol-ngx-0" deleted persistentvolumeclaim "ngxvol-ngx-1" deleted persistentvolumeclaim "ngxvol-ngx-2" deleted [root@k8s-master-dev statefulset]# kubectl delete -f ../volumes/pv-vol-demo.yaml persistentvolume "pv01" deleted persistentvolume "pv02" deleted persistentvolume "pv03" deleted [root@k8s-master-dev statefulset]#
更新策略
在Kubernetes 1.7及更高版本中,經過.spec.updateStrategy字段容許配置或禁用Pod、labels、source request/limits、annotations自動滾動更新功能。
OnDelete:經過.spec.updateStrategy.type 字段設置爲OnDelete,StatefulSet控制器不會自動更新StatefulSet中的Pod。用戶必須手動刪除Pod,以使控制器建立新的Pod。
RollingUpdate:經過.spec.updateStrategy.type 字段設置爲RollingUpdate,實現了Pod的自動滾動更新,若是.spec.updateStrategy未指定,則此爲默認策略。
StatefulSet控制器將刪除並從新建立StatefulSet中的每一個Pod。它將以Pod終止(從最大序數到最小序數)的順序進行,一次更新每一個Pod。在更新下一個Pod以前,必須等待這個Pod Running and Ready。
Partitions:經過指定 .spec.updateStrategy.rollingUpdate.partition 來對 RollingUpdate 更新策略進行分區,若是指定了分區,則當 StatefulSet 的 .spec.template 更新時,具備大於或等於分區序數的全部 Pod 將被更新。
具備小於分區的序數的全部 Pod 將不會被更新,即便刪除它們也將被從新建立。若是 StatefulSet 的 .spec.updateStrategy.rollingUpdate.partition 大於其 .spec.replicas,則其 .spec.template 的更新將不會傳播到 Pod。在大多數狀況下,不須要使用分區。
修改更新策略及更新image 例:vim
kubectl patch sts ngx -p '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":4}}}}' kubectl set image sts/ngx ngx=nginx:latest