容器編排系統K8s之StatefulSet控制器

  前文咱們聊到了k8s的configmap和secret資源的說明和相關使用示例,回顧請參考:https://www.cnblogs.com/qiuhom-1874/p/14194944.html;今天咱們來了解下k8s上的statefulSet控制器的相關話題;html

  一、statefulset控制器的做用node

  簡單講statefulset控制器主要用來在k8s上管理有狀態應用pod;咱們常常運維的一些應用主要能夠分爲4類,分別從是否有狀態和是否有存儲兩個維度去描述,咱們能夠將應用分爲有狀態無存儲,有狀態有存儲,無狀態無存儲和無狀態有存儲這四種;大部份應用都是有狀態有存儲或無狀態無存儲的應用,只有少數應用是有狀態無存儲或無狀態有存儲;好比mysql的主從複製集羣就是一個有狀態有存儲的應用;又好比一些http服務,如nginx,apache這些服務它就是無狀態無存儲(沒有用戶上傳數據);有狀態和無狀態的最本質區別是有狀態應用用戶的每次請求對應狀態都不同,狀態是隨時發生變化的,這種應用若是運行在k8s上,一旦對應pod崩潰,此時重建一個pod來替代以前的pod就必須知足,重建的pod必須和以前的pod上的數據保持一致,其次重建的pod要和現有集羣的框架適配;好比mysql主從複製集羣,當一個從發生故障,重建的pod必須知足可以正常掛在以前pod的pvc存儲卷,以保證二者數據的一致;其次就是新建的pod要適配到當前mysql主從複製的架構;從上述描述來看,在k8s上託管有狀態服務咱們必須解決上述問題纔可以讓一個有狀態服務在k8s上正常跑起來爲用戶提供服務;爲此k8s專門弄了一個statefulset控制器來管理有狀態pod,可是k8s上的statefulset控制器它不是幫咱們把上述的問題所有解決,它只負責幫我啓動對應數量的pod,而且把每一個pod的名稱序列化,若是對應pod崩潰,重建後的pod名稱和原來的pod名稱是同樣的;所謂序列化是指pod名稱再也不是pod控制名稱加隨機生成的字符串,而是pod控制器名稱加一個順序的數字;好比statefulset控制器的名稱爲web-demo,那麼對應控制器啓動的pod就是web-demo-0、web-demo-1相似這樣的邏輯命名;其次statefulset它還會把以前pod的pvc存儲卷自動掛載到重建後的pod上(這取決pvc回收策略必須爲Retain,即刪除對應pod後端pvc存儲卷保持不變),從而實現新建pod持有數據和以前的pod相同;簡單講statefulset控制器只是幫助咱們在k8s上啓動對應數量的pod,每一個pod分配一個固定不變的名稱,無論pod怎麼調度,對應pod的名稱是一直不變的;即使把對應pod刪除再重建,重建後的pod的名稱仍是和以前的pod名稱同樣;其次就是自動幫咱們把對應pod的pvc掛載到重建後的pod上,以保證二者數據的相同;statefulset控制器只幫咱們作這些事,至於pod內部跑的容器應用怎麼去適配對應的集羣架構,相似業務邏輯的問題須要咱們用戶手動去寫代碼解決,由於對於不一樣的應用其集羣邏輯架構和組織方式都不一樣,statefulset控制器不能作到以某種機制去適配全部的有狀態應用的邏輯架構和組織方式;mysql

  二、statefulset控制器示意圖nginx

  提示:statefulset控制器主要由pod模版和pvc模版組成;其中pod模版主要定義pod相關屬性信息,對應pvc模版主要用來爲對應pod提供存儲卷,該存儲卷可使用sc資源來動態建立並關聯pv,也能夠管理員手動建立並關聯對應的pv;git

  三、statefulset控制器的建立和使用程序員

[root@master01 ~]# cat statefulset-demo.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx 
  serviceName: nginx
  replicas: 3 
  template:
    metadata:
      labels:
        app: nginx 
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: nginx:1.14-alpine
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
[root@master01 ~]# 

  提示:statefulset控制器依賴handless類型service來管理pod的訪問;statefulset控制器會根據定義的副本數量和定義的pod模板和pvc模板來啓動pod,並給每一個pod分配一個固定不變的名稱,通常這個名稱都是statefulset名稱加上一個索引id,如上述清單,它會建立3個pod,這三個pod的名稱分別是web-0、web-一、web-2;pod名稱會結合handless service爲其每一個pod分配一個dns子域,訪問對應pod就能夠直接用這個子域名訪問便可;子域名格式爲$(pod_name).$(service_name).namespace_name.svc.集羣域名(若是在初始化爲指定默認集羣域名爲cluster.local);上述清單定義了一個handless服務,以及一個statefulset控制器,對應控制器下定義了一個pod模板,和一個pvc模板;其中在pod模板中的terminationGracePeriodSeconds字段用來指定終止容器的寬限期時長,默認不指定爲30秒;定義pvc模板須要用到volumeClaimTemplates字段,該字段的值爲一個對象列表;其內部咱們能夠定義pvc模板;若是後端存儲支持動態供給pv,還能夠在此模板中直接調用對應的sc資源;github

  在nfs服務器上導出共享目錄web

[root@docker_registry ~]# cat /etc/exports
/data/v1 192.168.0.0/24(rw,no_root_squash)
/data/v2 192.168.0.0/24(rw,no_root_squash)
/data/v3 192.168.0.0/24(rw,no_root_squash)
[root@docker_registry ~]# ll /data/v*     
/data/v1:
total 0

/data/v2:
total 0

/data/v3:
total 0
[root@docker_registry ~]# exportfs -av
exporting 192.168.0.0/24:/data/v3
exporting 192.168.0.0/24:/data/v2
exporting 192.168.0.0/24:/data/v1
[root@docker_registry ~]# 

  手動建立pvsql

[root@master01 ~]# cat pv-demo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv-v1
  labels:
    storsystem: nfs
    rel: stable
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes: ["ReadWriteOnce","ReadWriteMany","ReadOnlyMany"]
  persistentVolumeReclaimPolicy: Retain
  mountOptions:
  - hard
  - nfsvers=4.1
  nfs:
    path: /data/v1
    server: 192.168.0.99
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv-v2
  labels:
    storsystem: nfs
    rel: stable
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes: ["ReadWriteOnce","ReadWriteMany","ReadOnlyMany"]
  persistentVolumeReclaimPolicy: Retain
  mountOptions:
  - hard
  - nfsvers=4.1
  nfs:
    path: /data/v2
    server: 192.168.0.99
---

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv-v3
  labels:
    storsystem: nfs
    rel: stable
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes: ["ReadWriteOnce","ReadWriteMany","ReadOnlyMany"]
  persistentVolumeReclaimPolicy: Retain
  mountOptions:
  - hard
  - nfsvers=4.1
  nfs:
    path: /data/v3
    server: 192.168.0.99
[root@master01 ~]# 

  提示:手動建立pv須要將其pv回收策略設置爲Retain,以避免對應pod刪除之後,對應pv變成release狀態而致使不可用;docker

  應用配置清單

[root@master01 ~]# kubectl apply -f pv-demo.yaml
persistentvolume/nfs-pv-v1 created
persistentvolume/nfs-pv-v2 created
persistentvolume/nfs-pv-v3 created
[root@master01 ~]# kubectl get pv
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
nfs-pv-v1   1Gi        RWO,ROX,RWX    Retain           Available                                   3s
nfs-pv-v2   1Gi        RWO,ROX,RWX    Retain           Available                                   3s
nfs-pv-v3   1Gi        RWO,ROX,RWX    Retain           Available                                   3s
[root@master01 ~]# 

  提示:若是後端存儲支持動態供給pv,此步驟能夠省略;直接建立sc資源對象,而後在statefulset資源清單中的pvc模板中引用對應的sc對象名稱就能夠實現動態供給pv並綁定對應的pvc;

  應用statefulset資源清單

[root@master01 ~]# kubectl apply -f statefulset-demo.yaml 
service/nginx created
statefulset.apps/web created
[root@master01 ~]# kubectl get pv
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
nfs-pv-v1   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-0                           4m7s
nfs-pv-v2   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-1                           4m7s
nfs-pv-v3   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-2                           4m7s
[root@master01 ~]# kubectl get pvc
NAME        STATUS   VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
www-web-0   Bound    nfs-pv-v1   1Gi        RWO,ROX,RWX                   14s
www-web-1   Bound    nfs-pv-v2   1Gi        RWO,ROX,RWX                   12s
www-web-2   Bound    nfs-pv-v3   1Gi        RWO,ROX,RWX                   7s
[root@master01 ~]# kubectl get sts
NAME   READY   AGE
web    3/3     27s
[root@master01 ~]# kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          38s
web-1   1/1     Running   0          36s
web-2   1/1     Running   0          31s
[root@master01 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   48m
nginx        ClusterIP   None         <none>        80/TCP    41s
[root@master01 ~]# 

  提示:能夠看到應用statefulset資源清單之後,對應的pv從available狀態變爲了bound狀態,而且自動建立了3個pvc,對應pod的名稱再也不是控制器名稱加一串隨機字符串,而是statefulset控制器名稱加一個有序的數字;一般這個數字從0開始,依次向上加,咱們把這個數字叫作對應pod的索引;

  查看statefulset控制器詳細信息

[root@master01 ~]# kubectl describe sts web
Name:               web
Namespace:          default
CreationTimestamp:  Mon, 28 Dec 2020 19:34:11 +0800
Selector:           app=nginx
Labels:             <none>
Annotations:        <none>
Replicas:           3 desired | 3 total
Update Strategy:    RollingUpdate
  Partition:        0
Pods Status:        3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=nginx
  Containers:
   nginx:
    Image:        nginx:1.14-alpine
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /usr/share/nginx/html from www (rw)
  Volumes:  <none>
Volume Claims:
  Name:          www
  StorageClass:  
  Labels:        <none>
  Annotations:   <none>
  Capacity:      1Gi
  Access Modes:  [ReadWriteOnce]
Events:
  Type    Reason            Age    From                    Message
  ----    ------            ----   ----                    -------
  Normal  SuccessfulCreate  5m59s  statefulset-controller  create Claim www-web-0 Pod web-0 in StatefulSet web success
  Normal  SuccessfulCreate  5m59s  statefulset-controller  create Pod web-0 in StatefulSet web successful
  Normal  SuccessfulCreate  5m57s  statefulset-controller  create Claim www-web-1 Pod web-1 in StatefulSet web success
  Normal  SuccessfulCreate  5m57s  statefulset-controller  create Pod web-1 in StatefulSet web successful
  Normal  SuccessfulCreate  5m52s  statefulset-controller  create Claim www-web-2 Pod web-2 in StatefulSet web success
  Normal  SuccessfulCreate  5m52s  statefulset-controller  create Pod web-2 in StatefulSet web successful
[root@master01 ~]# 

  提示:從上面的詳細信息中能夠了解到,對應statefulset控制器是建立一個pvc,而後在建立一個pod;只有當第一個pod和pvc都成功並就緒之後,對應纔會進行下一個pvc和pod的建立和掛載;簡單講它裏面是有序串行進行的;

  驗證:在k8s集羣上任意節點查看對應nginx服務名稱,看看是否可以查到對應服務名稱域名下的pod記錄

  安裝dns工具包

[root@master01 ~]# yum install -y bind-utils

  用dig工具查看對應nginx.default.cluster.local域名在coredns上的解析記錄

[root@master01 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   64m
nginx        ClusterIP   None         <none>        80/TCP    20m
[root@master01 ~]# kubectl get svc -n kube-system                 
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   20d
[root@master01 ~]# dig nginx.default.svc.cluster.local @10.96.0.10

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.3 <<>> nginx.default.svc.cluster.local @10.96.0.10
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61539
;; flags: qr aa rd; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nginx.default.svc.cluster.local. IN    A

;; ANSWER SECTION:
nginx.default.svc.cluster.local. 30 IN  A       10.244.2.109
nginx.default.svc.cluster.local. 30 IN  A       10.244.4.27
nginx.default.svc.cluster.local. 30 IN  A       10.244.3.108

;; Query time: 0 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Mon Dec 28 19:54:34 CST 2020
;; MSG SIZE  rcvd: 201

[root@master01 ~]# kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP             NODE             NOMINATED NODE   READINESS GATES
web-0   1/1     Running   0          22m   10.244.4.27    node04.k8s.org   <none>           <none>
web-1   1/1     Running   0          22m   10.244.2.109   node02.k8s.org   <none>           <none>
web-2   1/1     Running   0          22m   10.244.3.108   node03.k8s.org   <none>           <none>
[root@master01 ~]# 

  提示:從上面的查詢結果能夠看到,對應default名稱空間下nginx服務名稱,對應在coredns上的記錄有3條;而且對應的解析記錄就是對應服務後端的podip地址;

  驗證:查詢web-0的記錄是不是對應的web-0這個pod的ip地址呢?

[root@master01 ~]# kubectl get pods web-0 -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES
web-0   1/1     Running   0          24m   10.244.4.27   node04.k8s.org   <none>           <none>
[root@master01 ~]# dig web-0.nginx.default.svc.cluster.local @10.96.0.10

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.3 <<>> web-0.nginx.default.svc.cluster.local @10.96.0.10
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13000
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;web-0.nginx.default.svc.cluster.local. IN A

;; ANSWER SECTION:
web-0.nginx.default.svc.cluster.local. 30 IN A  10.244.4.27

;; Query time: 0 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Mon Dec 28 19:58:58 CST 2020
;; MSG SIZE  rcvd: 119

[root@master01 ~]# 

  提示:能夠看到在對應查詢服務名稱域名前加上對應的pod名稱,在coredns上可以查詢到對應pod的ip記錄,這說明,後續訪問咱們能夠直接經過pod名稱加服務名稱直接訪問到對應pod;

  驗證:把集羣節點的dns服務器更改成coredns服務ip,而後使用pod名稱加服務名稱域名的方式訪問pod,看看是否可以訪問到pod?

[root@master01 ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search k8s.org
nameserver 10.96.0.10
[root@master01 ~]# curl web-0.nginx.default.svc.cluster.local
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx/1.14.2</center>
</body>
</html>
[root@master01 ~]# 

  提示:這裏可以響應403,說明pod可以正常訪問,只不過對應pod沒有主頁,因此提示403;

  驗證:進入對應pod裏面,提供主頁頁面,再次訪問,看看是否可以訪問到對應的頁面內容呢?

[root@master01 ~]# kubectl exec -it web-0 -- /bin/sh
/ # cd /usr/share/nginx/html/
/usr/share/nginx/html # ls
/usr/share/nginx/html # echo "this web-0 pod index" > index.html
/usr/share/nginx/html # ls
index.html
/usr/share/nginx/html # cat index.html 
this web-0 pod index
/usr/share/nginx/html # exit
[root@master01 ~]# curl web-0.nginx.default.svc.cluster.local
this web-0 pod index
[root@master01 ~]# 

  提示:能夠看到對應pod可以被訪問到;

  刪除web-0,看看對應pod是否自動重建?

[root@master01 ~]# kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP             NODE             NOMINATED NODE   READINESS GATES
web-0   1/1     Running   0          33m   10.244.4.27    node04.k8s.org   <none>           <none>
web-1   1/1     Running   0          33m   10.244.2.109   node02.k8s.org   <none>           <none>
web-2   1/1     Running   0          33m   10.244.3.108   node03.k8s.org   <none>           <none>
[root@master01 ~]# kubectl delete pod web-0
pod "web-0" deleted
[root@master01 ~]# kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP             NODE             NOMINATED NODE   READINESS GATES
web-0   1/1     Running   0          7s    10.244.4.28    node04.k8s.org   <none>           <none>
web-1   1/1     Running   0          33m   10.244.2.109   node02.k8s.org   <none>           <none>
web-2   1/1     Running   0          33m   10.244.3.108   node03.k8s.org   <none>           <none>
[root@master01 ~]# 

  提示:能夠看到手動刪除web-0之後,對應控制器會自動根據pod模板,重建一個名稱爲web-0的pod運行起來;

  驗證:訪問新建後的pod,看看是否可以訪問到對應的主頁呢?

[root@master01 ~]# curl web-0.nginx.default.svc.cluster.local
this web-0 pod index
[root@master01 ~]# 

  提示:能夠看到使用pod名稱加服務名稱域名,可以正常訪問到對應pod的主頁,這意味着新建的pod可以自動將以前刪除的pod的pvc存儲卷掛載到本身對一個的目錄下;

  擴展pod副本

  在nfs服務器上建立共享目錄,並導出對應的目錄

[root@docker_registry ~]# mkdir -pv /data/v{4,5,6}
mkdir: created directory ‘/data/v4’
mkdir: created directory ‘/data/v5’
mkdir: created directory ‘/data/v6’
[root@docker_registry ~]# echo "/data/v4 192.168.0.0/24(rw,no_root_squash)" >> /etc/exports
[root@docker_registry ~]# echo "/data/v5 192.168.0.0/24(rw,no_root_squash)" >> /etc/exports 
[root@docker_registry ~]# echo "/data/v6 192.168.0.0/24(rw,no_root_squash)" >> /etc/exports 
[root@docker_registry ~]# cat /etc/exports
/data/v1 192.168.0.0/24(rw,no_root_squash)
/data/v2 192.168.0.0/24(rw,no_root_squash)
/data/v3 192.168.0.0/24(rw,no_root_squash)
/data/v4 192.168.0.0/24(rw,no_root_squash)
/data/v5 192.168.0.0/24(rw,no_root_squash)
/data/v6 192.168.0.0/24(rw,no_root_squash)
[root@docker_registry ~]# exportfs -av
exporting 192.168.0.0/24:/data/v6
exporting 192.168.0.0/24:/data/v5
exporting 192.168.0.0/24:/data/v4
exporting 192.168.0.0/24:/data/v3
exporting 192.168.0.0/24:/data/v2
exporting 192.168.0.0/24:/data/v1
[root@docker_registry ~]# 

  複製建立pv的資源清單,更改成要建立pv對應的配置

[root@master01 ~]# cat pv-demo2.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv-v4
  labels:
    storsystem: nfs
    rel: stable
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes: ["ReadWriteOnce","ReadWriteMany","ReadOnlyMany"]
  persistentVolumeReclaimPolicy: Retain
  mountOptions:
  - hard
  - nfsvers=4.1
  nfs:
    path: /data/v4
    server: 192.168.0.99
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv-v5
  labels:
    storsystem: nfs
    rel: stable
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes: ["ReadWriteOnce","ReadWriteMany","ReadOnlyMany"]
  persistentVolumeReclaimPolicy: Retain
  mountOptions:
  - hard
  - nfsvers=4.1
  nfs:
    path: /data/v5
    server: 192.168.0.99
---

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv-v6
  labels:
    storsystem: nfs
    rel: stable
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes: ["ReadWriteOnce","ReadWriteMany","ReadOnlyMany"]
  persistentVolumeReclaimPolicy: Retain
  mountOptions:
  - hard
  - nfsvers=4.1
  nfs:
    path: /data/v6
    server: 192.168.0.99
[root@master01 ~]# 

  應用配置清單,建立pv

[root@master01 ~]# kubectl apply -f pv-demo2.yaml
persistentvolume/nfs-pv-v4 created
persistentvolume/nfs-pv-v5 created
persistentvolume/nfs-pv-v6 created
[root@master01 ~]# kubectl get pv
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM               STORAGECLASS   REASON   AGE
nfs-pv-v1   1Gi        RWO,ROX,RWX    Retain           Bound       default/www-web-0                           55m
nfs-pv-v2   1Gi        RWO,ROX,RWX    Retain           Bound       default/www-web-1                           55m
nfs-pv-v3   1Gi        RWO,ROX,RWX    Retain           Bound       default/www-web-2                           55m
nfs-pv-v4   1Gi        RWO,ROX,RWX    Retain           Available                                               4s
nfs-pv-v5   1Gi        RWO,ROX,RWX    Retain           Available                                               4s
nfs-pv-v6   1Gi        RWO,ROX,RWX    Retain           Available                                               4s
[root@master01 ~]# 

  擴展sts副本數爲6個

[root@master01 ~]# kubectl get sts
NAME   READY   AGE
web    3/3     53m
[root@master01 ~]# kubectl scale sts web --replicas=6
statefulset.apps/web scaled
[root@master01 ~]# 

  查看對應的pod擴展過程

[root@master01 ~]# kubectl get pod -w
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          19m
web-1   1/1     Running   0          53m
web-2   1/1     Running   0          53m
web-3   0/1     Pending   0          0s
web-3   0/1     Pending   0          0s
web-3   0/1     Pending   0          0s
web-3   0/1     ContainerCreating   0          0s
web-3   1/1     Running             0          2s
web-4   0/1     Pending             0          0s
web-4   0/1     Pending             0          0s
web-4   0/1     Pending             0          2s
web-4   0/1     ContainerCreating   0          2s
web-4   1/1     Running             0          4s
web-5   0/1     Pending             0          0s
web-5   0/1     Pending             0          0s
web-5   0/1     Pending             0          2s
web-5   0/1     ContainerCreating   0          2s
web-5   1/1     Running             0          4s

  提示:從上面的擴展過程能夠看到對應pod是串行擴展,當web-3就緒running之後,纔會進行web-4,依次類推;

  查看pv和pvc

[root@master01 ~]# kubectl get pv
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
nfs-pv-v1   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-0                           60m
nfs-pv-v2   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-1                           60m
nfs-pv-v3   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-2                           60m
nfs-pv-v4   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-4                           5m6s
nfs-pv-v5   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-3                           5m6s
nfs-pv-v6   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-5                           5m6s
[root@master01 ~]# kubectl get pvc
NAME        STATUS   VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
www-web-0   Bound    nfs-pv-v1   1Gi        RWO,ROX,RWX                   57m
www-web-1   Bound    nfs-pv-v2   1Gi        RWO,ROX,RWX                   57m
www-web-2   Bound    nfs-pv-v3   1Gi        RWO,ROX,RWX                   57m
www-web-3   Bound    nfs-pv-v5   1Gi        RWO,ROX,RWX                   3m31s
www-web-4   Bound    nfs-pv-v4   1Gi        RWO,ROX,RWX                   3m29s
www-web-5   Bound    nfs-pv-v6   1Gi        RWO,ROX,RWX                   3m25s
[root@master01 ~]# 

  提示:從上面的結果能夠看到pv和pvc都處於bound狀態;

  縮減pod數量爲4個

[root@master01 ~]# kubectl scale sts web --replicas=4
statefulset.apps/web scaled
[root@master01 ~]# kubectl get pods
NAME    READY   STATUS        RESTARTS   AGE
web-0   1/1     Running       0          28m
web-1   1/1     Running       0          61m
web-2   1/1     Running       0          61m
web-3   1/1     Running       0          7m46s
web-4   1/1     Running       0          7m44s
web-5   0/1     Terminating   0          7m40s
[root@master01 ~]# kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          28m
web-1   1/1     Running   0          62m
web-2   1/1     Running   0          62m
web-3   1/1     Running   0          8m4s
[root@master01 ~]# kubectl get pv
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
nfs-pv-v1   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-0                           66m
nfs-pv-v2   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-1                           66m
nfs-pv-v3   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-2                           66m
nfs-pv-v4   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-4                           10m
nfs-pv-v5   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-3                           10m
nfs-pv-v6   1Gi        RWO,ROX,RWX    Retain           Bound    default/www-web-5                           10m
[root@master01 ~]# kubectl get pvc
NAME        STATUS   VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
www-web-0   Bound    nfs-pv-v1   1Gi        RWO,ROX,RWX                   62m
www-web-1   Bound    nfs-pv-v2   1Gi        RWO,ROX,RWX                   62m
www-web-2   Bound    nfs-pv-v3   1Gi        RWO,ROX,RWX                   62m
www-web-3   Bound    nfs-pv-v5   1Gi        RWO,ROX,RWX                   8m13s
www-web-4   Bound    nfs-pv-v4   1Gi        RWO,ROX,RWX                   8m11s
www-web-5   Bound    nfs-pv-v6   1Gi        RWO,ROX,RWX                   8m7s
[root@master01 ~]# 

  提示:能夠看到縮減pod它會從索引號最大的pod逆序縮減,縮減之後對應pv和pvc的狀態依舊是bound狀態;擴縮減pod副本過程以下圖所示

  提示:上圖主要描述了sts控制器上的pod縮減-->擴展pod副本的過程;縮減pod副本數量,對應後端的pvc和pv的狀態都是不變的,後續再增長pod副本數量,對應pvc可以根據pod名稱自動的關聯到對應的pod上,使得擴展後對應名稱的pod和以前縮減pod的數據保存一致;

  滾動更新pod版本

[root@master01 ~]# kubectl set image sts web nginx=nginx:1.16-alpine
statefulset.apps/web image updated
[root@master01 ~]# 

  查看更新過程

[root@master01 ~]# kubectl get pod -w
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          38m
web-1   1/1     Running   0          71m
web-2   1/1     Running   0          71m
web-3   1/1     Running   0          17m
web-3   1/1     Terminating   0          20m
web-3   0/1     Terminating   0          20m
web-3   0/1     Terminating   0          20m
web-3   0/1     Terminating   0          20m
web-3   0/1     Pending       0          0s
web-3   0/1     Pending       0          0s
web-3   0/1     ContainerCreating   0          0s
web-3   1/1     Running             0          1s
web-2   1/1     Terminating         0          74m
web-2   0/1     Terminating         0          74m
web-2   0/1     Terminating         0          74m
web-2   0/1     Terminating         0          74m
web-2   0/1     Pending             0          0s
web-2   0/1     Pending             0          0s
web-2   0/1     ContainerCreating   0          0s
web-2   1/1     Running             0          2s
web-1   1/1     Terminating         0          74m
web-1   0/1     Terminating         0          74m
web-1   0/1     Terminating         0          75m
web-1   0/1     Terminating         0          75m
web-1   0/1     Pending             0          0s
web-1   0/1     Pending             0          0s
web-1   0/1     ContainerCreating   0          0s
web-1   1/1     Running             0          2s
web-0   1/1     Terminating         0          41m
web-0   0/1     Terminating         0          41m
web-0   0/1     Terminating         0          41m
web-0   0/1     Terminating         0          41m
web-0   0/1     Pending             0          0s
web-0   0/1     Pending             0          0s
web-0   0/1     ContainerCreating   0          0s
web-0   1/1     Running             0          1s

  提示:從上面更新過程來看,statefulset控制器滾動更新是從索引號最大的pod開始更新,而且它一次更新一個pod,只有等到上一個pod更新完畢,而且其狀態爲running之後,纔開始更新第二個,依次類推;

  驗證:查看對應sts信息,看看對應版本是否更新爲咱們指定的鏡像版本?

[root@master01 ~]# kubectl get sts -o wide
NAME   READY   AGE   CONTAINERS   IMAGES
web    4/4     79m   nginx        nginx:1.16-alpine
[root@master01 ~]# 

  回滾pod版本爲上一個版本

[root@master01 ~]# kubectl get sts -o wide
NAME   READY   AGE   CONTAINERS   IMAGES
web    4/4     80m   nginx        nginx:1.16-alpine
[root@master01 ~]# kubectl rollout undo sts/web
statefulset.apps/web rolled back
[root@master01 ~]# kubectl get pods
NAME    READY   STATUS              RESTARTS   AGE
web-0   1/1     Running             0          6m6s
web-1   1/1     Running             0          6m13s
web-2   0/1     ContainerCreating   0          1s
web-3   1/1     Running             0          12s
[root@master01 ~]# kubectl get pods
NAME    READY   STATUS              RESTARTS   AGE
web-0   1/1     Running             0          6m14s
web-1   0/1     ContainerCreating   0          1s
web-2   1/1     Running             0          9s
web-3   1/1     Running             0          20s
[root@master01 ~]# kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          1s
web-1   1/1     Running   0          8s
web-2   1/1     Running   0          16s
web-3   1/1     Running   0          27s
[root@master01 ~]# kubectl get sts -o wide     
NAME   READY   AGE   CONTAINERS   IMAGES
web    4/4     81m   nginx        nginx:1.14-alpine
[root@master01 ~]# 

  使用partition字段控制更新pod數量,實現金絲雀效果更新pod版本

  sts控制器默認的更新策略是依次從索引最大的pod開始逆序更新,先刪除一個pod等待對應pod更新完畢之後,狀態處於running之後,接着更新第二個依次更新完全部的pod,要想實現金絲雀更新pod版本的效果,咱們須要告訴sts更新在那個位置;在deploy控制器中咱們使用的是kubectl rollout pause命令來暫停更新,從而實現金絲雀更新pod版本的效果,固然在sts中也能夠;除此以外,sts還支持經過sts.spec.updateStrategy.rollingUpdate.partition字段的值來控制器更新數量;默認partition的值爲0,表示更新到索引大於0的pod位置,即所有更新;以下圖所示

  提示:在sts控制器中更新pod模板的鏡像版本,可使用partition這個字段來控制更新到那個位置,partition=3表示更新索引大於等於3的pod,小於3的pod就不更新;partition=0表示所有更新;

  示例:在線更改sts控制器的partition字段的值爲3

  提示:在線修改sts的配置,使用kubectl edit命令指定類型和對應控制器實例,就能夠進入編輯對應配置文件的界面,找到updateStrategy字段下的rollingUpdate字段下的partition字段,把原有的0更改成3,保存退出便可生效;固然咱們也能夠直接更改配置清單,而後再從新應用一下也行;若是配置清單中沒有定義,能夠加上對應的字段便可;

  再次更新pod版本,看看它是否只更新索引於等3的pod呢?

[root@master01 ~]# kubectl get sts -o wide
NAME   READY   AGE    CONTAINERS   IMAGES
web    4/4     146m   nginx        nginx:1.14-alpine
[root@master01 ~]# kubectl set image sts web nginx=nginx:1.16-alpine
statefulset.apps/web image updated
[root@master01 ~]# kubectl get sts -o wide                          
NAME   READY   AGE    CONTAINERS   IMAGES
web    3/4     146m   nginx        nginx:1.16-alpine
[root@master01 ~]#

  查看更新過程

[root@master01 ~]# kubectl get pods -w
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          64m
web-1   1/1     Running   0          64m
web-2   1/1     Running   0          64m
web-3   1/1     Running   0          64m
web-3   1/1     Terminating   0          51s
web-3   0/1     Terminating   0          51s
web-3   0/1     Terminating   0          60s
web-3   0/1     Terminating   0          60s
web-3   0/1     Pending       0          0s
web-3   0/1     Pending       0          0s
web-3   0/1     ContainerCreating   0          0s
web-3   1/1     Running             0          1s
^C[root@master01 ~]# kubectl get pods 
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          65m
web-1   1/1     Running   0          65m
web-2   1/1     Running   0          65m
web-3   1/1     Running   0          50s
[root@master01 ~]# 

  提示:從上面的更新過程能夠看到,對應sts控制器此時更新只是更新了web-3,其他索引小於3的pod並無發生更新操做;

  恢復所有更新

  提示:從上面的演示來看,咱們把對應的控制器中的partition字段的值從3更改成0之後,對應更新操做就理解開始執行;

  以上就是sts控制器的相關使用說明,其實我上面使用nginx來演示sts控制器的相關操做,在生產環境中咱們部署的是一個真正有狀態的服務,還要考慮怎麼去適配對應的集羣,每一個pod怎麼加入到集羣,擴縮容怎麼作等等一系列運維操做都須要在pod模板中定義出來;

  示例:在k8s上使用sts控制器部署zk集羣

apiVersion: v1
kind: Service
metadata:
  name: zk-hs
  labels:
    app: zk
spec:
  ports:
  - port: 2888
    name: server
  - port: 3888
    name: leader-election
  clusterIP: None
  selector:
    app: zk
---
apiVersion: v1
kind: Service
metadata:
  name: zk-cs
  labels:
    app: zk
spec:
  ports:
  - port: 2181
    name: client
  selector:
    app: zk
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: zk-pdb
spec:
  selector:
    matchLabels:
      app: zk
  maxUnavailable: 1
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: zk
spec:
  selector:
    matchLabels:
      app: zk
  serviceName: zk-hs
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        app: zk
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - zk-hs
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: kubernetes-zookeeper
        image: gcr.io/google-containers/kubernetes-zookeeper:1.0-3.4.10
        resources:
          requests:
            memory: "1Gi"
            cpu: "0.5"
        ports:
        - containerPort: 2181
          name: client
        - containerPort: 2888
          name: server
        - containerPort: 3888
          name: leader-election
        command:
        - sh
        - -c
        - "start-zookeeper \
          --servers=3 \
          --data_dir=/var/lib/zookeeper/data \
          --data_log_dir=/var/lib/zookeeper/data/log \
          --conf_dir=/opt/zookeeper/conf \
          --client_port=2181 \
          --election_port=3888 \
          --server_port=2888 \
          --tick_time=2000 \
          --init_limit=10 \
          --sync_limit=5 \
          --heap=512M \
          --max_client_cnxns=60 \
          --snap_retain_count=3 \
          --purge_interval=12 \
          --max_session_timeout=40000 \
          --min_session_timeout=4000 \
          --log_level=INFO"
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        volumeMounts:
        - name: data
          mountPath: /var/lib/zookeeper
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: gluster-dynamic
      resources:
        requests:
          storage: 5Gi
View Code

  示例:在k8s上使用sts控制器部署etcd集羣

apiVersion: v1
kind: Service
metadata:
  name: etcd
  labels:
    app: etcd
  annotations:
    # Create endpoints also if the related pod isn't ready
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  ports:
  - port: 2379
    name: client
  - port: 2380
    name: peer
  clusterIP: None
  selector:
    app: etcd-member
---
apiVersion: v1
kind: Service
metadata:
  name: etcd-client
  labels:
    app: etcd
spec:
  ports:
  - name: etcd-client
    port: 2379
    protocol: TCP
    targetPort: 2379
  selector:
    app: etcd-member
  type: NodePort
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: etcd
  labels:
    app: etcd
spec:
  serviceName: etcd
  # changing replicas value will require a manual etcdctl member remove/add
  #   # command (remove before decreasing and add after increasing)
  replicas: 3
  selector:
    matchLabels:
      app: etcd-member
  template:
    metadata:
      name: etcd
      labels:
        app: etcd-member
    spec:
      containers:
      - name: etcd
        image: "quay.io/coreos/etcd:v3.2.16"
        ports:
        - containerPort: 2379
          name: client
        - containerPort: 2380
          name: peer
        env:
        - name: CLUSTER_SIZE
          value: "3"
        - name: SET_NAME
          value: "etcd"
        volumeMounts:
        - name: data
          mountPath: /var/run/etcd
        command:
          - "/bin/sh"
          - "-ecx"
          - |
            IP=$(hostname -i)
            PEERS=""
            for i in $(seq 0 $((${CLUSTER_SIZE} - 1))); do
                PEERS="${PEERS}${PEERS:+,}${SET_NAME}-${i}=http://${SET_NAME}-${i}.${SET_NAME}:2380"
            done
            # start etcd. If cluster is already initialized the `--initial-*` options will be ignored.
            exec etcd --name ${HOSTNAME} \
              --listen-peer-urls http://${IP}:2380 \
              --listen-client-urls http://${IP}:2379,http://127.0.0.1:2379 \
              --advertise-client-urls http://${HOSTNAME}.${SET_NAME}:2379 \
              --initial-advertise-peer-urls http://${HOSTNAME}.${SET_NAME}:2380 \
              --initial-cluster-token etcd-cluster-1 \
              --initial-cluster ${PEERS} \
              --initial-cluster-state new \
              --data-dir /var/run/etcd/default.etcd
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      storageClassName: gluster-dynamic
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: 1Gi
View Code

  提示:以上示例都是使用的sc資源對象自動建立pv並關聯pvc,在運行前請先準備好對應的存儲和建立好sc對象;若是不使用pv自動供給,能夠先建立pv在應用資源清單(手動建立pv須要將其pvc模板中的storageClassName字段刪除);

  最後再來講一下k8s operator

  從上述sts示例來看,咱們要在k8s上部署一個真正意思上的有狀態服務,最重要的就是定義好pod模板,這個模板一般就是指定對應的鏡像內部怎麼加入集羣,對應pod擴縮容怎麼處理等等;根據不一樣的服務邏輯定義的方式各有不一樣,這樣一來使得在k8s上跑有狀態的服務就顯得格外的吃力;爲此coreos想了個辦法,它把在k8s上跑有狀態應用的絕大部分運維操做,作成了一個sdk,這個sdk叫operator,用戶只須要針對這個sdk來開發一些適合本身業務須要用到的對應服務的運維操做程序;而後把此程序跑到k8s上;這樣一來針對專有服務就有專有的operator,用戶若是要在k8s上跑對應服務,只須要告訴對應的operator跑一個某某服務便可;簡單講operator就是一個針對某有狀態服務的全能運維,用戶須要在k8s上建立一個對應服務的集羣,就告訴對應的「運維」建立一個集羣,須要擴展/縮減集羣pod數量,告訴運維「擴展/縮減集羣」便可;至於該」運維「有哪些能力,取決開發此程序的程序員賦予了該operator哪些能力;這樣一來咱們在k8s上跑有狀態的應用程序,只須要把對應的operator部署到k8s集羣,而後根據此operator來編寫對應的資源配置清單應用便可;在哪裏找operator呢?https://github.com/operator-framework/awesome-operators;該地址是一個operator的列表,裏面有不少服務的operator的網站地址;能夠找到對應的服務,進入到對應網站查看相關文檔部署使用便可;

相關文章
相關標籤/搜索