簡介:混合雲K8s容器化應用彈性伸縮實戰
本最佳實踐的軟件環境要求以下:
應用環境:
①容器服務ACK基於專有云V3.10.0版本。
②公共云云企業網服務CEN。
③公共雲彈性伸縮組服務ESS。
配置條件:
1)使用專有云的容器服務或者在ECS上手動部署敏捷PaaS。
2)開通雲專線,打通容器服務所在VPC與公共雲上的VPC。
3)開通公共雲彈性伸縮組服務(ESS)。html
本實踐基於K8s的業務集羣運行在專有云上,對測試業務進行壓力測試,主要基於如下三種產品和能力:
①利用阿里雲的雲企業網專線打通專有云和公共雲,實現兩朵雲上VPC網絡互通。
②利用K8s(Kubernetes)的HPA能力,實現容器的水平伸縮。
③利用K8s的Cluster Autoscaler和阿里雲彈性伸縮組ESS能力實現節點的自動伸縮。node
HPA(Horizontal Pod Autoscaler)是K8s的一種資源對象,可以根據CPU、內存等指標對statefulset、deployment等對象中的pod數量進行動態伸縮,使運行在上面的服務對指標的變化有必定的自適應能力。nginx
當被測試業務指標達到上限時,觸發HPA自動擴容業務pod;當業務集羣沒法承載更多pod時,觸發公共雲的ESS服務,在公共雲內擴容出ECS並自動添加到專有云的K8s集羣。docker
圖 1:架構原理圖json
本示例建立了一個支持HPA的nginx應用,建立成功後,當Pod的利用率超過本例中設置的20%利用率時,則會進行水平擴容,低於20%的時候會進行縮容。centos
1)建立一個nginx應用,必須爲應用設置request值,不然HPA不會生效。api
apiVersion: app/v1beta2 kind: Deployment spec: template: metadata: creationTimestamp: null labels: app: hpa-test spec: dnsPolicy: ClusterFirst terminationGracePeriodSeconds:30 containers: image: '192.168.**.***:5000/admin/hpa-example:v1' imagePullPolicy: IfNotPresent terminationMessagePolicy:File terminationMessagePath:/dev/termination-log name: hpa-test resources: requests: cpu: //必須設置request值 securityContext: {} restartPolicy:Always schedulerName:default-scheduler replicas: 1 selector: matchLabels: app: hpa-test revisionHistoryLimit: 10 strategy: type: RollingUpdate rollingUpdate: maxSurge: 25% maxUnavailable: 25% progressDeadlineSeconds: 600
2)建立HPA。bash
apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: annotations: autoscaling.alpha.kubernetes.io/conditions:'[{"type":"AbleToScale","status":"True","lastTransitionTime":"2020-04-29T06:57:28Z","reason":"ScaleDownStabilized","message":"recent recommendations were higher than current one, applying the highest recent recommendation"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2020-04-29T06:57:28Z","reason":"ValidMetricFound","message":"theHPA was able to successfully calculate a replica count from cpu resource utilization(percentage of request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2020-04-29T06:57:28Z","reason":"DesiredWithinRange","message":"thedesired count is within the acceptable range"}]' autoscaling.alpha.kubernetes.io/currentmetrics:'[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"0"}}]' creationTimestamp: 2020-04-29T06:57:13Z name: hpa-test namespace: default resourceVersion: "3092268" selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/hpa01 uid: a770ca26-89e6-11ea-a7d7-00163e0106e9 spec: maxReplicas: //設置pod數量 minReplicas: 1 scaleTargetRef: apiVersion: apps/v1beta2 kind: Deployment name: centos targetCPUUtilizationPercentage://設置CPU閾值
圖2:訪問設置網絡
資源請求(Request)的正確、合理設置,是彈性伸縮的前提條件。節點自動伸縮組件基於K8s資源調度的分配狀況進行伸縮判斷,節點中資源的分配經過資源請(Request)進行計算。架構
當Pod因爲資源請求(Request)沒法知足並進入等待(Pending)狀態時,節點自動伸縮組件會根據彈性伸縮組配置信息中的資源規格以及約束配置,計算所需的節點數目。
若是能夠知足伸縮條件,則會觸發伸縮組的節點加入。而當一個節點在彈性伸縮組中且節點上Pod的資源請求低於閾值時,節點自動伸縮組件會將節點進行縮容。
1)建立ESS彈性伸縮組,記錄最小實例數和最大實例數。
圖3:修改伸縮組
2)建立伸縮配置,記錄伸縮配置的id。
圖4:伸縮配置
#!/bin/sh yum install -y ntpdate && ntpdate -u ntp1.aliyun.com && curl http:// example.com/public/hybrid/attach_local_node_aliyun.sh | bash -s -- --docker-version 17.06.2-ce-3 --token 9s92co.y2gkocbumal4fz1z --endpoint 192.168.**.***:6443 --cluster-dns 10.254.**.** --region cn-huhehaote echo "{" > /etc/docker/daemon.json echo "\"registry-mirrors\": [" >> /etc/docker/daemon.json echo "\"https://registry-vpc.cn-huhehaote.aliyuncs.com\"" >> /etc/docker/daemon.json echo "]," >> /etc/docker/daemon.json echo "\"insecure-registries\": [\"https://192.168.**.***:5000\"]" >> /etc/docker/daemon.json echo "}" >> /etc/docker/daemon.json systemctl restart docker
kubectl apply -f ca.yml
參考ca.yml建立autoscaler,注意修改以下配置與實際環境相對應。
access-key-id: "TFRBSWlCSFJyeHd2QXZ6****" access-key-secret: "bGIyQ3NuejFQOWM0WjFUNjR4WTVQZzVPRXND****" region-id: "Y24taHVoZWhh****"
ca.yal代碼以下:
--- apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler name: cluster-autoscaler namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: cluster-autoscaler labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler rules: - apiGroups: [""] resources: ["events","endpoints"] verbs: ["create", "patch"] - apiGroups: [""] resources: ["pods/eviction"] verbs: ["create"] - apiGroups: [""] resources: ["pods/status"] verbs: ["update"] - apiGroups: [""] resources: ["endpoints"] resourceNames: ["cluster-autoscaler"] verbs: ["get","update"] - apiGroups: [""] resources: ["nodes"] verbs: ["watch","list","get","update"] - apiGroups: [""] resources: ["pods","services","replicationcontrollers","persistentvolumeclaims","persistentvolumes"] verbs: ["watch","list","get"] - apiGroups: ["extensions"] resources: ["replicasets","daemonsets"] verbs: ["watch","list","get"] - apiGroups: ["policy"] resources: ["poddisruptionbudgets"] verbs: ["watch","list"] - apiGroups: ["apps"] resources: ["statefulsets"] verbs: ["watch","list","get"] - apiGroups: ["storage.k8s.io"] resources: ["storageclasses"] verbs: ["watch","list","get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: cluster-autoscaler namespace: kube-system labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler rules: - apiGroups: [""] resources: ["configmaps"] verbs: ["create","list","watch"] - apiGroups: [""] resources: ["configmaps"] resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"] verbs: ["delete","get","update","watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: cluster-autoscaler labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-autoscaler subjects: - kind: ServiceAccount name: cluster-autoscaler namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: cluster-autoscaler namespace: kube-system labels: k8s-addon: cluster-autoscaler.addons.k8s.io k8s-app: cluster-autoscaler roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: cluster-autoscaler subjects: - kind: ServiceAccount name: cluster-autoscaler namespace: kube-system --- apiVersion: v1 kind: Secret metadata: name: cloud-config namespace: kube-system type: Opaque data: access-key-id: "TFRBSWlCSFJyeHd2********" access-key-secret: "bGIyQ3NuejFQOWM0WjFUNjR4WTVQZzVP*********" region-id: "Y24taHVoZW********" --- apiVersion: apps/v1 kind: Deployment metadata: name: cluster-autoscaler namespace: kube-system labels: app: cluster-autoscaler spec: replicas: 1 selector: matchLabels: app: cluster-autoscaler template: metadata: labels: app: cluster-autoscaler spec: dnsConfig: nameservers: - 100.XXX.XXX.XXX - 100.XXX.XXX.XXX nodeSelector: ca-key: ca-value priorityClassName: system-cluster-critical serviceAccountName: admin containers: - image: 192.XXX.XXX.XXX:XX/admin/autoscaler:v1.3.1-7369cf1 name: cluster-autoscaler resources: limits: cpu: 100m memory: 300Mi requests: cpu: 100m memory: 300Mi command: - ./cluster-autoscaler - '--v=5' - '--stderrthreshold=info' - '--cloud-provider=alicloud' - '--scan-interval=30s' - '--scale-down-delay-after-add=8m' - '--scale-down-delay-after-failure=1m' - '--scale-down-unready-time=1m' - '--ok-total-unready-count=1000' - '--max-empty-bulk-delete=50' - '--expander=least-waste' - '--leader-elect=false' - '--scale-down-unneeded-time=8m' - '--scale-down-utilization-threshold=0.2' - '--scale-down-gpu-utilization-threshold=0.3' - '--skip-nodes-with-local-storage=false' - '--nodes=0:5:asg-hp3fbu2zeu9bg3clraqj' imagePullPolicy: "Always" env: - name: ACCESS_KEY_ID valueFrom: secretKeyRef: name: cloud-config key: access-key-id - name: ACCESS_KEY_SECRET valueFrom: secretKeyRef: name: cloud-config key: access-key-secret - name: REGION_ID valueFrom: secretKeyRef: name: cloud-config key: region-id
啓動busybox鏡像,在pod內執行以下命令訪問以上應用的service,能夠同時啓動多個pod增長業務負載。while true;do wget -q -O- http://hpa-test/index.html;done
加壓前
圖 5:加壓前
加壓後
當CPU值達到閾值後,會觸發pod的水平擴容。
圖 6:加壓後1
圖 7:加壓後2
當集羣資源不足時,新擴容出的pod處於pending狀態,此時將觸發cluster autoscaler,自動擴容節點。
圖8:伸縮活動
咱們是阿里雲智能全球技術服務-SRE團隊,咱們致力成爲一個以技術爲基礎、面向服務、保障業務系統高可用的工程師團隊;提供專業、體系化的SRE服務,幫助廣大客戶更好地使用雲、基於雲構建更加穩定可靠的業務系統,提高業務穩定性。咱們指望可以分享更多幫助企業客戶上雲、用好雲,讓客戶雲上業務運行更加穩定可靠的技術,您可用釘釘掃描下方二維碼,加入阿里雲SRE技術學院釘釘圈子,和更多雲上人交流關於雲平臺的那些事。
本文內容由阿里雲實名註冊用戶自發貢獻,版權歸原做者全部,阿里雲開發者社區不擁有其著做權,亦不承擔相應法律責任。具體規則請查看《阿里雲開發者社區用戶服務協議》和《阿里雲開發者社區知識產權保護指引》。若是您發現本社區中有涉嫌抄襲的內容,填寫侵權投訴表單進行舉報,一經查實,本社區將馬上刪除涉嫌侵權內容。