kubernetes調度之資源配額

時間 2019-11-11

標籤 kubernetes 調度資源配額简体版

原文原文鏈接

系列目錄html

當多個用戶或者開發團隊共享一個有固定節點的的kubernetes集羣時,一個團隊或者一個用戶使用的資源超過他應當使用的資源是須要關注的問題,資源配額是管理員用來解決這個問題的一個工具.node

資源配額,經過ResourceQuota定義,提供了對某一名稱空間使用資源的整體約束.它便可以限制這個名稱空間下有多少個對象能夠被建立,也能夠限制對計算機資源使用量的限制(前面說到過,計算機資源包括cpu,內存,磁盤空間等資源)nginx

資源配額經過如下相似方式工做:ubuntu

不一樣的團隊在不一樣的名稱空間下工做.當前kubernetes並無強制這樣作,徹底是自願的,可是kubernetes團隊計劃經過acl受權來達到強制這樣作.api
管理員對每個名稱空間建立一個ResourceQuota(資源配額)bash
用戶在一個名稱空間下建立資源(例如pod,service等),配額系統跟蹤資源使用量來保證資源的使用不超過ResourceQuota定義的量.服務器
若是對一個資源的建立或者更新違反了資源配額約束,則請求會返回失敗,失敗的http狀態碼是403 FORBIDDEN而且有一條消息來解釋哪一個約束被違反.app
若是一個名稱空間下的計算機資源配額,好比CPU和內存被啓用,則用戶必須指定相應的資源申請或者限制的值,不然配額系統可能會阻止pod的建立.工具

資源配額在某一名稱空間下建立策略示例:

在一個有32G內存,16核cpu的集羣,讓團隊A使用20G內存和10核cpu,讓團隊B使用10G內存和4核cpu,剩餘的2G內存和2核cup預留以備進一步分配測試
限制測試名稱空間使用1核1G,讓生產名稱空間使用剩下的所有資源

當集羣的容量小於全部名稱空間下配額總和時,將會出現資源競爭,這種狀況下kubernetes將會基於先到先分配的原則進行處理

不管是資源競爭或者是資源配額的修改都不會影響已經建立的資源

啓用資源配額

不少kubernetes的發行版中資源配額支持默認是開啓的,當ResourceQuota做爲apiserver的--enable-admission-plugins=的其中一個值時,資源配額被開啓.

當某一名稱空間包含ResourceQuota對象時資源配額在這個名稱空間下生效.

計算機資源配額

你能夠限制一個名稱空間下能夠被申請的計算機資源的總和

kubernetes支持如下資源類型:

Resource Name	Description
cpu	Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value.
limits.cpu	Across all pods in a non-terminal state, the sum of CPU limits cannot exceed this value.
limits.memory	Across all pods in a non-terminal state, the sum of memory limits cannot exceed this value.
memory	Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value.
requests.cpu	Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value.
requests.memory	Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value.

擴展資源的資源配額

除了上面提到的,在kubernetes 1.10裏,添加了對擴展資源的配額支持

存儲資源配額

你能夠限制某一名稱空間下的存儲空間總量的申請

此外,你也能夠你也能夠根據關聯的storage-class來限制存儲空間資源的使用

Resource Name	Description
requests.storage	Across all persistent volume claims, the sum of storage requests cannot exceed this value.
persistentvolumeclaims	The total number of persistent volume claims that can exist in the namespace.
.storageclass.storage.k8s.io/requests.storage	Across all persistent volume claims associated with the storage-class-name, the sum of storage requests cannot exceed this value.
.storageclass.storage.k8s.io/persistentvolumeclaims	Across all persistent volume claims associated with the storage-class-name, the total number of persistent volume claims that can exist in the namespace.

例如,一個operator想要想要使黃金和青銅單獨申請存儲空間,那麼這個operator能夠像以下同樣申請配額:

gold.storageclass.storage.k8s.io/requests.storage: 500Gi
bronze.storageclass.storage.k8s.io/requests.storage: 100Gi

在1.8版本里,對local ephemeral storage配額的的支持被添加到alpha特徵裏.

Resource Name	Description
requests.ephemeral-storage	Across all pods in the namespace, the sum of local ephemeral storage requests cannot exceed this value.
limits.ephemeral-storage	Across all pods in the namespace, the sum of local ephemeral storage limits cannot exceed this value.

對象數量配額

1.9版本經過如下語法加入了對全部標準名稱空間資源類型的配額支持

count/<resource>.<group>

如下是用戶可能想要設置對象數量配額的例子:

count/persistentvolumeclaims
count/services
count/secrets
count/configmaps
count/replicationcontrollers
count/deployments.apps
count/replicasets.apps
count/statefulsets.apps
count/jobs.batch
count/cronjobs.batch
count/deployments.extensions

當使用count/*類型資源配額,服務器上存在的資源對象將都被控制.這將有助於防止服務器存儲資源被耗盡.好比,若是存儲在服務器上的secrets資源對象過大,你可能會想要限制它的數量.過多的secrets可能會致使服務器沒法啓動!你也可能會限制job的數量以防一些設計拙劣的定時任務會建立過多的job以致使服務被拒絕

如下資源類型的限額是支持的

Resource Name	Description
configmaps	The total number of config maps that can exist in the namespace.
persistentvolumeclaims	The total number of persistent volume claims that can exist in the namespace.
pods	The total number of pods in a non-terminal state that can exist in the namespace. A pod is in a terminal state if .status.phase in (Failed, Succeeded) is true.
replicationcontrollers	The total number of replication controllers that can exist in the namespace.
resourcequotas	The total number of resource quotas that can exist in the namespace.
services	The total number of services that can exist in the namespace.
services.loadbalancers	The total number of services of type load balancer that can exist in the namespace.
services.nodeports	The total number of services of type node port that can exist in the namespace.
secrets	The total number of secrets that can exist in the namespace.

例如,pod配額限制了一個名稱空間下非terminal狀態的pod總數量.這樣能夠防止一個用戶建立太多小的pod以致於耗盡集羣分配給pod的全部IP

配額範圍

每個配額均可以包含一系列相關的範圍.配額只會在匹配列舉出的範圍的交集時才計算資源的使用.

當一個範圍被添加到配額裏,它將限制它支持的,屬於範圍的資源.指定的資源不在支持的集合裏時,將會致使驗證錯誤

Scope	Description
Terminating	Match pods where .spec.activeDeadlineSeconds >= 0
NotTerminating	Match pods where .spec.activeDeadlineSeconds is nil
BestEffort	Match pods that have best effort quality of service.
NotBestEffort	Match pods that do not have best effort quality of service.

BestEffort範圍限制配額只追蹤pods資源

Terminating,NotTerminating和NotBestEffort範圍限制配額追蹤如下資源:

cpu
limits.cpu
limits.memory
memory
pods
requests.cpu
requests.memory

每個PriorityClass的資源配額

此特徵在1.12片本中爲beta

pod能夠以指定的優先級建立.你能夠經過pod的優先級來控制pod對系統資源的使用,它是經過配額的spec下的scopeSelector字段產生效果的.

只有當配額spec的scopeSelector選擇了一個pod,配額纔會被匹配和消費

你在使用PriorityClass的配額的以前,須要啓用ResourceQuotaScopeSelectors

如下示例建立一個配額對象,而且必定優先級的pod會匹配它.

集羣中的pod有如下三個優先級類之一:low,medium,high
每一個優先級類都建立了一個資源配額

apiVersion: v1
kind: List
items:
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-high
  spec:
    hard:
      cpu: "1000"
      memory: 200Gi
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator : In
        scopeName: PriorityClass
        values: ["high"]
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-medium
  spec:
    hard:
      cpu: "10"
      memory: 20Gi
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator : In
        scopeName: PriorityClass
        values: ["medium"]
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: pods-low
  spec:
    hard:
      cpu: "5"
      memory: 10Gi
      pods: "10"
    scopeSelector:
      matchExpressions:
      - operator : In
        scopeName: PriorityClass
        values: ["low"]

使用kubectl create來用戶以上yml文件

kubectl create -f ./quota.yml
resourcequota/pods-high created
resourcequota/pods-medium created
resourcequota/pods-low created

使用kubectl describe quota來查看

kubectl describe quota
Name:       pods-high
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     1k
memory      0     200Gi
pods        0     10


Name:       pods-low
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     5
memory      0     10Gi
pods        0     10


Name:       pods-medium
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     10
memory      0     20Gi
pods        0     10

建立一個具備high優先級的pod,把如下內容保存在high-priority-pod.yml裏

apiVersion: v1
kind: Pod
metadata:
  name: high-priority
spec:
  containers:
  - name: high-priority
    image: ubuntu
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo hello; sleep 10;done"]
    resources:
      requests:
        memory: "10Gi"
        cpu: "500m"
      limits:
        memory: "10Gi"
        cpu: "500m"
  priorityClassName: high

使用kubectl create來應用

kubectl create -f ./high-priority-pod.yml

這時候再用kubectl describe quota來查看

Name:       pods-high
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         500m  1k
memory      10Gi  200Gi
pods        1     10


Name:       pods-low
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     5
memory      0     10Gi
pods        0     10


Name:       pods-medium
Namespace:  default
Resource    Used  Hard
--------    ----  ----
cpu         0     10
memory      0     20Gi
pods        0     10

scopeSelector支持operator字段的如下值:

In
NotIn
Exist
DoesNotExist

配額資源的申請與限制

當分配計算機資源時,每個容器可能會指定對cpu或者內存的申請或限制.配額能夠配置爲它們中的一個值

這裏是說配額只能是申請或者限制,而不能同時出現

若是配額指定了requests.cpu或requests.memory那麼它須要匹配的容器必須顯式指定申請這些資源.若是配額指定了limits.cpu或limits.memory,那麼它須要匹配的容器必須顯式指定限制這些資源

查看和設置配額

kubectl支持建立,更新和查看配額

kubectl create namespace myspace

cat <<EOF > compute-resources.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
spec:
  hard:
    pods: "4"
    requests.cpu: "1"
    requests.memory: 1Gi
    limits.cpu: "2"
    limits.memory: 2Gi
    requests.nvidia.com/gpu: 4
EOF

kubectl create -f ./compute-resources.yaml --namespace=myspace

cat <<EOF > object-counts.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-counts
spec:
  hard:
    configmaps: "10"
    persistentvolumeclaims: "4"
    replicationcontrollers: "20"
    secrets: "10"
    services: "10"
    services.loadbalancers: "2"
EOF

kubectl create -f ./object-counts.yaml --namespace=myspace

kubectl get quota --namespace=myspace

NAME                    AGE
compute-resources       30s
object-counts           32s

kubectl describe quota compute-resources --namespace=myspace

Name:                    compute-resources
Namespace:               myspace
Resource                 Used  Hard
--------                 ----  ----
limits.cpu               0     2
limits.memory            0     2Gi
pods                     0     4
requests.cpu             0     1
requests.memory          0     1Gi
requests.nvidia.com/gpu  0     4

kubectl describe quota object-counts --namespace=myspace

Name:                   object-counts
Namespace:              myspace
Resource                Used    Hard
--------                ----    ----
configmaps              0       10
persistentvolumeclaims  0       4
replicationcontrollers  0       20
secrets                 1       10
services                0       10
services.loadbalancers  0       2

kubectl經過count/<resource>.<group>語法形式支持標準名稱空間對象數量配額

kubectl create namespace myspace

kubectl create quota test --hard=count/deployments.extensions=2,count/replicasets.extensions=4,count/pods=3,count/secrets=4 --namespace=myspace

kubectl run nginx --image=nginx --replicas=2 --namespace=myspace

kubectl describe quota --namespace=myspace

Name:                         test
Namespace:                    myspace
Resource                      Used  Hard
--------                      ----  ----
count/deployments.extensions  1     2
count/pods                    2     3
count/replicasets.extensions  1     4
count/secrets                 1     4

配額和集羣容量

ResourceQuotas獨立於集羣的容量,它們經過絕對的單位表示.所以,若是你向集羣添加了節點,這並不會給集羣中的每一個名稱空間賦予消費更多資源的能力.

有時候須要更爲複雜的策略,好比:

把集羣中全部的資源按照比例分配給不一樣團隊
容許每一個租戶根據需求增長資源使用,可是有一個整體的限制以防資源被耗盡
檢測名稱空間的需求,添加節點,增長配額

這些策略能夠經過實現ResourceQuotas來寫一個controller用於監視配額的使用,而且經過其它信號來調整每一個名稱空間的配額

默認限制優先類消費

有時候咱們可能但願必定優先級別的pod,例如cluster-services應當被容許在一個名稱空間裏,當且僅當匹配的配額存在.

經過這種機制,operators能夠限制一些高優先級的類只能用於有限數量的名稱空間裏,而且不是全部的名稱空間均可以默認消費它們.

爲了使以上生效,kube-apiserver標籤--admission-control-config-file應當傳入如下配置文件的路徑

apiVersion: apiserver.k8s.io/v1alpha1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
  configuration:
    apiVersion: resourcequota.admission.k8s.io/v1beta1
    kind: Configuration
    limitedResources:
    - resource: pods
      matchScopes:
      - scopeName: PriorityClass 
        operator: In
        values: ["cluster-services"]

如今,cluster-services類型的pod僅被容許運行在有匹配scopeSelector的配額資源對象的名稱空間裏,例如

`yml scopeSelector: matchExpressions: - scopeName: PriorityClass operator: In values: ["cluster-services"]