上文metric-server提到,kubernetes的監控指標分爲兩種:node
核心指標只包含node和pod的cpu、內存等,通常來講,核心指標做HPA已經足夠,但若是想根據自定義指標:如請求qps/5xx錯誤數來實現HPA,就須要使用自定義指標了,目前Kubernetes中自定義指標通常由Prometheus來提供,再利用k8s-prometheus-adpater聚合到apiserver,實現和核心指標(metric-server)一樣的效果。git
如下是官方metrics的項目介紹:github
Resource Metrics API(核心api)api
Custom Metrics API:緩存
Prometheus能夠採集其它各類指標,可是prometheus採集到的metrics並不能直接給k8s用,由於二者數據格式不兼容,所以還須要另一個組件(kube-state-metrics),將prometheus的metrics數據格式轉換成k8s API接口能識別的格式,轉換之後,由於是自定義API,因此還須要用Kubernetes aggregator在主API服務器中註冊,以便直接經過/apis/來訪問。bash
文件清單:服務器
k8s-prometheus-adapter的部署文件:
app
其中建立了一個叫作cm-adapter-serving-certs的secret,包含兩個值: serving.crt和serving.key,這是由apiserver信任的證書。kube-prometheus項目中的gencerts.sh和deploy.sh腳本能夠建立這個secretide
包括secret的全部資源,都在custom-metrics命名空間下,所以須要kubectl create namespace custom-metricsurl
以上組件均部署成功後,能夠經過url獲取指標
使用prometheus後,pod有一些自定義指標,如http_request請求數
建立一個HPA,當請求數超過每秒10次時進行自動擴容
apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: podinfo spec: scaleTargetRef: apiVersion: extensions/v1beta1 kind: Deployment name: podinfo minReplicas: 2 maxReplicas: 10 metrics: - type: Pods pods: metricName: http_requests targetAverageValue: 10
查看hpa
$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE podinfo Deployment/podinfo 899m / 10 2 10 2 1m
對pod進行施壓
#install hey $ go get -u github.com/rakyll/hey #do 10K requests rate limited at 25 QPS $ hey -n 10000 -q 5 -c 5 http://PODINFO_SVC_IP:9898/healthz
HPA發揮做用:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 3; reason: pods metric http_requests above target Normal SuccessfulRescale 21s horizontal-pod-autoscaler New size: 2; reason: All metrics below target
其實k8s-prometheus-adapter既包含自定義指標,又包含核心指標,即若是按照了prometheus,且指標都採集完整,k8s-prometheus-adapter能夠替代metrics server。
在1.6以上的集羣中,k8s-prometheus-adapter能夠適配autoscaling/v2的HPA
由於通常是部署在集羣內,因此k8s-prometheus-adapter默認狀況下,使用in-cluster的認證方式,如下是主要參數:
config文件的內容示例(參考文檔)
rules: - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}' seriesFilters: [] resources: overrides: namespace: resource: namespace pod_name: resource: pod name: matches: ^container_(.*)_seconds_total$ as: "" metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>) - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}' seriesFilters: - isNot: ^container_.*_seconds_total$ resources: overrides: namespace: resource: namespace pod_name: resource: pod name: matches: ^container_(.*)_total$ as: "" metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>) - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}' seriesFilters: - isNot: ^container_.*_total$ resources: overrides: namespace: resource: namespace pod_name: resource: pod name: matches: ^container_(.*)$ as: "" metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}) by (<<.GroupBy>>)
爲何我看不到自定義的metric
參考k8s-prometheus-adapter,能夠實現本身的adapter,好比獲取已有監控系統的指標,匯聚到api-server中,k8s-prometheus-adapter的實現邏輯會在後續文章中專門來說。
本文爲容器監控實踐系列文章,完整內容見:container-monitor-book