背景:最近看馬哥的k8s教程,在《容器資源需求、資源限制及HeapSter》章節中,教程裏kubectl top和grafana圖形最終也沒有顯示出來;heapster會在後面的版本中廢棄,因此沒必要糾結。我只是比較好奇而已。下面把遇到的問題及解決過程講一下,我安裝的k8s版本是v1.13.3。node
查看版本python
[ryuser@cdh-master metrics]$ kubectl get nodes NAME STATUS ROLES AGE VERSION cdh-master.rongyi.com Ready master 41d v1.13.3 cdh-slave.rongyi.com Ready <none> 41d v1.13.3 cdh-slave2.rongyi.com Ready <none> 39d v1.13.3
[ryuser@cdh-master metrics]$ kubectl logs heapster-f64999bc-25tvv -n kube-system I0326 06:23:03.317063 1 heapster.go:78] /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086 I0326 06:23:03.317170 1 heapster.go:79] Heapster version v1.5.4 I0326 06:23:03.317421 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1 I0326 06:23:03.317437 1 configs.go:62] Using kubelet port 10255 I0326 06:23:03.341940 1 influxdb.go:312] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc:8086 user:root db:k8s I0326 06:23:03.341976 1 heapster.go:202] Starting with InfluxDB Sink I0326 06:23:03.341985 1 heapster.go:202] Starting with Metric Sink I0326 06:23:03.364225 1 heapster.go:112] Starting heapster on port 8082 E0326 06:24:05.006245 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10255: failed to get all container stats from Kubelet URL "http://192.168.10.73:10255/stats/container/": Post http://192.168.10.73:10255/stats/container/: dial tcp 192.168.10.73:10255: getsockopt: connection refused E0326 06:24:05.006326 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10255: failed to get all container stats from Kubelet URL "http://192.168.10.77:10255/stats/container/": Post http://192.168.10.77:10255/stats/container/: dial tcp 192.168.10.77:10255: getsockopt: connection refused E0326 06:24:05.006827 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10255: failed to get all container stats from Kubelet URL "http://192.168.10.74:10255/stats/container/": Post http://192.168.10.74:10255/stats/container/: dial tcp 192.168.10.74:10255: getsockopt: connection refused W0326 06:24:25.002576 1 manager.go:152] Failed to get all responses in time (got 0/3) I0326 06:24:25.033246 1 influxdb.go:274] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc:8086" E0326 06:25:05.009902 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10255: failed to get all container stats from Kubelet URL "http://192.168.10.77:10255/stats/container/": Post http://192.168.10.77:10255/stats/container/: dial tcp 192.168.10.77:10255: getsockopt: connection refused E0326 06:25:05.010317 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10255: failed to get all container stats from Kubelet URL "http://192.168.10.73:10255/stats/container/": Post http://192.168.10.73:10255/stats/container/: dial tcp 192.168.10.73:10255: getsockopt: connection refused E0326 06:25:05.024937 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10255: failed to get all container stats from Kubelet URL "http://192.168.10.74:10255/stats/container/": Post http://192.168.10.74:10255/stats/container/: dial tcp 192.168.10.74:10255: getsockopt: connection refused W0326 06:25:25.002198 1 manager.go:152] Failed to get all responses in time (got 0/3) E0326 06:26:05.011184 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10255: failed to get all container stats from Kubelet URL "http://192.168.10.77:10255/stats/container/": Post http://192.168.10.77:10255/stats/container/: dial tcp 192.168.10.77:10255: getsockopt: connection refused E0326 06:26:05.014660 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10255: failed to get all container stats from Kubelet URL "http://192.168.10.73:10255/stats/container/": Post http://192.168.10.73:10255/stats/container/: dial tcp 192.168.10.73:10255: getsockopt: connection refused E0326 06:26:05.021066 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10255: failed to get all container stats from Kubelet URL "http://192.168.10.74:10255/stats/container/": Post http://192.168.10.74:10255/stats/container/: dial tcp 192.168.10.74:10255: getsockopt: connection refused
[ryuser@cdh-master metrics]$ kubectl top pod W0326 15:13:19.303263 20846 top_pod.go:259] Metrics not available for pod default/client, age: 980h4m21.303224766s error: Metrics not available for pod default/client, age: 980h4m21.303224766s [ryuser@cdh-master metrics]$ kubectl top node error: metrics not available yet
解決辦法:bootstrap
#在heapster.yaml清單文件中進行以下修改 - --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true - --sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086
而後刪除heapster重建vim
kubectl delete -f heapster.yamlapi
kubectl apply -f heapster.yamlapp
continuecurl
403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"tcp
[ryuser@cdh-master metrics]$ kubectl logs -f heapster-5fcf457b-zq99c -n kube-system I0326 07:36:23.229287 1 heapster.go:78] /heapster --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true --sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086 I0326 07:36:23.229348 1 heapster.go:79] Heapster version v1.5.4 I0326 07:36:23.229602 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1 I0326 07:36:23.229618 1 configs.go:62] Using kubelet port 10250 I0326 07:36:23.334904 1 influxdb.go:312] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc.cluster.local:8086 user:root db:k8s I0326 07:36:23.334946 1 heapster.go:202] Starting with InfluxDB Sink I0326 07:36:23.334955 1 heapster.go:202] Starting with Metric Sink I0326 07:36:23.347573 1 heapster.go:112] Starting heapster on port 8082 E0326 07:37:05.028341 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:37:05.096629 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:37:05.157683 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" W0326 07:37:25.003226 1 manager.go:152] Failed to get all responses in time (got 0/3) I0326 07:37:25.037245 1 influxdb.go:274] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc.cluster.local:8086" E0326 07:38:05.013221 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:38:05.019540 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:38:05.022849 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" W0326 07:38:25.003081 1 manager.go:152] Failed to get all responses in time (got 0/3) E0326 07:39:05.010246 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:39:05.019238 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:39:05.024794 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" W0326 07:39:25.004236 1 manager.go:152] Failed to get all responses in time (got 0/3) E0326 07:40:05.016757 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:40:05.020030 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" E0326 07:40:05.020763 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)" W0326 07:40:25.002318 1 manager.go:152] Failed to get all responses in time (got 0/3)
解決辦法:ide
查看ClusterRole: system:heapster的權限,發現的確沒有針對Resource: nodes/stats 的create權限ui
[ryuser@cdh-master metrics]$ kubectl describe clusterrole system:heapster Name: system:heapster Labels: kubernetes.io/bootstrapping=rbac-defaults Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{"rbac.authorization.kubernetes.io/autoupdate"... rbac.authorization.kubernetes.io/autoupdate: true PolicyRule: Resources Non-Resource URLs Resource Names Verbs --------- ----------------- -------------- ----- events [] [] [get list watch] namespaces [] [] [get list watch] nodes/stats [] [] [get list watch] nodes [] [] [get list watch] pods [] [] [get list watch] deployments.extensions [] [] [get list watch]
修改ClusterRole: system:heapster的權限
生成清單文件
kubectl get clusterrole system:heapster -o yaml > heapster_modify.yaml
修改文件,增長verbs:create權限,增長resources:nodes/stats
vim heapster_modify.yaml
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{"rbac.authorization.kubernetes.io/autoupdate":"true"},"creationTimestamp":"2019-02-12T10:41:33Z","labels":{"kubernetes.io/bootstrapping":"rbac-defaults"},"name":"system:heapster","resourceVersion":"70","selfLink":"/apis/rbac.authorization.k8s.io/v1/clusterroles/system%3Aheapster","uid":"c3bd303a-2eb2-11e9-9c98-005056be639a"},"rules":[{"apiGroups":[""],"resources":["events","namespaces","nodes","pods"],"verbs":["create","get","list","watch"]},{"apiGroups":["extensions"],"resources":["deployments"],"verbs":["get","list","watch"]}]} rbac.authorization.kubernetes.io/autoupdate: "true" creationTimestamp: "2019-02-12T10:41:33Z" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:heapster resourceVersion: "4109335" selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/system%3Aheapster uid: c3bd303a-2eb2-11e9-9c98-005056be639a rules: - apiGroups: - "" resources: - events - namespaces - nodes - pods - nodes/stats # 增長 verbs: - create #增長 - get - list - watch - apiGroups: - extensions resources: - deployments verbs: - get - list - watch
刪除heapster從新部署
kubectl delete -f heapster.yaml
kubectl apply -f heapster.yaml
終於不報錯了。
[ryuser@cdh-master metrics]$ kubectl logs -f heapster-5fcf457b-vhrxf -n kube-system I0326 07:47:00.574138 1 heapster.go:78] /heapster --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true --sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086 I0326 07:47:00.574204 1 heapster.go:79] Heapster version v1.5.4 I0326 07:47:00.574470 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1 I0326 07:47:00.574487 1 configs.go:62] Using kubelet port 10250 I0326 07:47:00.639292 1 influxdb.go:312] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc.cluster.local:8086 user:root db:k8s I0326 07:47:00.639338 1 heapster.go:202] Starting with InfluxDB Sink I0326 07:47:00.639354 1 heapster.go:202] Starting with Metric Sink I0326 07:47:00.670576 1 heapster.go:112] Starting heapster on port 8082 I0326 07:48:05.366442 1 influxdb.go:274] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc.cluster.local:8086"
kubectl top
[ryuser@cdh-master metrics]$ kubectl top nodes NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% cdh-master.rongyi.com 158m 3% 2550Mi 69% cdh-slave.rongyi.com 79m 1% 2386Mi 64% cdh-slave2.rongyi.com 820m 41% 3136Mi 84% [ryuser@cdh-master metrics]$ kubectl top pods NAME CPU(cores) MEMORY(bytes) curl-66959f6557-bvn9r 0m 0Mi dep-httpd-5b774f45df-vtv59 0m 21Mi dep-httpd-5b774f45df-wd5kf 0m 15Mi myapp-0 0m 1Mi myapp-1 0m 3Mi myapp-2 0m 1Mi myapp-3 0m 1Mi myapp-4 0m 1Mi pod-demo 499m 138Mi
另外還有一個問題,就是grafana裏面的dashboard是不顯示數據。 通過上面的折騰有數據了。
附:dashboard的下載地址:
「Kubernetes Node Statistics」dashabord : https://grafana.com/dashboards/3646
「Kubernetes Pod Statistics」dashabord:https://grafana.com/dashboards/3649