heapster是kubernetes默認的監控系統,它經過kubelet裏面的cadvisor獲取容器的監控信息。
上面的圖顯示了heapster監控流程,採集的數據放到influxdb中,influxdb支持rest的方式存儲數據,熟悉openstb的人對此很容易上手。
經過yaml文件的方式就能夠完成安裝:node
heapster-deployment.yamlgit
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: heapster namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: heapster spec: serviceAccountName: heapster containers: - name: heapster image: gcr.io/google_containers/heapster-amd64:v1.3.0-beta.1 imagePullPolicy: IfNotPresent command: - /heapster - --source=kubernetes:https://kubernetes.default - --sink=influxdb:http://monitoring-influxdb:8086
這個裏面source是從kubernetes獲取監控對象信息,sink制定數據存儲的路徑,經過influxdb的api保存數據。上面serviceAccountName是1.6後的rbac準備的github
heapster-rbac.yaml數據庫
apiVersion: v1 kind: ServiceAccount metadata: name: heapster namespace: kube-system --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1alpha1 metadata: name: heapster subjects: - kind: ServiceAccount name: heapster namespace: kube-system roleRef: kind: ClusterRole name: system:heapster apiGroup: rbac.authorization.k8s.io
這個裏對heapster這個ServiceAccount,受權system:heapster。api
heapster-service.yamlapp
apiVersion: v1 kind: Service metadata: labels: task: monitoring # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true' kubernetes.io/name: Heapster name: heapster namespace: kube-system spec: ports: - port: 80 targetPort: 8082 selector: k8s-app: heapster
由於dashboard須要訪問heapster,因此這裏配置service。緊接着是數據庫influxdb,先定義配置文件,經過configmap掛載到容器裏面。curl
influxdb-cm.yamlsocket
apiVersion: v1 kind: ConfigMap metadata: name: influxdb-config namespace: kube-system data: config.toml: | reporting-disabled = true bind-address = ":8088" [meta] dir = "/data/meta" retention-autocreate = true logging-enabled = true [data] dir = "/data/data" wal-dir = "/data/wal" query-log-enabled = true cache-max-memory-size = 1073741824 cache-snapshot-memory-size = 26214400 cache-snapshot-write-cold-duration = "10m0s" compact-full-write-cold-duration = "4h0m0s" max-series-per-database = 1000000 max-values-per-tag = 100000 trace-logging-enabled = false [coordinator] write-timeout = "10s" max-concurrent-queries = 0 query-timeout = "0s" log-queries-after = "0s" max-select-point = 0 max-select-series = 0 max-select-buckets = 0 [retention] enabled = true check-interval = "30m0s" [admin] enabled = true bind-address = ":8083" https-enabled = false https-certificate = "/etc/ssl/influxdb.pem" [shard-precreation] enabled = true check-interval = "10m0s" advance-period = "30m0s" [monitor] store-enabled = true store-database = "_internal" store-interval = "10s" [subscriber] enabled = true http-timeout = "30s" insecure-skip-verify = false ca-certs = "" write-concurrency = 40 write-buffer-size = 1000 [http] enabled = true bind-address = ":8086" auth-enabled = false log-enabled = true write-tracing = false pprof-enabled = false https-enabled = false https-certificate = "/etc/ssl/influxdb.pem" https-private-key = "" max-row-limit = 10000 max-connection-limit = 0 shared-secret = "" realm = "InfluxDB" unix-socket-enabled = false bind-socket = "/var/run/influxdb.sock" [[graphite]] enabled = false bind-address = ":2003" database = "graphite" retention-policy = "" protocol = "tcp" batch-size = 5000 batch-pending = 10 batch-timeout = "1s" consistency-level = "one" separator = "." udp-read-buffer = 0 [[collectd]] enabled = false bind-address = ":25826" database = "collectd" retention-policy = "" batch-size = 5000 batch-pending = 10 batch-timeout = "10s" read-buffer = 0 typesdb = "/usr/share/collectd/types.db" [[opentsdb]] enabled = false bind-address = ":4242" database = "opentsdb" retention-policy = "" consistency-level = "one" tls-enabled = false certificate = "/etc/ssl/influxdb.pem" batch-size = 1000 batch-pending = 5 batch-timeout = "1s" log-point-errors = true [[udp]] enabled = false bind-address = ":8089" database = "udp" retention-policy = "" batch-size = 5000 batch-pending = 10 read-buffer = 0 batch-timeout = "1s" precision = "" [continuous_queries] log-enabled = true enabled = true run-interval = "1s"
influxdb-deployment.yaml
這個裏使用上面的配置文件tcp
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: monitoring-influxdb namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: influxdb spec: containers: - name: influxdb image: gcr.io/google_containers/heapster-influxdb-amd64:v1.1.1 volumeMounts: - mountPath: /data name: influxdb-storage - mountPath: /etc/ name: influxdb-config volumes: - name: influxdb-storage emptyDir: {} - name: influxdb-config configMap: name: influxdb-config
建立influxdb服務
influxdb-service.yaml測試
apiVersion: v1 kind: Service metadata: labels: task: monitoring # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-influxdb name: monitoring-influxdb namespace: kube-system spec: type: NodePort ports: - port: 8086 targetPort: 8086 name: http - port: 8083 targetPort: 8083 name: admin selector: k8s-app: influxdb
下面是配置頁面的yaml文件,若是隻是集羣內部訪問能夠去掉上面的NodePort。
測試一下:
curl http://10.254.101.26/api/v1/model/namespaces/default/pod-list/busybox,busybox1/metrics/cpu/usage_rate {"items":[{"metrics":[{"timestamp":"2017-05-05T01:36:00Z","value":0},{"timestamp":"2017-05-05T01:37:00Z","value":0},{"timestamp":"2017-05-05T01:38:00Z","value":0},{"timestamp":"2017-05-05T01:39:00Z","value":0},{"timestamp":"2017-05-05T01:40:00Z","value":0},{"timestamp":"2017-05-05T01:41:00Z","value":0},{"timestamp":"2017-05-05T01:42:00Z","value":0},{"timestamp":"2017-05-05T01:43:00Z","value":0},{"timestamp":"2017-05-05T01:44:00Z","value":0},{"timestamp":"2017-05-05T01:45:00Z","value":0},{"timestamp":"2017-05-05T01:46:00Z","value":0},{"timestamp":"2017-05-05T01:47:00Z","value":0},{"timestamp":"2017-05-05T01:48:00Z","value":0},{"timestamp":"2017-05-05T01:49:00Z","value":0},{"timestamp":"2017-05-05T01:50:00Z","value":0}],"latestTimestamp":"2017-05-05T01:50:00Z"},{"metrics":[],"latestTimestamp":"0001-01-01T00:00:00Z"}]}
經過heapster服務地址就能夠獲取監控數據了。
dashboard的安裝也是經過yaml文件,設計到調用kubernetes接口權限問題,因此也是同樣先受權
dashboard-rbac.yaml
apiVersion: v1 kind: ServiceAccount metadata: name: dashboard namespace: kube-system --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1alpha1 metadata: name: dashboard subjects: - kind: ServiceAccount name: dashboard namespace: kube-system roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io
配置了cluster-admin最高訪問權限,
dashboard-controller.yaml
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: kubernetes-dashboard namespace: kube-system labels: k8s-app: kubernetes-dashboard kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: selector: matchLabels: k8s-app: kubernetes-dashboard template: metadata: labels: k8s-app: kubernetes-dashboard annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: serviceAccountName: dashboard containers: - name: kubernetes-dashboard image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0 imagePullPolicy: IfNotPresent resources: # keep request = limit to keep this container in guaranteed class limits: cpu: 100m memory: 50Mi requests: cpu: 100m memory: 50Mi ports: - containerPort: 9090 livenessProbe: httpGet: path: / port: 9090 initialDelaySeconds: 30 timeoutSeconds: 30 tolerations: - key: "CriticalAddonsOnly" operator: "Exists"
從外部訪問的服務
dashboard-service.yaml apiVersion: v1 kind: Service metadata: name: kubernetes-dashboard namespace: kube-system labels: k8s-app: kubernetes-dashboard kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: type: NodePort selector: k8s-app: kubernetes-dashboard ports: - port: 80 targetPort: 9090
這裏爲了從外部訪問因此設置NodePort。這樣dashboard就能夠訪問了。
kubectl get svc --namespace=kube-system NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes-dashboard 10.254.244.7 <nodes> 80:31508/TCP 1d
那麼就能夠經過任意計算節點+端口31508訪問服務了