kubernetes實戰(二十):k8s一鍵部署高可用Prometheus並實現郵件告警

一、基本概念node

  本次部署使用的是CoreOS的prometheus-operator。git

  本次部署包含監控etcd集羣。github

  本次部署適用於二進制和kubeadm安裝方式。api

  本次部署適用於k8s v1.10版本以上,其餘版本自行測試。app

  項目地址:https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus測試

  使用Helm安裝:https://github.com/helm/charts/tree/master/stable/prometheus-operatorspa

 

二、安裝code

  下載安裝文件:server

[root@k8s-master01 ~]# git clone https://github.com/dotbalo/k8s.git
Cloning into 'k8s'... remote: Enumerating objects: 373, done. remote: Counting objects: 100% (373/373), done. remote: Compressing objects: 100% (264/264), done. remote: Total 373 (delta 127), reused 349 (delta 103), pack-reused 0 Receiving objects: 100% (373/373), 4.92 MiB | 553.00 KiB/s, done. Resolving deltas: 100% (127/127), done.

 [root@k8s-master01 prometheus-operator]# ls
 alertmanager-config.yam.bak bundle.yaml mail-template.tmpl README.md
 alertmanager.yaml deploy manifests teardownblog

 

  修改相關配置:

  1) 修改deploy文件中的etcd證書文件,kubeadm安裝方式的無須修改

  2)修改manifests/prometheus/prometheus-etcd.yaml的tlsConfig(kubeadm安裝方式的無須修改)和addresses(etcd地址)

  3)修改alertmanager.yaml文件的郵件告警配置和收件人配置

  一鍵安裝:(注意:若是集羣是二進制安裝的,首次安裝註冊時間可能會很長很長,kubeadm安裝方式較迅速。)

[root@k8s-master01 prometheus-operator]# ./deploy namespace/monitoring created secret/alertmanager-main created secret/etcd-certs created clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created clusterrole.rbac.authorization.k8s.io/prometheus-operator created serviceaccount/prometheus-operator created service/prometheus-operator created deployment.apps/prometheus-operator created Waiting for Operator to register custom resource definitions...done! clusterrolebinding.rbac.authorization.k8s.io/node-exporter created clusterrole.rbac.authorization.k8s.io/node-exporter created daemonset.extensions/node-exporter created serviceaccount/node-exporter created service/node-exporter created clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created clusterrole.rbac.authorization.k8s.io/kube-state-metrics created deployment.extensions/kube-state-metrics created rolebinding.rbac.authorization.k8s.io/kube-state-metrics created role.rbac.authorization.k8s.io/kube-state-metrics-resizer created serviceaccount/kube-state-metrics created service/kube-state-metrics created secret/grafana-credentials created secret/grafana-credentials unchanged configmap/grafana-dashboard-definitions-0 created configmap/grafana-dashboards created configmap/grafana-datasources created deployment.apps/grafana created service/grafana created service/etcd-k8s created endpoints/etcd-k8s created servicemonitor.monitoring.coreos.com/etcd-k8s created configmap/prometheus-k8s-rules created serviceaccount/prometheus-k8s created servicemonitor.monitoring.coreos.com/alertmanager created servicemonitor.monitoring.coreos.com/kube-apiserver created servicemonitor.monitoring.coreos.com/kube-controller-manager created servicemonitor.monitoring.coreos.com/kube-scheduler created servicemonitor.monitoring.coreos.com/kube-state-metrics created servicemonitor.monitoring.coreos.com/kubelet created servicemonitor.monitoring.coreos.com/node-exporter created servicemonitor.monitoring.coreos.com/prometheus-operator created servicemonitor.monitoring.coreos.com/prometheus created service/prometheus-k8s created prometheus.monitoring.coreos.com/k8s created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created clusterrole.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created service/alertmanager-main created alertmanager.monitoring.coreos.com/main created

 

三、驗證安裝

  查看pods

[root@k8s-master01 prometheus-operator]# kubectl get po -n monitoring NAME READY STATUS RESTARTS AGE alertmanager-main-0                    2/2       Running   0 2m alertmanager-main-1                    2/2       Running   0 1m alertmanager-main-2                    2/2       Running   0 1m grafana-59f56c4789-dzvgf               1/1       Running   0 2m kube-state-metrics-575464c49c-m8w4w    4/4       Running   0 2m node-exporter-5kvxf                    2/2       Running   0 2m node-exporter-66p7h                    2/2       Running   0 2m node-exporter-clxzk                    2/2       Running   0 2m node-exporter-hsgm8                    2/2       Running   0 2m node-exporter-m5l24                    2/2       Running   0 2m prometheus-k8s-0                       2/2       Running   0 2m prometheus-k8s-1                       2/2       Running   0 2m prometheus-operator-8597f9b976-2hvd5   1/1       Running   0          2m

  查看svc

[root@k8s-master01 prometheus-operator]# kubectl get svc -n !$ kubectl get svc -n monitoring NAME TYPE CLUSTER-IP       EXTERNAL-IP PORT(S) AGE alertmanager-main       NodePort    10.106.201.155   <none>        9093:30903/TCP 2m alertmanager-operated   ClusterIP   None             <none>        9093/TCP,6783/TCP 2m etcd-k8s                ClusterIP   None             <none>        2379/TCP 2m grafana NodePort 10.99.143.133    <none>        3000:30902/TCP 2m kube-state-metrics      ClusterIP   None             <none>        8443/TCP,9443/TCP 2m node-exporter           ClusterIP   None             <none>        9100/TCP 2m prometheus-k8s          NodePort    10.101.175.59    <none>        9090:30900/TCP 2m prometheus-operated     ClusterIP   None             <none>        9090/TCP 2m prometheus-operator     ClusterIP   10.107.31.10     <none>        8080/TCP            2m

  此時開放了三個端口:

  •   alertmanager UI:30903
  •   grafana:30902
  •   prometheus UI:30900

 

四、訪問測試

  alertmanager:

  prometheus:

  grafana:

 

 

  告警郵件查看:

 

五、卸載

 

[root@k8s-master01 prometheus-operator]# ./teardown clusterrolebinding.rbac.authorization.k8s.io "node-exporter" deleted clusterrole.rbac.authorization.k8s.io "node-exporter" deleted daemonset.extensions "node-exporter" deleted serviceaccount "node-exporter" deleted service "node-exporter" deleted clusterrolebinding.rbac.authorization.k8s.io "kube-state-metrics" deleted clusterrole.rbac.authorization.k8s.io "kube-state-metrics" deleted deployment.extensions "kube-state-metrics" deleted rolebinding.rbac.authorization.k8s.io "kube-state-metrics" deleted role.rbac.authorization.k8s.io "kube-state-metrics-resizer" deleted serviceaccount "kube-state-metrics" deleted service "kube-state-metrics" deleted secret "grafana-credentials" deleted configmap "grafana-dashboard-definitions-0" deleted configmap "grafana-dashboards" deleted configmap "grafana-datasources" deleted deployment.apps "grafana" deleted service "grafana" deleted service "etcd-k8s" deleted servicemonitor.monitoring.coreos.com "etcd-k8s" deleted ......

 

 

贊助做者:

  

相關文章
相關標籤/搜索