若是沒有特殊指明,全部操做均在 zhaoyixin-k8s-01 節點上執行。node
kubernetes master 節點運行以下組件:linux
這三個組件均以多實例模式運行:nginx
cd /opt/k8s/work wget https://dl.k8s.io/v1.16.6/kubernetes-server-linux-amd64.tar.gz tar -xzvf kubernetes-server-linux-amd64.tar.gz cd kubernetes tar -xzvf kubernetes-src.tar.gz
將二進制文件拷貝到全部 master 節點:git
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kubernetes/server/bin/{apiextensions-apiserver,kube-apiserver,kube-controller-manager,kube-proxy,kube-scheduler,kubeadm,kubectl,kubelet,mounter} root@${node_ip}:/opt/k8s/bin/ ssh root@${node_ip} "chmod +x /opt/k8s/bin/*" done
本節部署一個三實例 kube-apiserver 集羣。github
建立證書籤名請求:json
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kubernetes-csr.json <<EOF { "CN": "kubernetes-master", "hosts": [ "127.0.0.1", "192.168.16.8", "192.168.16.10", "192.168.16.6", "${CLUSTER_KUBERNETES_SVC_IP}", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local.", "kubernetes.default.svc.${CLUSTER_DNS_DOMAIN}." ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "zhaoyixin" } ] } EOF
hosts
字段指定受權使用該證書的 IP 和域名列表,這裏列出了 master 節點 IP、kubernetes 服務的 IP 和域名。生成證書和私鑰:bootstrap
cfssl gencert -ca=/opt/k8s/work/ca.pem \ -ca-key=/opt/k8s/work/ca-key.pem \ -config=/opt/k8s/work/ca-config.json \ -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes ls kubernetes*pem
將證書和私鑰拷貝到全部 master 節點:後端
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/kubernetes/cert" scp kubernetes*.pem root@${node_ip}:/etc/kubernetes/cert/ done
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > encryption-config.yaml <<EOF kind: EncryptionConfig apiVersion: v1 resources: - resources: - secrets providers: - aescbc: keys: - name: key1 secret: ${ENCRYPTION_KEY} - identity: {} EOF
將加密配置文件拷貝到 master 節點:api
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp encryption-config.yaml root@${node_ip}:/etc/kubernetes/ done
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > audit-policy.yaml <<EOF apiVersion: audit.k8s.io/v1beta1 kind: Policy rules: # The following requests were manually identified as high-volume and low-risk, so drop them. - level: None resources: - group: "" resources: - endpoints - services - services/status users: - 'system:kube-proxy' verbs: - watch - level: None resources: - group: "" resources: - nodes - nodes/status userGroups: - 'system:nodes' verbs: - get - level: None namespaces: - kube-system resources: - group: "" resources: - endpoints users: - 'system:kube-controller-manager' - 'system:kube-scheduler' - 'system:serviceaccount:kube-system:endpoint-controller' verbs: - get - update - level: None resources: - group: "" resources: - namespaces - namespaces/status - namespaces/finalize users: - 'system:apiserver' verbs: - get # Don't log HPA fetching metrics. - level: None resources: - group: metrics.k8s.io users: - 'system:kube-controller-manager' verbs: - get - list # Don't log these read-only URLs. - level: None nonResourceURLs: - '/healthz*' - /version - '/swagger*' # Don't log events requests. - level: None resources: - group: "" resources: - events # node and pod status calls from nodes are high-volume and can be large, don't log responses # for expected updates from nodes - level: Request omitStages: - RequestReceived resources: - group: "" resources: - nodes/status - pods/status users: - kubelet - 'system:node-problem-detector' - 'system:serviceaccount:kube-system:node-problem-detector' verbs: - update - patch - level: Request omitStages: - RequestReceived resources: - group: "" resources: - nodes/status - pods/status userGroups: - 'system:nodes' verbs: - update - patch # deletecollection calls can be large, don't log responses for expected namespace deletions - level: Request omitStages: - RequestReceived users: - 'system:serviceaccount:kube-system:namespace-controller' verbs: - deletecollection # Secrets, ConfigMaps, and TokenReviews can contain sensitive & binary data, # so only log at the Metadata level. - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets - configmaps - group: authentication.k8s.io resources: - tokenreviews # Get repsonses can be large; skip them. - level: Request omitStages: - RequestReceived resources: - group: "" - group: admissionregistration.k8s.io - group: apiextensions.k8s.io - group: apiregistration.k8s.io - group: apps - group: authentication.k8s.io - group: authorization.k8s.io - group: autoscaling - group: batch - group: certificates.k8s.io - group: extensions - group: metrics.k8s.io - group: networking.k8s.io - group: policy - group: rbac.authorization.k8s.io - group: scheduling.k8s.io - group: settings.k8s.io - group: storage.k8s.io verbs: - get - list - watch # Default level for known APIs - level: RequestResponse omitStages: - RequestReceived resources: - group: "" - group: admissionregistration.k8s.io - group: apiextensions.k8s.io - group: apiregistration.k8s.io - group: apps - group: authentication.k8s.io - group: authorization.k8s.io - group: autoscaling - group: batch - group: certificates.k8s.io - group: extensions - group: metrics.k8s.io - group: networking.k8s.io - group: policy - group: rbac.authorization.k8s.io - group: scheduling.k8s.io - group: settings.k8s.io - group: storage.k8s.io # Default level for all other requests. - level: Metadata omitStages: - RequestReceived EOF
分發審計策略文件:數組
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp audit-policy.yaml root@${node_ip}:/etc/kubernetes/audit-policy.yaml done
建立證書籤名請求:
cd /opt/k8s/work cat > proxy-client-csr.json <<EOF { "CN": "aggregator", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "zhaoyixin" } ] } EOF
kube-apiserver
的 --requestheader-allowed-names
參數中,不然後續訪問 metrics 時會提示權限不足。生成證書和私鑰:
cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes proxy-client-csr.json | cfssljson -bare proxy-client ls proxy-client*.pem
拷貝生成的證書和私鑰文件
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp proxy-client*.pem root@${node_ip}:/etc/kubernetes/cert/ done
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kube-apiserver.service.template <<EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] WorkingDirectory=${K8S_DIR}/kube-apiserver ExecStart=/opt/k8s/bin/kube-apiserver \\ --advertise-address=##NODE_IP## \\ --default-not-ready-toleration-seconds=360 \\ --default-unreachable-toleration-seconds=360 \\ --feature-gates=DynamicAuditing=true \\ --max-mutating-requests-inflight=2000 \\ --max-requests-inflight=4000 \\ --default-watch-cache-size=200 \\ --delete-collection-workers=2 \\ --encryption-provider-config=/etc/kubernetes/encryption-config.yaml \\ --etcd-cafile=/etc/kubernetes/cert/ca.pem \\ --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \\ --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \\ --etcd-servers=${ETCD_ENDPOINTS} \\ --bind-address=##NODE_IP## \\ --secure-port=6443 \\ --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \\ --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \\ --insecure-port=0 \\ --audit-dynamic-configuration \\ --audit-log-maxage=15 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-truncate-enabled \\ --audit-log-path=${K8S_DIR}/kube-apiserver/audit.log \\ --audit-policy-file=/etc/kubernetes/audit-policy.yaml \\ --profiling \\ --anonymous-auth=false \\ --client-ca-file=/etc/kubernetes/cert/ca.pem \\ --enable-bootstrap-token-auth \\ --requestheader-allowed-names="aggregator" \\ --requestheader-client-ca-file=/etc/kubernetes/cert/ca.pem \\ --requestheader-extra-headers-prefix="X-Remote-Extra-" \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --service-account-key-file=/etc/kubernetes/cert/ca.pem \\ --authorization-mode=Node,RBAC \\ --runtime-config=api/all=true \\ --enable-admission-plugins=NodeRestriction \\ --allow-privileged=true \\ --apiserver-count=3 \\ --event-ttl=168h \\ --kubelet-certificate-authority=/etc/kubernetes/cert/ca.pem \\ --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \\ --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \\ --kubelet-https=true \\ --kubelet-timeout=10s \\ --proxy-client-cert-file=/etc/kubernetes/cert/proxy-client.pem \\ --proxy-client-key-file=/etc/kubernetes/cert/proxy-client-key.pem \\ --service-cluster-ip-range=${SERVICE_CIDR} \\ --service-node-port-range=${NODE_PORT_RANGE} \\ --logtostderr=true \\ --v=2 Restart=on-failure RestartSec=10 Type=notify LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
--advertise-address
:apiserver 對外通告的 IP(kubernetes 服務後端節點 IP);--default-*-toleration-seconds
:設置節點異常相關的閾值;--max-*-requests-inflight
:請求相關的最大閾值;--etcd-*
:訪問 etcd 的證書和 etcd 服務器地址;--bind-address
: https 監聽的 IP,不能爲 127.0.0.1,不然外界不能訪問它的安全端口 6443;--secret-port
:https 監聽端口;--insecure-port=0
:關閉監聽 http 非安全端口(8080);--tls-*-file
:指定 apiserver 使用的證書、私鑰和 CA 文件;--audit-*
:配置審計策略和審計日誌文件相關的參數;--client-ca-file
:驗證 client (kue-controller-manager、kube-scheduler、kubelet、kube-proxy 等)請求所帶的證書;--enable-bootstrap-token-auth
:啓用 kubelet bootstrap 的 token 認證;--requestheader-*
:kube-apiserver 的 aggregator layer 相關的配置參數,proxy-client & HPA 須要使用;--requestheader-client-ca-file
:用於簽名 --proxy-client-cert-file 和 --proxy-client-key-file 指定的證書;在啓用了 metric aggregator 時使用;--requestheader-allowed-names
:不能爲空,值爲逗號分割的 --proxy-client-cert-file 證書的 CN 名稱,這裏設置爲 "aggregator";--service-account-key-file
:簽名 ServiceAccount Token 的公鑰文件,kube-controller-manager 的 --service-account-private-key-file 指定私鑰文件,二者配對使用;--runtime-config=api/all=true
: 啓用全部版本的 APIs,如 autoscaling/v2alpha1;--authorization-mode=Node,RBAC
、--anonymous-auth=false
: 開啓 Node 和 RBAC 受權模式,拒絕未受權的請求;--enable-admission-plugins
:啓用一些默認關閉的 plugins;--allow-privileged
:運行執行 privileged 權限的容器;--apiserver-count=3
:指定 apiserver 實例的數量;--event-ttl
:指定 events 的保存時間;--kubelet-*
:若是指定,則使用 https 訪問 kubelet APIs;須要爲證書對應的用戶(上面 kubernetes*.pem 證書的用戶爲 kubernetes) 用戶定義 RBAC 規則,不然訪問 kubelet API 時提示未受權;--proxy-client-*
:apiserver 訪問 metrics-server 使用的證書;--service-cluster-ip-range
: 指定 Service Cluster IP 地址段;--service-node-port-range
: 指定 NodePort 的端口範圍;若是 kube-apiserver 機器沒有運行 kube-proxy
,則還須要添加 --enable-aggregator-routing=true
參數;
1.
--requestheader-client-ca-file
指定的 CA 證書,必須具備client auth and server auth
2.若是--requestheader-allowed-names
不爲空,且--proxy-client-cert-file
證書的 CN 名稱不在 allowed-names 中,則後續查看 node 或 pods 的 metrics 失敗
替換模板文件中的變量,爲各節點生成 systemd unit 文件:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-apiserver.service.template > kube-apiserver-${NODE_IPS[i]}.service done ls kube-apiserver*.service
分發生成的 systemd unit 文件:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-apiserver-${node_ip}.service root@${node_ip}:/etc/systemd/system/kube-apiserver.service done
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-apiserver" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver" done
查看運行狀態
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status kube-apiserver |grep 'Active:'" done
確保狀態是active (running)
,不然經過journalctl -u kube-apiserver
查看日誌,找出緣由。
$ kubectl cluster-info Kubernetes master is running at https://192.168.16.8:6443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. $ kubectl get all --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 60s $ kubectl get componentstatuses NAME AGE controller-manager <unknown> scheduler <unknown> etcd-0 <unknown> etcd-2 <unknown> etcd-1 <unknown>
檢查 kube-apiserver 監聽的端口
$ sudo netstat -lnpt|grep kube tcp 0 0 192.168.16.8:6443 0.0.0.0:* LISTEN 13722/kube-apiserve
本節部署一個三節點 kube-controller-manager 集羣。
爲保證通訊安全,先生成 x509 證書和私鑰,kube-controller-manager
在以下兩種狀況下使用該證書:
建立證書籤名請求:
cd /opt/k8s/work cat > kube-controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "192.168.16.8", "192.168.16.10", "192.168.16.6" ], "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-controller-manager", "OU": "zhaoyixin" } ] } EOF
system:kube-controller-manager
,kubernetes 內置的 ClusterRoleBindings system:kube-controller-manager
賦予 kube-controller-manager 工做所需的權限。生成證書和私鑰:
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem \ -ca-key=/opt/k8s/work/ca-key.pem \ -config=/opt/k8s/work/ca-config.json \ -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager ls kube-controller-manager*pem
分發生成的證書和私鑰
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager*.pem root@${node_ip}:/etc/kubernetes/cert/ done
kube-controller-manager 使用 kubeconfig 文件訪問 apiserver,該文件提供了 apiserver 地址、嵌入的 CA 證書和 kube-controller-manager 證書等信息:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes \ --certificate-authority=/opt/k8s/work/ca.pem \ --embed-certs=true \ --server="https://##NODE_IP##:6443" \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager \ --client-certificate=kube-controller-manager.pem \ --client-key=kube-controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-context system:kube-controller-manager \ --cluster=kubernetes \ --user=system:kube-controller-manager \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
分發 kubeconfig
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" sed -e "s/##NODE_IP##/${node_ip}/" kube-controller-manager.kubeconfig > kube-controller-manager-${node_ip}.kubeconfig scp kube-controller-manager-${node_ip}.kubeconfig root@${node_ip}:/etc/kubernetes/kube-controller-manager.kubeconfig done
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kube-controller-manager.service.template <<EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] WorkingDirectory=${K8S_DIR}/kube-controller-manager ExecStart=/opt/k8s/bin/kube-controller-manager \\ --profiling \\ --cluster-name=kubernetes \\ --controllers=*,bootstrapsigner,tokencleaner \\ --kube-api-qps=1000 \\ --kube-api-burst=2000 \\ --leader-elect \\ --use-service-account-credentials\\ --concurrent-service-syncs=2 \\ --bind-address=##NODE_IP## \\ --secure-port=10252 \\ --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \\ --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \\ --port=0 \\ --authentication-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\ --client-ca-file=/etc/kubernetes/cert/ca.pem \\ --requestheader-allowed-names="aggregator" \\ --requestheader-client-ca-file=/etc/kubernetes/cert/ca.pem \\ --requestheader-extra-headers-prefix="X-Remote-Extra-" \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --authorization-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\ --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \\ --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \\ --experimental-cluster-signing-duration=876000h \\ --horizontal-pod-autoscaler-sync-period=10s \\ --concurrent-deployment-syncs=10 \\ --concurrent-gc-syncs=30 \\ --node-cidr-mask-size=24 \\ --service-cluster-ip-range=${SERVICE_CIDR} \\ --pod-eviction-timeout=6m \\ --terminated-pod-gc-threshold=10000 \\ --root-ca-file=/etc/kubernetes/cert/ca.pem \\ --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \\ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\ --logtostderr=true \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF
--port=0
:關閉監聽非安全端口(http),同時 --address
參數無效,--bind-address
參數有效;--secure-port=10252
、--bind-address=0.0.0.0
: 在全部網絡接口監聽 10252 端口的 https /metrics 請求;--kubeconfig
:指定 kubeconfig 文件路徑,kube-controller-manager 使用它鏈接和驗證 kube-apiserver;--authentication-kubeconfig
和 --authorization-kubeconfig
:kube-controller-manager 使用它鏈接 apiserver,對 client 的請求進行認證和受權。kube-controller-manager 再也不使用 --tls-ca-file
對請求 https metrics 的 Client 證書進行校驗。若是沒有配置這兩個 kubeconfig 參數,則 client 鏈接 kube-controller-manager https 端口的請求會被拒絕(提示權限不足)。--cluster-signing-*-file
:簽名 TLS Bootstrap 建立的證書;--experimental-cluster-signing-duration
:指定 TLS Bootstrap 證書的有效期;--root-ca-file
:放置到容器 ServiceAccount 中的 CA 證書,用來對 kube-apiserver 的證書進行校驗;--service-account-private-key-file
:簽名 ServiceAccount 中 Token 的私鑰文件,必須和 kube-apiserver 的 --service-account-key-file
指定的公鑰文件配對使用;--service-cluster-ip-range
:指定 Service Cluster IP 網段,必須和 kube-apiserver 中的同名參數一致;--leader-elect=true
:集羣運行模式,啓用選舉功能;被選爲 leader 的節點負責處理工做,其它節點爲阻塞狀態;--controllers=*,bootstrapsigner,tokencleaner
:啓用的控制器列表,tokencleaner 用於自動清理過時的 Bootstrap token;--horizontal-pod-autoscaler-*
:custom metrics 相關參數,支持 autoscaling/v2alpha1;--tls-cert-file
、--tls-private-key-file
:使用 https 輸出 metrics 時使用的 Server 證書和祕鑰;--use-service-account-credentials=true
: kube-controller-manager 中各 controller 使用 serviceaccount 訪問 kube-apiserver;替換模板文件中的變量,爲各節點建立 systemd unit 文件:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-controller-manager.service.template > kube-controller-manager-${NODE_IPS[i]}.service done ls kube-controller-manager*.service
分發到全部 master 節點:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager-${node_ip}.service root@${node_ip}:/etc/systemd/system/kube-controller-manager.service done
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-controller-manager" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager" done
檢查服務運行狀態
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status kube-controller-manager|grep Active" done
確保狀態爲 active (running)
,不然經過journalctl -u kube-controller-manager
查看日誌,找出緣由。
kube-controller-manager 監聽 10252 端口,接收 https 請求:
$ sudo netstat -lnpt | grep kube-cont tcp 0 0 192.168.16.8:10252 0.0.0.0:* LISTEN 27533/kube-controll
$ curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://172.27.138.251:10252/metrics |head # HELP apiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend. # TYPE apiserver_audit_event_total counter apiserver_audit_event_total 0 # HELP apiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend. # TYPE apiserver_audit_requests_rejected_total counter apiserver_audit_requests_rejected_total 0 # HELP apiserver_client_certificate_expiration_seconds [ALPHA] Distribution of the remaining lifetime on the certificate used to authenticate a request. # TYPE apiserver_client_certificate_expiration_seconds histogram apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="1800"} 0
$ kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"zhaoyixin-k8s-01_4b717180-fdc3-42d1-8e9e-08d8bc0708b2","leaseDurationSeconds":15,"acquireTime":"2020-05-21T07:05:08Z","renewTime":"2020-05-21T07:06:29Z","leaderTransitions":0}' creationTimestamp: "2020-05-21T07:05:08Z" name: kube-controller-manager namespace: kube-system resourceVersion: "23373" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager uid: cdd98142-9541-4418-9e8f-c2f77c5009e6
可見,當前的 leader 爲 zhaoyixin-k8s-01 節點。
本節部署一個三節點 kube-scheduler 集羣。
爲保證通訊安全,本文檔先生成 x509 證書和私鑰,kube-scheduler
在以下兩種狀況下使用該證書:
建立證書籤名請求:
cd /opt/k8s/work cat > kube-scheduler-csr.json <<EOF { "CN": "system:kube-scheduler", "hosts": [ "127.0.0.1", "192.168.16.8", "192.168.16.10", "192.168.16.6" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-scheduler", "OU": "zhaoyixin" } ] } EOF
system:kube-scheduler
,kubernetes 內置的 ClusterRoleBindings system:kube-scheduler
將賦予 kube-scheduler 工做所需的權限;生成證書和私鑰:
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem \ -ca-key=/opt/k8s/work/ca-key.pem \ -config=/opt/k8s/work/ca-config.json \ -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler ls kube-scheduler*pem
分發證書和私鑰
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-scheduler*.pem root@${node_ip}:/etc/kubernetes/cert/ done
kube-scheduler 使用 kubeconfig 文件訪問 apiserver,該文件提供了 apiserver 地址、嵌入的 CA 證書和 kube-scheduler 證書:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes \ --certificate-authority=/opt/k8s/work/ca.pem \ --embed-certs=true \ --server="https://##NODE_IP##:6443" \ --kubeconfig=kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler \ --client-certificate=kube-scheduler.pem \ --client-key=kube-scheduler-key.pem \ --embed-certs=true \ --kubeconfig=kube-scheduler.kubeconfig kubectl config set-context system:kube-scheduler \ --cluster=kubernetes \ --user=system:kube-scheduler \ --kubeconfig=kube-scheduler.kubeconfig kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
分發 kubeconfig 到全部 master 節點:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" sed -e "s/##NODE_IP##/${node_ip}/" kube-scheduler.kubeconfig > kube-scheduler-${node_ip}.kubeconfig scp kube-scheduler-${node_ip}.kubeconfig root@${node_ip}:/etc/kubernetes/kube-scheduler.kubeconfig done
cd /opt/k8s/work cat >kube-scheduler.yaml.template <<EOF apiVersion: kubescheduler.config.k8s.io/v1alpha1 kind: KubeSchedulerConfiguration bindTimeoutSeconds: 600 clientConnection: burst: 200 kubeconfig: "/etc/kubernetes/kube-scheduler.kubeconfig" qps: 100 enableContentionProfiling: false enableProfiling: true hardPodAffinitySymmetricWeight: 1 healthzBindAddress: ##NODE_IP##:10251 leaderElection: leaderElect: true metricsBindAddress: ##NODE_IP##:10251 EOF
--kubeconfig
:指定 kubeconfig 文件路徑,kube-scheduler 使用它鏈接和驗證 kube-apiserver;--leader-elect=true
:集羣運行模式,啓用選舉功能;被選爲 leader 的節點負責處理工做,其它節點爲阻塞狀態;替換模板文件中的變量:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-scheduler.yaml.template > kube-scheduler-${NODE_IPS[i]}.yaml done ls kube-scheduler*.yaml
分發 kube-scheduler 配置文件到全部 master 節點,並重名命名爲 kube-scheduler.yaml。
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-scheduler-${node_ip}.yaml root@${node_ip}:/etc/kubernetes/kube-scheduler.yaml done
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kube-scheduler.service.template <<EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] WorkingDirectory=${K8S_DIR}/kube-scheduler ExecStart=/opt/k8s/bin/kube-scheduler \\ --config=/etc/kubernetes/kube-scheduler.yaml \\ --bind-address=##NODE_IP## \\ --secure-port=10259 \\ --port=0 \\ --tls-cert-file=/etc/kubernetes/cert/kube-scheduler.pem \\ --tls-private-key-file=/etc/kubernetes/cert/kube-scheduler-key.pem \\ --authentication-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\ --client-ca-file=/etc/kubernetes/cert/ca.pem \\ --requestheader-allowed-names="" \\ --requestheader-client-ca-file=/etc/kubernetes/cert/ca.pem \\ --requestheader-extra-headers-prefix="X-Remote-Extra-" \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --authorization-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\ --logtostderr=true \\ --v=2 Restart=always RestartSec=5 StartLimitInterval=0 [Install] WantedBy=multi-user.target EOF
替換模板文件中的變量,爲各節點建立 systemd unit 文件:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-scheduler.service.template > kube-scheduler-${NODE_IPS[i]}.service done ls kube-scheduler*.service
分發 systemd unit 文件到全部 master 節點:
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-scheduler-${node_ip}.service root@${node_ip}:/etc/systemd/system/kube-scheduler.service done
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-scheduler" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-scheduler && systemctl restart kube-scheduler" done
檢查服務運行狀態
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status kube-scheduler|grep Active" done
確保狀態爲 active (running)
,不然經過journalctl -u kube-scheduler
查看日誌,找出緣由。
kube-scheduler 監聽 10251 和 10259 端口:
兩個接口都對外提供 /metrics
和 /healthz
的訪問。
$ sudo netstat -lnpt |grep kube-sch tcp 0 0 192.168.16.8:10259 0.0.0.0:* LISTEN 3605/kube-scheduler tcp 0 0 192.168.16.8:10251 0.0.0.0:* LISTEN 3605/kube-scheduler
$ curl -s http://192.168.16.8:10251/metrics |head # HELP apiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend. # TYPE apiserver_audit_event_total counter apiserver_audit_event_total 0 # HELP apiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend. # TYPE apiserver_audit_requests_rejected_total counter apiserver_audit_requests_rejected_total 0 # HELP apiserver_client_certificate_expiration_seconds [ALPHA] Distribution of the remaining lifetime on the certificate used to authenticate a request. # TYPE apiserver_client_certificate_expiration_seconds histogram apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="1800"} 0
$ curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.16.8:10259/metrics |head # HELP apiserver_audit_event_total [ALPHA] Counter of audit events generated and sent to the audit backend. # TYPE apiserver_audit_event_total counter apiserver_audit_event_total 0 # HELP apiserver_audit_requests_rejected_total [ALPHA] Counter of apiserver requests rejected due to an error in audit logging backend. # TYPE apiserver_audit_requests_rejected_total counter apiserver_audit_requests_rejected_total 0 # HELP apiserver_client_certificate_expiration_seconds [ALPHA] Distribution of the remaining lifetime on the certificate used to authenticate a request. # TYPE apiserver_client_certificate_expiration_seconds histogram apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="1800"} 0
查看當前的 leader
$ kubectl get endpoints kube-scheduler --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"zhaoyixin-k8s-01_463d86ce-7c3f-478f-b03f-79282fd97565","leaseDurationSeconds":15,"acquireTime":"2020-05-29T03:23:30Z","renewTime":"2020-05-29T03:26:14Z","leaderTransitions":0}' creationTimestamp: "2020-05-29T03:23:30Z" name: kube-scheduler namespace: kube-system resourceVersion: "559017" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler uid: 6ca6bd7b-b3ee-49bb-9efb-f180511022df
可見,當前的 leader 爲 zhaoyixin-k8s-01 節點。