kubernetes快速入門-安裝篇

kubeadm安裝k8s集羣

kubeadm的實現設計請參考:https://github.com/kubernetes/kubeadm/blob/master/docs/design/design_v1.10.mdnode

網絡規劃

節點網絡 Pod網絡 service網絡
192.168.101.0/24 10.244.0.0/16(flannel網絡默認) 10.96.0.0/12

部署流程

  1. 各個master和nodes節點先手動安裝好kubelet、kubadm和docker,kubelet是負責能運行Pod化的核心組件,docker是容器化的引擎;
  2. 在master節點上運行kubeadm init命令初始化,完成主節點的初化。在master節點上把API Servercontroller-managerscheduleretcd運行成Pod,在各node節點上把kube-proxy也運行成Pod,這些Pod是靜態化Pod;
  3. nodes節點上使用kubeadm join把節點加入到集羣
  4. flannel附件也運行成在各master和nodes節點上,也運行成Pod

集羣安裝

系統環境準備

節點角色 IP地址
master主節點 192.168.101.40
node1工做節點 192.168.101.41
node2工做節點 192.168.101.42

三個節點系統環境徹底相同linux

root@node01:~# cat /etc/issue
Ubuntu 18.04.4 LTS \n \l

root@node01:~# uname -r
4.15.0-111-generic

root@node01:~# lsb_release -cr
Release:    18.04
Codename:   bionic

master和node節點上分別執行以下操做git

# 禁用swap
# 增長開機啓動時關閉swap
# 禁用 /etc/fstab 文件中swap的相關行
root@node01:~# swapoff -a
root@node01:~# vim /etc/rc.local
#/bin/bash
swapoff -a
root@node01:~# chmod +x /etc/rc.local

# 關閉ufw防火牆,若是是centos7系統,則須要關閉firewall,並disable
root@node01:~# systemctl stop ufw.service
root@node01:~# systemctl disable ufw.service
Synchronizing state of ufw.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable ufw
root@node01:~# systemctl list-unit-files | grep ufw
ufw.service                                                      disabled

# selinux未配置,若是是啓用狀態,得須要禁用

# 清空iptables規則
root@node01:~# dpkg -l | grep iptables  # 默認已安裝Iptables管理工具
ii  iptables                              1.6.1-2ubuntu2                                  amd64        administration tools for packet filtering and NAT
root@node01:~# iptables -F
root@node01:~# iptables -X
root@node01:~# iptables -Z

# 更改apt源,使用阿里的鏡像源
root@node01:~# vim /etc/apt/sources.list
deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse

# 安裝時間同步軟件
root@node01:~# apt-get update && apt-get install chrony
# 修正時區
root@node01:~# cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

# 增長aliyun的docker-ce源
root@node01:~# apt-get -y install apt-transport-https ca-certificates curl software-properties-common
root@node01:~# curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
root@node01:~# echo "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" > /etc/apt/sources.list.d/docker-ce.list

# 安裝docker-ce
root@node01:~# apt-get update && apt-get install docker-ce

# 增長aliyun的kubernetes鏡像源
root@node01:~# apt-get install -y apt-transport-https
root@node01:~# curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - 
root@node01:~# cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF

# 在/etc/hosts中增長各個節點的主機解析
192.168.101.40 node01.k8s.com node01
192.168.101.41 node02.k8s.com node02
192.168.101.42 node03.k8s.com node03

ubuntu上的iptables不像centos系統上是以服務的形式來管理,iptable只是一個管理工具而已,只須要保證沒有啓用規則便可。github

Ubutu阿里鏡像源配置參考:https://developer.aliyun.com/mirror/ubuntu?spm=a2c6h.13651102.0.0.3e221b11HFtiVedocker

Docker-ce阿里鏡像源配置參考:https://developer.aliyun.com/mirror/docker-ce?spm=a2c6h.13651102.0.0.3e221b11O3EaIzjson

kubernetes阿里鏡像配置參考:https://developer.aliyun.com/mirror/kubernetes?spm=a2c6h.13651102.0.0.3e221b11HFtiVebootstrap

docker版本選擇

經過以上的步驟安裝的docker-ce爲19.03.12版本,對kubernetes來講該版本太高,在這裏有說明:ubuntu

Kubernetes system requirements:

 if running on linux:
   [error] if not Kernel 3.10+ or 4+ with specific KernelSpec.
   [error] if required cgroups subsystem aren't in set up.
 if using docker:
   [error/warning] if Docker endpoint does not exist or does not work, if docker version >17.03. Note: starting from 1.9, kubeadm provides better support for CRI-generic functionality; in that case, docker specific controls are skipped or replaced by similar controls for crictl

若是是生產環境,請安裝17.03的版本。vim

k8s集羣架構

kubernetes快速入門-安裝篇

master節點安裝

root@node01:~# apt-get install kubelet kubeadm kubectl
...
o you want to continue? [Y/n] y
Get:1 http://mirrors.aliyun.com/ubuntu bionic/main amd64 conntrack amd64 1:1.4.4+snapshot20161117-6ubuntu2 [30.6 kB]
Get:2 http://mirrors.aliyun.com/ubuntu bionic/main amd64 socat amd64 1.7.3.2-2ubuntu2 [342 kB]
Get:3 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 cri-tools amd64 1.13.0-01 [8775 kB]
Get:4 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubernetes-cni amd64 0.8.6-00 [25.0 MB]
Get:5 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubelet amd64 1.18.6-00 [19.4 MB]
Get:6 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubectl amd64 1.18.6-00 [8826 kB]
Get:7 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubeadm amd64 1.18.6-00 [8167 kB]
Fetched 70.6 MB in 15s (4599 kB/s)
...

kubectl: API Server的客戶端工具,node節點上不執行API Server相關命令就不用安裝centos

kubeadm在初始化時會下載一些鏡像到本地,而這些鏡像是託管在k8s.gcr.io,在大陸地區沒法訪問。能夠想辦法搭建一個代理來解決,要讓docker deamon在拉取鏡像時走代理,配置以下:

root@node01:~# vim /lib/systemd/system/docker.service
[Service]
Environment="HTTPS_PROXY=http://x.x.x.x:10080"
Environment="NO_PROXY=127.0.0.0/8,192.168.101.0/24"
...

# 從新啓動docker
root@node01:~# systemctl daemon-reload
root@node01:~# systemctl stop docker
root@node01:~# systemctl start docker

確保關於iptable的兩個內核參數值爲1

root@node01:~# cat /proc/sys/net/bridge/bridge-nf-call-iptables
1
root@node01:~# cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
1

確保kubelet服務設置爲開機啓動,但當前處理關閉狀態

root@node01:~# systemctl is-enabled kubelet
enabled

增長docker運行加載參數

root@node01:~# vim /etc/docker/daemon.json
{
    "exec-opts": ["native.cgroupdriver=systemd"],
    ....
}

若是不加此選項,那kubeadm init在初始化時會有警告信息,而且初始化失敗,警告信息以下:

...
[preflight] Running pre-flight checks
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
...

[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.

    Unfortunately, an error has occurred:
        timed out waiting for the condition

    This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

    If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

    Additionally, a control plane component may have crashed or exited when started by the container runtime.
    To troubleshoot, list all containers using your preferred container runtimes CLI.

    Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'

初始化kubernetes

# 查看須要拉取哪些鏡像
root@node01:~# kubeadm config images list
W0725 13:02:07.511180    6409 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
k8s.gcr.io/kube-apiserver:v1.18.6
k8s.gcr.io/kube-controller-manager:v1.18.6
k8s.gcr.io/kube-scheduler:v1.18.6
k8s.gcr.io/kube-proxy:v1.18.6
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7

# 先拉取所須要的鏡像
root@node01:~# kubeadm config images pull
W0722 16:17:21.699535    8329 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[config/images] Pulled k8s.gcr.io/kube-apiserver:v1.18.6
[config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.18.6
[config/images] Pulled k8s.gcr.io/kube-scheduler:v1.18.6
[config/images] Pulled k8s.gcr.io/kube-proxy:v1.18.6
[config/images] Pulled k8s.gcr.io/pause:3.2
[config/images] Pulled k8s.gcr.io/etcd:3.4.3-0
[config/images] Pulled k8s.gcr.io/coredns:1.6.7

# 初始化爲master
root@node01:~# kubeadm init --kubernetes-version=v1.18.6 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12
W0722 17:02:21.625550   25074 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.6
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [node01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.101.40]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [node01 localhost] and IPs [192.168.101.40 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [node01 localhost] and IPs [192.168.101.40 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0722 17:02:25.619105   25074 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0722 17:02:25.620260   25074 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 25.005958 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node node01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node node01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: ri964b.aos1fa4h7y2zmu5g
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.101.40:6443 --token ri964b.aos1fa4h7y2zmu5g \
    --discovery-token-ca-cert-hash sha256:c7c8e629116b4bda1af8ad83236291f1a38ca01bb0abd8a7a8a46c286547d609

注意:

kubeadm join 192.168.101.40:6443 --token ri964b.aos1fa4h7y2zmu5g \
    --discovery-token-ca-cert-hash sha256:c7c8e629116b4bda1af8ad83236291f1a38ca01bb0abd8a7a8a46c286547d609
這個增長工做節點命令中的「token」是有時效性的,默認爲24小時,過時後在增長工做節點時出現「error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID...」這樣的錯誤,那就說明token過時了,解決辦法:
在master節點上使用「kubeadm token create --ttl 0」來生成新的token,其中「--ttl 0」表示token不過時,根據須要看是否增長此選項。"kubeadm token list"列出token有哪些。

master節點初始化完成,按照提示建立建立一個普通用戶來管理kubernetes集羣

root@node01:~# adduser k8s

# 配置sudo權限
root@node01:~# visudo
# 增長一行
k8s ALL=(ALL) NOPASSWD:ALL

#
k8s@node01:~$ mkdir -p $HOME/.kube
k8s@node01:~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
k8s@node01:~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
k8s@node01:~$ ls -al .kube/config
-rw------- 1 k8s k8s 5454 Jul 22 17:37 .kube/config

此時查看集羣狀態,節點狀態,運行Pod信息都是有問題的,以下

# 集羣狀態不健康
k8s@node01:~$ sudo kubectl get componentstatus
NAME                 STATUS      MESSAGE                                                                                     ERROR
controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0               Healthy     {"health":"true"}

# 有兩個Pod沒有正常運行
k8s@node01:~$ sudo kubectl get pods -n kube-system
NAME                             READY   STATUS    RESTARTS   AGE
coredns-66bff467f8-7dr57         0/1     Pending   0          85m
coredns-66bff467f8-xzf9p         0/1     Pending   0          85m
etcd-node01                      1/1     Running   0          85m
kube-apiserver-node01            1/1     Running   0          85m
kube-controller-manager-node01   1/1     Running   0          85m
kube-proxy-vlbxb                 1/1     Running   0          85m
kube-scheduler-node01            1/1     Running   0          85m

# master節點也是未就緒狀態
k8s@node01:~$ sudo kubectl get nodes
NAME     STATUS     ROLES    AGE   VERSION
node01   NotReady   master   49m   v1.18.6

後兩個問題都將在安裝網絡插件flannel後獲得解決。

安裝網絡插件flannel

k8s@node01:~$ sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created

關於cgroupfs與systemd的說明

docker安裝好後默認使用的Cgroup Drivercgroupfs,以下

root@node03:/var/lib/kubelet# docker info | grep -i cgroup
 Cgroup Driver: cgroupfs

而kuebelet默認使用的Cgroup Driversystemd,因此kubelet與docker使用的驅動要一致才能正常的協調工做,在初始化master時,是修改的/etc/docker/daemon.json文件,給docker daemon傳遞一個參數讓其Cgroup Driver設置爲systemd,也能夠修改kubelet的啓動參數,讓其工做在cgroupfs模式,確保如下配置文件中--cgroup-driver=cgroupfs便可

$ cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.2 --resolv-conf=/run/systemd/resolve/resolv.conf"

工做節點node03就是以這種方式加入到集羣的。

kubectl經常使用命令

kubectl get componentstatus可簡寫爲kubectl cs 列出集羣健康狀態信息

kubectl get nodes 列出集羣節點信息

kubectl get pods -n kube-system 列出名稱空間爲「kube-system」中Pod的運行狀態

kubectl get ns 列出集羣的名稱空間

kubectl get deployment -w 實時監控deployment的信息

kubectl describe node NODENAME 查看一個節點的詳細信息

kubectl cluster-info 集羣信息

kubectl get services簡寫爲kubectl get svc 列出services

kubectl get pods --show-labels 顯示pods資源時一併顯示相應的標籤信息

kubectl edit svc SERVICE_NAME 修改一個服務的運行中的信息

kubectl describe deployment DEPLOYMENT_NAME 顯示指定deployment的詳細信息

node節點安裝及加入到集羣

# 安裝所須要組件
root@node02:~# apt-get update && apt-get -y install kubelet kubeadm

# 複製master節點上的/etc/docker/daemon.json文件,主要是配置「"exec-opts": ["native.cgroupdriver=systemd"],」,不然
# kubelet沒法啓動,配置更改後重啓docker

# 設置開機啓動
oot@node02:~# systemctl enable docker kubelet

# 加入集羣
root@node02:~# kubeadm join 192.168.101.40:6443 --token ri964b.aos1fa4h7y2zmu5g --discovery-token-ca-cert-hash sha256:c7c8e629116b4bda1af8ad83236291f1a38ca01bb0abd8a7a8a46c286547d609
W0722 18:42:58.676548   25113 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

再回到master節點查看狀態信息

# 集羣狀態已就緒
k8s@node01:~$ sudo kubectl get nodes
NAME     STATUS   ROLES    AGE    VERSION
node01   Ready    master   117m   v1.18.6
node02   Ready    <none>   24m    v1.18.6

# 各Pod已正常運行
k8s@node01:~$ sudo kubectl get pods -n kube-system -o wide
NAME                             READY   STATUS    RESTARTS   AGE     IP               NODE     NOMINATED NODE   READINESS GATES
coredns-66bff467f8-7dr57         1/1     Running   0          116m    10.244.1.3       node02   <none>           <none>
coredns-66bff467f8-xzf9p         1/1     Running   0          116m    10.244.1.2       node02   <none>           <none>
etcd-node01                      1/1     Running   0          116m    192.168.101.40   node01   <none>           <none>
kube-apiserver-node01            1/1     Running   0          116m    192.168.101.40   node01   <none>           <none>
kube-controller-manager-node01   1/1     Running   0          116m    192.168.101.40   node01   <none>           <none>
kube-flannel-ds-amd64-djjs7      1/1     Running   0          6m35s   192.168.101.41   node02   <none>           <none>
kube-flannel-ds-amd64-hthnk      1/1     Running   0          6m35s   192.168.101.40   node01   <none>           <none>
kube-proxy-r2v2p                 1/1     Running   0          23m     192.168.101.41   node02   <none>           <none>
kube-proxy-vlbxb                 1/1     Running   0          116m    192.168.101.40   node01   <none>           <none>
kube-scheduler-node01            1/1     Running   0          116m    192.168.101.40   node01   <none>           <none>

node03以一樣的方式加入到集羣,最終集羣狀態以下

k8s@node01:~$ sudo kubectl get nodes
NAME     STATUS   ROLES    AGE    VERSION
node01   Ready    master   124m   v1.18.6
node02   Ready    <none>   31m    v1.18.6
node03   Ready    <none>   47s    v1.18.6

node節點移除

若是想移除集羣中的節點依次進行以下操做

k8s@node01:~$ kubectl get nodes
NAME     STATUS   ROLES    AGE   VERSION
node01   Ready    master   12d   v1.18.6
node02   Ready    <none>   12d   v1.18.6
node03   Ready    <none>   12d   v1.18.6
node04   Ready    <none>   24h   v1.18.6  # 須要移除node04節點

# 遷移node04節點上的pod,daemonset類型的pod不用遷移
k8s@node01:~$ kubectl drain node04 --delete-local-data --force --ignore-daemonsets
node/node04 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/canal-ggt5n, kube-system/kube-flannel-ds-amd64-xhksw, kube-system/kube-proxy-g9rpd
node/node04 drained

k8s@node01:~$ kubectl get nodes
NAME     STATUS                     ROLES    AGE   VERSION
node01   Ready                      master   12d   v1.18.6
node02   Ready                      <none>   12d   v1.18.6
node03   Ready                      <none>   12d   v1.18.6
node04   Ready,SchedulingDisabled   <none>   24h   v1.18.6

k8s@node01:~$ kubectl delete nodes node04
node "node04" deleted

k8s@node01:~$ kubectl get nodes
NAME     STATUS   ROLES    AGE   VERSION
node01   Ready    master   12d   v1.18.6
node02   Ready    <none>   12d   v1.18.6
node03   Ready    <none>   12d   v1.18.6

# 再到node04節點上執行
root@node01:~# kubeadm reset

組件controller-manager與scheduler狀態爲Unhealthy處理

master初始化完成後,如下兩個組件狀態顯示依然爲Unhealthy

k8s@node01:~$ sudo kubectl get cs
NAME                 STATUS      MESSAGE                                                                                     ERROR
controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0               Healthy     {"health":"true"}

網絡搜尋說是controller-managerscheduler兩個組件運行所在節點與執行kubectl get cs的節點不是同一個節點,因此才致使訪問http://127.0.0.1:10252失敗,但我這裏執行`kubectl get cs命令的節點與controller-managerscheduler兩個組件運行的節點都是node01節點,但經測試不影響集羣使用。

問題處理思路:

  1. 先查看master節點的確沒有監聽10251與10252這兩個端口

  2. 查看兩個組件的Pod是否正常運行

    k8s@node01:~$ sudo kubectl get pods -n kube-system -o wide | grep 'scheduler\|controller-manager'
    kube-controller-manager-node01   1/1     Running   1          7m42s   192.168.101.40   node01   <none>           <none>
    kube-scheduler-node01            1/1     Running   0          6h32m   192.168.101.40   node01   <none>           <none>

    兩個組件已正常運行

  3. 那的確是兩個組件的相應Pod運行時沒有監聽相應的端口,那得找到運行兩個組件的配置文件,在主節點初化時的輸出信息中在/etc/kubernetes/manifests目錄下建立了各個組件的相應靜態Pod的清單文件,從這裏入手

    [control-plane] Using manifest folder "/etc/kubernetes/manifests"
    [control-plane] Creating static Pod manifest for "kube-apiserver"
    [control-plane] Creating static Pod manifest for "kube-controller-manager"
    W0722 17:02:25.619105   25074 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
    [control-plane] Creating static Pod manifest for "kube-scheduler"
    k8s@node01:~$ ls /etc/kubernetes/manifests/
    etcd.yaml  kube-apiserver.yaml  kube-controller-manager.yaml  kube-scheduler.yaml
  4. 修改清單文件,去掉--port=0這一行,在對清單文件進行修改時先作備份操做

    注意:

    在對清單文件作備份時,不要直接把清單文件備份在平級目錄裏,即/etc/kubernetes/manifests目錄,須要備份到其餘目錄中或在平級目錄再建立一個相似/etc/kubernetes/manifests/bak的備份目錄,不然按照如下操做後master節點上依然沒法監聽1025110252兩個端口,組件健康狀態依然沒法恢復爲health狀態。

k8s@node01:~$ sudo vim /etc/kubernetes/manifests/kube-controller-manager.yaml
- command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --node-cidr-mask-size=24
    - --port=0 ########################## 刪除這行  #########
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.96.0.0/12
    - --use-service-account-credentials=true

k8s@node01:~$ sudo vim /etc/kubernetes/manifests/kube-scheduler.yaml
- command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true
    - --port=0   ########### 刪除這行 #################

# 重啓kubelet服務
k8s@node01:~$ sudo systemctl restart kubelet

# 查看監聽監聽端口以及組件狀態
k8s@node01:~$ sudo ss -tanlp | grep '10251\|10252'
LISTEN   0         128                        *:10251                  *:*       users:(("kube-scheduler",pid=51054,fd=5))
LISTEN   0         128                        *:10252                  *:*       users:(("kube-controller",pid=51100,fd=5))
k8s@node01:~$ sudo kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok
controller-manager   Healthy   ok
etcd-0               Healthy   {"health":"true"}

至此,kubernetes單master集羣安裝完成。master HA的安裝請參考官方文檔:https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/high-availability/

相關文章
相關標籤/搜索