一、安裝要求(提早確認)
在開始以前,部署Kubernetes集羣機器須要知足如下幾個條件:html
- 三臺機器,操做系統 CentOS7.5+(mini)
- 硬件配置:2GBRAM,2個CPU,硬盤30GB
二、安裝步驟
角色 | IP |
---|---|
master |
192.168.50.128 |
node1 |
192.168.50.131 |
node2 |
192.168.50.132 |
2.一、安裝前預處理操做
注意本小節這7個步驟中,在全部的節點(master和node節點)都要操做。node
(1)關閉防火牆、selinux
~]# systemctl disable --now firewalld ~]# setenforce 0 ~]# sed -i 's/enforcing/disabled/' /etc/selinux/config
(3)關閉swap
分區
~]# swapoff -a ~]# sed -i.bak 's/^.*centos-swap/#&/g' /etc/fstab
上面的是臨時關閉,固然也能夠永久關閉,即在/etc/fstab
文件中將swap
掛載所在的行註釋掉便可。linux
(4)設置主機名
master主節點設置以下nginx
~]# hostnamectl set-hostname master
node1從節點設置以下git
~]# hostnamectl set-hostname node1
node2從節點設置以下github
~]# hostnamectl set-hostname node2
執行bash
命令以加載新設置的主機名算法
(5)添加hosts
解析
~]# cat >>/etc/hosts <<EOF 192.168.50.128 master 192.168.50.131 node1 192.168.50.132 node2 EOF
(6)打開ipv6
流量轉發。
~]# cat > /etc/sysctl.d/k8s.conf << EOF net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF ~]# sysctl --system #當即生效
(7)配置yum
源
全部的節點均採用阿里雲官網的base
和epel
源docker
~]# mv /etc/yum.repos.d/* /tmp ~]# curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo ~]# curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
(8)時區與時間同步
~]# ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime ~]# yum install dnf ntpdate -y ~]# dnf makecache ~]# ntpdate ntp.aliyun.com
2.二、安裝docker
(1)添加docker
軟件yum
源
~]# curl -o /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo ~]# cat /etc/yum.repos.d/docker-ce.repo [docker-ce-stable] name=Docker CE Stable - $basearch baseurl=https://mirrors.aliyun.com/docker-ce/linux/centos/7/$basearch/stable enabled=1 gpgcheck=1 gpgkey=https://mirrors.aliyun.com/docker-ce/linux/centos/gpg .......
(2)安裝docker-ce
列出全部能夠安裝的版本json
~]# dnf list docker-ce --showduplicates docker-ce.x86_64 3:18.09.6-3.el7 docker-ce-stable docker-ce.x86_64 3:18.09.7-3.el7 docker-ce-stable docker-ce.x86_64 3:18.09.8-3.el7 docker-ce-stable docker-ce.x86_64 3:18.09.9-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.0-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.1-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.2-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.3-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.4-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.5-3.el7 docker-ce-stable .....
這裏咱們安裝最新版本的docker
,全部的節點都須要安裝docker
服務bootstrap
~]# dnf install -y docker-ce docker-ce-cli
(3)啓動docker
並設置開機自啓動
~]# systemctl enable --now docker
查看版本號,檢測docker是否安裝成功
~]# docker --version Docker version 19.03.12, build 48a66213fea
上面的這種查看docker client
的版本的。建議使用下面這種方法查看docker-ce
版本號,這種方法把docker的client端和server端的版本號查看的一清二楚。
~]# docker version Client: Version: 19.03.12 API version: 1.40 Go version: go1.13.10 Git commit: 039a7df9ba Built: Wed Sep 4 16:51:21 2019 OS/Arch: linux/amd64 Experimental: false Server: Docker Engine - Community Engine: Version: 19.03.12 API version: 1.40 (minimum version 1.12) Go version: go1.13.10 Git commit: 039a7df Built: Wed Sep 4 16:22:32 2019 OS/Arch: linux/amd64 Experimental: false
(4)更換docker
的鏡像倉庫源
國內鏡像倉庫源有不少,好比阿里雲,清華源,中國科技大,docker官方中國源等等。
~]# cat > /etc/docker/daemon.json << EOF { "registry-mirrors": ["https://f1bhsuge.mirror.aliyuncs.com"] } EOF
因爲加載docker倉庫源
,因此須要重啓docker
~]# systemctl restart docker
2.三、安裝kubernetes
服務
(1)添加kubernetes
軟件yum
源
方法:瀏覽器打開mirrors.aliyun.com
網站,找到kubernetes
,便可看到鏡像倉庫源
~]# cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
(2)安裝kubeadm、kubelet
和kubectl
組件
全部的節點都須要安裝這幾個組件。
~]# dnf list kubeadm --showduplicates kubeadm.x86_64 1.17.7-0 kubernetes kubeadm.x86_64 1.17.7-1 kubernetes kubeadm.x86_64 1.17.8-0 kubernetes kubeadm.x86_64 1.17.9-0 kubernetes kubeadm.x86_64 1.18.0-0 kubernetes kubeadm.x86_64 1.18.1-0 kubernetes kubeadm.x86_64 1.18.2-0 kubernetes kubeadm.x86_64 1.18.3-0 kubernetes kubeadm.x86_64 1.18.4-0 kubernetes kubeadm.x86_64 1.18.4-1 kubernetes kubeadm.x86_64 1.18.5-0 kubernetes kubeadm.x86_64 1.18.6-0 kubernetes
因爲kubernetes版本變動很是快,所以這裏先列出了有哪些版本,咱們安裝1.18.6
版本。全部節點都安裝。
~]# dnf install -y kubelet-1.18.6 kubeadm-1.18.6 kubectl-1.18.6
(3)設置開機自啓動
咱們先設置開機自啓,可是
kubelete
服務暫時先不啓動。
~]# systemctl enable kubelet
2.四、部署Kubeadm Master
節點
(1)生成預處理文件
在master
節點執行以下指令,可能出現WARNING
警告,可是不影響部署:
~]# kubeadm config print init-defaults > kubeadm-init.yaml
這個文件kubeadm-init.yaml
,是咱們初始化使用的文件,裏面大概修改這幾項參數。
[root@master1 ~]# cat kubeadm-init.yaml apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.50.128 bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: master1 taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers #阿里雲的鏡像站點 kind: ClusterConfiguration kubernetesVersion: v1.18.3 #kubernetes版本號 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 #選擇默認便可,固然也能夠自定義CIDR podSubnet: 10.244.0.0/16 #添加pod網段 scheduler: {}
(2)提早拉取鏡像
若是直接採用kubeadm init
來初始化,中間會有系統自動拉取鏡像的這一步驟,這是比較慢的,我建議分開來作,因此這裏就先提早拉取鏡像。在master
節點操做以下指令:
[root@master ~]# kubeadm config images pull --config kubeadm-init.yaml [config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.1 [config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.4.3-0 [config/images] Pulled registry.aliyuncs.com/google_containers/coredns:1.6.5
若是你們看到開頭的兩行warning
信息(我這裏沒有打印),沒必要擔憂,這只是警告,不影響咱們完成實驗。
既然鏡像已經拉取成功了,那咱們就能夠直接開始初始化了。
(3)初始化kubenetes
的master
節點
執行以下命令:
[root@master ~]# kubeadm init --config kubeadm-init.yaml [init] Using Kubernetes version: v1.18.3 [preflight] Running pre-flight checks [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service' [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.50.128] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [192.168.50.128 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [192.168.50.128 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" W0629 21:47:51.709568 39444 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-scheduler" W0629 21:47:51.711376 39444 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 14.003225 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.17" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: abcdef.0123456789abcdef [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.50.128:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:05b84c41152f72ca33afe39a7ef7fa359eec3d3ed654c2692b665e2c4810af3e
這個過程大概15s的時間就作完了,之因此初始化的這麼快就是由於咱們提早拉取了鏡像。
像我上面這樣的沒有報錯信息,而且顯示最後的kubeadm join 192.168.50.128:6443 --token abcdef.0123456789abcdef
這些,說明咱們的master是初始化成功的。
固然咱們還須要按照最後的提示在使用kubernetes集羣以前還須要再作一下收尾工做,注意是在master節點上執行的。
[root@master ~]# mkdir -p $HOME/.kube [root@master ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@master ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
好了,此時的master節點就算是初始化完畢了。有個重要的點就是最後一行信息,這是node節點加入kubernetes集羣的認證命令。這個密鑰是系統根據sha256
算法計算出來的,必須持有這樣的密鑰才能夠加入當前的kubernetes集羣。
若是此時查看當前集羣的節點,會發現只有master
節點本身。
[root@master ~]# kubectl get node NAME STATUS ROLES AGE VERSION master NotReady master 2m53s v1.18.6
接下來咱們把node節點加入到kubernetes集羣中
2.五、node
節點加入kubernetes
集羣中
先把加入集羣的命令明確一下,此命令是master節點初始化成功以後給出的命令。
注意,你的初始化以後與個人密鑰指令確定是不同的,所以要用本身的命令才行,我這邊是爲了給你們演示才貼出來的。
~]# kubeadm join 192.168.50.128:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:05b84c41152f72ca33afe39a7ef7fa359eec3d3ed654c2692b665e2c4810af3e
(1)node1
節點加入集羣
[root@node1 ~]# kubeadm join 192.168.50.128:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:05b84c41152f72ca33afe39a7ef7fa359eec3d3ed654c2692b665e2c4810af3e [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster
當看到This node has joined the cluster
,這一行信息表示node節點加入集羣成功,
(2)node2
節點加入集羣
node2
節點也是使用一樣的方法來執行。全部的節點加入集羣以後,此時咱們能夠在master節點執行以下命令查看此集羣的現有節點。
[root@master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master NotReady master 2m53s v1.18.6 node1 NotReady <none> 73s v1.18.6 node2 NotReady <none> 7s v1.18.6
能夠看到集羣的三個節點都已經存在,可是如今還不能用,也就是說集羣節點是不可用的,緣由在於上面的第2個字段,咱們看到三個節點都是NotReady
狀態,這是由於咱們尚未安裝網絡插件,這裏咱們選擇使用flannel插件。
2.六、安裝Flannel
網絡插件
Flannel是 CoreOS 團隊針對 Kubernetes 設計的一個覆蓋網絡(Overlay Network)工具,其目的在於幫助每個使用 Kuberentes 的 CoreOS 主機擁有一個完整的子網。此次的分享內容將從Flannel的介紹、工做原理及安裝和配置三方面來介紹這個工具的使用方法。
Flannel經過給每臺宿主機分配一個子網的方式爲容器提供虛擬網絡,它基於Linux TUN/TAP
,使用UDP封裝IP包來建立overlay網絡,並藉助etcd維護網絡的分配狀況
(1)默認方法
默認你們從網上的教程都會使用這個命令來初始化。
~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
事實上不少用戶都不能成功,由於國內網絡受限,因此能夠這樣子來作。
(2)更換flannel
鏡像源
修改本地的hosts
文件添加以下內容以便解析才能下載該文件
199.232.28.133 raw.githubusercontent.com
而後下載flannel文件
[root@master ~]# curl -o kube-flannel.yml https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
編輯鏡像源,默認的鏡像地址咱們修改一下。把yaml文件中全部的quay.io 修改成quay-mirror.qiniu.com
[root@master ~]# sed -i 's/quay.io/quay-mirror.qiniu.com/g' kube-flannel.yml
此時保存保存退出。在master節點執行此命令。
[root@master ~]# kubectl apply -f kube-flannel.yml podsecuritypolicy.policy/psp.flannel.unprivileged created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds-amd64 created daemonset.apps/kube-flannel-ds-arm64 created daemonset.apps/kube-flannel-ds-arm created daemonset.apps/kube-flannel-ds-ppc64le created daemonset.apps/kube-flannel-ds-s390x created
這樣子就能夠成功拉取flannel鏡像了。固然你也可使用我提供給你們的kube-flannel.yml
文件。
- 查看
kube-flannel
的pod是否運行正常
[root@master ~]# kubectl get pod -n kube-system | grep kube-flannel kube-flannel-ds-amd64-8svs6 1/1 Running 0 44s kube-flannel-ds-amd64-k5k4k 0/1 Running 0 44s kube-flannel-ds-amd64-mwbwp 0/1 Running 0 44s
(3)沒法拉取鏡像解決方法
像上面查看kube-flannel
的pod時發現不是Running
,這就表示該pod有問題,咱們須要進一步分析。
執行kubectl describe pod xxxx
若是有如下報錯:
Normal BackOff 24m (x6 over 26m) kubelet, master3 Back-off pulling image "quay-mirror.qiniu.com/coreos/flannel:v0.12.0-amd64" Warning Failed 11m (x64 over 26m) kubelet, master3 Error: ImagePullBackOff
或者是
Error response from daemon: Get https://quay.io/v2/: net/http: TLS handshake timeout
上面的這些都表示是網絡問題不能拉取鏡像,我這裏給你們提早準備了flannel的鏡像。導入一下就能夠了。
[root@master ~]# docker load -i flannel.tar
2.七、驗證節點是否可用
稍等片刻,執行以下指令查看節點是否可用
[root@master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 82m v1.17.6 node1 Ready <none> 60m v1.17.6 node2 Ready <none> 55m v1.17.6
目前節點狀態是Ready
,表示集羣節點如今是可用的。
三、測試kubernetes
集羣
3.一、kubernetes
集羣測試
(1)建立一個nginx
的pod
如今咱們在kubernetes集羣中建立一個nginx的pod,驗證是否能正常運行。
在master節點執行一下步驟:
[root@master ~]# kubectl create deployment nginx --image=nginx deployment.apps/nginx created [root@master ~]# kubectl expose deployment nginx --port=80 --type=NodePort service/nginx exposed
如今咱們查看pod和service
[root@master ~]# kubectl get pod,svc -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/nginx-86c57db685-kk755 1/1 Running 0 29m 10.244.1.10 node1 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 24h <none> service/nginx NodePort 10.96.5.205 <none> 80:32627/TCP 29m app=nginx
打印的結果中,前半部分是pod相關信息,後半部分是service相關信息。咱們看service/nginx
這一行能夠看出service暴漏給集羣的端口是32627
。記住這個端口。
而後從pod的詳細信息能夠看出此時pod在node1節點之上。node1節點的IP地址是192.168.50.129
(2)訪問nginx
驗證集羣
那如今咱們訪問一下。打開瀏覽器(建議火狐瀏覽器),訪問地址就是:http://192.168.50.129:32627
3.二、安裝dashboard
(1)建立dashboard
先把dashboard的配置文件下載下來。因爲咱們以前已經添加了hosts
解析,所以能夠下載。
~]# curl -o recommended.yaml https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml
默認Dashboard只能集羣內部訪問,修改Service
爲NodePort
類型,暴露到外部:
大概在此文件的32-44
行之間,修改成以下:
kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort #加上此行 ports: - port: 443 targetPort: 8443 nodePort: 30001 #加上此行,端口30001能夠自行定義 selector: k8s-app: kubernetes-dashboard
- 運行此
yaml
文件
[root@master ~]# kubectl apply -f recommended.yaml namespace/kubernetes-dashboard created serviceaccount/kubernetes-dashboard created service/kubernetes-dashboard created secret/kubernetes-dashboard-certs created ... service/dashboard-metrics-scraper created deployment.apps/dashboard-metrics-scraper created
- 查看
dashboard
運行是否正常
[root@master ~]# kubectl get pod,svc -n kubernetes-dashboard -o wide NAME READY STATUS RESTARTS AGE IP NODE pod/dashboard-metrics-scraper-76585494d8-vd9w6 1/1 Running 0 4h50m 10.244.2.3 node2 pod/kubernetes-dashboard-594b99b6f4-72zxw 1/1 Running 0 4h50m 10.244.2.2 node2 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/dashboard-metrics-scraper ClusterIP 10.96.45.110 <none> 8000/TCP 4h50m k8s-app=dashboard-metrics-scraper service/kubernetes-dashboard NodePort 10.96.217.29 <none> 443:30001/TCP 4h50m k8s-app=kubernetes-dashboard
從上面能夠看出,kubernetes-dashboard-594b99b6f4-72zxw
運行所在的節點是node2
上面,而且暴漏出來的端口是30001
,因此訪問地址是: https://192.168.50.130:30001
- 瀏覽器訪問
訪問的時候會讓輸入token
,今後處能夠查看到token的值。
~]# kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
把上面的token
值輸入進去便可進去dashboard界面。
不過如今咱們雖然能夠登錄上去,可是咱們權限不夠還查看不了集羣信息,由於咱們尚未綁定集羣角色,同窗們能夠先按照上面的嘗試一下,再來作下面的步驟
(2)cluster-admin
管理員角色綁定
~]# kubectl create serviceaccount dashboard-admin -n kube-system ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin ~]# kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
再使用輸出的token登錄dashboard便可。
四、集羣報錯總結
(1)拉取鏡像報錯沒有找到
[root@master ~]# kubeadm config images pull --config kubeadm-init.yaml W0801 11:00:00.705044 2780 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] failed to pull image "registry.aliyuncs.com/google_containers/kube-apiserver:v1.18.4": output: Error response from daemon: manifest for registry.aliyuncs.com/google_containers/kube-apiserver:v1.18.4 not found: manifest unknown: manifest unknown , error: exit status 1 To see the stack trace of this error execute with --v=5 or higher
選擇拉取的kubernetes鏡像版本太高,所以須要下降一些,修改kubeadm-init.yaml
中的kubernetesVersion
便可。
(2)docker
存儲驅動報錯
在安裝kubernetes
的過程當中,常常會碰見以下錯誤
failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
緣由是docker
的Cgroup Driver
和kubelet
的Cgroup Driver
不一致。
一、修改docker的Cgroup Driver
修改/etc/docker/daemon.json
文件
{ "exec-opts": ["native.cgroupdriver=systemd"] }
重啓docker便可
systemctl daemon-reload systemctl restart docker
(3)node
節點報localhost:8080
拒絕錯誤
node
節點執行kubectl get pod
報錯以下:
[root@node1 ~]# kubectl get pod The connection to the server localhost:8080 was refused - did you specify the right host or port?
出現這個問題的緣由是kubectl命令須要使用kubernetes-admin
密鑰來運行
解決方法:
在master
節點上將/etc/kubernetes/admin.conf
文件遠程複製到node節點的/etc/kubernetes
目錄下,而後在node
節點配置一下環境變量
[root@node1 images]# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile [root@node1 images]# source ~/.bash_profile
node
節點再次執行kubectl get pod
:
[root@node1 ~]# kubectl get pod NAME READY STATUS RESTARTS AGE nginx-f89759699-z4fc2 1/1 Running 0 20m
(4)node
節點加入集羣身份驗證報錯
[root@node1 ~]# kubeadm join 192.168.50.128:6443 --token abcdef.0123456789abcdef \ > --discovery-token-ca-cert-hash sha256:05b84c41152f72ca33afe39a7ef7fa359eec3d3ed654c2692b665e2c4810af3e W0801 11:06:05.871557 2864 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set. [preflight] Running pre-flight checks error execution phase preflight: couldn't validate the identity of the API Server: cluster CA found in cluster-info ConfigMap is invalid: none of the public keys "sha256:a74a8f5a2690aa46bd2cd08af22276c08a0ed9489b100c0feb0409e1f61dc6d0" are pinned To see the stack trace of this error execute with --v=5 or higher
密鑰複製的不對,從新把master初始化以後的加入集羣指令複製一下,
(5)初始化master節點時,swap未關閉
[ERROR Swap]:running with swap on is not supported please diable swap
關閉swap分區便可。
swapoff -a sed -i.bak 's/^.*centos-swap/#&/g' /etc/fstab
(6)執行kubectl get cs
顯示組件處於非健康狀態
[root@master ~]# kubectl get cs NAME STATUS MESSAGE ERROR scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused etcd-0 Healthy {"health":"true"}
修改scheduler
和controller-manager
兩個組件的配置文件,分別將--port=0
去掉。配置文件的路徑是/etc/kubernetes/manifests/
,下面有kube-controller-manager.yaml
和kube-scheduler.yaml
兩個配置文件。
修改好以後保存一下便可,不須要手動重啓服務。等個半分鐘集羣自動就恢復正常,再次執行kubectl get cs
命令就能夠看到組件是正常的了。
(7)dashboard報錯:Get [https://10.96.0.1:443/version](https://10.96.0.1/version): dial tcp 10.96.0.1:443: i/o timeout
出現這個問題實際上仍是集羣網絡存在問題,可是若是你查看節點或者flannel的pod等等是正常的,因此仍是排查不出來問題的。最快的解決方法讓dashboard調度到master節點上就能夠了。
修改dashboard的配置文件,將下面幾行註釋掉(大約在232-234行)
nodeSelector: "beta.kubernetes.io/os": linux # Comment the following tolerations if Dashboard must not be deployed on master # tolerations: # - key: node-role.kubernetes.io/master # effect: NoSchedule
也就是將上面的最後三行註釋掉。
接着是再增長選中的節點
template: metadata: labels: k8s-app: kubernetes-dashboard spec: nodeName: master containers: - name: kubernetes-dashboard image: kubernetesui/dashboard:v2.0.0-beta8 imagePullPolicy: Always ports:
大約在第190行,增長一行信息nodeName: master
保存好以後從新執行kubectl apply
命令申請加入集羣便可。
若是想本身繼續研究的話,多看看是否是flannel的網段定義的問題。
五、參考
我的參考的一些博客,在此記錄一下:https://www.cnblogs.com/FengGeBlog/p/10810632.html