01.系統初始化和全局變量 集羣機器 test1:192.168.0.91 test2:192.168.0.92 test3:192.168.0.93 主機名 設置永久主機名稱,而後從新登陸: sudo hostnamectl set-hostname test1 # 將 test1 替換爲當前主機名 設置的主機名保存在 /etc/hostname 文件中; 修改每臺機器的 /etc/hosts 文件,添加主機名和 IP 的對應關係: grep kube-node /etc/hosts 192.168.0.91 test1 test1 192.168.0.92 test2 test2 192.168.0.93 test3 test3 添加 k8s 和 docker 帳戶 在每臺機器上添加 k8s 帳戶,能夠無密碼 sudo: sudo useradd -m k8s sudo sh -c 'echo 123456 | passwd k8s --stdin' # 爲 k8s 帳戶設置密碼 sudo visudo sudo grep '%wheel.*NOPASSWD: ALL' /etc/sudoers %wheel ALL=(ALL) NOPASSWD: ALL sudo gpasswd -a k8s wheel 在每臺機器上添加 docker 帳戶,將 k8s 帳戶添加到 docker 組中,同時配置 dockerd 參數: sudo useradd -m docker sudo gpasswd -a k8s docker sudo mkdir -p /etc/docker/ cat /etc/docker/daemon.json { "registry-mirrors": ["https://hub-mirror.c.163.com", "https://docker.mirrors.ustc.edu.cn"], "max-concurrent-downloads": 20 } 無密碼 ssh 登陸其它節點 若是沒有特殊指明,本文檔的全部操做均在 test1 節點上執行,而後遠程分發文件和執行命令。? 設置 test1 能夠無密碼登陸全部節點的 k8s 和 root 帳戶: [k8s@test1 k8s]$ ssh-keygen -t rsa [k8s@test1 k8s]$ ssh-copy-id root@test1 [k8s@test1 k8s]$ ssh-copy-id root@test2 [k8s@test1 k8s]$ ssh-copy-id root@test3 [k8s@test1 k8s]$ ssh-copy-id k8s@test1 [k8s@test1 k8s]$ ssh-copy-id k8s@test2 [k8s@test1 k8s]$ ssh-copy-id k8s@test3 將可執行文件路徑 /opt/k8s/bin 添加到 PATH 變量中 在每臺機器上添加環境變量: sudo sh -c "echo 'PATH=/opt/k8s/bin:$PATH:$HOME/bin:$JAVA_HOME/bin' >>/root/.bashrc" echo 'PATH=/opt/k8s/bin:$PATH:$HOME/bin:$JAVA_HOME/bin' >>~/.bashrc 安裝依賴包 在每臺機器上安裝依賴包: CentOS: sudo yum install -y epel-release sudo yum install -y conntrack ipvsadm ipset jq sysstat curl iptables libseccomp ipvs 依賴 ipset 關閉防火牆 在每臺機器上關閉防火牆: sudo systemctl stop firewalld sudo systemctl disable firewalld sudo iptables -F && sudo iptables -X && sudo iptables -F -t nat && sudo iptables -X -t nat sudo sudo iptables -P FORWARD ACCEPT 關閉 swap 分區 若是開啓了 swap 分區,kubelet 會啓動失敗(能夠經過將參數 --fail-swap-on 設置爲 false 來忽略 swap on),故須要在每臺機器上關閉 swap 分區: sudo swapoff -a 爲了防止開機自動掛載 swap 分區,能夠註釋 /etc/fstab 中相應的條目: sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab 關閉 SELinux 關閉 SELinux,不然後續 K8S 掛載目錄時可能報錯 Permission denied: sudo setenforce 0 grep SELINUX /etc/selinux/config SELINUX=disabled 修改配置文件,永久生效; 關閉 dnsmasq linux 系統開啓了 dnsmasq 後(如 GUI 環境),將系統 DNS Server 設置爲 127.0.0.1,這會致使 docker 容器沒法解析域名,須要關閉它: sudo service dnsmasq stop sudo systemctl disable dnsmasq 設置系統參數 cat > kubernetes.conf <<EOF net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 net.ipv4.ip_forward=1 vm.swappiness=0 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 EOF sudo cp kubernetes.conf /etc/sysctl.d/kubernetes.conf sudo sysctl -p /etc/sysctl.d/kubernetes.conf sudo mount -t cgroup -o cpu,cpuacct none /sys/fs/cgroup/cpu,cpuacct 加載內核模塊 sudo modprobe br_netfilter sudo modprobe ip_vs 設置系統時區 sudo timedatectl set-timezone Asia/Shanghai 將當前的 UTC 時間寫入硬件時鐘 sudo timedatectl set-local-rtc 0 重啓依賴於系統時間的服務 sudo systemctl restart rsyslog sudo systemctl restart crond 建立目錄 在每臺機器上建立目錄: sudo mkdir -p /opt/k8s/bin sudo chown -R k8s /opt/k8s sudo sudo mkdir -p /etc/kubernetes/cert sudo chown -R k8s /etc/kubernetes sudo mkdir -p /etc/etcd/cert sudo chown -R k8s /etc/etcd/cert sudo mkdir -p /var/lib/etcd && chown -R k8s /etc/etcd/cert 集羣環境變量 後續的部署步驟將使用下面定義的全局環境變量,請根據本身的機器、網絡狀況修改: #!/usr/bin/bash # 生成 EncryptionConfig 所需的加密 key ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64) # 最好使用 當前未用的網段 來定義服務網段和 Pod 網段 # 服務網段,部署前路由不可達,部署後集羣內路由可達(kube-proxy 和 ipvs 保證) SERVICE_CIDR="10.254.0.0/16" # Pod 網段,建議 /16 段地址,部署前路由不可達,部署後集羣內路由可達(flanneld 保證) CLUSTER_CIDR="172.30.0.0/16" # 服務端口範圍 (NodePort Range) export NODE_PORT_RANGE="8400-9000" # 集羣各機器 IP 數組 export NODE_IPS=(192.168.0.91 192.168.0.92 192.168.0.93) # 集羣各 IP 對應的 主機名數組 export NODE_NAMES=(test1 test2 test3) # kube-apiserver 的 VIP(HA 組件 keepalived 發佈的 IP) export MASTER_VIP="192.168.0.235" # kube-apiserver VIP 地址(HA 組件 haproxy 監聽 8443 端口) export KUBE_APISERVER="https://${MASTER_VIP}:8443" # HA 節點,VIP 所在的網絡接口名稱 export VIP_IF="eth0" # etcd 集羣服務地址列表 export ETCD_ENDPOINTS="https://192.168.0.91:2379,https://192.168.0.92:2379,https://192.168.0.93:2379" # etcd 集羣間通訊的 IP 和端口 export ETCD_NODES="test1=https://192.168.0.91:2380,test2=https://192.168.0.92:2380,test3=https://192.168.0.93:2380" # flanneld 網絡配置前綴 export FLANNEL_ETCD_PREFIX="/kubernetes/network" # kubernetes 服務 IP (通常是 SERVICE_CIDR 中第一個IP) export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1" # 集羣 DNS 服務 IP (從 SERVICE_CIDR 中預分配) export CLUSTER_DNS_SVC_IP="10.254.0.2" # 集羣 DNS 域名 export CLUSTER_DNS_DOMAIN="cluster.local." # 將二進制目錄 /opt/k8s/bin 加到 PATH 中 export PATH=/opt/k8s/bin:$PATH 打包後的變量定義見 environment.sh,後續部署時會提示導入該腳本; 分發集羣環境變量定義腳本 把全局變量定義腳本拷貝到全部節點的 /opt/k8s/bin 目錄: source environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp environment.sh k8s@${node_ip}:/opt/k8s/bin/ ssh k8s@${node_ip} "chmod +x /opt/k8s/bin/*" done 02.建立 CA 證書和祕鑰 爲確保安全,kubernetes 系統各組件須要使用 x509 證書對通訊進行加密和認證。 CA (Certificate Authority) 是自簽名的根證書,用來簽名後續建立的其它證書。 本文檔使用 CloudFlare 的 PKI 工具集 cfssl 建立全部證書。 安裝 cfssl 工具集 sudo mkdir -p /opt/k8s/cert && sudo chown -R k8s /opt/k8s && cd /opt/k8s wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 mv cfssl_linux-amd64 /opt/k8s/bin/cfssl wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 mv cfssljson_linux-amd64 /opt/k8s/bin/cfssljson wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 mv cfssl-certinfo_linux-amd64 /opt/k8s/bin/cfssl-certinfo chmod +x /opt/k8s/bin/* export PATH=/opt/k8s/bin:$PATH 建立根證書 (CA) CA 證書是集羣全部節點共享的,只須要建立一個 CA 證書,後續建立的全部證書都由它簽名。 建立配置文件 CA 配置文件用於配置根證書的使用場景 (profile) 和具體參數 (usage,過時時間、服務端認證、客戶端認證、加密等),後續在簽名其它證書時須要指定特定場景。 cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "kubernetes": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" } } } } EOF signing:表示該證書可用於簽名其它證書,生成的 ca.pem 證書中 CA=TRUE; server auth:表示 client 能夠用該該證書對 server 提供的證書進行驗證; client auth:表示 server 能夠用該該證書對 client 提供的證書進行驗證; 建立證書籤名請求文件 cat > ca-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF CN:Common Name,kube-apiserver 從證書中提取該字段做爲請求的用戶名 (User Name),瀏覽器使用該字段驗證網站是否合法; O:Organization,kube-apiserver 從證書中提取該字段做爲請求用戶所屬的組 (Group); kube-apiserver 將提取的 User、Group 做爲 RBAC 受權的用戶標識; 生成 CA 證書和私鑰 cfssl gencert -initca ca-csr.json | cfssljson -bare ca ls ca* 分發證書文件 將生成的 CA 證書、祕鑰文件、配置文件拷貝到全部節點的 /etc/kubernetes/cert 目錄下: source /opt/k8s/bin/environment.sh # 導入 NODE_IPS 環境變量 for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/kubernetes/cert && chown -R k8s /etc/kubernetes" scp ca*.pem ca-config.json k8s@${node_ip}:/etc/kubernetes/cert done k8s 帳戶須要有讀寫 /etc/kubernetes 目錄及其子目錄文件的權限; 03.部署 kubectl 命令行工具 kubectl 是 kubernetes 集羣的命令行管理工具,本文檔介紹安裝和配置它的步驟。 kubectl 默認從 ~/.kube/config 文件讀取 kube-apiserver 地址、證書、用戶名等信息,若是沒有配置,執行 kubectl 命令時會報以下錯誤: kubectl get pods The connection to the server localhost:8080 was refused - did you specify the right host or port? 下載和分發 kubectl 二進制文件 下載、解壓: wget https://dl.k8s.io/v1.10.4/kubernetes-client-linux-amd64.tar.gz tar -xzvf kubernetes-client-linux-amd64.tar.gz 分發到全部使用 kubectl 的節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kubernetes/client/bin/kubectl k8s@${node_ip}:/opt/k8s/bin/ ssh k8s@${node_ip} "chmod +x /opt/k8s/bin/*" done 建立 admin 證書和私鑰 kubectl 與 apiserver https 安全端口通訊,apiserver 對提供的證書進行認證和受權。 kubectl 做爲集羣的管理工具,須要被授予最高權限。這裏建立具備最高權限的 admin 證書。 建立證書籤名請求: cat > admin-csr.json <<EOF { "CN": "admin", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:masters", "OU": "4Paradigm" } ] } EOF O 爲 system:masters,kube-apiserver 收到該證書後將請求的 Group 設置爲 system:masters; 預約義的 ClusterRoleBinding cluster-admin 將 Group system:masters 與 Role cluster-admin 綁定,該 Role 授予全部 API的權限; 該證書只會被 kubectl 當作 client 證書使用,因此 hosts 字段爲空; 生成證書和私鑰: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes admin-csr.json | cfssljson -bare admin ls admin* 建立 kubeconfig 文件 kubeconfig 爲 kubectl 的配置文件,包含訪問 apiserver 的全部信息,如 apiserver 地址、CA 證書和自身使用的證書; source /opt/k8s/bin/environment.sh # 設置集羣參數 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/cert/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kubectl.kubeconfig # 設置客戶端認證參數 kubectl config set-credentials admin \ --client-certificate=admin.pem \ --client-key=admin-key.pem \ --embed-certs=true \ --kubeconfig=kubectl.kubeconfig # 設置上下文參數 kubectl config set-context kubernetes \ --cluster=kubernetes \ --user=admin \ --kubeconfig=kubectl.kubeconfig # 設置默認上下文 kubectl config use-context kubernetes --kubeconfig=kubectl.kubeconfig --certificate-authority:驗證 kube-apiserver 證書的根證書; --client-certificate、--client-key:生成的 admin 證書和私鑰,鏈接 kube-apiserver 時使用 --embed-certs=true:將 ca.pem 和 admin.pem 證書內容嵌入到生成的 kubectl.kubeconfig 文件中(不加時,寫入的是證書文件路徑); 分發 kubeconfig 文件 分發到全部使用 kubectl 命令的節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh k8s@${node_ip} "mkdir -p ~/.kube" scp kubectl.kubeconfig k8s@${node_ip}:~/.kube/config ssh root@${node_ip} "mkdir -p ~/.kube" scp kubectl.kubeconfig root@${node_ip}:~/.kube/config done 保存到用戶的 ~/.kube/config 文件; 04.部署 etcd 集羣 etcd 是基於 Raft 的分佈式 key-value 存儲系統,由 CoreOS 開發,經常使用於服務發現、共享配置以及併發控制(如 leader 選舉、分佈式鎖等)。kubernetes 使用 etcd 存儲全部運行數據。 本文檔介紹部署一個三節點高可用 etcd 集羣的步驟: 下載和分發 etcd 二進制文件; 建立 etcd 集羣各節點的 x509 證書,用於加密客戶端(如 etcdctl) 與 etcd 集羣、etcd 集羣之間的數據流; 建立 etcd 的 systemd unit 文件,配置服務參數; 檢查集羣工做狀態; etcd 集羣各節點的名稱和 IP 以下: test1:192.168.0.91 test2:192.168.0.92 test3:192.168.0.93 下載和分發 etcd 二進制文件 tar -xvf etcd-v3.3.7-linux-amd64.tar.gz 分發二進制文件到集羣全部節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp etcd-v3.3.7-linux-amd64/etcd* k8s@${node_ip}:/opt/k8s/bin ssh k8s@${node_ip} "chmod +x /opt/k8s/bin/*" done 建立 etcd 證書和私鑰 建立證書籤名請求: cat > etcd-csr.json <<EOF { "CN": "etcd", "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF hosts 字段指定受權使用該證書的 etcd 節點 IP 或域名列表,這裏將 etcd 集羣的三個節點 IP 都列在其中; 生成證書和私鑰: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes etcd-csr.json | cfssljson -bare etcd ls etcd* 分發生成的證書和私鑰到各 etcd 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/etcd/cert && chown -R k8s /etc/etcd/cert" scp etcd*.pem k8s@${node_ip}:/etc/etcd/cert/ done 建立 etcd 的 systemd unit 模板文件 source /opt/k8s/bin/environment.sh cat > etcd.service.template <<EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos [Service] User=k8s Type=notify WorkingDirectory=/var/lib/etcd/ ExecStart=/opt/k8s/bin/etcd \\ --data-dir=/var/lib/etcd \\ --name=##NODE_NAME## \\ --cert-file=/etc/etcd/cert/etcd.pem \\ --key-file=/etc/etcd/cert/etcd-key.pem \\ --trusted-ca-file=/etc/kubernetes/cert/ca.pem \\ --peer-cert-file=/etc/etcd/cert/etcd.pem \\ --peer-key-file=/etc/etcd/cert/etcd-key.pem \\ --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \\ --peer-client-cert-auth \\ --client-cert-auth \\ --listen-peer-urls=https://##NODE_IP##:2380 \\ --initial-advertise-peer-urls=https://##NODE_IP##:2380 \\ --listen-client-urls=https://##NODE_IP##:2379,http://127.0.0.1:2379 \\ --advertise-client-urls=https://##NODE_IP##:2379 \\ --initial-cluster-token=etcd-cluster-0 \\ --initial-cluster=${ETCD_NODES} \\ --initial-cluster-state=new Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF User:指定以 k8s 帳戶運行; WorkingDirectory、--data-dir:指定工做目錄和數據目錄爲 /var/lib/etcd,需在啓動服務前建立這個目錄; --name:指定節點名稱,當 --initial-cluster-state 值爲 new 時,--name 的參數值必須位於 --initial-cluster 列表中; --cert-file、--key-file:etcd server 與 client 通訊時使用的證書和私鑰; --trusted-ca-file:簽名 client 證書的 CA 證書,用於驗證 client 證書 --peer-cert-file、--peer-key-file:etcd 與 peer 通訊使用的證書和私鑰; --peer-trusted-ca-file:簽名 peer 證書的 CA 證書,用於驗證 peer 證書; 爲各節點分發 etcd systemd unit 文件 分發時替換模板文件中的變量 source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" etcd.service.template > etcd-${NODE_IPS[i]}.service done ls *.service NODE_NAMES 和 NODE_IPS 爲相同長度的 bash 數組,分別爲節點名稱和對應的 IP; 分發生成的 systemd unit 文件: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /var/lib/etcd && chown -R k8s /var/lib/etcd" scp etcd-${node_ip}.service root@${node_ip}:/etc/systemd/system/etcd.service done 必須先建立 etcd 數據目錄和工做目錄; 啓動 etcd 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable etcd && systemctl restart etcd &" done etcd 進程首次啓動時會等待其它節點的 etcd 加入集羣,命令 systemctl start etcd 會卡住一段時間,爲正常現象。 檢查啓動結果 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh k8s@${node_ip} "systemctl status etcd|grep Active" done 驗證服務狀態 部署完 etcd 集羣后,在任一 etc 節點上執行以下命令: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ETCDCTL_API=3 /opt/k8s/bin/etcdctl \ --endpoints=https://${node_ip}:2379 \ --cacert=/etc/kubernetes/cert/ca.pem \ --cert=/etc/etcd/cert/etcd.pem \ --key=/etc/etcd/cert/etcd-key.pem endpoint health done 預期輸出: https://192.168.0.91:2379 is healthy: successfully committed proposal: took = 2.192932ms https://192.168.0.92:2379 is healthy: successfully committed proposal: took = 3.546896ms https://192.168.0.93:2379 is healthy: successfully committed proposal: took = 3.013667ms 輸出均爲 healthy 時表示集羣服務正常。 05.部署 flannel 網絡 kubernetes 要求集羣內各節點(包括 master 節點)能經過 Pod 網段互聯互通。flannel 使用 vxlan 技術爲各節點建立一個能夠互通的 Pod 網絡。 flaneel 第一次啓動時,從 etcd 獲取 Pod 網段信息,爲本節點分配一個未使用的 /24 段地址,而後建立 flannedl.1(也多是其它名稱,如 flannel1 等) 接口。 flannel 將分配的 Pod 網段信息寫入 /run/flannel/docker 文件,docker 後續使用這個文件中的環境變量設置 docker0 網橋。 下載和分發 flanneld 二進制文件 到 https://github.com/coreos/flannel/releases 頁面下載最新版本的發佈包: mkdir flannel wget https://github.com/coreos/flannel/releases/download/v0.10.0/flannel-v0.10.0-linux-amd64.tar.gz tar -xzvf flannel-v0.10.0-linux-amd64.tar.gz -C flannel 分發 flanneld 二進制文件到集羣全部節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp flannel/{flanneld,mk-docker-opts.sh} k8s@${node_ip}:/opt/k8s/bin/ ssh k8s@${node_ip} "chmod +x /opt/k8s/bin/*" done 建立 flannel 證書和私鑰 flannel 從 etcd 集羣存取網段分配信息,而 etcd 集羣啓用了雙向 x509 證書認證,因此須要爲 flanneld 生成證書和私鑰。 建立證書籤名請求: cat > flanneld-csr.json <<EOF { "CN": "flanneld", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF 該證書只會被 kubectl 當作 client 證書使用,因此 hosts 字段爲空; 生成證書和私鑰: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld ls flanneld*pem 將生成的證書和私鑰分發到全部節點(master 和 worker): source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/flanneld/cert && chown -R k8s /etc/flanneld" scp flanneld*.pem k8s@${node_ip}:/etc/flanneld/cert done 向 etcd 寫入集羣 Pod 網段信息 注意:本步驟只需執行一次。 source /opt/k8s/bin/environment.sh etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/etc/kubernetes/cert/ca.pem \ --cert-file=/etc/flanneld/cert/flanneld.pem \ --key-file=/etc/flanneld/cert/flanneld-key.pem \ set ${FLANNEL_ETCD_PREFIX}/config '{"Network":"'${CLUSTER_CIDR}'", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}' flanneld 當前版本 (v0.10.0) 不支持 etcd v3,故使用 etcd v2 API 寫入配置 key 和網段數據; 寫入的 Pod 網段 ${CLUSTER_CIDR} 必須是 /16 段地址,必須與 kube-controller-manager 的 --cluster-cidr 參數值一致; 建立 flanneld 的 systemd unit 文件 source /opt/k8s/bin/environment.sh export IFACE=eth0 cat > flanneld.service << EOF [Unit] Description=Flanneld overlay address etcd agent After=network.target After=network-online.target Wants=network-online.target After=etcd.service Before=docker.service [Service] Type=notify ExecStart=/opt/k8s/bin/flanneld \\ -etcd-cafile=/etc/kubernetes/cert/ca.pem \\ -etcd-certfile=/etc/flanneld/cert/flanneld.pem \\ -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \\ -etcd-endpoints=${ETCD_ENDPOINTS} \\ -etcd-prefix=${FLANNEL_ETCD_PREFIX} \\ -iface=${IFACE} ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker Restart=on-failure [Install] WantedBy=multi-user.target RequiredBy=docker.service EOF mk-docker-opts.sh 腳本將分配給 flanneld 的 Pod 子網網段信息寫入 /run/flannel/docker 文件,後續 docker 啓動時使用這個文件中的環境變量配置 docker0 網橋; flanneld 使用系統缺省路由所在的接口與其它節點通訊,對於有多個網絡接口(如內網和公網)的節點,能夠用 -iface 參數指定通訊接口,如上面的 eth0 接口; flanneld 運行時須要 root 權限; 分發 flanneld systemd unit 文件到全部節點 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp flanneld.service root@${node_ip}:/etc/systemd/system/ done 啓動 flanneld 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld" done 檢查啓動結果 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh k8s@${node_ip} "systemctl status flanneld|grep Active" done 確保狀態爲 active (running),不然查看日誌,確認緣由: $ journalctl -u flanneld 檢查分配給各 flanneld 的 Pod 網段信息 查看集羣 Pod 網段(/16): source /opt/k8s/bin/environment.sh etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/etc/kubernetes/cert/ca.pem \ --cert-file=/etc/flanneld/cert/flanneld.pem \ --key-file=/etc/flanneld/cert/flanneld-key.pem \ get ${FLANNEL_ETCD_PREFIX}/config 輸出: {"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}} 查看已分配的 Pod 子網段列表(/24): source /opt/k8s/bin/environment.sh etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/etc/kubernetes/cert/ca.pem \ --cert-file=/etc/flanneld/cert/flanneld.pem \ --key-file=/etc/flanneld/cert/flanneld-key.pem \ ls ${FLANNEL_ETCD_PREFIX}/subnets 輸出: /kubernetes/network/subnets/172.30.81.0-24 /kubernetes/network/subnets/172.30.29.0-24 /kubernetes/network/subnets/172.30.39.0-24 查看某一 Pod 網段對應的節點 IP 和 flannel 接口地址: source /opt/k8s/bin/environment.sh etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/etc/kubernetes/cert/ca.pem \ --cert-file=/etc/flanneld/cert/flanneld.pem \ --key-file=/etc/flanneld/cert/flanneld-key.pem \ get ${FLANNEL_ETCD_PREFIX}/subnets/172.30.81.0-24 輸出: {"PublicIP":"192.168.0.91","BackendType":"vxlan","BackendData":{"VtepMAC":"12:21:93:9e:b1:eb"}} 驗證各節點能經過 Pod 網段互通 在各節點上部署 flannel 後,檢查是否建立了 flannel 接口(名稱可能爲 flannel0、flannel.0、flannel.1 等): source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh ${node_ip} "/usr/sbin/ip addr show flannel.1|grep -w inet" done 輸出: inet 172.30.81.0/32 scope global flannel.1 inet 172.30.29.0/32 scope global flannel.1 inet 172.30.39.0/32 scope global flannel.1 在各節點上 ping 全部 flannel 接口 IP,確保能通: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh ${node_ip} "ping -c 1 172.30.81.0" ssh ${node_ip} "ping -c 1 172.30.29.0" ssh ${node_ip} "ping -c 1 172.30.39.0" done 06-0.部署 master 節點 kubernetes master 節點運行以下組件: kube-apiserver kube-scheduler kube-controller-manager kube-scheduler 和 kube-controller-manager 能夠以集羣模式運行,經過 leader 選舉產生一個工做進程,其它進程處於阻塞模式。 對於 kube-apiserver,能夠運行多個實例(本文檔是 3 實例),但對其它組件須要提供統一的訪問地址,該地址須要高可用。本文檔使用 keepalived 和 haproxy 實現 kube-apiserver VIP 高可用和負載均衡。 下載最新版本的二進制文件 從 CHANGELOG頁面 下載 server tarball 文件。 wget https://dl.k8s.io/v1.10.4/kubernetes-server-linux-amd64.tar.gz tar -xzvf kubernetes-server-linux-amd64.tar.gz cd kubernetes tar -xzvf kubernetes-src.tar.gz 將二進制文件拷貝到全部 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp server/bin/* k8s@${node_ip}:/opt/k8s/bin/ ssh k8s@${node_ip} "chmod +x /opt/k8s/bin/*" done 06-1.部署高可用組件 本文檔講解使用 keepalived 和 haproxy 實現 kube-apiserver 高可用的步驟: keepalived 提供 kube-apiserver 對外服務的 VIP; haproxy 監聽 VIP,後端鏈接全部 kube-apiserver 實例,提供健康檢查和負載均衡功能; 運行 keepalived 和 haproxy 的節點稱爲 LB 節點。因爲 keepalived 是一主多備運行模式,故至少兩個 LB 節點。 本文檔複用 master 節點的三臺機器,haproxy 監聽的端口(8443) 須要與 kube-apiserver 的端口 6443 不一樣,避免衝突。 keepalived 在運行過程當中週期檢查本機的 haproxy 進程狀態,若是檢測到 haproxy 進程異常,則觸發從新選主的過程,VIP 將飄移到新選出來的主節點,從而實現 VIP 的高可用。 全部組件(如 kubeclt、apiserver、controller-manager、scheduler 等)都經過 VIP 和 haproxy 監聽的 8443 端口訪問 kube-apiserver 服務。 安裝軟件包 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "yum install -y keepalived haproxy" done 配置和下發 haproxy 配置文件 haproxy 配置文件: cat > haproxy.cfg <<EOF global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /var/run/haproxy-admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon nbproc 1 defaults log global timeout connect 5000 timeout client 10m timeout server 10m listen admin_stats bind 0.0.0.0:10080 mode http log 127.0.0.1 local0 err stats refresh 30s stats uri /status stats realm welcome login\ Haproxy stats auth admin:123456 stats hide-version stats admin if TRUE listen kube-master bind 0.0.0.0:8443 mode tcp option tcplog balance source server 192.168.0.91 192.168.0.91:6443 check inter 2000 fall 2 rise 2 weight 1 server 192.168.0.92 192.168.0.92:6443 check inter 2000 fall 2 rise 2 weight 1 server 192.168.0.93 192.168.0.93:6443 check inter 2000 fall 2 rise 2 weight 1 EOF haproxy 在 10080 端口輸出 status 信息; haproxy 監聽全部接口的 8443 端口,該端口與環境變量 ${KUBE_APISERVER} 指定的端口必須一致; server 字段列出全部 kube-apiserver 監聽的 IP 和端口; 下發 haproxy.cfg 到全部 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp haproxy.cfg root@${node_ip}:/etc/haproxy done 起 haproxy 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl restart haproxy" done 檢查 haproxy 服務狀態 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status haproxy|grep Active" done 確保狀態爲 active (running),不然查看日誌,確認緣由: journalctl -u haproxy 檢查 haproxy 是否監聽 8443 端口: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "netstat -lnpt|grep haproxy" done 確保輸出相似於: tcp 0 0 0.0.0.0:8443 0.0.0.0:* LISTEN 120583/haproxy 配置和下發 keepalived 配置文件 keepalived 是一主(master)多備(backup)運行模式,故有兩種類型的配置文件。master 配置文件只有一份,backup 配置文件視節點數目而定,對於本文檔而言,規劃以下: master: 192.168.0.91 backup:192.168.0.9二、192.168.0.93 master 配置文件: source /opt/k8s/bin/environment.sh cat > keepalived-master.conf <<EOF global_defs { router_id lb-master-105 } vrrp_script check-haproxy { script "killall -0 haproxy" interval 5 weight -30 } vrrp_instance VI-kube-master { state MASTER priority 120 dont_track_primary interface ${VIP_IF} virtual_router_id 68 advert_int 3 track_script { check-haproxy } virtual_ipaddress { ${MASTER_VIP} } } EOF VIP 所在的接口(interface ${VIP_IF})爲 eth0; 使用 killall -0 haproxy 命令檢查所在節點的 haproxy 進程是否正常。若是異常則將權重減小(-30),從而觸發從新選主過程; router_id、virtual_router_id 用於標識屬於該 HA 的 keepalived 實例,若是有多套 keepalived HA,則必須各不相同; backup 配置文件: source /opt/k8s/bin/environment.sh cat > keepalived-backup.conf <<EOF global_defs { router_id lb-backup-105 } vrrp_script check-haproxy { script "killall -0 haproxy" interval 5 weight -30 } vrrp_instance VI-kube-master { state BACKUP priority 110 dont_track_primary interface ${VIP_IF} virtual_router_id 68 advert_int 3 track_script { check-haproxy } virtual_ipaddress { ${MASTER_VIP} } } EOF VIP 所在的接口(interface ${VIP_IF})爲 eth0; 使用 killall -0 haproxy 命令檢查所在節點的 haproxy 進程是否正常。若是異常則將權重減小(-30),從而觸發從新選主過程; router_id、virtual_router_id 用於標識屬於該 HA 的 keepalived 實例,若是有多套 keepalived HA,則必須各不相同; priority 的值必須小於 master 的值; 下發 keepalived 配置文件 下發 master 配置文件: scp keepalived-master.conf root@192.168.0.91:/etc/keepalived/keepalived.conf 下發 backup 配置文件: scp keepalived-backup.conf root@192.168.0.92:/etc/keepalived/keepalived.conf scp keepalived-backup.conf root@192.168.0.93:/etc/keepalived/keepalived.conf 起 keepalived 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl restart keepalived" done 檢查 keepalived 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status keepalived|grep Active" done 確保狀態爲 active (running),不然查看日誌,確認緣由: journalctl -u keepalived 查看 VIP 所在的節點,確保能夠 ping 通 VIP: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh ${node_ip} "/usr/sbin/ip addr show ${VIP_IF}" ssh ${node_ip} "ping -c 1 ${MASTER_VIP}" done 查看 haproxy 狀態頁面 瀏覽器訪問 ${MASTER_VIP}:10080/status 地址,查看 haproxy 狀態頁面: 06-1.部署 kube-apiserver 組件 本文檔講解使用 keepalived 和 haproxy 部署一個 3 節點高可用 master 集羣的步驟,對應的 LB VIP 爲環境變量 ${MASTER_VIP}。 準備工做 下載最新版本的二進制文件、安裝和配置 flanneld 參考:06-0.部署master節點.md 建立 kubernetes 證書和私鑰 建立證書籤名請求: source /opt/k8s/bin/environment.sh cat > kubernetes-csr.json <<EOF { "CN": "kubernetes", "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93", "${MASTER_VIP}", "${CLUSTER_KUBERNETES_SVC_IP}", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF hosts 字段指定受權使用該證書的 IP 或域名列表,這裏列出了 VIP 、apiserver 節點 IP、kubernetes 服務 IP 和域名; 域名最後字符不能是 .(如不能爲 kubernetes.default.svc.cluster.local.),不然解析時失敗,提示: x509: cannot parse dnsName "kubernetes.default.svc.cluster.local."; 若是使用非 cluster.local 域名,如 opsnull.com,則須要修改域名列表中的最後兩個域名爲:kubernetes.default.svc.opsnull、kubernetes.default.svc.opsnull.com kubernetes 服務 IP 是 apiserver 自動建立的,通常是 --service-cluster-ip-range 參數指定的網段的第一個IP,後續能夠經過以下命令獲取: $ kubectl get svc kubernetes NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes 10.254.0.1 <none> 443/TCP 1d 生成證書和私鑰: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes ls kubernetes*pem 將生成的證書和私鑰文件拷貝到 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/kubernetes/cert/ && sudo chown -R k8s /etc/kubernetes/cert/" scp kubernetes*.pem k8s@${node_ip}:/etc/kubernetes/cert/ done k8s 帳戶能夠讀寫 /etc/kubernetes/cert/ 目錄; 建立加密配置文件 source /opt/k8s/bin/environment.sh cat > encryption-config.yaml <<EOF kind: EncryptionConfig apiVersion: v1 resources: - resources: - secrets providers: - aescbc: keys: - name: key1 secret: ${ENCRYPTION_KEY} - identity: {} EOF 將加密配置文件拷貝到 master 節點的 /etc/kubernetes 目錄下: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp encryption-config.yaml root@${node_ip}:/etc/kubernetes/ done 替換後的 encryption-config.yaml 文件:encryption-config.yaml 生成 service account key cd /etc/kubernetes/ openssl genrsa -out /etc/kubernetes/sa.key 2048 openssl rsa -in /etc/kubernetes/cert/sa.key -pubout -out /etc/kubernetes/cert/sa.pub ls /etc/kubernetes/pki/sa.* cd $HOME 分發service account key到全部節點 subprocess.call(["ansible k8s -m copy -a 'src=/etc/kubernetes/sa.key dest=/etc/kubernetes/cert/ force=yes'"], shell=True) subprocess.call(["ansible k8s -m copy -a 'src=/etc/kubernetes/sa.pub dest=/etc/kubernetes/cert/ force=yes'"], shell=True) 建立 kube-apiserver systemd unit 模板文件 source /opt/k8s/bin/environment.sh cat > kube-apiserver.service.template <<EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] ExecStart=/opt/k8s/bin/kube-apiserver \\ --enable-admission-plugins=plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota \\ --anonymous-auth=false \\ --experimental-encryption-provider-config=/etc/kubernetes/encryption-config.yaml \\ --advertise-address=##NODE_IP## \\ --bind-address=##NODE_IP## \\ --insecure-port=0 \\ --authorization-mode=Node,RBAC \\ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \\ --runtime-config=api/all \\ --enable-bootstrap-token-auth \\ --service-cluster-ip-range=${SERVICE_CIDR} \\ --service-node-port-range=${NODE_PORT_RANGE} \\ --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \\ --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \\ --client-ca-file=/etc/kubernetes/cert/ca.pem \\ --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \\ --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \\ --service-account-key-file=/etc/kubernetes/cert/sa.pub \\ --etcd-cafile=/etc/kubernetes/cert/ca.pem \\ --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \\ --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \\ --etcd-servers=${ETCD_ENDPOINTS} \\ --enable-swagger-ui=true \\ --allow-privileged=true \\ --apiserver-count=3 \\ --audit-log-maxage=30 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-path=/var/log/kube-apiserver-audit.log \\ --event-ttl=1h \\ --alsologtostderr=true \\ --logtostderr=false \\ --log-dir=/var/log/kubernetes \\ --v=2 Restart=on-failure RestartSec=5 Type=notify User=k8s LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF --experimental-encryption-provider-config:啓用加密特性; --authorization-mode=Node,RBAC: 開啓 Node 和 RBAC 受權模式,拒絕未受權的請求; --enable-admission-plugins:啓用 ServiceAccount 和 NodeRestriction; --service-account-key-file:簽名 ServiceAccount Token 的公鑰文件,kube-controller-manager 的 --service-account-private-key-file 指定私鑰文件,二者配對使用; --tls-*-file:指定 apiserver 使用的證書、私鑰和 CA 文件。--client-ca-file 用於驗證 client (kue-controller-manager、kube-scheduler、kubelet、kube-proxy 等)請求所帶的證書; --kubelet-client-certificate、--kubelet-client-key:若是指定,則使用 https 訪問 kubelet APIs;須要爲證書對應的用戶(上面 kubernetes*.pem 證書的用戶爲 kubernetes) 用戶定義 RBAC 規則,不然訪問 kubelet API 時提示未受權; --bind-address: 不能爲 127.0.0.1,不然外界不能訪問它的安全端口 6443; --insecure-port=0:關閉監聽非安全端口(8080); --service-cluster-ip-range: 指定 Service Cluster IP 地址段; --service-node-port-range: 指定 NodePort 的端口範圍; --runtime-config=api/all=true: 啓用全部版本的 APIs,如 autoscaling/v2alpha1; --enable-bootstrap-token-auth:啓用 kubelet bootstrap 的 token 認證; --apiserver-count=3:指定集羣運行模式,多臺 kube-apiserver 會經過 leader 選舉產生一個工做節點,其它節點處於阻塞狀態; User=k8s:使用 k8s 帳戶運行; 爲各節點建立和分發 kube-apiserver systemd unit 文件 替換模板文件中的變量,爲各節點建立 systemd unit 文件: source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-apiserver.service.template > kube-apiserver-${NODE_IPS[i]}.service done ls kube-apiserver*.service NODE_NAMES 和 NODE_IPS 爲相同長度的 bash 數組,分別爲節點名稱和對應的 IP; 分發生成的 systemd unit 文件: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" scp kube-apiserver-${node_ip}.service root@${node_ip}:/etc/systemd/system/kube-apiserver.service done 必須先建立日誌目錄; 文件重命名爲 kube-apiserver.service; 替換後的 unit 文件:kube-apiserver.service 啓動 kube-apiserver 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver" done 檢查 kube-apiserver 運行狀態 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status kube-apiserver |grep 'Active:'" done 報錯:若是找不到kube-controller-manager.kubeconfig文件會報以下錯誤: [root@test1 ~]# journalctl -u kube-controller-manager -- Logs begin at Mon 2019-02-04 17:56:47 EST, end at Tue 2019-02-05 01:04:33 EST. -- Feb 04 23:58:13 test1 systemd[1]: [/etc/systemd/system/kube-controller-manager.service:7] Failed to parse service restart specifier, Feb 04 23:58:13 test1 systemd[1]: [/etc/systemd/system/kube-controller-manager.service:7] Failed to parse service restart specifier, Feb 04 23:58:14 test1 kube-controller-manager[45817]: Flag --port has been deprecated, see --secure-port instead. Feb 04 23:58:14 test1 kube-controller-manager[45817]: Flag --horizontal-pod-autoscaler-use-rest-clients has been deprecated, Heapster Feb 04 23:58:14 test1 kube-controller-manager[45817]: I0204 23:58:14.297286 45817 flags.go:33] FLAG: --address="0.0.0.0" 確保狀態爲 active (running),不然到 master 節點查看日誌,確認緣由: journalctl -u kube-apiserver 打印 kube-apiserver 寫入 etcd 的數據 source /opt/k8s/bin/environment.sh ETCDCTL_API=3 etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --cacert=/etc/kubernetes/cert/ca.pem \ --cert=/etc/etcd/cert/etcd.pem \ --key=/etc/etcd/cert/etcd-key.pem \ get /registry/ --prefix --keys-only 檢查集羣信息 kubectl cluster-info Kubernetes master is running at https://192.168.0.235:8443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. kubectl get all --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 35m kubectl get componentstatuses NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: getsockopt: connection refused scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: getsockopt: connection refused etcd-1 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} 注意: 若是執行 kubectl 命令式時輸出以下錯誤信息,則說明使用的 ~/.kube/config 文件不對,請切換到正確的帳戶後再執行該命令: The connection to the server localhost:8080 was refused - did you specify the right host or port? 執行 kubectl get componentstatuses 命令時,apiserver 默認向 127.0.0.1 發送請求。當 controller-manager、scheduler 以集羣模式運行時,有可能和 kube-apiserver 不在一臺機器上,這時 controller-manager 或 scheduler 的狀態爲 Unhealthy,但實際上它們工做正常。 檢查 kube-apiserver 監聽的端口 sudo netstat -lnpt|grep kube tcp 0 0 192.168.0.91:6443 0.0.0.0:* LISTEN 13075/kube-apiserve 6443: 接收 https 請求的安全端口,對全部請求作認證和受權; 因爲關閉了非安全端口,故沒有監聽 8080; 授予 kubernetes 證書訪問 kubelet API 的權限 (按照這個作完後感受沒這一步沒什麼用,還得建立高級權限,建立權限在啓動kubelet前建立) kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernete 在執行 kubectl exec、run、logs 等命令時,apiserver 會轉發到 kubelet。這裏定義 RBAC 規則,受權 apiserver 調用 kubelet API。 06-2.部署高可用 kube-controller-manager 集羣 本文檔介紹部署高可用 kube-controller-manager 集羣的步驟。 該集羣包含 3 個節點,啓動後將經過競爭選舉機制產生一個 leader 節點,其它節點爲阻塞狀態。當 leader 節點不可用後,剩餘節點將再次進行選舉產生新的 leader 節點,從而保證服務的可用性。 爲保證通訊安全,本文檔先生成 x509 證書和私鑰,kube-controller-manager 在以下兩種狀況下使用該證書: 與 kube-apiserver 的安全端口通訊時; 在安全端口(https,10252) 輸出 prometheus 格式的 metrics; 準備工做 下載最新版本的二進制文件、安裝和配置 flanneld 參考:06-0.部署master節點.md 建立 kube-controller-manager 證書和私鑰 建立證書籤名請求: cat > kube-controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93" ], "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-controller-manager", "OU": "4Paradigm" } ] } EOF hosts 列表包含全部 kube-controller-manager 節點 IP; CN 爲 system:kube-controller-manager、O 爲 system:kube-controller-manager,kubernetes 內置的 ClusterRoleBindings system:kube-controller-manager 賦予 kube-controller-manager 工做所需的權限。 生成證書和私鑰: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager 將生成的證書和私鑰分發到全部 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager*.pem k8s@${node_ip}:/etc/kubernetes/cert/ done 建立和分發 kubeconfig 文件 kubeconfig 文件包含訪問 apiserver 的全部信息,如 apiserver 地址、CA 證書和自身使用的證書; source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/cert/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager \ --client-certificate=kube-controller-manager.pem \ --client-key=kube-controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-context system:kube-controller-manager \ --cluster=kubernetes \ --user=system:kube-controller-manager \ --kubeconfig=kube-controller-manager.kubeconfig kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig 分發 kubeconfig 到全部 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager.kubeconfig k8s@${node_ip}:/etc/kubernetes/ done 建立和分發 kube-controller-manager systemd unit 文件 source /opt/k8s/bin/environment.sh cat > kube-controller-manager.service <<EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/opt/k8s/bin/kube-controller-manager \\ --port=0 \\ --secure-port=10252 \\ --bind-address=127.0.0.1 \\ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\ --service-cluster-ip-range=${SERVICE_CIDR} \\ --allocate-node-cidrs=true \\ --cluster-cidr=${CLUSTER_CIDR} \\ --cluster-name=kubernetes \\ --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \\ --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \\ --experimental-cluster-signing-duration=8760h \\ --root-ca-file=/etc/kubernetes/cert/ca.pem \\ --service-account-private-key-file=/etc/kubernetes/cert/sa.key \\ --leader-elect=true \\ --feature-gates=RotateKubeletServerCertificate=true \\ --controllers=*,bootstrapsigner,tokencleaner \\ --horizontal-pod-autoscaler-use-rest-clients=true \\ --horizontal-pod-autoscaler-sync-period=10s \\ --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \\ --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \\ --use-service-account-credentials=true \\ --alsologtostderr=true \\ --logtostderr=false \\ --log-dir=/var/log/kubernetes \\ --v=2 Restart=on Restart=on-failure RestartSec=5 User=k8s [Install] WantedBy=multi-user.target EOF --port=0:關閉監聽 http /metrics 的請求,同時 --address 參數無效,--bind-address 參數有效; --secure-port=1025二、--bind-address=0.0.0.0: 在全部網絡接口監聽 10252 端口的 https /metrics 請求; --kubeconfig:指定 kubeconfig 文件路徑,kube-controller-manager 使用它鏈接和驗證 kube-apiserver; --cluster-signing-*-file:簽名 TLS Bootstrap 建立的證書; --experimental-cluster-signing-duration:指定 TLS Bootstrap 證書的有效期; --root-ca-file:放置到容器 ServiceAccount 中的 CA 證書,用來對 kube-apiserver 的證書進行校驗; --service-account-private-key-file:簽名 ServiceAccount 中 Token 的私鑰文件,必須和 kube-apiserver 的 --service-account-key-file 指定的公鑰文件配對使用; --service-cluster-ip-range :指定 Service Cluster IP 網段,必須和 kube-apiserver 中的同名參數一致; --leader-elect=true:集羣運行模式,啓用選舉功能;被選爲 leader 的節點負責處理工做,其它節點爲阻塞狀態; --feature-gates=RotateKubeletServerCertificate=true:開啓 kublet server 證書的自動更新特性; --controllers=*,bootstrapsigner,tokencleaner:啓用的控制器列表,tokencleaner 用於自動清理過時的 Bootstrap token; --horizontal-pod-autoscaler-*:custom metrics 相關參數,支持 autoscaling/v2alpha1; --tls-cert-file、--tls-private-key-file:使用 https 輸出 metrics 時使用的 Server 證書和祕鑰; --use-service-account-credentials=true: User=k8s:使用 k8s 帳戶運行; kube-controller-manager 不對請求 https metrics 的 Client 證書進行校驗,故不須要指定 --tls-ca-file 參數,並且該參數已被淘汰。 分發 systemd unit 文件到全部 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager.service root@${node_ip}:/etc/systemd/system/ done kube-controller-manager 的權限 ClusteRole: system:kube-controller-manager 的權限很小,只能建立 secret、serviceaccount 等資源對象,各 controller 的權限分散到 ClusterRole system:controller:XXX 中。 須要在 kube-controller-manager 的啓動參數中添加 --use-service-account-credentials=true 參數,這樣 main controller 會爲各 controller 建立對應的 ServiceAccount XXX-controller。 內置的 ClusterRoleBinding system:controller:XXX 將賦予各 XXX-controller ServiceAccount 對應的 ClusterRole system:controller:XXX 權限。 啓動 kube-controller-manager 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager" done 必須先建立日誌目錄; 檢查服務運行狀態 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh k8s@${node_ip} "systemctl status kube-controller-manager|grep Active" done 確保狀態爲 active (running),不然查看日誌,確認緣由: journalctl -u kube-controller-manager 查看輸出的 metric 注意:如下命令在 kube-controller-manager 節點上執行。 kube-controller-manager 監聽 10252 端口,接收 https 請求: sudo netstat -lnpt|grep kube-controll tcp 0 0 127.0.0.1:10252 0.0.0.0:* LISTEN 18377/kube-controll curl -s --cacert /etc/kubernetes/cert/ca.pem https://127.0.0.1:10252/metrics |head # HELP ClusterRoleAggregator_adds Total number of adds handled by workqueue: ClusterRoleAggregator # TYPE ClusterRoleAggregator_adds counter ClusterRoleAggregator_adds 3 # HELP ClusterRoleAggregator_depth Current depth of workqueue: ClusterRoleAggregator # TYPE ClusterRoleAggregator_depth gauge ClusterRoleAggregator_depth 0 # HELP ClusterRoleAggregator_queue_latency How long an item stays in workqueueClusterRoleAggregator before being requested. # TYPE ClusterRoleAggregator_queue_latency summary ClusterRoleAggregator_queue_latency{quantile="0.5"} 57018 ClusterRoleAggregator_queue_latency{quantile="0.9"} 57268 curl --cacert CA 證書用來驗證 kube-controller-manager https server 證書; 測試 kube-controller-manager 集羣的高可用 停掉一個或兩個節點的 kube-controller-manager 服務,觀察其它節點的日誌,看是否獲取了 leader 權限。 查看當前的 leader kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"test2_084534e2-6cc4-11e8-a418-5254001f5b65","leaseDurationSeconds":15,"acquireTime":"2018-06-10T15:40:33Z","renewTime":"2018-06-10T16:19:08Z","leaderTransitions":12}' creationTimestamp: 2018-06-10T13:59:42Z name: kube-controller-manager namespace: kube-system resourceVersion: "4540" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager uid: 862cc048-6cb6-11e8-96fa-525400ba84c6 可見,當前的 leader 爲 test2 節點。 06-3.部署高可用 kube-scheduler 集羣 本文檔介紹部署高可用 kube-scheduler 集羣的步驟。 該集羣包含 3 個節點,啓動後將經過競爭選舉機制產生一個 leader 節點,其它節點爲阻塞狀態。當 leader 節點不可用後,剩餘節點將再次進行選舉產生新的 leader 節點,從而保證服務的可用性。 爲保證通訊安全,本文檔先生成 x509 證書和私鑰,kube-scheduler 在以下兩種狀況下使用該證書: 與 kube-apiserver 的安全端口通訊; 在安全端口(https,10251) 輸出 prometheus 格式的 metrics; 準備工做 下載最新版本的二進制文件、安裝和配置 flanneld 參考:06-0.部署master節點.md 建立 kube-scheduler 證書和私鑰 建立證書籤名請求: cat > kube-scheduler-csr.json <<EOF { "CN": "system:kube-scheduler", "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-scheduler", "OU": "4Paradigm" } ] } EOF hosts 列表包含全部 kube-scheduler 節點 IP; CN 爲 system:kube-scheduler、O 爲 system:kube-scheduler,kubernetes 內置的 ClusterRoleBindings system:kube-scheduler 將賦予 kube-scheduler 工做所需的權限。 生成證書和私鑰: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler 建立和分發 kubeconfig 文件 kubeconfig 文件包含訪問 apiserver 的全部信息,如 apiserver 地址、CA 證書和自身使用的證書; source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/cert/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler \ --client-certificate=kube-scheduler.pem \ --client-key=kube-scheduler-key.pem \ --embed-certs=true \ --kubeconfig=kube-scheduler.kubeconfig kubectl config set-context system:kube-scheduler \ --cluster=kubernetes \ --user=system:kube-scheduler \ --kubeconfig=kube-scheduler.kubeconfig kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig 上一步建立的證書、私鑰以及 kube-apiserver 地址被寫入到 kubeconfig 文件中; 分發 kubeconfig 到全部 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-scheduler.kubeconfig k8s@${node_ip}:/etc/kubernetes/ done 建立和分發 kube-scheduler systemd unit 文件 cat > kube-scheduler.service <<EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/opt/k8s/bin/kube-scheduler \\ --address=127.0.0.1 \\ --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\ --leader-elect=true \\ --alsologtostderr=true \\ --logtostderr=false \\ --log-dir=/var/log/kubernetes \\ --v=2 Restart=on-failure RestartSec=5 User=k8s [Install] WantedBy=multi-user.target EOF --address:在 127.0.0.1:10251 端口接收 http /metrics 請求;kube-scheduler 目前還不支持接收 https 請求; --kubeconfig:指定 kubeconfig 文件路徑,kube-scheduler 使用它鏈接和驗證 kube-apiserver; --leader-elect=true:集羣運行模式,啓用選舉功能;被選爲 leader 的節點負責處理工做,其它節點爲阻塞狀態; User=k8s:使用 k8s 帳戶運行; 完整 unit 見 kube-scheduler.service。 分發 systemd unit 文件到全部 master 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-scheduler.service root@${node_ip}:/etc/systemd/system/ done 啓動 kube-scheduler 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-scheduler && systemctl restart kube-scheduler" done 必須先建立日誌目錄; 檢查服務運行狀態 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh k8s@${node_ip} "systemctl status kube-scheduler|grep Active" done 確保狀態爲 active (running),不然查看日誌,確認緣由: journalctl -u kube-scheduler 查看輸出的 metric 注意:如下命令在 kube-scheduler 節點上執行。 kube-scheduler 監聽 10251 端口,接收 http 請求: sudo netstat -lnpt|grep kube-sche tcp 0 0 127.0.0.1:10251 0.0.0.0:* LISTEN 23783/kube-schedule curl -s http://127.0.0.1:10251/metrics |head # HELP apiserver_audit_event_total Counter of audit events generated and sent to the audit backend. # TYPE apiserver_audit_event_total counter apiserver_audit_event_total 0 # HELP go_gc_duration_seconds A summary of the GC invocation durations. # TYPE go_gc_duration_seconds summary go_gc_duration_seconds{quantile="0"} 9.7715e-05 go_gc_duration_seconds{quantile="0.25"} 0.000107676 go_gc_duration_seconds{quantile="0.5"} 0.00017868 go_gc_duration_seconds{quantile="0.75"} 0.000262444 go_gc_duration_seconds{quantile="1"} 0.001205223 測試 kube-scheduler 集羣的高可用 隨便找一個或兩個 master 節點,停掉 kube-scheduler 服務,看其它節點是否獲取了 leader 權限(systemd 日誌)。 查看當前的 leader kubectl get endpoints kube-scheduler --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"test3_61f34593-6cc8-11e8-8af7-5254002f288e","leaseDurationSeconds":15,"acquireTime":"2018-06-10T16:09:56Z","renewTime":"2018-06-10T16:20:54Z","leaderTransitions":1}' creationTimestamp: 2018-06-10T16:07:33Z name: kube-scheduler namespace: kube-system resourceVersion: "4645" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler uid: 62382d98-6cc8-11e8-96fa-525400ba84c6 可見,當前的 leader 爲 test3 節點。 07-0.部署 worker 節點 kubernetes work 節點運行以下組件: docker kubelet kube-proxy 安裝和配置 flanneld 參考 05-部署flannel網絡.md 安裝依賴包 CentOS: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "yum install -y epel-release" ssh root@${node_ip} "yum install -y conntrack ipvsadm ipset jq iptables curl sysstat libseccomp && /usr/sbin/modprobe ip_vs " done Ubuntu: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "apt-get install -y conntrack ipvsadm ipset jq iptables curl sysstat libseccomp && /usr/sbin/modprobe ip_vs " done 07-1.部署 docker 組件 docker 是容器的運行環境,管理它的生命週期。kubelet 經過 Container Runtime Interface (CRI) 與 docker 進行交互。 安裝依賴包 參考 07-0.部署worker節點.md 下載和分發 docker 二進制文件 到 https://download.docker.com/linux/static/stable/x86_64/ 頁面下載最新發布包: wget https://download.docker.com/linux/static/stable/x86_64/docker-18.03.1-ce.tgz tar -xvf docker-18.03.1-ce.tgz 分發二進制文件到全部 worker 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker/docker* k8s@${node_ip}:/opt/k8s/bin/ ssh k8s@${node_ip} "chmod +x /opt/k8s/bin/*" done 建立和分發 systemd unit 文件 cat > docker.service <<"EOF" [Unit] Description=Docker Application Container Engine Documentation=http://docs.docker.io [Service] Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin" EnvironmentFile=-/run/flannel/docker ExecStart=/opt/k8s/bin/dockerd --log-level=error $DOCKER_NETWORK_OPTIONS ExecReload=/bin/kill -s HUP $MAINPID Restart=on-failure RestartSec=5 LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF EOF 先後有雙引號,這樣 bash 不會替換文檔中的變量,如 $DOCKER_NETWORK_OPTIONS; dockerd 運行時會調用其它 docker 命令,如 docker-proxy,因此須要將 docker 命令所在的目錄加到 PATH 環境變量中; flanneld 啓動時將網絡配置寫入 /run/flannel/docker 文件中,dockerd 啓動前讀取該文件中的環境變量 DOCKER_NETWORK_OPTIONS ,而後設置 docker0 網橋網段; 若是指定了多個 EnvironmentFile 選項,則必須將 /run/flannel/docker 放在最後(確保 docker0 使用 flanneld 生成的 bip 參數); docker 須要以 root 用於運行; docker 從 1.13 版本開始,可能將 iptables FORWARD chain的默認策略設置爲DROP,從而致使 ping 其它 Node 上的 Pod IP 失敗,遇到這種狀況時,須要手動設置策略爲 ACCEPT: sudo iptables -P FORWARD ACCEPT 而且把如下命令寫入 /etc/rc.local 文件中,防止節點重啓iptables FORWARD chain的默認策略又還原爲DROP echo /sbin/iptables -P FORWARD ACCEPT >/etc/profile source /etc/profile 分發 systemd unit 文件到全部 worker 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker.service root@${node_ip}:/etc/systemd/system/docker.service done 配置和分發 docker 配置文件 配置docker鏡像加速 (須要重啓 dockerd 生效): cat > docker-daemon.json <<EOF { "registry-mirrors": ["https://hub-mirror.c.163.com", "https://docker.mirrors.ustc.edu.cn"], "max-concurrent-downloads": 20 } EOF 分發 docker 配置文件到全部 work 節點: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/docker/" scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json done 啓動 docker 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl stop firewalld && systemctl disable firewalld" ssh root@${node_ip} "/usr/sbin/iptables -F && /usr/sbin/iptables -X && /usr/sbin/iptables -F -t nat && /usr/sbin/iptables -X -t nat" ssh root@${node_ip} "/usr/sbin/iptables -P FORWARD ACCEPT" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker" ssh root@${node_ip} 'for intf in /sys/devices/virtual/net/docker0/brif/*; do echo 1 > $intf/hairpin_mode; done' ssh root@${node_ip} "sudo sysctl -p /etc/sysctl.d/kubernetes.conf" done 關閉 firewalld(centos7)/ufw(ubuntu16.04),不然可能會重複建立 iptables 規則; 清理舊的 iptables rules 和 chains 規則; 開啓 docker0 網橋下虛擬網卡的 hairpin 模式; 檢查服務運行狀態 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh k8s@${node_ip} "systemctl status docker|grep Active" done 確保狀態爲 active (running),不然查看日誌,確認緣由: journalctl -u docker 檢查 docker0 網橋 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh k8s@${node_ip} "/usr/sbin/ip addr show flannel.1 && /usr/sbin/ip addr show docker0" done 確認各 work 節點的 docker0 網橋和 flannel.1 接口的 IP 處於同一個網段中(以下 172.30.39.0 和 172.30.39.1): 3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default link/ether ce:2f:d6:53:e5:f3 brd ff:ff:ff:ff:ff:ff inet 172.30.39.0/32 scope global flannel.1 valid_lft forever preferred_lft forever inet6 fe80::cc2f:d6ff:fe53:e5f3/64 scope link valid_lft forever preferred_lft forever 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:bf:65:16:5c brd ff:ff:ff:ff:ff:ff inet 172.30.39.1/24 brd 172.30.39.255 scope global docker0 valid_lft forever preferred_lft forever 07-2.部署 kubelet 組件 kublet 運行在每一個 worker 節點上,接收 kube-apiserver 發送的請求,管理 Pod 容器,執行交互式命令,如 exec、run、logs 等。 kublet 啓動時自動向 kube-apiserver 註冊節點信息,內置的 cadvisor 統計和監控節點的資源使用狀況。 爲確保安全,本文檔只開啓接收 https 請求的安全端口,對請求進行認證和受權,拒絕未受權的訪問(如 apiserver、heapster)。 下載和分發 kubelet 二進制文件 參考 06-0.部署master節點.md 安裝依賴包 參考 07-0.部署worker節點.md 建立 kubelet bootstrap kubeconfig 文件 source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" # 建立 token export BOOTSTRAP_TOKEN=$(kubeadm token create \ --description kubelet-bootstrap-token \ --groups system:bootstrappers:${node_name} \ --kubeconfig ~/.kube/config) # 設置集羣參數 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/cert/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 設置客戶端認證參數 kubectl config set-credentials kubelet-bootstrap \ --token=${BOOTSTRAP_TOKEN} \ --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 設置上下文參數 kubectl config set-context default \ --cluster=kubernetes \ --user=kubelet-bootstrap \ --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 設置默認上下文 kubectl config use-context default --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig done 證書中寫入 Token 而非證書,證書後續由 controller-manager 建立。 查看 kubeadm 爲各節點建立的 token: kubeadm token list --kubeconfig ~/.kube/config TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS k0s2bj.7nvw1zi1nalyz4gz 23h 2018-06-14T15:14:31+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:test1 mkus5s.vilnjk3kutei600l 23h 2018-06-14T15:14:32+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:test3 zkiem5.0m4xhw0jc8r466nk 23h 2018-06-14T15:14:32+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:test2 建立的 token 有效期爲 1 天,超期後將不能再被使用,且會被 kube-controller-manager 的 tokencleaner 清理(若是啓用該 controller 的話); kube-apiserver 接收 kubelet 的 bootstrap token 後,將請求的 user 設置爲 system:bootstrap:,group 設置爲 system:bootstrappers; 各 token 關聯的 Secret: kubectl get secrets -n kube-system NAME TYPE DATA AGE bootstrap-token-k0s2bj bootstrap.kubernetes.io/token 7 1m bootstrap-token-mkus5s bootstrap.kubernetes.io/token 7 1m bootstrap-token-zkiem5 bootstrap.kubernetes.io/token 7 1m default-token-99st7 kubernetes.io/service-account-token 3 2d 分發 bootstrap kubeconfig 文件到全部 worker 節點 source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kubelet-bootstrap-${node_name}.kubeconfig k8s@${node_name}:/etc/kubernetes/kubelet-bootstrap.kubeconfig done 建立和分發 kubelet 參數配置文件 從 v1.10 開始,kubelet 部分參數需在配置文件中配置,kubelet --help 會提示: DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag 建立 kubelet 參數配置模板文件: source /opt/k8s/bin/environment.sh cat > kubelet.config.json.template <<EOF { "kind": "KubeletConfiguration", "apiVersion": "kubelet.config.k8s.io/v1beta1", "authentication": { "x509": { "clientCAFile": "/etc/kubernetes/cert/ca.pem" }, "webhook": { "enabled": true, "cacheTTL": "2m0s" }, "anonymous": { "enabled": false } }, "authorization": { "mode": "Webhook", "webhook": { "cacheAuthorizedTTL": "5m0s", "cacheUnauthorizedTTL": "30s" } }, "address": "##NODE_IP##", "port": 10250, "readOnlyPort": 0, "cgroupDriver": "cgroupfs", "hairpinMode": "promiscuous-bridge", "serializeImagePulls": false, "featureGates": { "RotateKubeletClientCertificate": true, "RotateKubeletServerCertificate": true }, "clusterDomain": "${CLUSTER_DNS_DOMAIN}", "clusterDNS": ["${CLUSTER_DNS_SVC_IP}"] } EOF address:API 監聽地址,不能爲 127.0.0.1,不然 kube-apiserver、heapster 等不能調用 kubelet 的 API; readOnlyPort=0:關閉只讀端口(默認 10255),等效爲未指定; authentication.anonymous.enabled:設置爲 false,不容許匿名?訪問 10250 端口; authentication.x509.clientCAFile:指定簽名客戶端證書的 CA 證書,開啓 HTTP 證書認證; authentication.webhook.enabled=true:開啓 HTTPs bearer token 認證; 對於未經過 x509 證書和 webhook 認證的請求(kube-apiserver 或其餘客戶端),將被拒絕,提示 Unauthorized; authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查詢 kube-apiserver 某 user、group 是否具備操做資源的權限(RBAC); featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自動 rotate 證書,證書的有效期取決於 kube-controller-manager 的 --experimental-cluster-signing-duration 參數; 須要 root 帳戶運行; 爲各節點建立和分發 kubelet 配置文件: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" sed -e "s/##NODE_IP##/${node_ip}/" kubelet.config.json.template > kubelet.config-${node_ip}.json scp kubelet.config-${node_ip}.json root@${node_ip}:/etc/kubernetes/kubelet.config.json done 替換後的 kubelet.config.json 文件: kubelet.config.json 建立和分發 kubelet systemd unit 文件 建立 kubelet systemd unit 文件模板: cat > kubelet.service.template <<EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=docker.service Requires=docker.service [Service] WorkingDirectory=/var/lib/kubelet ExecStart=/opt/k8s/bin/kubelet \\ --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \\ --cert-dir=/etc/kubernetes/cert \\ --network-plugin=cni \\ --cni-bin-dir=/opt/cni/bin \\ --cni-conf-dir=/etc/cni/net.d \\ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\ --config=/etc/kubernetes/kubelet.config.json \\ --hostname-override=##NODE_NAME## \\ --pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest \\ --allow-privileged=true \\ --alsologtostderr=true \\ --logtostderr=false \\ --log-dir=/var/log/kubernetes/ \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF 若是設置了 --hostname-override 選項,則 kube-proxy 也須要設置該選項,不然會出現找不到 Node 的狀況; --bootstrap-kubeconfig:指向 bootstrap kubeconfig 文件,kubelet 使用該文件中的用戶名和 token 向 kube-apiserver 發送 TLS Bootstrapping 請求; K8S approve kubelet 的 csr 請求後,在 --cert-dir 目錄建立證書和私鑰文件,而後寫入 --kubeconfig 文件; 替換後的 unit 文件:kubelet.service 爲各節點建立和分發 kubelet systemd unit 文件: source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" sed -e "s/##NODE_NAME##/${node_name}/" kubelet.service.template > kubelet-${node_name}.service scp kubelet-${node_name}.service root@${node_name}:/etc/systemd/system/kubelet.service done Bootstrap Token Auth 和授予權限 kublet 啓動時查找配置的 --kubeletconfig 文件是否存在,若是不存在則使用 --bootstrap-kubeconfig 向 kube-apiserver 發送證書籤名請求 (CSR)。 kube-apiserver 收到 CSR 請求後,對其中的 Token 進行認證(事先使用 kubeadm 建立的 token),認證經過後將請求的 user 設置爲 system:bootstrap:,group 設置爲 system:bootstrappers,這一過程稱爲 Bootstrap Token Auth。 默認狀況下,這個 user 和 group 沒有建立 CSR 的權限,kubelet 啓動失敗,錯誤日誌以下: sudo journalctl -u kubelet -a |grep -A 2 'certificatesigningrequests' May 06 06:42:36 test1 kubelet[26986]: F0506 06:42:36.314378 26986 server.go:233] failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:bootstrap:lemy40" cannot create certificatesigningrequests.certificates.k8s.io at the cluster scope May 06 06:42:36 test1 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a May 06 06:42:36 test1 systemd[1]: kubelet.service: Failed with result 'exit-code'. 解決辦法是:建立一個 clusterrolebinding,將 group system:bootstrappers 和 clusterrole system:node-bootstrapper 綁定: kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers 給kubelet受權高級權限: 不然沒法經過kubectl exec 進入一個pod cat > apiserver-to-kubelet.yaml <<EOF apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kubernetes-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics verbs: - "*" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kubernetes namespace: "" roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kubernetes-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kubernetes EOF 建立受權: kubectl create -f apiserver-to-kubelet.yaml [root@test4 ~]# kubectl create -f apiserver-to-kubelet.yaml clusterrole.rbac.authorization.k8s.io/system:kubernetes-to-kubelet created clusterrolebinding.rbac.authorization.k8s.io/system:kubernetes created 從新進到容器查看資源 [root@test4 ~]# kubectl exec -it http-test-dm2-6dbd76c7dd-cv9qf sh / # exit 如今能夠進到容器裏面查看資源了 這是以前實驗報錯後解決的結果貼到這裏了,如今尚未pod,因此沒法操做這一步 啓動 kubelet 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /var/lib/kubelet" ssh root@${node_ip} "/usr/sbin/swapoff -a" ssh root@${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet" done 關閉 swap 分區,不然 kubelet 會啓動失敗; 必須先建立工做和日誌目錄; 查看日誌 journalctl -u kubelet |tail Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.388242 22343 feature_gate.go:226] feature gates: &{{} map[RotateKubeletServerCertificate:true RotateKubeletClientCertificate:true]} Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.394342 22343 mount_linux.go:211] Detected OS with systemd Jun 13 16:05:40 test2 kubelet[22343]: W0613 16:05:40.394494 22343 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399508 22343 server.go:376] Version: v1.10.4 Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399583 22343 feature_gate.go:226] feature gates: &{{} map[RotateKubeletServerCertificate:true RotateKubeletClientCertificate:true]} Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399736 22343 plugins.go:89] No cloud provider specified. Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399752 22343 server.go:492] No cloud provider specified: "" from the config file: "" Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399777 22343 bootstrap.go:58] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.446068 22343 csr.go:105] csr for this node already exists, reusing Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.453761 22343 csr.go:113] csr for this node is still valid kubelet 啓動後使用 --bootstrap-kubeconfig 向 kube-apiserver 發送 CSR 請求,當這個 CSR 被 approve 後,kube-controller-manager 爲 kubelet 建立 TLS 客戶端證書、私鑰和 --kubeletconfig 文件。 注意:kube-controller-manager 須要配置 --cluster-signing-cert-file 和 --cluster-signing-key-file 參數,纔會爲 TLS Bootstrap 建立證書和私鑰。 查看csr kubectl get csr NAME AGE REQUESTOR CONDITION node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk 43s system:bootstrap:zkiem5 Pending node-csr-oVbPmU-ikVknpynwu0Ckz_MvkAO_F1j0hmbcDa__sGA 27s system:bootstrap:mkus5s Pending node-csr-u0E1-ugxgotO_9FiGXo8DkD6a7-ew8sX2qPE6KPS2IY 13m system:bootstrap:k0s2bj Pending kubectl get nodes No resources found. 三個 work 節點的 csr 均處於 pending 狀態; approve kubelet CSR 請求 能夠手動或自動 approve CSR 請求。推薦使用自動的方式,由於從 v1.8 版本開始,能夠自動輪轉approve csr 後生成的證書。 方式、手動 approve CSR 請求 查看 CSR 列表: kubectl get csr NAME AGE REQUESTOR CONDITION node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk 43s system:bootstrap:zkiem5 Pending node-csr-oVbPmU-ikVknpynwu0Ckz_MvkAO_F1j0hmbcDa__sGA 27s system:bootstrap:mkus5s Pending node-csr-u0E1-ugxgotO_9FiGXo8DkD6a7-ew8sX2qPE6KPS2IY 13m system:bootstrap:k0s2bj Pending approve CSR: kubectl certificate approve node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk certificatesigningrequest.certificates.k8s.io "node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk" approved 查看 Approve 結果: kubectl describe csr node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk Name: node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk Labels: <none> Annotations: <none> CreationTimestamp: Wed, 13 Jun 2018 16:05:04 +0800 Requesting User: system:bootstrap:zkiem5 Status: Approved Subject: Common Name: system:node:test2 Serial Number: Organization: system:nodes Events: <none> Requesting User:請求 CSR 的用戶,kube-apiserver 對它進行認證和受權; Subject:請求籤名的證書信息; 證書的 CN 是 system:node:test2, Organization 是 system:nodes,kube-apiserver 的 Node 受權模式會授予該證書的相關權限; 方式2、自動 approve CSR 請求 建立三個 ClusterRoleBinding,分別用於自動 approve client、renew client、renew server 證書: cat > csr-crb.yaml <<EOF # Approve all CSRs for the group "system:bootstrappers" kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: auto-approve-csrs-for-group subjects: - kind: Group name: system:bootstrappers apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:nodeclient apiGroup: rbac.authorization.k8s.io --- # To let a node of the group "system:nodes" renew its own credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-client-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient apiGroup: rbac.authorization.k8s.io --- # A ClusterRole which instructs the CSR approver to approve a node requesting a # serving cert matching its client cert. kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: approve-node-server-renewal-csr rules: - apiGroups: ["certificates.k8s.io"] resources: ["certificatesigningrequests/selfnodeserver"] verbs: ["create"] --- # To let a node of the group "system:nodes" renew its own server credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-server-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: approve-node-server-renewal-csr apiGroup: rbac.authorization.k8s.io EOF auto-approve-csrs-for-group:自動 approve node 的第一次 CSR; 注意第一次 CSR 時,請求的 Group 爲 system:bootstrappers; node-client-cert-renewal:自動 approve node 後續過時的 client 證書,自動生成的證書 Group 爲 system:nodes; node-server-cert-renewal:自動 approve node 後續過時的 server 證書,自動生成的證書 Group 爲 system:nodes; 生效配置: kubectl apply -f csr-crb.yaml 查看 kublet 的狀況 等待一段時間(1-10 分鐘),三個節點的 CSR 都被自動 approve: kubectl get csr NAME AGE REQUESTOR CONDITION csr-98h25 6m system:node:test2 Approved,Issued csr-lb5c9 7m system:node:test3 Approved,Issued csr-m2hn4 14m system:node:test1 Approved,Issued node-csr-7q7i0q4MF_K2TSEJj16At4CJFLlJkHIqei6nMIAaJCU 28m system:bootstrap:k0s2bj Approved,Issued node-csr-ND77wk2P8k2lHBtgBaObiyYw0uz1Um7g2pRvveMF-c4 35m system:bootstrap:mkus5s Approved,Issued node-csr-Nysmrw55nnM48NKwEJuiuCGmZoxouK4N8jiEHBtLQso 6m system:bootstrap:zkiem5 Approved,Issued node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk 1h system:bootstrap:zkiem5 Approved,Issued node-csr-oVbPmU-ikVknpynwu0Ckz_MvkAO_F1j0hmbcDa__sGA 1h system:bootstrap:mkus5s Approved,Issued node-csr-u0E1-ugxgotO_9FiGXo8DkD6a7-ew8sX2qPE6KPS2IY 1h system:bootstrap:k0s2bj Approved,Issued 全部節點均 ready: kubectl get nodes NAME STATUS ROLES AGE VERSION test1 Ready <none> 18m v1.10.4 test2 Ready <none> 10m v1.10.4 test3 Ready <none> 11m v1.10.4 查看kube-controller-manager 爲各 node 生成了 kubeconfig 文件和公私鑰: ls -l /etc/kubernetes/kubelet.kubeconfig -rw------- 1 root root 2293 Jun 13 17:07 /etc/kubernetes/kubelet.kubeconfig ls -l /etc/kubernetes/cert/|grep kubelet -rw-r--r-- 1 root root 1046 Jun 13 17:07 kubelet-client.crt -rw------- 1 root root 227 Jun 13 17:07 kubelet-client.key -rw------- 1 root root 1334 Jun 13 17:07 kubelet-server-2018-06-13-17-07-45.pem lrwxrwxrwx 1 root root 58 Jun 13 17:07 kubelet-server-current.pem -> /etc/kubernetes/cert/kubelet-server-2018-06-13-17-07-45.pem kubelet-server 證書會週期輪轉; 查看kubelet開啓的端口(親測實驗1.13.0版本看不到4194端口,加上後kubelet就沒法啓動) kubelet 提供的 API 接口 kublet 啓動後監聽多個端口,用於接收 kube-apiserver 或其它組件發送的請求: sudo netstat -lnpt|grep kubelet tcp 0 0 192.168.0.92:4194 0.0.0.0:* LISTEN 2490/kubelet tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 2490/kubelet tcp 0 0 192.168.0.92:10250 0.0.0.0:* LISTEN 2490/kubelet 4194: cadvisor http 服務; 10248: healthz http 服務; 10250: https API 服務;注意:未開啓只讀端口 10255; 例如執行 kubectl ec -it nginx-ds-5rmws -- sh 命令時,kube-apiserver 會向 kubelet 發送以下請求: POST /exec/default/nginx-ds-5rmws/my-nginx?command=sh&input=1&output=1&tty=1 kubelet 接收 10250 端口的 https 請求: /pods、/runningpods /metrics、/metrics/cadvisor、/metrics/probes /spec /stats、/stats/container /logs /run/、"/exec/", "/attach/", "/portForward/", "/containerLogs/" 等管理; 詳情參考:https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/server/server.go#L434:3 因爲關閉了匿名認證,同時開啓了 webhook 受權,全部訪問 10250 端口 https API 的請求都須要被認證和受權。 預約義的 ClusterRole system:kubelet-api-admin 授予訪問 kubelet 全部 API 的權限: kubectl describe clusterrole system:kubelet-api-admin Name: system:kubelet-api-admin Labels: kubernetes.io/bootstrapping=rbac-defaults Annotations: rbac.authorization.kubernetes.io/autoupdate=true PolicyRule: Resources Non-Resource URLs Resource Names Verbs --------- ----------------- -------------- ----- nodes [] [] [get list watch proxy] nodes/log [] [] [*] nodes/metrics [] [] [*] nodes/proxy [] [] [*] nodes/spec [] [] [*] nodes/stats [] [] [*] kublet api 認證和受權 kublet 配置了以下認證參數: authentication.anonymous.enabled:設置爲 false,不容許匿名?訪問 10250 端口; authentication.x509.clientCAFile:指定簽名客戶端證書的 CA 證書,開啓 HTTPs 證書認證; authentication.webhook.enabled=true:開啓 HTTPs bearer token 認證; 同時配置了以下受權參數: authroization.mode=Webhook:開啓 RBAC 受權; kubelet 收到請求後,使用 clientCAFile 對證書籤名進行認證,或者查詢 bearer token 是否有效。若是二者都沒經過,則拒絕請求,提示 Unauthorized: curl -s --cacert /etc/kubernetes/cert/ca.pem https://192.168.0.92:10250/metrics Unauthorized curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer 123456" https://192.168.0.92:10250/metrics Unauthorized 經過認證後,kubelet 使用 SubjectAccessReview API 向 kube-apiserver 發送請求,查詢證書或 token 對應的 user、group 是否有操做資源的權限(RBAC); 證書認證和受權: 權限不足的證書; curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /etc/kubernetes/cert/kube-controller-manager.pem --key /etc/kubernetes/cert/kube-controller-manager-key.pem https://192.168.0.92:10250/metrics Forbidden (user=system:kube-controller-manager, verb=get, resource=nodes, subresource=metrics) 使用部署 kubectl 命令行工具時建立的、具備最高權限的 admin 證書; curl -s --cacert /etc/kubernetes/cert/ca.pem --cert ./admin.pem --key ./admin-key.pem https://192.168.0.92:10250/metrics|head # HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request. # TYPE apiserver_client_certificate_expiration_seconds histogram apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="21600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="43200"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="86400"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="172800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="345600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="604800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="2.592e+06"} 0 --cacert、--cert、--key 的參數值必須是文件路徑,如上面的 ./admin.pem 不能省略 ./,不然返回 401 Unauthorized; bear token 認證和受權: 建立一個 ServiceAccount,將它和 ClusterRole system:kubelet-api-admin 綁定,從而具備調用 kubelet API 的權限: kubectl create sa kubelet-api-test kubectl create clusterrolebinding kubelet-api-test --clusterrole=system:kubelet-api-admin --serviceaccount=default:kubelet-api-test SECRET=$(kubectl get secrets | grep kubelet-api-test | awk '{print $1}') TOKEN=$(kubectl describe secret ${SECRET} | grep -E '^token' | awk '{print $2}') echo ${TOKEN} curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer ${TOKEN}" https://192.168.0.92:10250/metrics|head # HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request. # TYPE apiserver_client_certificate_expiration_seconds histogram apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="21600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="43200"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="86400"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="172800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="345600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="604800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="2.592e+06"} 0 cadvisor 和 metrics cadvisor 統計?所在節點各容器的資源(CPU、內存、磁盤、網卡)使用狀況,分別在本身的 http web 頁面(4194 端口)和 10250 以 promehteus metrics 的形式輸出。 瀏覽器訪問 http://192.168.0.91:4194/containers/ 能夠查看到 cadvisor 的監控頁面: 瀏覽器訪問 https://172.27.129.80:10250/metrics 和 https://172.27.129.80:10250/metrics/cadvisor 分別返回 kublet 和 cadvisor 的 metrics。 注意: kublet.config.json 設置 authentication.anonymous.enabled 爲 false,不容許匿名證書訪問 10250 的 https 服務; 參考A.瀏覽器訪問kube-apiserver安全端口.md,建立和導入相關證書,而後訪問上面的 10250 端口; 獲取 kublet 的配置 從 kube-apiserver 獲取各 node 的配置: source /opt/k8s/bin/environment.sh 使用部署 kubectl 命令行工具時建立的、具備最高權限的 admin 證書; curl -sSL --cacert /etc/kubernetes/cert/ca.pem --cert ./admin.pem --key ./admin-key.pem ${KUBE_APISERVER}/api/v1/nodes/test1/proxy/configz | jq \ '.kubeletconfig|.kind="KubeletConfiguration"|.apiVersion="kubelet.config.k8s.io/v1beta1"' { "syncFrequency": "1m0s", "fileCheckFrequency": "20s", "httpCheckFrequency": "20s", "address": "172.27.129.80", "port": 10250, "readOnlyPort": 10255, "authentication": { "x509": {}, "webhook": { "enabled": false, "cacheTTL": "2m0s" }, "anonymous": { "enabled": true } }, "authorization": { "mode": "AlwaysAllow", "webhook": { "cacheAuthorizedTTL": "5m0s", "cacheUnauthorizedTTL": "30s" } }, "registryPullQPS": 5, "registryBurst": 10, "eventRecordQPS": 5, "eventBurst": 10, "enableDebuggingHandlers": true, "healthzPort": 10248, "healthzBindAddress": "127.0.0.1", "oomScoreAdj": -999, "clusterDomain": "cluster.local.", "clusterDNS": [ "10.254.0.2" ], "streamingConnectionIdleTimeout": "4h0m0s", "nodeStatusUpdateFrequency": "10s", "imageMinimumGCAge": "2m0s", "imageGCHighThresholdPercent": 85, "imageGCLowThresholdPercent": 80, "volumeStatsAggPeriod": "1m0s", "cgroupsPerQOS": true, "cgroupDriver": "cgroupfs", "cpuManagerPolicy": "none", "cpuManagerReconcilePeriod": "10s", "runtimeRequestTimeout": "2m0s", "hairpinMode": "promiscuous-bridge", "maxPods": 110, "podPidsLimit": -1, "resolvConf": "/etc/resolv.conf", "cpuCFSQuota": true, "maxOpenFiles": 1000000, "contentType": "application/vnd.kubernetes.protobuf", "kubeAPIQPS": 5, "kubeAPIBurst": 10, "serializeImagePulls": false, "evictionHard": { "imagefs.available": "15%", "memory.available": "100Mi", "nodefs.available": "10%", "nodefs.inodesFree": "5%" }, "evictionPressureTransitionPeriod": "5m0s", "enableControllerAttachDetach": true, "makeIPTablesUtilChains": true, "iptablesMasqueradeBit": 14, "iptablesDropBit": 15, "featureGates": { "RotateKubeletClientCertificate": true, "RotateKubeletServerCertificate": true }, "failSwapOn": true, "containerLogMaxSize": "10Mi", "containerLogMaxFiles": 5, "enforceNodeAllocatable": [ "pods" ], "kind": "KubeletConfiguration", "apiVersion": "kubelet.config.k8s.io/v1beta1" } 07-3.部署 kube-proxy 組件 kube-proxy 運行在全部 worker 節點上,,它監聽 apiserver 中 service 和 Endpoint 的變化狀況,建立路由規則來進行服務負載均衡。 本文檔講解部署 kube-proxy 的部署,使用 ipvs 模式。 下載和分發 kube-proxy 二進制文件 參考 06-0.部署master節點.md 安裝依賴包 各節點須要安裝 ipvsadm 和 ipset 命令,加載 ip_vs 內核模塊。 參考 07-0.部署worker節點.md 建立 kube-proxy 證書 建立證書籤名請求: cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF CN:指定該證書的 User 爲 system:kube-proxy; 預約義的 RoleBinding system:node-proxier 將User system:kube-proxy 與 Role system:node-proxier 綁定,該 Role 授予了調用 kube-apiserver Proxy 相關 API 的權限; 該證書只會被 kube-proxy 當作 client 證書使用,因此 hosts 字段爲空; 生成證書和私鑰: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy 建立和分發 kubeconfig 文件 source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/cert/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=kube-proxy.kubeconfig kubectl config set-credentials kube-proxy \ --client-certificate=kube-proxy.pem \ --client-key=kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=kube-proxy.kubeconfig kubectl config set-context default \ --cluster=kubernetes \ --user=kube-proxy \ --kubeconfig=kube-proxy.kubeconfig kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig --embed-certs=true:將 ca.pem 和 admin.pem 證書內容嵌入到生成的 kubectl-proxy.kubeconfig 文件中(不加時,寫入的是證書文件路徑); 分發 kubeconfig 文件: source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.kubeconfig k8s@${node_name}:/etc/kubernetes/ done 建立 kube-proxy 配置文件 從 v1.10 開始,kube-proxy 部分參數能夠配置文件中配置。可使用 --write-config-to 選項生成該配置文件,或者參考 kubeproxyconfig 的類型定義源文件 :https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/apis/kubeproxyconfig/types.go 建立 kube-proxy config 文件模板: cat >kube-proxy.config.yaml.template <<EOF apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: ##NODE_IP## clientConnection: kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig clusterCIDR: ${CLUSTER_CIDR} healthzBindAddress: ##NODE_IP##:10256 hostnameOverride: ##NODE_NAME## kind: KubeProxyConfiguration metricsBindAddress: ##NODE_IP##:10249 mode: "iptables" EOF bindAddress: 監聽地址; clientConnection.kubeconfig: 鏈接 apiserver 的 kubeconfig 文件; clusterCIDR: kube-proxy 根據 --cluster-cidr 判斷集羣內部和外部流量,指定 --cluster-cidr 或 --masquerade-all 選項後 kube-proxy 纔會對訪問 Service IP 的請求作 SNAT; hostnameOverride: 參數值必須與 kubelet 的值一致,不然 kube-proxy 啓動後會找不到該 Node,從而不會建立任何 iptables 規則; mode: 使用 iptables 模式; 爲各節點建立和分發 kube-proxy 配置文件: source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do echo ">>> ${NODE_NAMES[i]}" sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-proxy.config.yaml.template > kube-proxy-${NODE_NAMES[i]}.config.yaml scp kube-proxy-${NODE_NAMES[i]}.config.yaml root@${NODE_NAMES[i]}:/etc/kubernetes/kube-proxy.config.yaml done 替換後的 kube-proxy.config.yaml 文件:kube-proxy.config.yaml 建立和分發 kube-proxy systemd unit 文件 source /opt/k8s/bin/environment.sh cat > kube-proxy.service <<EOF [Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] WorkingDirectory=/var/lib/kube-proxy ExecStart=/opt/k8s/bin/kube-proxy \\ --config=/etc/kubernetes/kube-proxy.config.yaml \\ --alsologtostderr=true \\ --logtostderr=false \\ --log-dir=/var/log/kubernetes \\ --v=2 Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF 替換後的 unit 文件:kube-proxy.service 分發 kube-proxy systemd unit 文件: source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.service root@${node_name}:/etc/systemd/system/ done 啓動 kube-proxy 服務 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /var/lib/kube-proxy" ssh root@${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy" done 必須先建立工做和日誌目錄;參照文檔:https://github.com/opsnull/follow-me-install-kubernetes-cluster