Kubernetes使用kubeadm安裝默認只有一個etcd實例,存在單點故障的風險。提高Kubernetes集羣可用性的方法包括:一、備份(Kubernetes探祕—etcd狀態數據及其備份 );二、etcd節點和實例擴容;三、apiserver的多節點服務和負載均衡。這裏主要實驗etcd節點和實例的擴容。node
etcd是一個獨立的服務,在kubernetes中使用時將配置參數和數據目錄分別映射到了宿主機目錄,並且使用hostnetwork網絡(本主機網絡)。其中,/etc/kubernetes/manifest/etcd.yaml 爲啓動參數文件,/etc/kubernetes/pki/etcd 爲 https使用的證書,/var/lib/etcd 爲該節點的etcd數據文件。linux
對於已用kubeadm安裝的單Master節點Kubernetes集羣,其etcd運行實例只有一個。咱們但願將其etcd實例擴展到多個,以下降單點失效風險。Kubernetes中etcd的擴容的思路以下:git
準備好安裝etcd的節點。我使用ubuntu 18.04LTS,而後安裝Docker CE 18.06和kubernetes 1.12.3。github
我這裏的三個節點分別爲:docker
須要提早把k8s用到的容器鏡像拉取下來到每個節點。參考:json
本想嘗試複製主節點的/etc/kubernetes/kpi和/etc/kubernetes/manifest目錄到全部副(mate)節點,啓動後出現各類問題沒法正常訪問,提示是ca證書問題。最後,準備從頭開始建立本身的證書和部署yaml文件。ubuntu
建立證書使用cfssl來建立,須要下載模版文件和修改定義文件,包括ca機構、ca-config配置、ca-key私鑰、csr請求、server/peer/client等證書的配置模版文件等。須要將裏面的信息按照本身的環境進行修改。segmentfault
下面說明具體過程(更多信息參考 http://www.javashuo.com/article/p-zzahcksu-hh.html)。api
mkdir ~/cfssl && cd ~/cfssl mkdir bin && cd bin wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -O cfssl wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -O cfssljson chmod +x {cfssl,cfssljson} export PATH=$PATH:~/cfssl/bin
建立證書配置文件目錄:服務器
mkdir -p ~/cfssl/etcd-certs && cd ~/cfssl/etcd-certs
生成證書配置文件放到~/cfssl/etcd-certs目錄中,文件模版以下:
# ============================================== # ca-config.json { "signing": { "default": { "expiry": "43800h" }, "profiles": { "server": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth" ] }, "client": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "client auth" ] }, "peer": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } # ============================================== # ca-csr.json { "CN": "My own CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "US", "L": "CA", "O": "My Company Name", "ST": "San Francisco", "OU": "Org Unit 1", "OU": "Org Unit 2" } ] } # ============================================== # server.json { "CN": "etcd0", "hosts": [ "127.0.0.1", "0.0.0.0", "10.1.1.201", "10.1.1.202", "10.1.1.203" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] } # ============================================== # peer1.json # 填本機IP { "CN": "etcd0", "hosts": [ "10.1.1.201" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] } # ============================================== # client.json { "CN": "client", "hosts": [ "" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] }
操做以下:
cd ~/cfssl/etcd-certs cfssl gencert -initca ca-csr.json | cfssljson -bare ca - cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer peer1.json | cfssljson -bare peer1 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client
查看所產生的證書文件:
ls -l ~/cfssl/etcd-certs
文件包括:
...
啓動etcd實例以前,務必將/var/lib/etcd目錄清空,不然一些設置的參數將不會起做用,仍然保留原來的狀態。
注意,etcd的下面幾個參數只在第一次啓動(初始化)時起做用,包括:
將cfssl/etcd-certs目錄拷貝到/etc/kubernetes/pki/etcd-certs 目錄,可使用scp或sftp上傳。
編輯/etc/kubernetes/manifests/etcd.yaml文件,這是kubelet啓動etcd實例的配置文件。
# /etc/kubernetes/manifests/etcd.yaml apiVersion: v1 kind: Pod metadata: annotations: scheduler.alpha.kubernetes.io/critical-pod: "" creationTimestamp: null labels: component: etcd tier: control-plane name: etcd namespace: kube-system spec: containers: - command: - etcd - --advertise-client-urls=https://10.1.1.201:2379 - --cert-file=/etc/kubernetes/pki/etcd-certs/server.pem - --client-cert-auth=true - --data-dir=/var/lib/etcd - --initial-advertise-peer-urls=https://10.1.1.201:2380 - --initial-cluster=etcd0=https://10.1.1.201:2380 - --key-file=/etc/kubernetes/pki/etcd-certs/server-key.pem - --listen-client-urls=https://10.1.1.201:2379 - --listen-peer-urls=https://10.1.1.201:2380 - --name=etcd1 - --peer-cert-file=/etc/kubernetes/pki/etcd-certs/peer1.pem - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd-certs/peer1-key.pem - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd-certs/ca.pem - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd-certs/ca.pem image: k8s.gcr.io/etcd-amd64:3.2.18 imagePullPolicy: IfNotPresent #livenessProbe: # exec: # command: # - /bin/sh # - -ec # - ETCDCTL_API=3 etcdctl --endpoints=https://[10.1.1.201]:2379 --cacert=/etc/kubernetes/pki/etcd-certs/ca.pem # --cert=/etc/kubernetes/pki/etcd-certs/client.pem --key=/etc/kubernetes/pki/etcd-certs/client-key.pem # get foo # failureThreshold: 8 # initialDelaySeconds: 15 # timeoutSeconds: 15 name: etcd resources: {} volumeMounts: - mountPath: /var/lib/etcd name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priorityClassName: system-cluster-critical volumes: - hostPath: path: /var/lib/etcd type: DirectoryOrCreate name: etcd-data - hostPath: path: /etc/kubernetes/pki/etcd-certs type: DirectoryOrCreate name: etcd-certs status: {}
參照上面的模式,在各個副節點修改etcd啓動參數/etc/kubernetes/manifest/etcd.yaml文件內容。
進入etcd容器執行:
alias etcdv3="ETCDCTL_API=3 etcdctl --endpoints=https://[10.1.1.201]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/client.pem --key=/etc/kubernetes/pki/etcd/client-key.pem" etcdv3 member add etcd1 --peer-urls="https://10.1.1.202:2380"
拷貝etcd1(10.1.1.201)節點上的證書到etcd1(10.1.1.202)節點上,複製peer1.json到etcd2的peer2.json,修改peer2.json。
# peer2.json { "CN": "etcd1", "hosts": [ "10.1.86.202" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] }
從新生成在etcd1上生成peer1證書:
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer peer1.json | cfssljson -bare peer1
啓動etcd1,配置文件以下:
# etcd02 etcd.yaml apiVersion: v1 kind: Pod metadata: annotations: scheduler.alpha.kubernetes.io/critical-pod: "" creationTimestamp: null labels: component: etcd tier: control-plane name: etcd namespace: kube-system spec: containers: - command: - etcd - --advertise-client-urls=https://10.1.1.202:2379 - --cert-file=/etc/kubernetes/pki/etcd-certs/server.pem - --data-dir=/var/lib/etcd - --initial-advertise-peer-urls=https://10.1.1.202:2380 - --initial-cluster=etcd01=https://10.1.1.201:2380,etcd02=https://10.1.1.202:2380 - --key-file=/etc/kubernetes/pki/etcd-certs/server-key.pem - --listen-client-urls=https://10.1.1.202:2379 - --listen-peer-urls=https://10.1.1.202:2380 - --name=etcd02 - --peer-cert-file=/etc/kubernetes/pki/etcd-certs/peer2.pem - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd-certs/peer2-key.pem - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd-certs/ca.pem - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd-certs/ca.pem - --initial-cluster-state=existing # 千萬別加雙引號,被坑死 image: k8s.gcr.io/etcd-amd64:3.2.18 imagePullPolicy: IfNotPresent # livenessProbe: # exec: # command: # - /bin/sh # - -ec # - ETCDCTL_API=3 etcdctl --endpoints=https://[10.1.1.202]:2379 --cacert=/etc/kubernetes/pki/etcd-certs/ca.crt # --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd-certs/healthcheck-client.key # get foo # failureThreshold: 8 # initialDelaySeconds: 15 # timeoutSeconds: 15 name: etcd resources: {} volumeMounts: - mountPath: /var/lib/etcd name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priorityClassName: system-cluster-critical volumes: - hostPath: path: /var/lib/etcd type: DirectoryOrCreate name: etcd-data - hostPath: path: /etc/kubernetes/pki/etcd-certs type: DirectoryOrCreate name: etcd-certs status: {}
進入etcd容器執行:
alias etcdv3="ETCDCTL_API=3 etcdctl --endpoints=https://[10.1.86.201]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/client.pem --key=/etc/kubernetes/pki/etcd/client-key.pem" etcdv3 member add etcd1 --peer-urls="https://10.1.1.203:2380"
按照以上步驟,增長etcd03。
# etcdctl --endpoints=https://[10.1.1.201]:2379 --ca-file=/etc/kubernetes/pki/etcd-certs/ca.pem --cert-file=/etc/kubernetes/pki/etcd-certs/client.pem --key-file=/etc/kubernetes/pki/etcd-certs/client-key.pem cluster-health member 5856099674401300 is healthy: got healthy result from https://10.1.86.201:2379 member df99f445ac908d15 is healthy: got healthy result from https://10.1.86.202:2379 cluster is healthy
- --etcd-cafile=/etc/kubernetes/pki/etcd-certs/ca.pem - --etcd-certfile=/etc/kubernetes/pki/etcd-certs/client.pem - --etcd-keyfile=/etc/kubernetes/pki/etcd-certs/client-key.pem
至此,etcd已經擴展成多節點的分佈式集羣,並且各個節點的kubernetes都是能夠訪問的。
注意:
上面所部署的工做節點還只能鏈接到一個apiserver,其它副節點的apiserver雖然可用可是沒法被工做節點鏈接到。
下一步須要實現多master節點的容錯,遇主節點故障時能夠轉移訪問其它的副節點。