背景: 衆所周知的是在構建一個Kubernetes集羣時,容器網絡一般會使用一個獨立的私有子網來構建Kubernetes集羣內部的pod網絡和service網絡,但在實際的業務場景中,沒有企業會在一段時間內將內部所有的服務都遷移到Kubernetes集羣中(由於涉及到業務架構以及總體業務的可靠性),於是會產生一些Kubernetes集羣內部服務和集羣外部服務互相調用的場景,固然若是是HTTP服務,咱們能夠採用LVS、Nginx、HAProxy之類的代理工具工具進行集羣內外的流量轉發,但若是是TCP服務,好比使用Dubbo框架時,生產者和消費者須要直連,當生產者和消費者不在一個能夠互聯互通的網絡下會比較麻煩,這也就是爲何大廠在規模化使用Kubernetes時首先須要解決的就是網絡問題的緣由了。好比咱們在數科的時候就採用的是Contiv+BGP的模式來實現容器網絡和容器外網絡的互聯互通的,而這一般須要一個比較專業的SDN團隊來構建和維護。而做爲創業公司一般會使用公有云來承載本身的業務,這種輕資產模式的好處就是底層會有專業的團隊來提供保障,所以考慮到業務需求咱們採用了阿里雲的terway網絡插件來實現內部的Kubernetes集羣網絡.html
以上就是當前開源Kubernetes集羣中使用較多的集中網絡方案,咱們的業務需求中也是須要打通容器內外的網絡,所以在成本、效率以及穩定性上優先選擇採用阿里雲的Terway網絡方案來知足咱們的Kubernetes集羣需求.node
注意:
阿里雲容器服務ACK默認也支持兩種網絡,Flannel和Terway,前者和開源插件基本一致,後者支持VPC模式和ENI模式,VPC模式可實現容器網絡使用vpc內交換機子網地址,可是默認沒法和其餘交換機下的ecs主機通訊,ENI模式會給pod容器組分配一塊彈性網卡來實現和集羣外網絡的互聯互通,但Terway網絡下的ENI模式須要部分特殊機型才能夠支持。linux
因爲ACK下Terway的ENI模式對機型的要求,咱們採用購買ECS來本身搭建單節點集羣測試Terway
網絡下容器的互聯互通.nginx
前提條件:
git
注意:
Terway網絡插件官方驗證過的os鏡像爲Centos 7.4/7.6
,購買ecs時須要注意github
1.使用kubeadm安裝k8s單節點集羣docker
注意:
由於要使用terway網絡將pod和ecs網絡打通,所以須要將內核參數rp_filter
所有設置爲0
(對數據包源地址不進行校驗)api
# 更新yum源並安裝k8s相關組件
$ yum update
$ cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
$ yum clean all
$ yum install kubelet kubeadm kubectl --disableexcludes=kubernetes -y
$ yum install docker -y
# 啓動kubelet
## 此時kubelet會無限重試,由於會連接apiserver
$ systemctl restart kubelet
# 啓動docker
## 注意:須要注意kubelet中的cgroupfs類型要和docker的cgroupfs一致
$ systemctl restart docker
# 查看kubeadm 啓動集羣時所需鏡像
# 注意:kubeadm默認使用的是谷歌的鏡像倉庫,可將鏡像倉庫換成阿里雲鏡像倉庫
# 將k8s.gcr.io 替換成registry.cn-hangzhou.aliyuncs.com/google_containers 便可
$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.16.2
k8s.gcr.io/kube-controller-manager:v1.16.2
k8s.gcr.io/kube-scheduler:v1.16.2
k8s.gcr.io/kube-proxy:v1.16.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.15-0
k8s.gcr.io/coredns:1.6.2
# 初始化集羣
## 注意:初始化時須要指定vpc的子網,不然後期可能會發現沒法識別vpc子網
$ kubeadm init --pod-network-cidr=172.16.48.0/20
....
....
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.16.62.70:6443 --token j4b3xp.78izi2bmitxxx \
--discovery-token-ca-cert-hash sha256:fd1ff50cbabd4fb22cb9a866052fbdc0db7da662168cda702exxxxxxxx
# 接下來按照上述提示建立配置文件
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 查看k8s的node節點(當前處於NotReady狀態,由於kubelet尚未成功啓動)
$ # kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
izbp18diszrt8m41b2fbpsz NotReady master 7m19s v1.16.2
複製代碼
2. 給k8s集羣建立terway網絡安全
注意:
使用kubeadm建立的k8s集羣是v1.16的,官方提供的yaml文件中須要稍微修改下DaemonSet的相關部分網絡
# 給集羣建立k8s的cni網絡插件,也就是前面說的terway插件
# 須要修改阿里雲相關的配置(ak,as,subnet,security_group)
$ curl -O https://raw.githubusercontent.com/BGBiao/k8s-ansible-playbooks/master/manifest/cni/terway/podnetwork.yaml
# 修改podnetwork.yaml中的配置(指定阿里雲的ak和as認證信息以及vpc子網和安全組信息)
$ cat podnetwork.yaml
...
...
eni_conf: |
{
"version": "1",
"access_key": "your ak",
"access_secret": "your as",
"service_cidr": "your vpc subnet",
"security_group": "your 安全組id",
"max_pool_size": 5,
"min_pool_size": 0
}
....
....
- name: Network
value: "your vpc subnet"
....
# 建立terway網絡
$ kubectl apply -f podnetwork.yaml
serviceaccount/terway created
clusterrole.rbac.authorization.k8s.io/terway-pod-reader created
clusterrolebinding.rbac.authorization.k8s.io/terway-binding created
configmap/eni-config created
daemonset.apps/terway created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
# 查看cni相關容器以及node狀態
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
izbp18diszrt8m41b2fbpsz Ready master 28m v1.16.2
$ kubectl get pods -A | grep terway
kube-system terway-b9vm8 2/2 Running 0 6m53s
複製代碼
至此,咱們就已經完成了kubernetes的terway網絡單節點集羣,接下來就能夠嘗試讓k8s集羣中的pod來使用vpc的網絡了,以即可以實現k8s集羣內部的容器網絡和其餘ecs主機的網絡是平行的.
3. 測試terway網絡
注意:
咱們使用kubeadm構建的k8s單節點集羣,而kubeadm默認給master節點設置了taint,所以測試前須要去除taint
# 去除taint
$ kubectl taint nodes --all node-role.kubernetes.io/master-
node/izbp18diszrt8m41b2fbpsz untainted
# 默認建立一個vpc模式的deployment
$ kubectl apply -f https://raw.githubusercontent.com/BGBiao/k8s-ansible-playbooks/master/manifest/cni/terway/nginx.yaml
namespace/myapp configured
deployment.apps/nginx-test created
# 能夠看到容器網絡地址實際上是指定的vpc子網內地址
$ kubectl get pods -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-test-d56c87dd9-26mzs 1/1 Running 0 2m40s 172.16.48.5 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-test-d56c87dd9-hp2rv 1/1 Running 0 2m40s 172.16.48.4 izbp18diszrt8m41b2fbpsz <none> <none>
$ curl 172.16.48.4 -I
HTTP/1.1 200 OK
Server: nginx/1.17.5
Date: Sat, 26 Oct 2019 08:21:28 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 22 Oct 2019 14:30:00 GMT
Connection: keep-alive
ETag: "5daf1268-264"
Accept-Ranges: bytes
$ curl 172.16.48.5 -I
HTTP/1.1 200 OK
Server: nginx/1.17.5
Date: Sat, 26 Oct 2019 08:21:31 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 22 Oct 2019 14:30:00 GMT
Connection: keep-alive
ETag: "5daf1268-264"
Accept-Ranges: bytes
複製代碼
能夠發現,在集羣內部使用terway網絡已經沒有任何問題了,可是咱們在其餘ECS主機去訪問pod網絡時發現依然沒法訪問(由於默認使用的是terway的VPC模式,其實就是相似於calico的模式了.這個時候就須要用到eni模式了,即給k8s節點增長eni彈性網卡,而後pod的網絡流量統一經過node節點的eni網卡傳輸,此時就能夠很好的和整個內網vpc打通了)
4.測試ENI模式
注意:
在上面的nginx配置中增長limits: aliyun/eni: N
便可,須要注意的是N表示node節點上eni彈性網卡的數量,該數量取決於阿里雲ecs不一樣規格對eni的限制.
# 注意:
# 因爲實驗中採用的是4c8g的k8s單節點集羣,所以只能建立2個彈性網卡,這也就意味着若是不增長任何網絡配置,該node節點最多隻能運行2個和整個VPC網絡中其餘ecs主機互聯互通的pod
$ cat nginx.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-v2
namespace: myapp
spec:
revisionHistoryLimit: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
replicas: 2
selector:
matchLabels:
app: nginx-v2
profile: prod
template:
metadata:
labels:
app: nginx-v2
profile: prod
spec:
containers:
- name: nginx-v2
image: nginx:latest
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 200m
memory: 215Mi
limits:
cpu: 200m
memory: 215Mi
aliyun/eni: 1
# 建立帶eni的pod
$ kubectl apply -f nginx.yaml
deployment.apps/nginx-v2 configured
# 查看pod狀態
$ kubectl get pods -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-test-d56c87dd9-26mzs 1/1 Running 0 19m 172.16.48.5 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-test-d56c87dd9-hp2rv 1/1 Running 0 19m 172.16.48.4 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-v2-7548466fc8-d4klv 1/1 Running 0 61s 172.16.62.74 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-v2-7548466fc8-x7ft9 1/1 Running 0 61s 172.16.62.75 izbp18diszrt8m41b2fbpsz <none> <none>
# 在k8snode節點訪問
$ curl 172.16.62.75 -I
HTTP/1.1 200 OK
Server: nginx/1.17.5
Date: Sat, 26 Oct 2019 08:38:20 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 22 Oct 2019 14:30:00 GMT
Connection: keep-alive
ETag: "5daf1268-264"
Accept-Ranges: bytes
$ curl 172.16.62.74 -I
HTTP/1.1 200 OK
Server: nginx/1.17.5
Date: Sat, 26 Oct 2019 08:38:23 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 22 Oct 2019 14:30:00 GMT
Connection: keep-alive
ETag: "5daf1268-264"
Accept-Ranges: bytes
# 此時發現建立的帶eni和不帶eni的兩個pod在k8s集羣內部已經徹底能夠訪問
複製代碼
5. 測試集羣內外部網絡互聯互通
注意:
k8s集羣使用的是vpc網絡,所以默認集羣訪問外部ECS網絡默認是沒有問題,這裏主要測試外部ECS網絡是否能夠直連pod網絡進行通訊
# 在同vpc環境下其餘ecs主機上訪問
# 首先分別ping 上述四個pod的網絡(能夠發現eni模式下容器默承認以ping通)
$ for i in 172.16.48.5 172.16.48.4 172.16.62.74 172.16.62.75 ;do ping -c 1 -w 1 $i;done
PING 172.16.48.5 (172.16.48.5) 56(84) bytes of data.
--- 172.16.48.5 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms
PING 172.16.48.4 (172.16.48.4) 56(84) bytes of data.
--- 172.16.48.4 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
PING 172.16.62.74 (172.16.62.74) 56(84) bytes of data.
64 bytes from 172.16.62.74: icmp_seq=1 ttl=64 time=0.782 ms
--- 172.16.62.74 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.782/0.782/0.782/0.000 ms
PING 172.16.62.75 (172.16.62.75) 56(84) bytes of data.
64 bytes from 172.16.62.75: icmp_seq=1 ttl=64 time=0.719 ms
--- 172.16.62.75 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.719/0.719/0.719/0.000 ms
# 測試nginx服務(依然是帶eni的網絡可達)
$ for i in 172.16.48.5 172.16.48.4 172.16.62.74 172.16.62.75 ;do curl --connect-timeout 1 -I $i;done
curl: (28) Connection timed out after 1001 milliseconds
curl: (28) Connection timed out after 1001 milliseconds
HTTP/1.1 200 OK
Server: nginx/1.17.5
Date: Sat, 26 Oct 2019 08:44:21 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 22 Oct 2019 14:30:00 GMT
Connection: keep-alive
ETag: "5daf1268-264"
Accept-Ranges: bytes
HTTP/1.1 200 OK
Server: nginx/1.17.5
Date: Sat, 26 Oct 2019 08:44:21 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 22 Oct 2019 14:30:00 GMT
Connection: keep-alive
ETag: "5daf1268-264"
Accept-Ranges: bytes
複製代碼
此時,咱們查看該node節點上的網卡信息時能夠看到,增長了兩塊輔助網卡
6. 其餘問題
注意:
前面咱們提到過,若是使用eni
模式,不一樣的ECS規格能夠綁定的ENI
彈性網卡是有限的,也就是說能夠建立互聯互通的容器是有限的,咱們這裏驗證下
# 若是咱們這個時候再建立帶eni的pod時,就會發現沒法建立成功(由於4c8g的ecs最大隻支持兩個eni)
$ kubectl apply -f nginx-v3.yaml
deployment.apps/nginx-v3 created
[root@iZbp18diszrt8m41b2fbpsZ ~]# kubectl get pods -n myapp -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-test-d56c87dd9-26mzs 1/1 Running 0 48m 172.16.48.5 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-test-d56c87dd9-hp2rv 1/1 Running 0 48m 172.16.48.4 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-v2-7548466fc8-d4klv 1/1 Running 0 29m 172.16.62.74 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-v2-7548466fc8-x7ft9 1/1 Running 0 29m 172.16.62.75 izbp18diszrt8m41b2fbpsz <none> <none>
nginx-v3-79dd8fb956-4ghgb 0/1 Pending 0 2s <none> <none> <none> <none>
nginx-v3-79dd8fb956-str2k 0/1 Pending 0 2s <none> <none> <none> <none>
# 查看Pending的詳情
$ kubectl describe pods -n myapp nginx-v3-79dd8fb956-4ghgb
....
....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient aliyun/eni.
複製代碼
能夠發現,當使用terway網絡的ENI
模式時,若是該ecs可支持的彈性網卡達到限制,k8s就會調度失敗。
因此問題就來了,一般狀況下,咱們是但願使用k8s來彈性擴容,咱們會但願k8s節點上運行更多的pod,但用了terway網絡以後咱們發現,建立和k8s集羣外ecs主機通訊的pod數量居然受eni的限制,這可得了?
其實不用擔憂,阿里雲同窗的回覆是,這種狀況下在vpc上設置靜態路由便可實現node節點上的多pod和集羣外ecs主機互通,此時ecs主機上的eni僅至關因而整個容器的網絡出口,到這裏其實咱們就能夠放心了,由於使用terway後,及時不用eni模式,pod網絡也是全局惟一的,這個時候適當增長一些靜態路由,便可實現整個vpc內k8s容器網絡和容器外的ecs主機網絡互聯互通,很好的解決了咱們一開始的問題。
注意:
阿里雲容器服務ACK的terway網絡模式下的集羣會默認建立一些路由規則,所以當你使用ACK集羣時,只要購買了支持terway
規格的節點,默認建立的容器均可以實現和外部ecs主機的互聯互通,此時,該ecs上建立的彈性網卡將做爲節點上k8s容器的網絡出口,而ecs主機自己的eth0
將僅做爲管理網絡而存在,感興趣的同窗能夠點擊閱讀原文
嘗試使用阿里雲ACK的terway
網絡模式.
閱讀全文歡迎關注個人公衆號: BGBiao,一塊兒進步~