k8s系列---網絡插件flannel

跨節點通信,須要經過NAT,即須要作源地址轉換。css

    k8s網絡通訊: node

        1) 容器間通訊:同一個pod內的多個容器間的通訊,經過lo便可實現; python

        2) pod之間的通訊,pod ip <---> pod ip,pod和pod之間要不通過任何轉換便可通訊; git

        3) pod和service通訊:pod ip <----> cluster ip(即service ip)<---->pod ip,他們經過iptables或ipvs實現通訊,另外你們要注意ipvs取代不了iptables,由於ipvs只能作負載均衡,而作不了nat轉換; github

        4) Service與集羣外部客戶端的通訊 web

 

[root@master pki]# kubectl get configmap -n kube-system
NAME                                 DATA      AGE
coredns                              1         22d
extension-apiserver-authentication   6         22d
kube-flannel-cfg                     2         22d
kube-proxy                           2         22d
kubeadm-config                       1         22d
kubelet-config-1.11                  1         22d
kubernetes-dashboard-settings        1         9h

  

[root@master pki]# kubectl get configmap kube-proxy  -o yaml  -n kube-system
mode: ""

  

   看到mode是空的,咱們把它改成ipvs就能夠了。 docker

    k8s要靠CNI接口接入其餘插件來實現網絡通信。目前比較流行的插件有flannet,callco,canel,kube-router。 json

    這些插件使用的解決方案都以下: 後端

    1)虛擬網橋,虛擬網卡,多個容器共用一個虛擬網卡進行通訊; api

    2)多路複用:MacVLAN,多個容器共用一個物理網卡進行通訊; 

    3)硬件交換:SR-LOV,一個物理網卡能夠虛擬出多個接口,這個性能最好。 

 CNI插件存放位置 

[root@master ~]# cat  /etc/cni/net.d/10-flannel.conflist 
{
  "name": "cbr0",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

  

  flanel只支持網絡通信,可是不支持網絡策略。 

    callco網絡通信和網絡策略都支持。

    canel:flanel+callco合起來的功能。

 

    咱們能夠部署flanel提供網絡通信,再部署一個callco只提供網絡策略。而不用canel。 

    mtu:是指一種通訊協議的某一層上面所能經過的最大數據包大小。

[root@master ~]#  ifconfig 
cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 10.244.0.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::4097:d5ff:fe28:6b64  prefixlen 64  scopeid 0x20<link>
        ether 0a:58:0a:f4:00:01  txqueuelen 1000  (Ethernet)
        RX packets 1609844  bytes 116093191 (110.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1632952  bytes 577989701 (551.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:83:f8:b8:ff  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.100  netmask 255.255.255.0  broadcast 172.16.1.255
        inet6 fe80::9cf3:d9de:59f:c320  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::5707:6115:267b:bff5  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::e34:f952:2859:4c69  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:a2:4e:cb  txqueuelen 1000  (Ethernet)
        RX packets 5250378  bytes 704067861 (671.4 MiB)
        RX errors 139  dropped 190  overruns 0  frame 0
        TX packets 4988169  bytes 4151179300 (3.8 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 10.244.0.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::a82c:bcff:fef8:895c  prefixlen 64  scopeid 0x20<link>
        ether aa:2c:bc:f8:89:5c  txqueuelen 0  (Ethernet)
        RX packets 51  bytes 3491 (3.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 53  bytes 5378 (5.2 KiB)
        TX errors 0  dropped 10 overruns 0  carrier 0  collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1  (Local Loopback)
        RX packets 59118846  bytes 15473986573 (14.4 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 59118846  bytes 15473986573 (14.4 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
veth6ec94aab: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::487d:5bff:fef7:484d  prefixlen 64  scopeid 0x20<link>
        ether 4a:7d:5b:f7:48:4d  txqueuelen 0  (Ethernet)
        RX packets 88112  bytes 19831802 (18.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 105718  bytes 13343894 (12.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
vethf703483a: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::b06a:eaff:fec3:33a8  prefixlen 64  scopeid 0x20<link>
        ether b2:6a:ea:c3:33:a8  txqueuelen 0  (Ethernet)
        RX packets 760882  bytes 59400960 (56.6 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 763263  bytes 282299805 (269.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
vethff579703: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::d82f:37ff:fe9a:b6d0  prefixlen 64  scopeid 0x20<link>
        ether da:2f:37:9a:b6:d0  txqueuelen 0  (Ethernet)
        RX packets 760850  bytes 59398245 (56.6 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 764016  bytes 282349248 (269.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

  

 經過ifconfig命令,咱們能夠看到flannel.1的地址是10.244.0.0,子網掩碼是255.255.255.255,mtu是1450,mtu要留出一部分作封裝疊加,額外開銷使用。 

    cni0只有在pod運行時纔會出現。

    兩個節點上的pod能夠藉助flannel隧道進行通訊。默認使用的VxLAN協議,由於它有額外開銷,因此性能有點低。 

    flannel第二種協議叫host-gw(host gateway),即Node節點把本身的網絡接口當作pod的網關使用,從而使不一樣節點上的node進行通訊,這個性能比VxLAN高,由於它沒有額外開銷。不過他有個缺點, 就是各node節點必須在同一個網段中 。 

     另外,若是兩 個pod所在節點在同一個網段中 ,可讓VxLAN也支持host-gw的功能, 即直接經過物理網卡的網關路由轉發,而不用隧道flannel疊加,從而提升了VxLAN的性能,這種flannel的功能叫directrouting。

[root@master ~]# kubectl get daemonset -n kube-system
NAME                      DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
kube-flannel-ds-amd64     3         3         3         3            3           beta.kubernetes.io/arch=amd64 

  

[root@master ~]# kubectl get pods -n kube-system -o wide
NAME                                   READY     STATUS    RESTARTS   AGE       IP             NODE
kube-flannel-ds-amd64-6zqzr            1/1       Running   8          22d       172.16.1.100   master
kube-flannel-ds-amd64-7qtcl            1/1       Running   7          22d       172.16.1.101   node1
kube-flannel-ds-amd64-kpctn            1/1       Running   6          22d       172.16.1.102   node2

  

    看到flannel是以pod的daemonset控制器形式運行的(其實flannel還能夠以守護進程的方式運行)。

 

[root@master ~]# kubectl get configmap -n kube-system
NAME                                 DATA      AGE
kube-flannel-cfg                     2         22d

  

[root@master ~]#kubectl get configmap -n kube-system kube-flannel-cfg -o json -n kube-system
\\\"10.244.0.0/16\\\",\\n  \\\"Backend\\\": {\\n    \\\"Type\\\": \\\"vxlan\

  

   flannel的配置參數: 

        一、network :flannel使用的CIDR格式的網絡地址,用於爲pod配置網絡功能。 

            1)10.244.0.0/16---> 

                    master: 10.244.0.0./24 

                    node01: 10.244.1.0/24 

                    .... 

                    node255: 10.244.255.0/24 

                能夠支持255個節點 

             2)10.0.0.0/8 

                    10.0.0.0/24 

                    ... 

                    10.255.255.0/24 

                能夠支持6萬多個節點 

         二、SubnetLen :把network切分爲子網供各節點使用時,使用多長的掩碼進行切分,默認爲24位; 

         三、SubnetMin :指明子網中的地址段最小多少能夠分給子網使用,好比能夠限制10.244.10.0/24,這樣0~9就不讓用; 

         四、SubnetMax :表示最多使用多少個,好比10.244.100.0/24 

         五、Backend: Vxlan,host-gw,udp(最慢) 

    

flannel

    支持多種後端

    Vxlan

        1.valan

        2.Dirextrouting

    host-gw:Host Gateway  #不推薦,只能在二層網絡中,不支持跨網絡,若是有成千上萬的Pod,容易產生廣播風暴

    UDP:性能差

 

[root@master ~]# kubectl get pods -o wide
NAME                             READY     STATUS             RESTARTS   AGE       IP             NODE
myapp-deploy-69b47bc96d-79fqh    1/1       Running            4          7d        10.244.1.97    node1
myapp-deploy-69b47bc96d-tc54k    1/1       Running            4          7d        10.244.2.88    node2

  

[root@master ~]# kubectl exec -it myapp-deploy-69b47bc96d-79fqh -- /bin/sh
/ # ping 10.244.2.88 #ping對方Node上容器的ip
PING 10.244.2.88 (10.244.2.88): 56 data bytes
64 bytes from 10.244.2.88: seq=0 ttl=62 time=0.459 ms
64 bytes from 10.244.2.88: seq=0 ttl=62 time=0.377 ms
64 bytes from 10.244.2.88: seq=1 ttl=62 time=0.252 ms
64 bytes from 10.244.2.88: seq=2 ttl=62 time=0.261 ms

  

    在其餘節點上抓包,發如今ens192上抓不到包。因此沒走ens192

[root@master ~]# tcpdump -i ens192 -nn icmp

  

[root@master ~]# yum install bridge-utils -y

  

[root@master ~]# brctl show docker0
bridge namebridge idSTP enabledinterfaces
docker08000.024283f8b8ffno

  

[root@master ~]# brctl show cni0
bridge namebridge idSTP enabledinterfaces
cni08000.0a580af40001noveth6ec94aab
vethf703483a
vethff579703

  

  能夠看到veth這些接口都是橋接到cni0上的。

    brctl show表示查看已有網橋。

[root@node1 ~]#  tcpdump -i cni0 -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:40:11.370754 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 96, length 64
23:40:11.370988 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 96, length 64
23:40:12.370888 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 97, length 64
23:40:12.371090 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 97, length 64
^X23:40:13.371015 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 98, length 64
23:40:13.371239 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 98, length 64
23:40:14.371128 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 99, length 64

  

    能夠看到,在node節點,能夠在cni0端口上抓到容器裏面的Ping時的包。

    其實,上面ping時的數據流是先從cni0進來,而後從flannel.1出去,最後藉助物理網卡ens32發出去。因此,咱們在flannel.1上也能抓到包:

[root@node1 ~]#  tcpdump -i flannel.1 -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
03:12:36.823315 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 12840, length 64
03:12:36.823496 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 12840, length 64
03:12:37.823490 IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 4864, seq 12841, length 64
03:12:37.823634 IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 4864, seq 12841, length 64

  

  一樣,在ens192物理網卡上也能抓到包: 

[root@node1 ~]# tcpdump -i ens192 -nn host 172.16.1.102  #172.16.1.102是node2的物理ip
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
10:59:24.234174 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 0, length 64
10:59:24.234434 IP 172.16.1.102.54894 > 172.16.1.101.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 7168, seq 0, length 64
10:59:25.234301 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 1, length 64
10:59:25.234469 IP 172.16.1.102.54894 > 172.16.1.101.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 7168, seq 1, length 64
10:59:26.234415 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 2, length 64
10:59:26.234592 IP 172.16.1.102.54894 > 172.16.1.101.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.2.88 > 10.244.1.97: ICMP echo reply, id 7168, seq 2, length 64
10:59:27.234528 IP 172.16.1.101.60617 > 172.16.1.102.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.1.97 > 10.244.2.88: ICMP echo request, id 7168, seq 3, length 64

  

 下面咱們把flannel的通訊模式改爲directrouting的方式 ,從Git上下載配置文件,從新刪除網絡在從新應用,這個步驟不推薦。可是視頻就這麼作的。做者是修改源文件,而後重啓了k8s集羣,他的這個方式形成pod後續建立的都處於pendding狀態。

https://github.com/coreos/flannel
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

找到
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan",
        "Directrouting": true  //新增這一行。上面記得加逗號

 

先刪除以前的flannel,生產環境不要這麼幹

[root@master flannel]# kubectl delete -f kube-flannel.yml 

  

建立新的

[root@master flannel]# kubectl get pods -n kube-system 
[root@master flannel]# kubectl get configmap kube-flannel-cfg -o json -n kube-system

  "net-conf.json": "{\n  \"Network\": \"10.244.0.0/16\",\n  \"Backend\": {\n    \"Type\": \"vxlan\",\n    \"Directrouting\": true\n  }\n}\n"

  

看到有Directrouting,說明生效了。

 

[root@master ~]# ip route show
default via 172.16.1.254 dev ens192 proto static metric 100 
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1 #訪問本機直接在本機直接轉發,而不須要其餘接口,這就是directrouting
10.244.1.0/24 via 172.16.1.101 dev ens192 #看到如今訪問10.244.1.0,經過本地物理網卡ens192上的172.16.1.101送出去,即經過物理網卡通訊了,而再也不經過隧道flannel通訊。
10.244.2.0/24 via 172.16.1.102 dev ens192 
172.16.1.0/24 dev ens192 proto kernel scope link src 172.16.1.100 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

  

繼續登陸到一個pod中進行ping測試: 

 

[root@master ~]# kubectl get pods -o wide
NAME                             READY     STATUS             RESTARTS   AGE       IP             NODE
myapp-deploy-69b47bc96d-75g2b    1/1       Running            0          12m       10.244.1.124   node1
myapp-deploy-69b47bc96d-jwgwm    1/1       Running            0          3s        10.244.2.100   node2

  

[root@master ~]# kubectl exec  -it myapp-deploy-69b47bc96d-75g2b -- /bin/sh
/ # ping 10.244.2.100
PING 10.244.2.100 (10.244.2.100): 56 data bytes
64 bytes from 10.244.2.100: seq=0 ttl=62 time=0.536 ms
64 bytes from 10.244.2.100: seq=1 ttl=62 time=0.206 ms
64 bytes from 10.244.2.100: seq=2 ttl=62 time=0.206 ms
64 bytes from 10.244.2.100: seq=3 ttl=62 time=0.203 ms
64 bytes from 10.244.2.100: seq=4 ttl=62 time=0.210 ms

  

[root@node1 ~]# tcpdump -i ens192 -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
12:31:10.899403 IP 10.244.1.124 > 10.244.2.100: ICMP echo request, id 8960, seq 24, length 64
12:31:10.899546 IP 10.244.2.100 > 10.244.1.124: ICMP echo reply, id 8960, seq 24, length 64
12:31:11.899505 IP 10.244.1.124 > 10.244.2.100: ICMP echo request, id 8960, seq 25, length 64
12:31:11.899639 IP 10.244.2.100 > 10.244.1.124: ICMP echo reply, id 8960, seq 25, length 64

  

  經過抓包能夠看到,如今在pod中進行互ping,是從物理網卡ens192進出的,這就是directrouting,這種性能比默認vxlan高。 

[root@node 1  ~]#  tcpdump -i cni 0  -nn icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cni 0 , link-type EN 10 MB (Ethernet), capture size  262144  bytes
23: 40: 11.370754  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  96 , length  64
23: 40: 11.370988  IP  10.244 . 2.88  10.244 . 1.97:  ICMP echo reply, id  4864 , seq  96 , length  64
23: 40: 12.370888  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  97 , length  64
23: 40: 12.371090  IP  10.244 . 2.88  10.244 . 1.97:  ICMP echo reply, id  4864 , seq  97 , length  64
^X 23: 40: 13.371015  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  98 , length  64
23: 40: 13.371239  IP  10.244 . 2.88  10.244 . 1.97:  ICMP echo reply, id  4864 , seq  98 , length  64
23: 40: 14.371128  IP  10.244 . 1.97  10.244 . 2.88:  ICMP echo request, id  4864 , seq  99 , length  64
相關文章
相關標籤/搜索