瞭解容器網絡的同窗都知道容器之間是經過VEth設備來進行容器間的網絡通訊的, 即經過將VEth設備的一端接在宿主機上, 另外一端接在容器裏面來實現宿主機network namespace和容器network namespace的鏈接, 在這裏VEth設備充當了鏈接兩個network namespace的一根虛擬網線的做用.docker
處在宿主機上的這一端的「網線接口」體現爲一個宿主機上的網絡接口, 直接在宿主機上經過ip a
便可以看到, 通常形式爲vethXXX (咱們也能夠經過ip -d link show <interface name>
的命令來查看設備的類型), 可是當咱們看到一串串以veth開頭加上一串隨機字符串的接口時是否是一會兒就蒙了? 到底這些接口跟另外一端在容器裏面的接口是如何對應的? 這跟虛擬網線的另外一端到底鏈接的是哪一個容器?bash
下面就來分享兩種方法我總結的方法, 第一種也是官方推薦的作法, 第二種是本身忽然靈感乍現想到的💡, 因此趕忙記錄下來, 不知道有沒有跟我有同感的同窗哈 : )網絡
兩個運行在同一個節點上的Pod容器, 也能夠本身經過docker run
隨意建立兩個容器, 這裏就不糾結了.tcp
[root@10-10-40-84 ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 47m 10.222.1.3 10-10-40-93 <none> <none>
busybox2 1/1 Running 0 45m 10.222.1.4 10-10-40-93 <none> <none>
[root@10-10-40-84 ~]#
複製代碼
docker ps
的輸出ide
[root@10-10-40-93 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
70eebe80845b af2f74c517aa "sleep 3600" 31 minutes ago Up 31 minutes k8s_busybox_busybox2_default_247b9265-59f5-11e9-9c05-faf63cb42000_1
2060ba52f6ed af2f74c517aa "sleep 3600" 34 minutes ago Up 34 minutes k8s_busybox_busybox_default_c7bf5185-59f4-11e9-9c05-faf63cb42000_1
bcb7f08f8707 registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 "/pause" 2 hours ago Up 2 hours k8s_POD_busybox2_default_247b9265-59f5-11e9-9c05-faf63cb42000_0
9a23d437bf97 registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 "/pause" 2 hours ago Up 2 hours k8s_POD_busybox_default_c7bf5185-59f4-11e9-9c05-faf63cb42000_0
[root@10-10-40-93 ~]#
複製代碼
先來看下兩個容器所在的宿主機上ip a
輸出的狀況oop
[root@10-10-40-93 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:e7:af:c1:b5:00 brd ff:ff:ff:ff:ff:ff
inet 10.10.40.93/24 brd 10.10.40.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::f8e7:afff:fec1:b500/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:83:4d:a3:4e:01 brd ff:ff:ff:ff:ff:ff
inet 172.16.130.91/24 brd 172.16.130.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::f883:4dff:fea3:4e01/64 scope link
valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 02:42:42:ad:df:4f brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
5: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN
link/ether be:1f:af:bb:6e:f5 brd ff:ff:ff:ff:ff:ff
inet 10.222.1.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::bc1f:afff:febb:6ef5/64 scope link
valid_lft forever preferred_lft forever
6: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP qlen 1000
link/ether e6:36:8b:52:21:62 brd ff:ff:ff:ff:ff:ff
inet 10.222.1.1/24 scope global cni0
valid_lft forever preferred_lft forever
inet6 fe80::e436:8bff:fe52:2162/64 scope link
valid_lft forever preferred_lft forever
8: vethf0808a3e@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP
link/ether b2:2f:ed:b3:d1:66 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::b02f:edff:feb3:d166/64 scope link
valid_lft forever preferred_lft forever
9: vethd5962a6c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP
link/ether be:14:67:cb:39:79 brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::bc14:67ff:fecb:3979/64 scope link
valid_lft forever preferred_lft forever
[root@10-10-40-93 ~]#
複製代碼
能夠看到在宿主機上有兩個VEth接口vethf0808a3e和vethd5962a6c, 再經過ip -d link show
驗證確實是兩個VEth接口ui
[root@10-10-40-93 ~]# ip -d link show vethf0808a3e
8: vethf0808a3e@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT
link/ether b2:2f:ed:b3:d1:66 brd ff:ff:ff:ff:ff:ff link-netnsid 1 promiscuity 1
veth
bridge_slave state forwarding priority 32 cost 2 hairpin on guard off root_block off fastleave off learning on flood on port_id 0x8002 port_no 0x2 designated_port 32770 designated_cost 0 designated_bridge 8000.e6:36:8b:52:21:62 designated_root 8000.e6:36:8b:52:21:62 hold_timer 0.00 message_age_timer 0.00 forward_delay_timer 0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on addrgenmode eui64
[root@10-10-40-93 ~]# ip -d link show vethd5962a6c
9: vethd5962a6c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT
link/ether be:14:67:cb:39:79 brd ff:ff:ff:ff:ff:ff link-netnsid 2 promiscuity 1
veth
bridge_slave state forwarding priority 32 cost 2 hairpin on guard off root_block off fastleave off learning on flood on port_id 0x8003 port_no 0x3 designated_port 32771 designated_cost 0 designated_bridge 8000.e6:36:8b:52:21:62 designated_root 8000.e6:36:8b:52:21:62 hold_timer 0.00 message_age_timer 0.00 forward_delay_timer 0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on addrgenmode eui64
[root@10-10-40-93 ~]#
複製代碼
經過 brctl show
能夠看到兩個VEth接口都接在網橋cni0上google
[root@10-10-40-93 ~]# brctl show
bridge name bridge id STP enabled interfaces
cni0 8000.e6368b522162 no vethd5962a6c
vethf0808a3e
docker0 8000.024242addf4f no
[root@10-10-40-93 ~]#
複製代碼
ip a
輸出的網絡接口序號對應關係找到VEth設備的對端接口分別在兩個Pod(容器)當中執行ip a
, 查看容器當中的網絡接口狀況spa
[root@10-10-40-84 ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 47m 10.222.1.3 10-10-40-93 <none> <none>
busybox2 1/1 Running 0 45m 10.222.1.4 10-10-40-93 <none> <none>
[root@10-10-40-84 ~]# kubectl exec -it busybox -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
3: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether a6:d1:b0:67:6a:55 brd ff:ff:ff:ff:ff:ff
inet 10.222.1.3/24 scope global eth0
valid_lft forever preferred_lft forever
[root@10-10-40-84 ~]# kubectl exec -it busybox2 -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
3: eth0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 5a:d8:0d:16:64:5e brd ff:ff:ff:ff:ff:ff
inet 10.222.1.4/24 scope global eth0
valid_lft forever preferred_lft forever
[root@10-10-40-84 ~]#
複製代碼
能夠看到busybox這個容器裏面看到的接口爲eth0@if8, 對應宿主機上的序號爲8的接口即vethf0808a3e. 而busybox2這個容器裏面看到的接口爲eth0@if9, 對應宿主機上序號爲9的網絡接口vethd5962a6c, 下面來進行抓包驗證, 經過在busybox這個容器往外發ping包, 而後在宿主機上抓包看宿主機上的哪一個VEth網絡接口上能抓到ICMP報文3d
[root@10-10-40-84 ~]# kubectl exec -it busybox sh
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
3: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether a6:d1:b0:67:6a:55 brd ff:ff:ff:ff:ff:ff
inet 10.222.1.3/24 scope global eth0
valid_lft forever preferred_lft forever
/ # ping 10.222.1.4
PING 10.222.1.4 (10.222.1.4): 56 data bytes
^C
--- 10.222.1.4 ping statistics ---
49 packets transmitted, 0 packets received, 100% packet loss
/ #
複製代碼
[root@10-10-40-93 ~]# tcpdump -nn -i vethf0808a3e icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vethf0808a3e, link-type EN10MB (Ethernet), capture size 262144 bytes
21:36:23.262196 IP 10.222.1.3 > 10.222.1.4: ICMP echo request, id 5888, seq 19, length 64
21:36:24.262413 IP 10.222.1.3 > 10.222.1.4: ICMP echo request, id 5888, seq 20, length 64
21:36:25.262565 IP 10.222.1.3 > 10.222.1.4: ICMP echo request, id 5888, seq 21, length 64
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
[root@10-10-40-93 ~]# tcpdump -nn -i vethd5962a6c icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vethd5962a6c, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
[root@10-10-40-93 ~]#
複製代碼
能夠看到只有宿主機上的vethf0808a3e對應序號爲8的網絡接口上有抓到ICMP報文, 驗證經過
另一種奇淫異巧則則是經過Linux Bridge這個設備上的MAC地址對應關係來查找VEth設備的對端接口, 全部的VEth設備的一端實際上都鏈接在Linux Bridge上, 而Linux Bridge做爲一個網絡包轉發的中間人, 固然是得知道兩端的狀況才行, 否則怎麼作網絡包的轉發呢?
[root@10-10-40-93 ~]# brctl show
bridge name bridge id STP enabled interfaces
cni0 8000.e6368b522162 no vethd5962a6c
vethf0808a3e
docker0 8000.024242addf4f no
[root@10-10-40-93 ~]#
[root@10-10-40-93 ~]# brctl showmacs cni0
port no mac addr is local? ageing timer
3 5a:d8:0d:16:64:5e no 80.94
2 a6:d1:b0:67:6a:55 no 72.95
2 b2:2f:ed:b3:d1:66 yes 0.00
2 b2:2f:ed:b3:d1:66 yes 0.00
3 be:14:67:cb:39:79 yes 0.00
3 be:14:67:cb:39:79 yes 0.00
[root@10-10-40-93 ~]#
複製代碼
能夠看到Linux Bridge上總共有兩個接口, 接口2跟接口3, 前面兩個local標誌爲no的表示的就是VEth設備的對端, 端口號一致的表示同一個VEth設備, 經過對比宿主機上ip a
和容器當中ip a
輸出的結果對MAC地址進行比對便可發現跟第一種方法的結果是一致的, 一樣能夠經過抓包的方式來驗證 : )