k8s 網絡模型解析之實踐

一. 實踐說明node

首先咱們先建立一組資源,包括一個deployment和一個servicenginx

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    name: nginx
spec:
  selector:
    matchLabels:
      name: nginx1
  replicas: 1
  template:
    metadata:
      labels:
        name: nginx1
    spec:
      nodeName: meizu
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    name: nginx1
spec:
  ports:
  - port: 4432
    targetPort: 80
  selector:
    name: nginx1

能夠看到,咱們在指定的node上面建立了一個nginx deployment,而且建立一個服務指向這個pod.而後咱們再在其餘節點上啓動一個pod,在該pod中訪問這個service.docker

src pod  ->  service   ->  backend  podapi

172.30.83.9  ->  10.254.40.119:4432  ->  172.30.20.2:80bash

 

下面的全部操做都是nginx 服務所在的pod所運行的node上執行的網絡

二. 物理網卡app

1.監聽nginx pod/src pod 地址tcp

 sudo tcpdump -i  enp4s0 ‘dst 172.30.20.9’spa

 sudo tcpdump -i  enp4s0 'src 172.30.83.9'code

都沒有輸出

2. 監聽service 地址

 sudo tcpdump  -i  enp4s0 ‘dst 10.254.40.119’

沒有輸出

3.src pod所在的node的物理地址是10.167.226.38,咱們如今監遵從這個節點發出的全部的到本地8472端口的udp報文,注意8472端口是flannel所監聽的端口

sudo tcpdump  -i enp4s0 'src 10.167.226.38 and port 8472 and udp'
看到以下輸出:

11:25:22.220286 IP xiaomi.49008 > meizu.otv: OTV, flags [I] (0x08), overlay 0, instance 1 IP 172.30.83.0.38200 > 172.30.20.2.http: Flags [S], seq 154323928, win 29200, options [mss 1460,sackOK,TS val 3546750064 ecr 0,nop,wscale 7], length 0
11:25:22.221179 IP xiaomi.49008 > meizu.otv: OTV, flags [I] (0x08), overlay 0, instance 1 IP 172.30.83.0.38200 > 172.30.20.2.http: Flags [.], ack 4141357270, win 229, options [nop,nop,TS val 3546750065 ecr 248682180], length 0
11:25:22.221383 IP xiaomi.49008 > meizu.otv: OTV, flags [I] (0x08), overlay 0, instance 1 IP 172.30.83.0.38200 > 172.30.20.2.http: Flags [P.], seq 0:81, ack 1, win 229, options [nop,nop,TS val 3546750065 ecr 248682180], length 81: HTTP: GET / HTTP/1.1
11:25:22.221933 IP xiaomi.49008 > meizu.otv: OTV, flags [I] (0x08), overlay 0, instance 1 IP 172.30.83.0.38200 > 172.30.20.2.http: Flags [.], ack 234, win 237, options [nop,nop,TS val 3546750066 ecr 248682181], length 0
11:25:22.221949 IP xiaomi.49008 > meizu.otv: OTV, flags [I] (0x08), overlay 0, instance 1 IP 172.30.83.0.38200 > 172.30.20.2.http: Flags [.], ack 847, win 247, options [nop,nop,TS val 3546750066 ecr 248682181], length 0
11:25:22.222347 IP xiaomi.49008 > meizu.otv: OTV, flags [I] (0x08), overlay 0, instance 1 IP 172.30.83.0.38200 > 172.30.20.2.http: Flags [F.], seq 81, ack 847, win 247, options [nop,nop,TS val 3546750067 ecr 248682181], length 0

咱們能夠看到在物理網卡上,報文的源地址是物理機的ip,目標地址是目標pod所在的物理機的ip。

報文體中定義的源地址是172.30.83.0,這是源pod所在主機的flannel.1網卡的地址,目標地址是172.30.20.2,這個地址是nginx pod所在的地址

 

4. 再看看tcp協議的輸出

sudo tcpdump  -i enp4s0 'src 10.167.226.38 and port 8472 and tcp'

沒有輸出,因而可知,flannel之間的通訊是經過udp完成的,而不是tcp。

總結:在物理網卡上,全部的通訊都是經過源地址和目標地址所在的主機的物理地址進行通訊的,這些報文封裝了從管理源pod的docker網卡地址到目的nginx pod地址的flannel報文

 

三. Flannel 網卡

執行下面的指令,其中172.30.83.0是源pod所在的node的flannel網卡地址

[wlh@meizu ~]$ sudo tcpdump  -i flannel.1 'host 172.30.83.0 and tcp' tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes 11:42:43.362239 IP 172.30.83.0.50350 > 172.30.20.2.http: Flags [S], seq 3941617519, win 29200, options [mss 1460,sackOK,TS val 3547791182 ecr 0,nop,wscale 7], length 0
11:42:43.363702 IP 172.30.20.2.http > 172.30.83.0.50350: Flags [S.], seq 3977445704, ack 3941617520, win 27960, options [mss 1410,sackOK,TS val 249723323 ecr 3547791182,nop,wscale 7], length 0
11:42:43.364106 IP 172.30.83.0.50350 > 172.30.20.2.http: Flags [.], ack 1, win 229, options [nop,nop,TS val 3547791184 ecr 249723323], length 0
11:42:43.364180 IP 172.30.83.0.50350 > 172.30.20.2.http: Flags [P.], seq 1:82, ack 1, win 229, options [nop,nop,TS val 3547791184 ecr 249723323], length 81: HTTP: GET / HTTP/1.1
11:42:43.364218 IP 172.30.20.2.http > 172.30.83.0.50350: Flags [.], ack 82, win 219, options [nop,nop,TS val 249723324 ecr 3547791184], length 0
11:42:43.364482 IP 172.30.20.2.http > 172.30.83.0.50350: Flags [P.], seq 1:234, ack 82, win 219, options [nop,nop,TS val 249723324 ecr 3547791184], length 233: HTTP: HTTP/1.1 200 OK 11:42:43.364608 IP 172.30.20.2.http > 172.30.83.0.50350: Flags [FP.], seq 234:846, ack 82, win 219, options [nop,nop,TS val 249723324 ecr 3547791184], length 612: HTTP 11:42:43.364868 IP 172.30.83.0.50350 > 172.30.20.2.http: Flags [.], ack 234, win 237, options [nop,nop,TS val 3547791185 ecr 249723324], length 0
11:42:43.364888 IP 172.30.83.0.50350 > 172.30.20.2.http: Flags [.], ack 847, win 247, options [nop,nop,TS val 3547791185 ecr 249723324], length 0
11:42:43.365226 IP 172.30.83.0.50350 > 172.30.20.2.http: Flags [F.], seq 82, ack 847, win 247, options [nop,nop,TS val 3547791185 ecr 249723324], length 0
11:42:43.365271 IP 172.30.20.2.http > 172.30.83.0.50350: Flags [.], ack 83, win 219, options [nop,nop,TS val 249723325 ecr 3547791185], length 0

能夠看到物理層的報文被解封裝後提交給了flannel.1網卡,它處理的報文就是從源pod所在的node的flannel.1網卡地址到目標pod的地址的通訊

四. Docker0網卡

其中172.30.83.0是源pod所在的主機的flannel.1網卡的地址

 1 [wlh@meizu ~]$ sudo tcpdump  -i docker0 'host 172.30.83.0 and tcp'
 2 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode  3 listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes  4 11:51:00.681066 IP 172.30.83.0.56252 > 172.30.20.2.http: Flags [S], seq 2690808127, win 29200, options [mss 1460,sackOK,TS val 3548288489 ecr 0,nop,wscale 7], length 0
 5 11:51:00.681110 IP 172.30.20.2.http > 172.30.83.0.56252: Flags [S.], seq 115108410, ack 2690808128, win 27960, options [mss 1410,sackOK,TS val 250220641 ecr 3548288489,nop,wscale 7], length 0
 6 11:51:00.681548 IP 172.30.83.0.56252 > 172.30.20.2.http: Flags [.], ack 1, win 229, options [nop,nop,TS val 3548288490 ecr 250220641], length 0
 7 11:51:00.681560 IP 172.30.83.0.56252 > 172.30.20.2.http: Flags [P.], seq 1:82, ack 1, win 229, options [nop,nop,TS val 3548288490 ecr 250220641], length 81: HTTP: GET / HTTP/1.1
 8 11:51:00.681608 IP 172.30.20.2.http > 172.30.83.0.56252: Flags [.], ack 82, win 219, options [nop,nop,TS val 250220641 ecr 3548288490], length 0
 9 11:51:00.681773 IP 172.30.20.2.http > 172.30.83.0.56252: Flags [P.], seq 1:234, ack 82, win 219, options [nop,nop,TS val 250220642 ecr 3548288490], length 233: HTTP: HTTP/1.1 200 OK 10 11:51:00.681853 IP 172.30.20.2.http > 172.30.83.0.56252: Flags [FP.], seq 234:846, ack 82, win 219, options [nop,nop,TS val 250220642 ecr 3548288490], length 612: HTTP 11 11:51:00.682018 IP 172.30.83.0.56252 > 172.30.20.2.http: Flags [.], ack 234, win 237, options [nop,nop,TS val 3548288490 ecr 250220642], length 0
12 11:51:00.682031 IP 172.30.83.0.56252 > 172.30.20.2.http: Flags [.], ack 847, win 247, options [nop,nop,TS val 3548288490 ecr 250220642], length 0
13 11:51:00.682504 IP 172.30.83.0.56252 > 172.30.20.2.http: Flags [F.], seq 82, ack 847, win 247, options [nop,nop,TS val 3548288491 ecr 250220642], length 0
14 11:51:00.682523 IP 172.30.20.2.http > 172.30.83.0.56252: Flags [.], ack 83, win 219, options [nop,nop,TS val 250220642 ecr 3548288491], length 0

咱們能夠看到這裏的通訊是以tcp協議進行而且全部的通訊和flannel相似,也是在源主機的flannel網卡地址和目標pod地址之間進行的。

五 容器網卡

容器網卡的輸出和前面的比較相似這裏再也不贅述。

 

六總結

下面總結一下整個過程

  1. 首先容器內的進程發送一個訪問service的請求,這個被交給容器內網卡進行處理,容器內網卡將請求發送給veth pair的另外一端。此時請求是從src pod ip-> service ip。
  2. 而後根據NAT表的設置, 目的地址被轉化爲backend pod 的ip,這個請求再傳送給docker0網卡。此時請求是 src pod ip -> backend pod ip
  3. docker0 網卡收到請求後直接將請求發出去,請求根據路由表(route -n)被傳送給flannel.1網卡。此時請求是flanne.1 ip -> backend pod ip
  4. flannel.1網卡將請求發送給flanneld進程進行處理,該進程讀取etcd的配置,給請求封裝上一個udp協議頭,而後發出去, 該報文是 source node ip -> dst node ip
  5. 本地網卡收到這個請求後將報文從物理網絡上發出去,到達遠程主機。

 

flannel.1

相關文章
相關標籤/搜索