Rancher在IPTABLES的應用

IPTABLES對運維的同窗來講是一個很是有用的工具,配合tcpdump/wireshark來定位四層的收發包問題會格外的有效。比方說報文到了協議棧的設備上,經過tcpdump模擬協議棧抓到了交互的報文,可是報文沒有往上投遞到四層應用,這中間必定是有一些機制和策略阻礙的報文的向上投遞,這時候在用戶空間用上IPTABLES每每能截獲到很多的信息。mysql

 

提到IPTABLES,是一個用戶態的應用,它經過netlink socket與內核態netfilter子系統通訊,完成對報文的控制(防火牆,負載均衡等常見應用)。主要是調用glibc庫函數的socket(PF_NETLINK, SOCK_RAW, NETLINK_IP6_FW)/socket(PF_NETLINK, SOCK_RAW, NETLINK_FIREWALL)方法,經過系統調用陷入內核,調用內核函數獲取到與內核netfilter子系統通訊的socket句柄,截獲firewall hook(PREROUTING, FORWARD, INPUT, OUPUT, POSTROUTING)的報文,完成處理邏輯。這部分邏輯代碼能夠在iptables源碼裏面libipq.c的ipq_create_handle方法裏面找到,這裏不作深究。sql

 

下面簡單介紹一下,結合IPTABLES分析Rancher報文流向控制。docker

 

清理IPTABLES mangle表全部chain下面的規則負載均衡

  • iptables -t mangle -F

 

給iptables mangle表的5個鉤子添加日誌追蹤的行爲,追蹤的報文協議是tcp,源/目標端口爲3306(稍等會在Rancher上部署一個mysql應用,暴露出來的端口是3306),設置打印日誌的級別而後打印日誌運維

  • iptables -t mangle -A PREROUTING -p tcp --dport 3306 -j LOG --log-prefix "M-PREROUTING:" --log-level 7
  • iptables -t mangle -A POSTROUTING -p tcp --dport 3306 -j LOG --log-prefix "M-POSTROUTING:" --log-level 7
  • iptables -t mangle -A FORWARD -p tcp --dport 3306 -j LOG --log-prefix "M-FORWARD:" --log-level 7
  • iptables -t mangle -A OUTPUT -p tcp --dport 3306 -j LOG --log-prefix "M-OUTPUT:" --log-level 7
  • iptables -t mangle -A INPUT -p tcp --dport 3306 -j LOG --log-prefix "M-INPUT:" --log-level 7
  • iptables -t mangle -A PREROUTING -p tcp --sport 3306 -j LOG --log-prefix "M-PREROUTING:" --log-level 7
  • iptables -t mangle -A POSTROUTING -p tcp --sport 3306 -j LOG --log-prefix "M-POSTROUTING:" --log-level 7
  • iptables -t mangle -A FORWARD -p tcp --sport 3306 -j LOG --log-prefix "M-FORWARD:" --log-level 7
  • iptables -t mangle -A OUTPUT -p tcp --sport 3306 -j LOG --log-prefix "M-OUTPUT:" --log-level 7
  • iptables -t mangle -A INPUT -p tcp --sport 3306 -j LOG --log-prefix "M-INPUT:" --log-level 7

 

在Rancher上部署mysql應用,暴露端口爲3306,同時在容器所在宿主機上查看Rancher爲服務添加的IPTABLES主要規則,爲下面的報文分析作下鋪墊socket

  • NAT表
    • -A CATTLE_HOSTPORTS_POSTROUTING -s 10.42.158.152/32 -d 10.42.158.152/32 -p tcp -m tcp --dport 3306 -j MASQUERADE
    • -A CATTLE_OUTPUT -p tcp -m tcp --dport 3306 -m addrtype --dst-type LOCAL -j DNAT --to-destination 10.42.158.152:3306
    • -A CATTLE_PREROUTING ! -i docker0 -p tcp -m tcp --dport 3306 -j DNAT --to-destination 10.42.158.152:3306
    • -A CATTLE_PREROUTING -p tcp -m tcp --dport 3306 -m addrtype --dst-type LOCAL -j DNAT --to-destination 10.42.158.152:3306
  • FILTER表
    • -A CATTLE_FORWARD -m mark --mark 0x1068 -j ACCEPT
    • -A CATTLE_FORWARD -m mark --mark 0x4000 -j ACCEPT
    • -A CATTLE_FORWARD -d 10.42.0.0/16 -o docker0 -j ACCEPT

 

查看/var/log/kern.log日誌,分析報文流轉tcp

  • 跨主機訪問mysql所在主機的3306端口
    • SYN報文到達mangle表的PREROUTING chain,寫下了下面的日誌,同時在NAT表的PREROUTING chain作了DNAT,命中規則見上文提到的CATTLE_PREROUTING自定義鏈。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196230] M-PREROUTING:IN=enp0s8 OUT= MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=172.168.1.200 LEN=60 TOS=0x10 PREC=0x00 TTL=64 ID=63457 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=29200 RES=0x00 SYN URGP=0 MARK=0x1068
    • SYN報文到達mangle表的FORWARD chain,寫下了下面的日誌,同時在FILTER表的FORWARD chain 被ACCEPT,命中規則見上文提到的CATTLE_FORWARD自定義鏈。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196283] M-FORWARD:IN=enp0s8 OUT=docker0 MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=10.42.158.152 LEN=60 TOS=0x10 PREC=0x00 TTL=63 ID=63457 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=29200 RES=0x00 SYN URGP=0 MARK=0x1068
    • SYN報文到達mangle表的POSTROUTING chain,寫下了下面的日誌,而後過NAT表的POSTROUTING chain,沒有作SNAT操做,而後出協議棧,最終SYN報文到達容器,容器收到SYN報文以後,回覆ACK+SYN報文,這部分流轉本文沒有深究列出。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196296] M-POSTROUTING:IN= OUT=docker0 SRC=172.168.1.204 DST=10.42.158.152 LEN=60 TOS=0x10 PREC=0x00 TTL=63 ID=63457 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=29200 RES=0x00 SYN URGP=0 MARK=0x1068
    • ACK+SYN報文到達mangle表的PREROUTING chain,寫下下面的日誌,而後過NAT表的PREROUTING chain,沒有作DNAT操做。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196717] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=172.168.1.204 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=3306 DPT=54022 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • ACK+SYN報文到達mangle表的FORWARD chain,寫下了下面的日誌,同時在FILTER表的FORWARD chain 被默認ACCEPT。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196729] M-FORWARD:IN=docker0 OUT=enp0s8 PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=172.168.1.204 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=TCP SPT=3306 DPT=54022 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • ACK+SYN報文到達mangle表的POSTROUTING chain,寫下了下面的日誌,而後過NAT表的POSTROUTING chain,作SNAT的操做,命中規則見上文提到的CATTLE_HOSTPORTS_POSTROUTING自定義鏈,出主機協議棧,ACK+SYN報文到達發起訪問的主機網卡設備。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196735] M-POSTROUTING:IN= OUT=enp0s8 PHYSIN=vethr1368e17ce3 SRC=10.42.158.152 DST=172.168.1.204 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=TCP SPT=3306 DPT=54022 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • ACK報文到達主機,處理方式跟第一步同樣。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196984] M-PREROUTING:IN=enp0s8 OUT= MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=172.168.1.200 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=63458 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=229 RES=0x00 ACK URGP=0 MARK=0x1068
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196996] M-FORWARD:IN=enp0s8 OUT=docker0 MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=63 ID=63458 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=229 RES=0x00 ACK URGP=0 MARK=0x1068
      • Nov 17 07:07:39 cattleh2 kernel: [46619.197002] M-POSTROUTING:IN= OUT=docker0 SRC=172.168.1.204 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=63 ID=63458 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=229 RES=0x00 ACK URGP=0 MARK=0x1068

 

  • 本機(127.0.0.1)訪問主機3306端口,和跨主機訪問流轉一致,下面是流轉mangle表打出來的日誌,能夠參考上面的分析邏輯幫助理解下面的日誌。
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906692] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=60 TOS=0x10 PREC=0x00 TTL=64 ID=19503 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=43690 RES=0x00 SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906708] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=60 TOS=0x10 PREC=0x00 TTL=64 ID=19503 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=43690 RES=0x00 SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906776] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906791] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906806] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19504 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906812] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19504 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907322] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=147 TOS=0x08 PREC=0x00 TTL=64 ID=34614 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907331] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=147 TOS=0x08 PREC=0x00 TTL=64 ID=34614 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907362] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19505 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907367] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19505 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290761] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19506 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290777] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19506 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290939] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34615 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290948] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34615 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473238] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19507 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473255] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19507 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473378] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34616 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473386] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34616 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0

 

綜上分析,能夠看到Rancher並無使用docker-proxy來暴露服務。若是使用docker的userland proxy,試想若是開10000個服務,意味着主機上要開銷出10000個端口來暴露服務,對內核來講,是一筆不小的開銷。Rancher的設計是利用iptables控制報文在host上的流向無疑是一件很是科學的事情,可能會有小夥伴問到爲啥不用ipvs呢?弱雞小編膚淺的以爲這也能夠是一種嘗試,謝謝你們。函數

相關文章
相關標籤/搜索