在K8S集羣內部,應用常使用Service互訪,那麼,瞭解Service技術優缺點將有利於應用規劃與部署,鑑於此,本文將經過簡單案例以探索Cluster-Ip類型Service服務的利弊。html
爲便於講解,咱們先建立以下應用及Service服務:nginx
# kubectl run --image=nginx nginx-web-1 --image-pull-policy='IfNotPresent' # kubectl expose deployment nginx-web-1 --port=80 --target-port=80
做者的K8S環境是1.9版本,其Service內部服務由Kube-Proxy1提供,且默認用iptables技術實現,故本文探索K8S集羣Service技術,即研究iptables在K8S上的技術實現。web
以下可知,經過nginx-web-1服務可實際訪問到後端pod:後端
# nginx pod ip地址: # kubectl describe pod nginx-web-1-fb8d45f5f-dcbtt | grep "IP" IP: 10.129.1.22 # Service服務,經過172.30.132.253:80則實際訪問到10.129.1.22:80 # kubectl describe svc nginx-web-1 ... Type: ClusterIP IP: 172.30.132.253 Port: <unset> 80/TCP TargetPort: 80/TCP Endpoints: 10.129.1.22:80 Session Affinity: None ... # 重置nginx web頁面: # kubectl exec -it nginx-web-1-fb8d45f5f-dcbtt -- \ sh -c "echo hello>/usr/share/nginx/html/index.html" # curl 10.129.1.22 hello # curl 172.30.132.253 hello
Service服務分配的CLUSTER-IP以及監聽的端口均虛擬的,即在K8S集羣節點上執行ip a
與netstat -an
命令均沒法找到,其實際上,IP與Port是由iptables配置在每K8S節點上的。在節點上執行以下命令可找到此Service相關的iptables配置,簡析以下:bash
# iptables-save | grep nginx-web-1 -A KUBE-SEP-UWNFTKZFYWNNNTK7 -s 10.129.1.22/32 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-MARK-MASQ -A KUBE-SEP-UWNFTKZFYWNNNTK7 -p tcp -m comment --comment "demo/nginx-web-1:" \ -m tcp -j DNAT --to-destination 10.129.1.22:80 -A KUBE-SERVICES -d 172.30.132.253/32 -p tcp -m comment \ --comment "demo/nginx-web-1: cluster IP" -m tcp --dport 80 -j KUBE-SVC-SNP24T7IBBNZDJ76 -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-SEP-UWNFTKZFYWNNNTK7
詳細分析iptables規則,執行iptables-save
命令可發現nat的PREROUTING與OUTPUT鏈中均有KUBE-SERVICES規則鏈,且處於第一順位。網絡
*nat -A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES -A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
當經過Service訪問應用時,流量經由nat表中的PREROUTING規則鏈處理後,跳轉到KUBE-SERVICES子鏈,而此鏈包含了對具體Service處理的規則。以下所示,訪問172.30.132.253:80將被跳轉到KUBE-SEP-...子規則鏈中。session
-A KUBE-SERVICES -d 172.30.132.253/32 -p tcp -m comment \ --comment "demo/nginx-web-1: cluster IP" -m tcp --dport 80 -j KUBE-SVC-SNP24T7IBBNZDJ76 -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-SEP-UWNFTKZFYWNNNTK7
以下所示,KUBE-SEP-...子鏈存在兩條規則:負載均衡
-A KUBE-SEP-UWNFTKZFYWNNNTK7 -s 10.129.1.22/32 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-MARK-MASQ -A KUBE-SEP-UWNFTKZFYWNNNTK7 -p tcp -m comment --comment "demo/nginx-web-1:" \ -m tcp -j DNAT --to-destination 10.129.1.22:80 -A KUBE-MARK-MASQ -j MARK --set-xmark 0x1/0x1
執行以下命令將Deployment擴展爲3個Pod後,繼而再觀察Service負載均衡方面的技術或問題。dom
# kubectl scale deploy/nginx-web-1 --replicas=3
再次dump防火牆規則,發現Service經由iptables的statistic模塊,以random方式均衡的分發流量,也即負載均衡模式爲輪訓。curl
# iptables-save | grep nginx-web-1 -A KUBE-SEP-BI762VOIAZZWU5S7 -s 10.129.1.27/32 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-MARK-MASQ -A KUBE-SEP-BI762VOIAZZWU5S7 -p tcp -m comment --comment "demo/nginx-web-1:" \ -m tcp -j DNAT --to-destination 10.129.1.27:80 -A KUBE-SEP-CDQIKEVSTA766BRK -s 10.129.1.28/32 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-MARK-MASQ -A KUBE-SEP-CDQIKEVSTA766BRK -p tcp -m comment --comment "demo/nginx-web-1:" \ -m tcp -j DNAT --to-destination 10.129.1.28:80 -A KUBE-SEP-W5HTO42ZVNHJQWBG -s 10.129.3.57/32 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-MARK-MASQ -A KUBE-SEP-W5HTO42ZVNHJQWBG -p tcp -m comment --comment "demo/nginx-web-1:" \ -m tcp -j DNAT --to-destination 10.129.3.57:80 -A KUBE-SERVICES -d 172.30.132.253/32 -p tcp -m comment \ --comment "demo/nginx-web-1: cluster IP" -m tcp --dport 80 -j KUBE-SVC-SNP24T7IBBNZDJ76 -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-BI762VOIAZZWU5S7 -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-CDQIKEVSTA766BRK -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-SEP-W5HTO42ZVNHJQWBG
以下所示,調整Service服務,打開會話保持功能,並設置會話保持期限爲3小時(PS:若不設置,則默認是3小時):
# kubectl edit svc nginx-web-1 ... sessionAffinity: ClientIP sessionAffinityConfig: clientIP: timeoutSeconds: 10800 ...
繼續觀察iptables實現,發如今原有基礎上,iptables規則中添加了recent模塊,此模塊被用於會話保持功能,故kube-proxy經過在iptables中結合statistic與recent模塊,實現了Service的輪訓負載均衡與會話保持功能。
在KUBE-SVC-SNP...子鏈中,recent位於statistic模塊前,故而,有以下狀況出現:
# iptables-save | grep nginx-web-1 -A KUBE-SEP-BI762VOIAZZWU5S7 -s 10.129.1.27/32 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-MARK-MASQ -A KUBE-SEP-BI762VOIAZZWU5S7 -p tcp -m comment --comment "demo/nginx-web-1:" \ -m recent --set --name KUBE-SEP-BI762VOIAZZWU5S7 --mask 255.255.255.255 \ --rsource -m tcp -j DNAT --to-destination 10.129.1.27:80 # 省略2條相似的KUBE-SEP規則 ... -A KUBE-SERVICES -d 172.30.132.253/32 -p tcp -m comment \ --comment "demo/nginx-web-1: cluster IP" -m tcp --dport 80 -j KUBE-SVC-SNP24T7IBBNZDJ76 -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-BI762VOIAZZWU5S7 \ --mask 255.255.255.255 --rsource -j KUBE-SEP-BI762VOIAZZWU5S7 # 省略2條相似的KUBE-SVC規則 ... -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-BI762VOIAZZWU5S7 -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-CDQIKEVSTA766BRK -A KUBE-SVC-SNP24T7IBBNZDJ76 -m comment --comment "demo/nginx-web-1:" \ -j KUBE-SEP-W5HTO42ZVNHJQWBG
K8S中的Service服務可提供負載均衡及會話保持功能,其經過Linux內核netfilter模塊來配置iptables實現,網絡封包在內核中流轉,且規則匹配不多,故效率很是高;而Service負載均衡分發比較薄弱,其經過statistic的random規則實現輪訓分發,沒法實現複雜的如最小連接分發方式,鑑於此,K8S 1.9後續版本調整了kube-proxy服務,其可經過ipvs實現Service負載均衡功能。