前段時間參加openshift培訓,經過產品部門的講解,刷新了我對OpenShift一些的認識,今天先從最弱的環節網絡作一些瞭解吧。html
Openvswitch是openshift sdn的核心組件,進入集羣,而後列出某個節點全部的podnode
[root@master ~]# oc adm manage-node node1.example.com --list-pods Listing matched pods on node: node1.example.com NAMESPACE NAME READY STATUS RESTARTS AGE default docker-registry-1-trrdq 1/1 Running 3 16d default router-1-6zjmb 1/1 Running 3 16d openshift-monitoring alertmanager-main-0 3/3 Running 11 16d openshift-monitoring alertmanager-main-1 3/3 Running 10 6d openshift-monitoring alertmanager-main-2 3/3 Running 11 6d openshift-monitoring cluster-monitoring-operator-df6d9f48d-8lzcw 1/1 Running 4 16d openshift-monitoring grafana-76cc4f64c-m25c4 2/2 Running 8 16d openshift-monitoring kube-state-metrics-8db94b768-tgfgq 3/3 Running 11 16d openshift-monitoring node-exporter-mjclp 2/2 Running 86 164d openshift-monitoring prometheus-k8s-0 4/4 Running 15 6d openshift-monitoring prometheus-k8s-1 4/4 Running 14 6d openshift-monitoring prometheus-operator-959fc8dfd-ppc78 1/1 Running 9 16d openshift-node sync-7mpsc 1/1 Running 47 164d openshift-sdn ovs-rzc7j 1/1 Running 42 164d openshift-sdn sdn-77s8t 1/1 Running 60 164d
Sync Podgit
sync pod主要是sync daemonset建立,主要負責監控/etc/sysconfig/atomic-openshift-node的變化, 主要是觀察BOOTSTRAP_CONFIG_NAME的設置
. github
BOOTSTRAP_CONFIG_NAME
是openshift-ansible 安裝的,是一個基於node configuration group的ConfigMap類型。openshift 缺省有以下node Configuration Groupdocker
node-config-master數據庫
node-config-infrabootstrap
node-config-computebash
node-config-all-in-onecookie
node-config-master-infra網絡
Sync Pod 轉換configmap的數據到kubelet的配置,併爲節點生成/etc/origin/node/node-config.yaml ,若是文件配置有變化,kubelet會重啓。
OVS Pod
ovs Pod主要基於Openvswitch爲Openshift提供一個容器網絡.它是OpenvSwitch的一個容器化實現,核心組件以下:
核心組件介紹以下
ovs-vswitchd
守護進程是OVS的核心部件,它和datapath
內核模塊一塊兒實現OVS基於流的數據交換。做爲核心組件,它使用openflow協議與上層OpenFlow控制器通訊,使用OVSDB協議與ovsdb-server
通訊,使用netlink
和datapath
內核模塊通訊。ovs-vswitchd
在啓動時會讀取ovsdb-server
中配置信息,而後配置內核中的datapaths
和全部OVS switches,當ovsdb中的配置信息改變時(例如使用ovs-vsctl工具),ovs-vswitchd
也會自動更新其配置以保持與數據庫同步
進入到ovs pod,能夠看到運行的核心進程
root 14874 14873 0 02:02 ? 00:00:01 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach --monitor
ovsdb-server
是OVS輕量級的數據庫服務,用於整個OVS的配置信息,包括接口/交換內容/VLAN等,OVS主進程ovs-vswitchd
根據數據庫中的配置信息工做,下面是ovsdb-server
進程詳細信息
sh-4.2# ps -ef | grep ovsdb-server root 14009 13989 0 02:01 ? 00:00:00 /bin/bash -c #!/bin/bash set -euo pipefail # if another process is listening on the cni-server socket, wait until it exits trap 'kill $(jobs -p); exit 0' TERM retries=0 while true; do if /usr/share/openvswitch/scripts/ovs-ctl status &>/dev/null; then echo "warning: Another process is currently managing OVS, waiting 15s ..." 2>&1 sleep 15 & wait (( retries += 1 )) else break fi if [[ "${retries}" -gt 40 ]]; then echo "error: Another process is currently managing OVS, exiting" 2>&1 exit 1 fi done # launch OVS function quit { /usr/share/openvswitch/scripts/ovs-ctl stop exit 0 } trap quit SIGTERM /usr/share/openvswitch/scripts/ovs-ctl start --no-ovs-vswitchd --system-id=random # Restrict the number of pthreads ovs-vswitchd creates to reduce the # amount of RSS it uses on hosts with many cores # https://bugzilla.redhat.com/show_bug.cgi?id=1571379 # https://bugzilla.redhat.com/show_bug.cgi?id=1572797 if [[ `nproc` -gt 12 ]]; then ovs-vsctl --no-wait set Open_vSwitch . other_config:n-revalidator-threads=4 ovs-vsctl --no-wait set Open_vSwitch . other_config:n-handler-threads=10 fi /usr/share/openvswitch/scripts/ovs-ctl start --no-ovsdb-server --system-id=random tail --follow=name /var/log/openvswitch/ovs-vswitchd.log /var/log/openvswitch/ovsdb-server.log & sleep 20 while true; do if ! /usr/share/openvswitch/scripts/ovs-ctl status &>/dev/null; then echo "OVS seems to have crashed, exiting" quit fi sleep 15 done root 14382 13989 0 02:02 ? 00:00:00 ovsdb-server: monitoring pid 14383 (healthy) root 14383 14382 0 02:02 ? 00:00:00 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor root 15101 14009 0 02:02 ? 00:00:00 tail --follow=name /var/log/openvswitch/ovs-vswitchd.log /var/log/openvswitch/ovsdb-server.log root 44875 36935 0 02:27 ? 00:00:00 grep ovsdb-server
OpenFlow是開源的用於管理交換機流表的協議,OpenFlow在OVS中的地位能夠參考上面架構圖,它是Controller和ovs-vswitched間的通訊協議。須要注意的是,OpenFlow是一個獨立的完整的流表協議,不依賴於OVS,OVS只是支持OpenFlow協議,有了支持,咱們可使用OpenFlow控制器來管理OVS中的流表,OpenFlow不單單支持虛擬交換機,某些硬件交換機也支持OpenFlow協議。
Controller指OpenFlow控制器。OpenFlow控制器能夠經過OpenFlow協議鏈接到任何支持OpenFlow的交換機,好比OVS。控制器經過向交換機下發流表規則來控制數據流向。除了能夠經過OpenFlow控制器配置OVS中flows,也可使用OVS提供的ovs-ofctl
命令經過OpenFlow協議去鏈接OVS,從而配置flows,命令也可以對OVS的運行情況進行動態監控。
datapath是一個Linux內核模塊,它負責執行數據交換。
1.網絡架構
下面這個圖比較直觀。
找個環境先看看
[root@node1 ~]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 08:00:27:dc:99:1a brd ff:ff:ff:ff:ff:ff inet 192.168.56.104/24 brd 192.168.56.255 scope global noprefixroute enp0s3 valid_lft forever preferred_lft forever inet6 fe80::a00:27ff:fedc:991a/64 scope link tentative dadfailed valid_lft forever preferred_lft forever 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:1e:32:6a:69 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever 8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 42:ea:54:f3:41:32 brd ff:ff:ff:ff:ff:ff 9: br0: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default qlen 1000 link/ether 8e:be:50:76:b7:45 brd ff:ff:ff:ff:ff:ff 10: vxlan_sys_4789: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000 link/ether 56:e0:7e:45:e9:5b brd ff:ff:ff:ff:ff:ff inet6 fe80::54e0:7eff:fe45:e95b/64 scope link valid_lft forever preferred_lft forever 11: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether 1a:57:ef:7d:84:2a brd ff:ff:ff:ff:ff:ff inet 10.130.0.1/23 brd 10.130.1.255 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::1857:efff:fe7d:842a/64 scope link valid_lft forever preferred_lft forever 12: veth4da99f3a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether 4e:7c:96:3b:db:3c brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fe80::4c7c:96ff:fe3b:db3c/64 scope link valid_lft forever preferred_lft forever 13: veth4596fc96@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether 2a:83:b7:1b:e0:81 brd ff:ff:ff:ff:ff:ff link-netnsid 1 inet6 fe80::2883:b7ff:fe1b:e081/64 scope link valid_lft forever preferred_lft forever 14: veth04caa6b2@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether 3a:5b:9a:62:9c:f4 brd ff:ff:ff:ff:ff:ff link-netnsid 2 inet6 fe80::385b:9aff:fe62:9cf4/64 scope link valid_lft forever preferred_lft forever 15: veth14f14b18@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether ca:d2:96:48:84:be brd ff:ff:ff:ff:ff:ff link-netnsid 3 inet6 fe80::c8d2:96ff:fe48:84be/64 scope link valid_lft forever preferred_lft forever 16: veth31713a78@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether ce:c3:8f:3e:b9:41 brd ff:ff:ff:ff:ff:ff link-netnsid 4 inet6 fe80::ccc3:8fff:fe3e:b941/64 scope link valid_lft forever preferred_lft forever 17: veth9ff2f96a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether 16:f8:11:d2:28:7a brd ff:ff:ff:ff:ff:ff link-netnsid 5 inet6 fe80::14f8:11ff:fed2:287a/64 scope link valid_lft forever preferred_lft forever 18: veth86a4a302@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether 56:75:ab:7a:17:28 brd ff:ff:ff:ff:ff:ff link-netnsid 6 inet6 fe80::5475:abff:fe7a:1728/64 scope link valid_lft forever preferred_lft forever 19: vethb4141622@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether 46:d1:df:02:5c:d3 brd ff:ff:ff:ff:ff:ff link-netnsid 7 inet6 fe80::44d1:dfff:fe02:5cd3/64 scope link valid_lft forever preferred_lft forever 20: vethae772509@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether c6:12:af:03:f0:b7 brd ff:ff:ff:ff:ff:ff link-netnsid 8 inet6 fe80::c412:afff:fe03:f0b7/64 scope link valid_lft forever preferred_lft forever 25: veth4fbfae38@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default link/ether 82:f3:e8:87:9b:e9 brd ff:ff:ff:ff:ff:ff link-netnsid 9 inet6 fe80::80f3:e8ff:fe87:9be9/64 scope link valid_lft forever preferred_lft forever
能夠看到一堆的interface, 在一個節點上,OpenShift SDN每每會生成以下的類型的接口(interface)
br0
: OVS的網橋設備 The OVS bridge device that containers will be attached to. OpenShift SDN also configures a set of non-subnet-specific flow rules on this bridge.
tun0
: 訪問外部網絡的網關 An OVS internal port (port 2 on br0
). This gets assigned the cluster subnet gateway address, and is used for external network access. OpenShift SDN configures netfilter
and routing rules to enable access from the cluster subnet to the external network via NAT.
vxlan_sys_4789
: 訪問其餘節點Pod的設備 The OVS VXLAN device (port 1 on br0
), which provides access to containers on remote nodes. Referred to as vxlan0
in the OVS rules.
vethX
(in the main netns): 和Pod關聯的虛擬網卡 A Linux virtual ethernet peer of eth0
in the Docker netns. It will be attached to the OVS bridge on one of the other ports.
這個說明能夠結合上面那個圖來看,就很清晰了。
基於brctl命令是看不到ovs的網橋的,只能看到 docker0
[root@node1 ~]# brctl show bridge name bridge id STP enabled interfaces docker0 8000.02421e326a69 no
若是要看到br0網橋,而且瞭解附在上面的ip信息,須要安裝openvswitch.
基於
https://access.redhat.com/solutions/3710321
由於openvswitch 在3.10以上版本轉爲static pod實現,因此在3.10以上版本在節點不預裝openvswitch,若是須要安裝參照一下步驟。固然也能夠直接在ovs pod裏面進行查看。
# subscription-manager register # subscription-manager list --available # subscription-manager attach --pool=<pool_id> subscription-manager repos --enable=rhel-7-server-extras-rpms subscription-manager repos --enable=rhel-7-server-optional-rpms yum install -y openvswitch
基於命令查看網橋
[root@node1 network-scripts]# ovs-vsctl list-br
br0
查看網橋上的端口
[root@node1 network-scripts]# ovs-ofctl -O OpenFlow13 dump-ports-desc br0 OFPST_PORT_DESC reply (OF1.3) (xid=0x2): 1(vxlan0): addr:a2:01:b9:7d:a6:c3 config: 0 state: LIVE speed: 0 Mbps now, 0 Mbps max 2(tun0): addr:de:48:d6:e7:a9:91 config: 0 state: LIVE speed: 0 Mbps now, 0 Mbps max 3(veth04fb5821): addr:66:20:89:fa:e0:9f config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 4(vethbdebc4f7): addr:d6:f8:92:06:3e:da config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 5(veth9fd3926f): addr:5e:60:13:c8:30:9e config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 6(veth3b466fa9): addr:ee:3f:1b:cb:cf:9b config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 7(veth866c42b5): addr:be:7e:e6:d2:2f:f1 config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 8(veth2446bc66): addr:96:5b:fe:87:10:30 config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 9(veth1afcb012): addr:36:3b:de:9d:82:8b config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 10(vethd31bb8c7): addr:06:8f:41:50:ba:72 config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 11(vethe02a7907): addr:36:c4:af:c3:c8:26 config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max 12(veth06b13117): addr:0a:53:73:5e:21:d7 config: 0 state: LIVE current: 10GB-FD COPPER speed: 10000 Mbps now, 0 Mbps max LOCAL(br0): addr:b2:a3:b4:47:9d:4e config: PORT_DOWN state: LINK_DOWN speed: 0 Mbps now, 0 Mbps max
查看全部信息
[root@node1 ~]# ovs-vsctl show c2bad35f-9494-4055-aa1c-b255d0ee7e60 Bridge "br0" fail_mode: secure Port "tun0" Interface "tun0" type: internal Port "vethfff599a6" Interface "vethfff599a6" Port "br0" Interface "br0" type: internal Port "veth38efc56f" Interface "veth38efc56f" Port "veth80b6c074" Interface "veth80b6c074" Port "veth4cdb026d" Interface "veth4cdb026d" Port "veth42f411d0" Interface "veth42f411d0" Port "vxlan0" Interface "vxlan0" type: vxlan options: {dst_port="4789", key=flow, remote_ip=flow} Port "veth5bf0c012" Interface "veth5bf0c012" Port "vethb80cca24" Interface "vethb80cca24" Port "veth2335064f" Interface "veth2335064f" Port "vethf857e799" Interface "vethf857e799" Port "veth325c8496" Interface "veth325c8496" ovs_version: "2.9.0"
查看流量
[root@node1 network-scripts]# ovs-ofctl -O OpenFlow13 dump-flows br0 OFPST_FLOW reply (OF1.3) (xid=0x2): cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=250,ip,in_port=2,nw_dst=224.0.0.0/4 actions=drop cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=200,arp,in_port=1,arp_spa=10.128.0.0/14,arp_tpa=10.130.0.0/23 actions=move:NXM_NX_TUN_ID[0..31]->NXM_NX_REG0[],goto_table:10 cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=200,ip,in_port=1,nw_src=10.128.0.0/14 actions=move:NXM_NX_TUN_ID[0..31]->NXM_NX_REG0[],goto_table:10 cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=200,ip,in_port=1,nw_dst=10.128.0.0/14 actions=move:NXM_NX_TUN_ID[0..31]->NXM_NX_REG0[],goto_table:10 cookie=0x0, duration=788.192s, table=0, n_packets=98, n_bytes=4116, priority=200,arp,in_port=2,arp_spa=10.130.0.1,arp_tpa=10.128.0.0/14 actions=goto_table:30
開了個頭,先這樣,更多的網絡診斷,能夠參考
https://docs.openshift.com/container-platform/3.11/admin_guide/sdn_troubleshooting.html
openvswitch材料,參考
https://opengers.github.io/openstack/openstack-base-use-openvswitch/