OpenShift上的OpenvSwitch入門

前段時間參加openshift培訓,經過產品部門的講解,刷新了我對OpenShift一些的認識,今天先從最弱的環節網絡作一些瞭解吧。html

Openvswitch是openshift sdn的核心組件,進入集羣,而後列出某個節點全部的podnode

[root@master ~]# oc adm manage-node node1.example.com --list-pods

Listing matched pods on node: node1.example.com

NAMESPACE              NAME                                          READY     STATUS    RESTARTS   AGE
default                docker-registry-1-trrdq                       1/1       Running   3          16d
default                router-1-6zjmb                                1/1       Running   3          16d
openshift-monitoring   alertmanager-main-0                           3/3       Running   11         16d
openshift-monitoring   alertmanager-main-1                           3/3       Running   10         6d
openshift-monitoring   alertmanager-main-2                           3/3       Running   11         6d
openshift-monitoring   cluster-monitoring-operator-df6d9f48d-8lzcw   1/1       Running   4          16d
openshift-monitoring   grafana-76cc4f64c-m25c4                       2/2       Running   8          16d
openshift-monitoring   kube-state-metrics-8db94b768-tgfgq            3/3       Running   11         16d
openshift-monitoring   node-exporter-mjclp                           2/2       Running   86         164d
openshift-monitoring   prometheus-k8s-0                              4/4       Running   15         6d
openshift-monitoring   prometheus-k8s-1                              4/4       Running   14         6d
openshift-monitoring   prometheus-operator-959fc8dfd-ppc78           1/1       Running   9          16d
openshift-node         sync-7mpsc                                    1/1       Running   47         164d
openshift-sdn          ovs-rzc7j                                     1/1       Running   42         164d
openshift-sdn          sdn-77s8t                                     1/1       Running   60         164d

Sync Podgit

sync pod主要是sync daemonset建立,主要負責監控/etc/sysconfig/atomic-openshift-node的變化, 主要是觀察BOOTSTRAP_CONFIG_NAME的設置github

BOOTSTRAP_CONFIG_NAME 是openshift-ansible 安裝的,是一個基於node configuration group的ConfigMap類型。openshift 缺省有以下node Configuration Groupdocker

  • node-config-master數據庫

  • node-config-infrabootstrap

  • node-config-computebash

  • node-config-all-in-onecookie

  • node-config-master-infra網絡

Sync Pod 轉換configmap的數據到kubelet的配置,併爲節點生成/etc/origin/node/node-config.yaml ,若是文件配置有變化,kubelet會重啓。

0. OVS的簡介

OVS Pod

ovs Pod主要基於Openvswitch爲Openshift提供一個容器網絡.它是OpenvSwitch的一個容器化實現,核心組件以下:

核心組件介紹以下

  • OVS-VSwitchd

ovs-vswitchd守護進程是OVS的核心部件,它和datapath內核模塊一塊兒實現OVS基於流的數據交換。做爲核心組件,它使用openflow協議與上層OpenFlow控制器通訊,使用OVSDB協議與ovsdb-server通訊,使用netlinkdatapath內核模塊通訊。ovs-vswitchd在啓動時會讀取ovsdb-server中配置信息,而後配置內核中的datapaths和全部OVS switches,當ovsdb中的配置信息改變時(例如使用ovs-vsctl工具),ovs-vswitchd也會自動更新其配置以保持與數據庫同步

進入到ovs pod,能夠看到運行的核心進程

root      14874  14873  0 02:02 ?        00:00:01 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach --monitor

 

 

  • ovsdb-server

ovsdb-server是OVS輕量級的數據庫服務,用於整個OVS的配置信息,包括接口/交換內容/VLAN等,OVS主進程ovs-vswitchd根據數據庫中的配置信息工做,下面是ovsdb-server進程詳細信息

sh-4.2# ps -ef | grep ovsdb-server
root      14009  13989  0 02:01 ?        00:00:00 /bin/bash -c #!/bin/bash set -euo pipefail  # if another process is listening on the cni-server socket, wait until it exits trap 'kill $(jobs -p); exit 0' TERM retries=0 while true; do   if /usr/share/openvswitch/scripts/ovs-ctl status &>/dev/null; then     echo "warning: Another process is currently managing OVS, waiting 15s ..." 2>&1     sleep 15 & wait     (( retries += 1 ))   else     break   fi   if [[ "${retries}" -gt 40 ]]; then     echo "error: Another process is currently managing OVS, exiting" 2>&1     exit 1   fi done  # launch OVS function quit {     /usr/share/openvswitch/scripts/ovs-ctl stop     exit 0 } trap quit SIGTERM /usr/share/openvswitch/scripts/ovs-ctl start --no-ovs-vswitchd --system-id=random  # Restrict the number of pthreads ovs-vswitchd creates to reduce the # amount of RSS it uses on hosts with many cores # https://bugzilla.redhat.com/show_bug.cgi?id=1571379 # https://bugzilla.redhat.com/show_bug.cgi?id=1572797 if [[ `nproc` -gt 12 ]]; then     ovs-vsctl --no-wait set Open_vSwitch . other_config:n-revalidator-threads=4     ovs-vsctl --no-wait set Open_vSwitch . other_config:n-handler-threads=10 fi /usr/share/openvswitch/scripts/ovs-ctl start --no-ovsdb-server --system-id=random  tail --follow=name /var/log/openvswitch/ovs-vswitchd.log /var/log/openvswitch/ovsdb-server.log & sleep 20 while true; do   if ! /usr/share/openvswitch/scripts/ovs-ctl status &>/dev/null; then     echo "OVS seems to have crashed, exiting"     quit   fi   sleep 15 done
root      14382  13989  0 02:02 ?        00:00:00 ovsdb-server: monitoring pid 14383 (healthy)
root 14383  14382  0 02:02 ?        00:00:00 ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor root      15101  14009  0 02:02 ?        00:00:00 tail --follow=name /var/log/openvswitch/ovs-vswitchd.log /var/log/openvswitch/ovsdb-server.log
root      44875  36935  0 02:27 ?        00:00:00 grep ovsdb-server
  • OpenFlow

OpenFlow是開源的用於管理交換機流表的協議,OpenFlow在OVS中的地位能夠參考上面架構圖,它是Controller和ovs-vswitched間的通訊協議。須要注意的是,OpenFlow是一個獨立的完整的流表協議,不依賴於OVS,OVS只是支持OpenFlow協議,有了支持,咱們可使用OpenFlow控制器來管理OVS中的流表,OpenFlow不單單支持虛擬交換機,某些硬件交換機也支持OpenFlow協議。

  • Controller

Controller指OpenFlow控制器。OpenFlow控制器能夠經過OpenFlow協議鏈接到任何支持OpenFlow的交換機,好比OVS。控制器經過向交換機下發流表規則來控制數據流向。除了能夠經過OpenFlow控制器配置OVS中flows,也可使用OVS提供的ovs-ofctl命令經過OpenFlow協議去鏈接OVS,從而配置flows,命令也可以對OVS的運行情況進行動態監控。

  • Kernel Datapath

datapath是一個Linux內核模塊,它負責執行數據交換。

 

1.網絡架構 

下面這個圖比較直觀。

找個環境先看看

[root@node1 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:dc:99:1a brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.104/24 brd 192.168.56.255 scope global noprefixroute enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fedc:991a/64 scope link tentative dadfailed 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:1e:32:6a:69 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 42:ea:54:f3:41:32 brd ff:ff:ff:ff:ff:ff
9: br0: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default qlen 1000
    link/ether 8e:be:50:76:b7:45 brd ff:ff:ff:ff:ff:ff
10: vxlan_sys_4789: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether 56:e0:7e:45:e9:5b brd ff:ff:ff:ff:ff:ff
    inet6 fe80::54e0:7eff:fe45:e95b/64 scope link 
       valid_lft forever preferred_lft forever
11: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 1a:57:ef:7d:84:2a brd ff:ff:ff:ff:ff:ff
    inet 10.130.0.1/23 brd 10.130.1.255 scope global tun0
       valid_lft forever preferred_lft forever
    inet6 fe80::1857:efff:fe7d:842a/64 scope link 
       valid_lft forever preferred_lft forever
12: veth4da99f3a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether 4e:7c:96:3b:db:3c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::4c7c:96ff:fe3b:db3c/64 scope link 
       valid_lft forever preferred_lft forever
13: veth4596fc96@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether 2a:83:b7:1b:e0:81 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::2883:b7ff:fe1b:e081/64 scope link 
       valid_lft forever preferred_lft forever
14: veth04caa6b2@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether 3a:5b:9a:62:9c:f4 brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::385b:9aff:fe62:9cf4/64 scope link 
       valid_lft forever preferred_lft forever
15: veth14f14b18@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether ca:d2:96:48:84:be brd ff:ff:ff:ff:ff:ff link-netnsid 3
    inet6 fe80::c8d2:96ff:fe48:84be/64 scope link 
       valid_lft forever preferred_lft forever
16: veth31713a78@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether ce:c3:8f:3e:b9:41 brd ff:ff:ff:ff:ff:ff link-netnsid 4
    inet6 fe80::ccc3:8fff:fe3e:b941/64 scope link 
       valid_lft forever preferred_lft forever
17: veth9ff2f96a@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether 16:f8:11:d2:28:7a brd ff:ff:ff:ff:ff:ff link-netnsid 5
    inet6 fe80::14f8:11ff:fed2:287a/64 scope link 
       valid_lft forever preferred_lft forever
18: veth86a4a302@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether 56:75:ab:7a:17:28 brd ff:ff:ff:ff:ff:ff link-netnsid 6
    inet6 fe80::5475:abff:fe7a:1728/64 scope link 
       valid_lft forever preferred_lft forever
19: vethb4141622@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether 46:d1:df:02:5c:d3 brd ff:ff:ff:ff:ff:ff link-netnsid 7
    inet6 fe80::44d1:dfff:fe02:5cd3/64 scope link 
       valid_lft forever preferred_lft forever
20: vethae772509@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether c6:12:af:03:f0:b7 brd ff:ff:ff:ff:ff:ff link-netnsid 8
    inet6 fe80::c412:afff:fe03:f0b7/64 scope link 
       valid_lft forever preferred_lft forever
25: veth4fbfae38@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ovs-system state UP group default 
    link/ether 82:f3:e8:87:9b:e9 brd ff:ff:ff:ff:ff:ff link-netnsid 9
    inet6 fe80::80f3:e8ff:fe87:9be9/64 scope link 
       valid_lft forever preferred_lft forever

 

能夠看到一堆的interface, 在一個節點上,OpenShift SDN每每會生成以下的類型的接口(interface)

  • br0:  OVS的網橋設備 The OVS bridge device that containers will be attached to. OpenShift SDN also configures a set of non-subnet-specific flow rules on this bridge.

  • tun0:  訪問外部網絡的網關 An OVS internal port (port 2 on br0). This gets assigned the cluster subnet gateway address, and is used for external network access. OpenShift SDN configures netfilter and routing rules to enable access from the cluster subnet to the external network via NAT.

  • vxlan_sys_4789: 訪問其餘節點Pod的設備 The OVS VXLAN device (port 1 on br0), which provides access to containers on remote nodes. Referred to as vxlan0 in the OVS rules.

  • vethX (in the main netns):  和Pod關聯的虛擬網卡 A Linux virtual ethernet peer of eth0 in the Docker netns. It will be attached to the OVS bridge on one of the other ports.

 這個說明能夠結合上面那個圖來看,就很清晰了。

 

基於brctl命令是看不到ovs的網橋的,只能看到 docker0

[root@node1 ~]# brctl show
bridge name    bridge id        STP enabled    interfaces
docker0        8000.02421e326a69    no        

若是要看到br0網橋,而且瞭解附在上面的ip信息,須要安裝openvswitch.

2.安裝OpenvSwitch

基於 

https://access.redhat.com/solutions/3710321

由於openvswitch 在3.10以上版本轉爲static pod實現,因此在3.10以上版本在節點不預裝openvswitch,若是須要安裝參照一下步驟。固然也能夠直接在ovs pod裏面進行查看。

# subscription-manager register
# subscription-manager list --available
# subscription-manager attach --pool=<pool_id>
subscription-manager repos --enable=rhel-7-server-extras-rpms
subscription-manager repos --enable=rhel-7-server-optional-rpms

yum install -y openvswitch

 

基於命令查看網橋

[root@node1 network-scripts]#  ovs-vsctl list-br
br0

查看網橋上的端口

[root@node1 network-scripts]# ovs-ofctl -O OpenFlow13 dump-ports-desc br0
OFPST_PORT_DESC reply (OF1.3) (xid=0x2):
 1(vxlan0): addr:a2:01:b9:7d:a6:c3
     config:     0
     state:      LIVE
     speed: 0 Mbps now, 0 Mbps max
 2(tun0): addr:de:48:d6:e7:a9:91
     config:     0
     state:      LIVE
     speed: 0 Mbps now, 0 Mbps max
 3(veth04fb5821): addr:66:20:89:fa:e0:9f
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 4(vethbdebc4f7): addr:d6:f8:92:06:3e:da
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 5(veth9fd3926f): addr:5e:60:13:c8:30:9e
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 6(veth3b466fa9): addr:ee:3f:1b:cb:cf:9b
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 7(veth866c42b5): addr:be:7e:e6:d2:2f:f1
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 8(veth2446bc66): addr:96:5b:fe:87:10:30
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 9(veth1afcb012): addr:36:3b:de:9d:82:8b
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 10(vethd31bb8c7): addr:06:8f:41:50:ba:72
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 11(vethe02a7907): addr:36:c4:af:c3:c8:26
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 12(veth06b13117): addr:0a:53:73:5e:21:d7
     config:     0
     state:      LIVE
     current:    10GB-FD COPPER
     speed: 10000 Mbps now, 0 Mbps max
 LOCAL(br0): addr:b2:a3:b4:47:9d:4e
     config:     PORT_DOWN
     state:      LINK_DOWN
     speed: 0 Mbps now, 0 Mbps max

 

查看全部信息

[root@node1 ~]# ovs-vsctl show
c2bad35f-9494-4055-aa1c-b255d0ee7e60
    Bridge "br0"
        fail_mode: secure
        Port "tun0"
            Interface "tun0"
                type: internal
        Port "vethfff599a6"
            Interface "vethfff599a6"
        Port "br0"
            Interface "br0"
                type: internal
        Port "veth38efc56f"
            Interface "veth38efc56f"
        Port "veth80b6c074"
            Interface "veth80b6c074"
        Port "veth4cdb026d"
            Interface "veth4cdb026d"
        Port "veth42f411d0"
            Interface "veth42f411d0"
        Port "vxlan0"
            Interface "vxlan0"
                type: vxlan
                options: {dst_port="4789", key=flow, remote_ip=flow}
        Port "veth5bf0c012"
            Interface "veth5bf0c012"
        Port "vethb80cca24"
            Interface "vethb80cca24"
        Port "veth2335064f"
            Interface "veth2335064f"
        Port "vethf857e799"
            Interface "vethf857e799"
        Port "veth325c8496"
            Interface "veth325c8496"
    ovs_version: "2.9.0"

 

查看流量

[root@node1 network-scripts]# ovs-ofctl -O OpenFlow13 dump-flows br0
OFPST_FLOW reply (OF1.3) (xid=0x2):
 cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=250,ip,in_port=2,nw_dst=224.0.0.0/4 actions=drop
 cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=200,arp,in_port=1,arp_spa=10.128.0.0/14,arp_tpa=10.130.0.0/23 actions=move:NXM_NX_TUN_ID[0..31]->NXM_NX_REG0[],goto_table:10
 cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=200,ip,in_port=1,nw_src=10.128.0.0/14 actions=move:NXM_NX_TUN_ID[0..31]->NXM_NX_REG0[],goto_table:10
 cookie=0x0, duration=788.192s, table=0, n_packets=0, n_bytes=0, priority=200,ip,in_port=1,nw_dst=10.128.0.0/14 actions=move:NXM_NX_TUN_ID[0..31]->NXM_NX_REG0[],goto_table:10
 cookie=0x0, duration=788.192s, table=0, n_packets=98, n_bytes=4116, priority=200,arp,in_port=2,arp_spa=10.130.0.1,arp_tpa=10.128.0.0/14 actions=goto_table:30

 

開了個頭,先這樣,更多的網絡診斷,能夠參考

https://docs.openshift.com/container-platform/3.11/admin_guide/sdn_troubleshooting.html

 

openvswitch材料,參考

https://opengers.github.io/openstack/openstack-base-use-openvswitch/

相關文章
相關標籤/搜索