探索 OpenStack 之(7):Neutron 深刻探索之 Open vSwitch (OVS) + GRE 之 Neutron節點篇

0. 測試環境

硬件環境:仍是使用四節點OpenStack部署環境,參見 http://www.cnblogs.com/sammyliu/p/4190843.htmlhtml

OpenStack配置:node

  • tenant:三個tenant:demo,tenant-one,tenant-two
  • network:三個tenanet公用public network,每一個tenant擁有本身的subnet,都有一個router鏈接本身的subnet到public net
  • 虛機:三個虛機,tenant-one一個,tenant-two兩個,都在compute node上

1. Neutron節點上的網絡組件

使用 http://www.cnblogs.com/sammyliu/p/4201143.html 中相同的方法,畫出Neutron節點上網絡組件圖:算法

可見:數據庫

  (1). 關於Neutron上的三種Agent的做用:cookie

  • Neutron-OVS-Agent:從OVS-Plugin上接收tunnel和tunnel flow的配置,驅動OVS來創建GRE Tunnel
  • Neutron-DHCP-Agent:爲每個配置了DHCP的網絡/子網配置dnsmasq,也負責把Mac地址/IP地址 信息寫入dnsmasq dhcp lease 文件
  • Neturon-L3-Agent:設置iptables/routing/NAT表

(2). Neutorn節點上一樣有OVS Tunnel bridge br-tun和OVS Integration bridge br-int,多了br-ex來提供外部網絡鏈接,br-ex和物理網卡eth0綁定。這裏出現的一個問題是eth0的IP沒法ping通,OVS提供的解決方法以下。究其緣由,一塊物理以太網卡若是做爲 OpenvSwitch bridge 的一部分,則它不能擁有 IP 地址,若是有,也會徹底不起做用。若是發生了上述狀況,能夠將 IP 地址綁定至某 OpenvSwitch 「internal」 設備來恢復網絡訪問功能。網絡

ifconfig eth0 0.0.0.0 
ifconfig br-ex 192.168.1.19

(3). Neutron使用Linux network namespace來實現tenant之間的網絡隔離。本例中有三個network namespace,每一個network namspace包括router,dhcp,interface,routing tables,iptable rules等。dom

root@network:/home/s1# ip netns
qdhcp-d24963da-5221-481e-adf5-fe033d6e0b4e
qrouter-e506f8fe-3260-4880-bd06-32246225aeae
qdhcp-d04a0a06-7206-4d05-9432-3443843bc199
qrouter-33e2b1bf-04cb-4811-9c58-7e03856022c1
qrouter-9ba04071-f32b-435e-8f44-e32936568102
qdhcp-0a4cd030-d951-401a-8202-937b788bea43

(4). Neutron 爲每個 network 分配一個本地的 VLAN ID,每一個 network 分配一個 network namespace,該DHCP 經過一個 tap 鏈接在 br-int 上,該 tap 的 tag 爲該 local VLAN IDH1/H2/H3端口上分佈有不一樣的VLAN ID。 tcp

#在存在多個 network 的狀況下,br-int 上DHCP namespace 端口的 tag 狀況
       Port "tap0f45d165-9f"
            tag: 5
            Interface "tap0f45d165-9f"
                type: internal
        Port "tap89874f55-97"
            tag: 4
            Interface "tap89874f55-97"
                type: internal
        Port "tap5522533d-fe"
            tag: 3
            Interface "tap5522533d-fe"
                type: internal   
        Port "tap56c9730c-9c"
            tag: 4095
            Interface "tap56c9730c-9c"
                type: internal                            
        Port "tap1fd04a93-09"
            tag: 4095
            Interface "tap1fd04a93-09"
                type: internal
        Port "tap777c1047-ed"
            tag: 2
            Interface "tap777c1047-ed"
                type: internal  
        Port "tap3fca96e0-c6"
            tag: 1
            Interface "tap3fca96e0-c6"
                type: internal  
(5). 不知道爲何br-ex和br-int之間還須要有直接的path。網上看到一些說法,彷佛不是全部的環境都須要使用這條路徑,好比當前的環境,往外走的traffic都會通過router到br-ex,應該不會直接到從br-int到br-ex。也許是某些配置中須要用到。解釋之一是eth0是虛機網絡的物理網卡,這麼說的話它就是必需要有的。
  (6). Neutron-OVS-Agent會從Neutron db的表 ml2_gre_endpoints中讀取GRE端口的信息。若是其中出現錯誤的IP地址,Neutron上會出現錯誤的GRE Tunnel。解決方法是先刪除數據庫中的錯誤記錄,在重啓 neutron-plugin-openvswitch-agent service.
 

1.1 br-tun OpenFlow rules

插播Mac地址的基礎知識:分佈式

  • MAC地址是以太網二層使用的一個48bit(6字節十六進制數)的地址,用來標識設備位置。MAC地址分紅兩部分,前24位是組織惟一標識符(OUI, Organizationally unique identifier),後24位由廠商自行分配。48bit的MAC地址通常用6字節的十六進制來表示,如XX-XX-XX-XX-XX-XX。
  • 廣播地址:FF:FF:FF:FF:FF:FF
  • 組播地址:MAC組播地址的特徵是頭8位的最低位是1。例如01:80:C2:00:00:00是一個組播地址,表示802.1d網橋多播組。網橋就是使用這個地址,相互之間交換配置信息,運行分佈式生成樹算法,消除網絡拓撲結構中的環路。
  • 單播地址:單播地址的特徵是頭8位的最低位爲0。每一個網卡出廠時被分配惟一一個單播地址,頭24位是設備製造廠商的編號,由IEEE(電氣與電子工程師協會)分配,後24位是設備廠商爲網卡制定的惟一編號。例如8C-70-5A-29-3A-48 是單播地址的例子 (8C = 10001100)。

root@network:/home/s1# ovs-ofctl dump-flows br-tunide

NXST_FLOW reply (xid=0x4):

 cookie=0x0, duration=33.236s, table=0, n_packets=0, n_bytes=0, idle_age=33, priority=1,in_port=1 actions=resubmit(,2) //從H1進來的traffic,到table 2

cookie=0x0, duration=32.131s, table=0, n_packets=0, n_bytes=0, idle_age=32, priority=1,in_port=2 actions=resubmit(,3) //從GRE端口進來的traffic,到table 3

 cookie=0x0, duration=33.178s, table=0, n_packets=6, n_bytes=480, idle_age=24, priority=0 actions=drop

 cookie=0x0, duration=33.121s, table=2, n_packets=0, n_bytes=0, idle_age=33, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20) //目的地址爲單播地址,到table 20

 cookie=0x0, duration=33.066s, table=2, n_packets=0, n_bytes=0, idle_age=33, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22) //目的地址爲組播(包括廣播)地址,到table 22

 cookie=0x0, duration=30.614s, table=3, n_packets=0, n_bytes=0, idle_age=30, priority=1,tun_id=0x1 actions=mod_vlan_vid:1,resubmit(,10) //Tunnel 1的traffic,修改VLAN ID 爲 1, 再到 table 10

 cookie=0x0, duration=29.291s, table=3, n_packets=0, n_bytes=0, idle_age=29, priority=1,tun_id=0x2 actions=mod_vlan_vid:3,resubmit(,10) //Tunnel 2的traffic,修改VLAN ID 爲 2, 再到 table 10

 cookie=0x0, duration=30.241s, table=3, n_packets=0, n_bytes=0, idle_age=30, priority=1,tun_id=0x3 actions=mod_vlan_vid:2,resubmit(,10) //Tunnel 3的traffic,修改VLAN ID 爲 3, 再到 table 10

 cookie=0x0, duration=33.001s, table=3, n_packets=0, n_bytes=0, idle_age=33, priority=0 actions=drop

 cookie=0x0, duration=32.932s, table=4, n_packets=0, n_bytes=0, idle_age=32, priority=0 actions=drop

 cookie=0x0, duration=32.874s, table=10, n_packets=0, n_bytes=0, idle_age=32, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1 //學習一條新的規則添加到table 20,發到端口1,進入br-int

 cookie=0x0, duration=32.815s, table=20, n_packets=0, n_bytes=0, idle_age=32, priority=0 actions=resubmit(,22) //到table 22

 cookie=0x0, duration=29.35s, table=22, n_packets=0, n_bytes=0, idle_age=29, dl_vlan=3 actions=strip_vlan,set_tunnel:0x2,output:2

 cookie=0x0, duration=30.293s, table=22, n_packets=0, n_bytes=0, idle_age=30, dl_vlan=2 actions=strip_vlan,set_tunnel:0x3,output:2

 cookie=0x0, duration=30.682s, table=22, n_packets=0, n_bytes=0, idle_age=30, dl_vlan=1 actions=strip_vlan,set_tunnel:0x1,output:2 //以上三條rule,根據目的VLAN ID,修改Tunnel ID,並去掉VLAN ID,發到GRE端口,通過GRE Tunnel到compute node

 cookie=0x0, duration=32.752s, table=22, n_packets=0, n_bytes=0, idle_age=32, priority=0 actions=drop

總之,br-tun會:

  • 把從GRE端口來的traffic設置相應的VLAN ID,發到br-int
  • 把從br-int/patch-int來的traffic,去掉VLAN ID,設置相應的Trunne ID,通過GRE端口H1 發到Compute節點 

2. Router Server

2.1 以tenant-one (有一個虛機)的router爲例,先看看它的interface (略去lo)

root@network:/home/s1# ip netns exec qrouter-33e2b1bf-04cb-4811-9c58-7e03856022c1 ip addr
22: qr-d3d3e235-d4: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether fa:16:3e:b3:06:e8 brd ff:ff:ff:ff:ff:ff
    inet 10.0.11.1/24 brd 10.0.11.255 scope global qr-d3d3e235-d4
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:feb3:6e8/64 scope link
       valid_lft forever preferred_lft forever
26: qg-6c06581b-bd: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether fa:16:3e:0b:ac:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.114/24 brd 192.168.1.255 scope global qg-6c06581b-bd
       valid_lft forever preferred_lft forever
    inet 192.168.1.115/32 brd 192.168.1.115 scope global qg-6c06581b-bd
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe0b:ac82/64 scope link
       valid_lft forever preferred_lft forever
可見:
  • qg-6c06581b-bd 鏈接 br-ex
  • qr-d3d3e235-d4鏈接br-int

再看看它的route規則:

root@network:/home/s1# ip netns exec qrouter-33e2b1bf-04cb-4811-9c58-7e03856022c1 route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 qg-6c06581b-bd 

//默認路由,全部目的地址不在本網絡中的traffic都要經過 qg-d3657c7f-28 interface 發到外網網關192.168.1.1 

10.0.11.0       0.0.0.0         255.255.255.0   U     0      0        0 qr-d3d3e235-d4 

//目的爲本子網內的traffic 通過 qr-d3d3e235-d4 發到子網網關 10.0.11.1

192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 qg-6c06581b-bd

//目的爲 192.168.1.0/24 的traffic經過 qg-6c06581b-bd 發到網關192.168.1.100

2.2 Neutorn Floating IP 實現原理

Router namespace中的 netfilter NAT 表負責 Neutron Floating IP 的實現。下面是tenant-two (有兩個虛機)的router的NAT表:

root@network:/home/s1# ip netns exec qrouter-e506f8fe-3260-4880-bd06-32246225aeae  iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N neutron-l3-agent-OUTPUT
-N neutron-l3-agent-POSTROUTING
-N neutron-l3-agent-PREROUTING
-N neutron-l3-agent-float-snat
-N neutron-l3-agent-snat
-N neutron-postrouting-bottom
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 192.168.1.118/32 -j DNAT --to-destination 10.0.22.200
-A neutron-l3-agent-OUTPUT -d 192.168.1.117/32 -j DNAT --to-destination 10.0.22.202
-A neutron-l3-agent-POSTROUTING ! -i qg-cba7b139-04 ! -o qg-cba7b139-04 -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 192.168.1.118/32 -j DNAT --to-destination 10.0.22.200
-A neutron-l3-agent-PREROUTING -d 192.168.1.117/32 -j DNAT --to-destination 10.0.22.202
-A neutron-l3-agent-float-snat -s 10.0.22.200/32 -j SNAT --to-source 192.168.1.118
-A neutron-l3-agent-float-snat -s 10.0.22.202/32 -j SNAT --to-source 192.168.1.117
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-l3-agent-snat -s 10.0.22.0/24 -j SNAT --to-source 192.168.1.116
-A neutron-postrouting-bottom -j neutron-l3-agent-snat
  • SNAT (源地址轉換) 負責把從虛機來的traffic的 IP源地址 即fixed ip 10.0.22.200/202 轉化爲 floating ip 192.168.1.118/117,而後該traffic被路由到 br-ex 再到外網
  • DNAT (目的地址轉換)負責把從外網來的traffic的 IP目的地址 即floating ip 192.168.1.118/117 轉化爲虛機所使用的 fixed ip 10.0.22.200/202,而後該traffic被路由到br-int 再到虛機

 3. DHCP Server

每個有DHCP的網絡都在Neutron節點上有一個DHCP服務,每一個DHCP Server都是一個運行在一個network namespace中的dnsmasq進程。 dnsmasq是一個用在Linux上的輕型DNS和DHCP服務,具體見 http://www.thekelleys.org.uk/dnsmasq/docs/dnsmasq-man.html.

3.1 每一個DHCP在neutron host上都有一個process,其ID是qdhcp-<net id>:

nobody    2049     1  0 06:43 ?        00:00:00 dnsmasq --no-hosts --no-resolv --strict-order --bind-interfaces --interface=tap15865c29-9b --except-interface=lo --pid-file=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/host --addn-hosts=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/opts --leasefile-ro --dhcp-range=set:tag0,10.0.22.0,static,86400s --dhcp-lease-max=256 --conf-file= --domain=openstacklocal

說明:

1.  --interface=tap15865c29-9b: 該process綁定/監聽一個TAP設備,即上圖中的 H3

2.  --dhcp-hostsfile=/var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/host:

root@network:/home/s1# cat /var/lib/neutron/dhcp/d24963da-5221-481e-adf5-fe033d6e0b4e/host

fa:16:3e:4d:6b:44,host-10-0-22-201.openstacklocal,10.0.22.201 //本子網DHCP Server本身(M3)的Mac地址以及IP

fa:16:3e:79:07:5e,host-10-0-22-1.openstacklocal,10.0.22.1 //本子網Router Server ( N3) 的Mac地址,名字和 IP

fa:16:3e:bf:69:36,host-10-0-22-200.openstacklocal,10.0.22.200 //本子網虛機1的Mac地址,虛機的主機名字,虛機的fixed IP

fa:16:3e:19:65:62,host-10-0-22-202.openstacklocal,10.0.22.202 //本子網虛機2的Mac地址,虛機的主機名字,虛機的fixed IP

fa:16:3e:88:99:c1,host-10-0-0-116.openstacklocal,10.0.0.116 //子網1的DHCP Server (H1)的Mac地址,以及IP地址。那麼這裏爲何沒H2的相應信息?

在虛機的建立過程當中,Neutron會把這些信息(應該是從neutron db中拿到一個可用的IP地址)寫到該文件中,這樣,當虛機使用Mac地址向DHCP Server查詢IP地址的時候,dnsmasq會讀取該文件把IP地址返回給它。

3.2 DHCP的interface (省去lo)

root@network:/home/s1# ip netns exec qdhcp-0a4cd030-d951-401a-8202-937b788bea43 ip addr

18: tap6356d532-32: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default 

    link/ether fa:16:3e:88:99:c1 brd ff:ff:ff:ff:ff:ff

    inet 10.0.0.116/24 brd 10.0.0.255 scope global tap6356d532-32

       valid_lft forever preferred_lft forever

    inet6 fe80::f816:3eff:fe88:99c1/64 scope link 

       valid_lft forever preferred_lft forever

 

root@network:/home/s1# ip netns exec qdhcp-d04a0a06-7206-4d05-9432-3443843bc199 ip addr

17: tap8dfd0bd8-45: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default 

    link/ether fa:16:3e:82:fd:26 brd ff:ff:ff:ff:ff:ff

    inet 10.0.11.101/24 brd 10.0.11.255 scope global tap8dfd0bd8-45

       valid_lft forever preferred_lft forever

    inet6 fe80::f816:3eff:fe82:fd26/64 scope link 

       valid_lft forever preferred_lft forever


root@network:/home/s1# ip netns exec qdhcp-d24963da-5221-481e-adf5-fe033d6e0b4e ip addr 19: tap15865c29-9b: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether fa:16:3e:4d:6b:44 brd ff:ff:ff:ff:ff:ff inet 10.0.22.201/24 brd 10.0.22.255 scope global tap15865c29-9b valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe4d:6b44/64 scope link valid_lft forever preferred_lft forever

DHCP使用fix ip range的第一個可用IP地址作爲其IP地址。它的interface的MAC地址 fa:16:3e:4d:6b:44 會出如今br-tun的rules裏面。

3.3 虛機向DHCP Server申請/查詢Fixed IP

具體步驟在下一篇博文中詳細描述。

相關文章
相關標籤/搜索