OpenFlow Controller:SDN 網絡將傳統的網絡結構劃分紅了 Control Plane 和 Data Plane 兩部分,OpenFlow Controller 正是 Control Plane 部分,經過約定的通訊協議來遠程控制管理 OpenFlow Switch。增長、刪除或者修改 OpenFlow Switch 的 Flow Entries。
OpenFlow Switch:實現了 OpenFlow Switch 規範的交換機。
- OpenFlow-only Switch(純粹的 OpenFlow Switch):全部的網絡包只能經過 OpenFlow 流表的 Pipeline。
- OpenFlow-hybrid Switch(混合型的 OpenFlow Switch):同時支持傳統網絡協議棧和 OpenFlow 協議的交換機設備。
OpenFlow Channel:OpenFlow Switch 對外開放的接口,接受來自於 Remote Controller 的通訊協議,進而來操縱 OpenFlow Switch。
OpenFlow Protocol:一種通訊協議規範,用於 Remote Controller 和 OpenFlow Switch 之間的消息交換。
Flow Table:包含多個 Flow Entry 記錄,控制數據包的流向。
Group Table:相對於 Flow Table,控制着數據包更高級的轉發特性,好比 Flooding、Multipath、Fast Reroute、Link Aggregation 等。
Meter Table:對匹配流表項的網絡包執行 QoS 策略。
Pipeline:由多個 Flow Table 連接而成,控制數據包的一系列行爲。html
OpenFlow 交換機基於多個流表(Flow Table)和一個組表(Group Table)轉發數據包,經過 OpenFlow Channel 與 OpenFlow 控制器進行通訊。OpenFlow 控制器能夠向 OpenFlow 交換機下發配置來添加、刪除、更新流表中的流表項(Flow Entry),OpenFlow 交換機也能夠將數據包轉發至 OpenFlow 控制器,由 OpenFlow 控制器來判斷如何處理數據包。git
OpenFlow 規範主要定義了 OpenFlow 交換機的功能模塊以及其與 OpenFlow 控制器之間的通訊信道等方面。OpenFlow 規範還在不停改進,本文以 OpenFlow 1.3.5 爲基礎展開。web
OpenFlow 規範主要分爲四大部分:算法
一臺 OpenFlow 交換機能夠配置 65280 個端口,OpenFlow 規範將交換機上的端口分爲 3 種類別:安全
OpenFlow 目前總共定義了 ALL、CONTROLLER、TABLE、IN_PORT、ANY、LOCAL、NORMAL 和 FLOOD 等 8 種預留端口。
以 16 比特標記 OpenFlow 自定義端口類型:
其中後 3 種爲非必需的端口,只在混合型的 OpenFlow Switch 中存在。e.g.
服務器
一個 OpenFlow 交換機中能夠有多個流表,一個流表又能夠包含有多個流表項。網絡包能夠與各流表中的流表項匹配,即網絡包能夠與多個流表項匹配。
**OpenFlow 經過用戶定義的或者預設的流表項來匹配和處理網絡包。**全部 OpenFlow 的流表項都被組織在不一樣的 Flow Table 中,在同一個 Flow Table 中按規則的優先級進行前後匹配。一個 OpenFlow 的交換機至少包含一個能夠包含多個 Flow Table,從 0 依次編號排列。OpenFlow 規範中定義了流水線式的處理流程,當數據包進入交換機後,必須從 Flow Table 0 開始依次匹配。Flow Table 可使用 goto 語句按次序從小到大越級跳轉,但不能從某一 Flow Table 向前跳轉至編號更小的 Flow Table。當數據包成功匹配一條流表項後,將首先更新該流表項對應的統計數據(又稱計數器,如:成功匹配數據包總數目和總字節數等),而後根據規則流表項中的指令進行相應操做。好比:跳轉至後續某一 Flow Table 繼續處理,修改或者當即執行該數據包對應的 Action Set 等。當數據包已經處於最後一個 Flow Table 時,其對應的 Action Set 中的全部 Action 將被執行,包括轉發至某一端口,修改數據包某一字段,丟棄數據包等。OpenFlow 規範中對目前所支持的 Instructions 和 Actions 進行了完整詳細的說明和定義。
OpenFlow 的流水線(Popeline Processing):
網絡
一條 OpenFlow 的流表項規則由如下部分組成:
數據結構
Action Set 中的主要 Action 包括:架構
Set-Field Action 能夠有如下類型:
NOTE:爲了實現 QoS,與 ToS 同時使用的是 CoS(Class of Service,服務等級)。ToS 在 IPv4 Header 中,Set VLAN priority 至關於 CoS。app
在一條流表項的匹配域中能夠根據網絡包在 L二、L3 或者 L4 等網絡報文頭的任意字段進行匹配。好比以太網幀的源 MAC 地址,IP 包的協議類型和 IP 地址,或者 TCP/UDP 的端口號等。目前 OpenFlow 的規範中還規定了交互機設備廠商能夠選擇性地支持通配符進行匹配。
v1.0 匹配與字段,v1.3 的字段類型更多:
網絡包在 OpenFlow 交換機中的執行流程:
首先 OpenFlow 交換機解析進入設備的網絡包,從 Table 0 開始匹配,按照優先級高低依次匹配該流表中的流表項,一個網絡包在一個流表中只會匹配上一條流表項。根據指令是否繼續前往下一個流表,不繼續則終止匹配流程執行動做集,若是指令要求繼續前往下一個流表則繼續匹配,下一個流表的 ID 須要比當前流表 ID 大。當網絡包匹配失敗了,且存在無匹配流表項(Table-miss)就按照該表項執行指令。通常是將網絡包轉發給 OpenFlow 控制器、丟棄或轉發給其餘流表。若是沒有 Table-miss 表項則默認丟棄該網絡包。通常 Table-miss 流表項全部 MatchFields 都爲空,而且優先級爲 0。
OpenFlow 控制器與 OpenFlow 交換機的流表項交互模式:
OpenFlow 流表示例:
除了直接由流表項處理數據包,還能夠由流表項指定經過組表(Group Table)來轉發數據包,能夠經過在不一樣流表項動做中引用相同的組表實現對數據包執行相同的動做,以此簡化了流表的維護。
OpenFlow 1.3 還有 Meter 表,用於關聯的流表項,對匹配流表項的網絡包執行 QoS 策略,字段以下:
這一節中,OpenFlow 規範定義了一個 OpenFlow Switch 如何與 Controller 創建鏈接、通信以及相關消息類型等。
當 OpenFlow 交換機啓動後既可與 OpenFlow 控制器創建鏈接,這個鏈接就稱之爲 「OpenFlow 通道」。鏈接是從 OpenFlow 交換機向 OpenFlow 控制器發起創建的。出於安全和高可用性等方面的考慮,OpenFlow 的規範還規定了如何爲 Controller 和 Switch 之間的信道加密、如何創建多鏈接等(主鏈接和輔助鏈接)。OpenFlow 通道支持 TLS 安全通訊,OpenFlow 控制器和 OpenFlow 交換機使用服務器證書和客戶端證書進行認證,創建安全通道的 TCP 端口默認爲 6633。在 OpenFlow 交換機與控制器創建鏈接後,有時候會由 OpenFlow 控制器來完成 OpenFlow 網絡的拓撲檢測,OpenFLow 可使用 LLDP(Link Layer Discovery Protocol,鏈路層發現協議)來完成。
OpenFlow 消息頭部:
OpenFlow 消息類型一覽(v1.0):
OpenFlow 規範中定義了三種消息類型:每一種類型都有多個子類型,控制器和交換機之間經過這三類消息進行鏈接創建,流表下發和信息交換,實現對網絡中全部 OpenFlow 交換機的控制。
Controller/Switch(Controller-to-Switch)消息:是指由 Controller 發起、Switch 接收並處理的消息,主要包括下列消息。這些消息主要由 Controller 用來對 Switch 進行狀態查詢和修改配置等操做。
異步(Asynchronous)消息:是由 Switch 發送給 Controller、用來通知 Switch上 發生的某些異步事件的消息,主要包括下列等。例如,當某一條規則由於超時而被刪除時,Switch 將自動發送一條 Flow-Removed 消息通知 Controller,以方便 Controller 做出相應的操做,如從新設置相關規則等。
對稱(Symmetric)消息:顧名思義,這些都是雙向對稱的消息,主要用來創建鏈接、檢測對方是否在線等,包括下列三種消息。
下圖展現了 OpenFlow 和 Switch 之間一次典型的消息交換過程:
在 OpenFlow 規範的最後一部分,主要詳細定義了各類 OpenFlow 消息的數據結構,包括 OpenFlow 消息的消息頭等。這裏就不一一贅述,如需瞭解能夠參考 OpenFlow 源代碼 openflow.h 頭文件中關於各類數據結構的定義。
OpenFlow 協議的發展演進一直都圍繞着兩個方面,一方面是控制面加強,讓系統功能更豐富更靈活;另外一方面是轉發層面的加強,能夠匹配更多的關鍵字,執行更多的動做。每個後續版本的 OpenFlow 協議都在前一版本的基礎上進行了或多或少的改進,但自 OpenFlow 1.1 版本開始和以前版本不兼容,OpenFlow 協議官方維護組織 ONF 爲了保證產業界有一個穩定發展的平臺,把 OpenFlow 1.0 和 1.3 版本做爲長期支持的穩定版本,一段時間內後續版本發展要保持和穩定版本的兼容。
OpenFlow 1.0 指定每一個 OpenFlow 交換機中都存在一張流表,用於數據包查找、處理和轉發,而且只能同一臺控制器進行通訊,流表的維護也是經過控制器下發相應的 OpenFlow 消息來實現。流表由多個流表項組成,而每一個流表項就是一個轉發規則。流表項由包頭域、計數器和動做組成。
自 OpenFlow 1.1 版本開始支持多級流表,有 256 級,將流表匹配過程分解成多個步驟,造成流水線處理方式,這樣能夠有效和靈活利用硬件內部固有的多表特性,同時把數據包處理流程分解到不一樣的流表中也避免了單流表過分膨脹問題。除此以外 OpenFlow 1.1 中還增長了對於 VLAN 和 MPLS 標籤的處理,而且增長了 Group 表,經過在不一樣流表項動做中引用相同的組表實現對數據包執行相同的動做,簡化了流表的維護。OpenFlow 1.1 版本是 OpenFlow 協議版本發展的一個分水嶺,它和 OpenFlow 1.0 版本開始不兼容,但後續版本仍然仍是在此基礎上發展。OpenFlow 1.1 把包頭域修改成了匹配域。
爲了更好支持協議的可擴展性, OpenFlow 1.2 版本發展爲下發規則的匹配字段再也不經過固定長度的結構來定義,而是採用了 TLV 結構定義匹配字段,稱爲 OXM(OpenFlow Extensible Match),這樣用戶就能夠靈活的下發本身的匹配字段,增長了更多關鍵字匹配字段的同時也節省了流表空間。同時,OpenFlow 1.2 規定可使用多臺控制器和同一臺交換機進行鏈接增長可靠性,而且多控制器能夠經過發送消息來變換本身的角色。還有重要的一點是自 OpenFlow 1.2 版本開始支持 IPv6。
通過 1.1 和 1.2 版本的演變積累,2012 年 4 月發佈的 OpenFlow 1.3 版本成爲長期支持的穩定版本。OpenFlow 1.3 流表支持的匹配關鍵字已經增長到 40 個,足以知足現有網絡應用的須要。OpenFlow 1.3 主要還增長了 Meter 表,用於控制關聯流表的數據包的傳送速率,但控制方式目前還相對簡單。OpenFlow 1.3 還改進了版本協商過程,容許交換機和控制器根據本身的能力協商支持的 OpenFlow 協議版本。同時,鏈接創建也增長了輔助鏈接提升交換機的處理效率和實現應用的並行性。其它還有 IPv6 擴展頭和 Table-miss 表項的支持。
2013 年發佈的 OpenFlow 1.4 版本仍然是基於 1.3 版本的特徵改進版本,數據轉發層面沒有太大變化,主要是增長了一種流表同步機制,多個流表能夠共享相同的匹配字段,但能夠定義不一樣的動做;另外又增長了 Bundle 消息,確保控制器下發一組完整消息或同時向多個交換機下發消息的狀態一致性。其它還支持光口屬性描述,多控制器相關的流表監控等特徵。
數據包類型識別流程(以太網數據包、PPP 數據包)egress Table。
NOTE:idle_timeout 不包含在 ovs-ofctl dump-flows br_name
的輸出。
經常使用字段:
NOTE:在網絡分層結構中底層的字段未給出肯定值時上層的字段不容許給肯定值,即一條流規則中容許底層協議字段指定爲肯定值,上層協議字段指定爲通配符或不指定(匹配任何值),而不容許上層協議字段指定爲肯定值,而底層協議字段卻指定爲通配符或不指定(匹配任何值)。不然,ovs-vswitchd 中的流規則將所有丟失,網絡沒法鏈接。
詳細介紹:
字段(key/value) | 含義 |
---|---|
in_port=port | Matches OpenFlow port port |
dl_vlan=vlan | Matches IEEE 802.1q Virtual LAN tag vlan. |
dl_vlan_pcp=priority | Matches IEEE 802.1q Priority Code Point (PCP) priority, which is specified as a value between 0 and 7, inclusive. A higher value indicates a higher frame priority level. |
dl_src=xx:xx:xx:xx:xx:xx dl_dst=xx:xx:xx:xx:xx:xx | Matches an Ethernet source (or destination) address specified as 6 pairs of hexadecimal digits delimited by colons (e.g. 00:0A:E4:25:6B:B0). |
dl_src=xx:xx:xx:xx:xx:xx/xx:xx:xx:xx:xx:xx dl_dst=xx:xx:xx:xx:xx:xx/xx:xx:xx:xx:xx:xx | Matches an Ethernet destination address specified as 6 pairs of hexadecimal digits delimited by colons (e.g. 00:0A:E4:25:6B:B0), with a wildcard mask following the slash. 01:00:00:00:00:00 Match only the multicast bit. Thus, dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 matches all multicast (including broadcast) Ethernet packets, and dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 matches all unicast Ethernet packets. ff:ff:ff:ff:ff:ff Exact match (equivalent to omitting the mask). 00:00:00:00:00:00 Wildcard all bits (equivalent to dl_dst=*.) |
dl_type=ethertype | Matches Ethernet protocol type ethertype, which is specified as an integer between 0 and 65535 |
nw_src=ip[/netmask] nw_dst=ip[/netmask] | When dl_type is 0x0800 (possibly via shorthand, e.g. ip or tcp), matches IPv4 source (or destination) address ip, which may be specified as an IP address or host name. When dl_type=0x0806 or arp is specified, matches the ar_spa or ar_tpa field, respectively, in ARP packets for IPv4 and Ethernet. When dl_type=0x8035 or rarp is specified, matches the ar_spa or ar_tpa field, respectively, in RARP packets for IPv4 and Ethernet. |
nw_proto=proto | When ip or dl_type=0x0800 is specified, matches IP protocol type proto, which is specified as a decimal number between 0 and 255, inclusive (e.g. 1 to match ICMP packets or 6 to match TCP packets). When ipv6 or dl_type=0x86dd is specified, matches IPv6 header type proto, which is specified as a decimal number between 0 and 255, inclusive (e.g. 58 to match ICMPv6 packets or 6 to match TCP). When arp or dl_type=0x0806 is specified, matches the lower 8 bits of the ARP opcode. When rarp or dl_type=0x8035 is specified, matches the lower 8 bits of the ARP opcode. |
nw_tos=tos | Matches IP ToS/DSCP or IPv6 traffic class field tos, which is specified as a decimal number between 0 and 255, inclusive. |
nw_ecn=ecn | Matches ecn bits in IP ToS or IPv6 traffic class fields, which is specified as a decimal number between 0 and 3, inclusive. |
nw_ttl=ttl | Matches IP TTL or IPv6 hop limit value ttl, which is specified as a decimal number between 0 and 255, inclusive. |
tp_src=port tp_dst=port | When dl_type and nw_proto specify TCP or UDP, tp_src and tp_dst match the UDP or TCP source or destination port port |
icmp_type=type icmp_code=code | When dl_type and nw_proto specify ICMP or ICMPv6, type matches the ICMP type and code matches the ICMP code. |
table=number | If specified, limits the flow manipulation and flow dump commands to only apply to the table with the given number between 0 and 254. |
vlan_tci=tci[/mask] | Matches modified VLAN TCI tci. If mask is omitted, tci is the exact VLAN TCI to match; if mask is specified, then a 1-bit in mask indicates that the corresponding bit in tci must match exactly, and a 0-bit wildcards that bit. |
ip_frag=frag_type | When dl_type specifies IP or IPv6, frag_type specifies what kind of IP fragments or non-fragments to match. The following values of frag_type are supported: no Matches only non-fragmented packets. yes Matches all fragments. first Matches only fragments with offset 0. later Matches only fragments with nonzero offset. not_later Matches non-fragmented packets and fragments with zero offset. |
arp_sha=xx:xx:xx:xx:xx:xx arp_tha=xx:xx:xx:xx:xx:xx | When dl_type specifies either ARP or RARP, arp_sha and arp_tha match the source and target hardware address, respectively. |
tun_id=tunnel-id[/mask] | Matches tunnel identifier tunnel-id. Only packets that arrive over a tunnel that carries a key (e.g. GRE with the RFC 2890 key extension and a nonzero key value) will have a nonzero tunnel ID. |
經常使用字段:
NOTE:一條流規則可有多個動做,動做執行按指定的前後順序依次完成。
詳細介紹:
字段(key/value) | 含義 |
---|---|
output:port | Outputs the packet to port |
output:src[start…end] | Outputs the packet to the OpenFlow port number read from src, which must be an NXM field as described above. For example, output:NXM_NX_REG0[16…31] outputs to the OpenFlow port number written in the upper half of register 0. |
enqueue:port:queue | Enqueues the packet on the specified queue within port port |
normal | Subjects the packet to the device’s normal L2/L3 processing. |
flood | Outputs the packet on all switch physical ports other than the port on which it was received and any ports on which flooding is disabled |
all | Outputs the packet on all switch physical ports other than the port on which it was received. |
controller(key=value…) | Sends the packet to the OpenFlow controller as a 「packet in」 message. The supported key-value pairs are: max_len=nbytes : Limit to nbytes the number of bytes of the packet to send to the controller. By default the entire packet is sent. reason=reason: Specify reason as the reason for sending the message in the 「packet in」 message. The supported reasons are action (the default), no_match, and invalid_ttl. id=controller-id : Specify controller-id |
in_port | Outputs the packet on the port from which it was received. |
drop | Discards the packet, so no further processing or forwarding takes place. |
mod_vlan_vid:vlan_vid | Modifies the VLAN id on a packet. |
mod_vlan_pcp:vlan_pcp | Modifies the VLAN priority on a packet. |
strip_vlan | Strips the VLAN tag from a packet if it is present. |
push_vlan:ethertype | Push a new VLAN tag onto the packet. |
push_mpls:ethertype | If the packet does not already contain any MPLS labels, changes the packet’s Ethertype to ethertype, which must be either the MPLS unicast Ethertype 0x8847 or the MPLS multicast Ethertype 0x8848, and then pushes an initial label stack entry. |
pop_mpls:ethertype | Strips the outermost MPLS label stack entry. |
mod_dl_src:mac | Sets the source Ethernet address to mac. |
mod_dl_dst:mac | Sets the destination Ethernet address to mac. |
mod_nw_src:ip | Sets the IPv4 source address to ip. |
mod_nw_dst:ip | Sets the IPv4 destination address to ip. |
mod_tp_src:port | Sets the TCP or UDP source port to port. |
mod_tp_dst:port | Sets the TCP or UDP destination port to port. |
mod_nw_tos:tos | Sets the IPv4 ToS/DSCP field to tos, which must be a multiple of 4 between 0 and 255. |
resubmit([port],[table]) | Re-searches this OpenFlow flow table (or the table whose number is specified by table) with the in_port field replaced by port (if port is specified) |
set_tunnel:id set_tunnel64:id | If outputting to a port that encapsulates the packet in a tunnel and supports an identifier (such as GRE), sets the identifier to id. |
set_queue:queue | Sets the queue that should be used to queue when packets are output. |
pop_queue | Restores the queue to the value it was before any set_queue actions were applied. |
dec_ttl dec_ttl[(id1,id2)] | Decrement TTL of IPv4 packet or hop limit of IPv6 packet. |
set_mpls_ttl:ttl | Set the TTL of the outer MPLS label stack entry of a packet. ttl should be in the range 0 to 255 inclusive. |
dec_mpls_ttl | Decrement TTL of the outer MPLS label stack entry of a packet. |
move:src[start…end]−>dst[start…end] | Copies the named bits from field src to field dst. src and dst must be NXM field names as defined in nicira−ext.h, e.g. NXM_OF_UDP_SRC or NXM_NX_REG0. Examples: move:NXM_NX_REG0[0…5]−>NXM_NX_REG1[26…31] copies the six bits numbered 0 through 5, inclusive, in register 0 into bits 26 through 31, inclusive; move:NXM_NX_REG0[0…15]−>NXM_OF_VLAN_TCI[] copies the least significant 16 bits of register 0 into the VLAN TCI field. |
load:value−>dst[start…end] | Writes value to bits start through end, inclusive, in field dst. Example: load:55−>NXM_NX_REG2[0…5] loads value 55 (bit pattern 110111) into bits 0 through 5, inclusive, in register 2. |
push:src[start…end] | Pushes start to end bits inclusive, in fields on top of the stack. Example: push:NXM_NX_REG2[0…5] push the value stored in register 2 bits 0 through 5, inclusive, on to the internal stack. |
pop:dst[start…end] | Pops from the top of the stack, retrieves the start to end bits inclusive, from the value popped and store them into the corresponding bits in dst. Example: pop:NXM_NX_REG2[0…5] pops the value from top of the stack. Set register 2 bits 0 through 5, inclusive, based on bits 0 through 5 from the value just popped. |
set_field:value−>dst | Writes the literal value into the field dst, which should be specified as a name used for matching. Example: set_field:fe80:0123:4567:890a:a6ba:dbff:fefe:59fa−>ipv6_src |
learn(argument[,argument]…) | This action adds or modifies a flow in an OpenFlow table, similar to ovs−ofctl −−strict mod−flows. The arguments specify the flow’s match fields, actions, and other properties, as follows |
idle_timeout=seconds hard_timeout=seconds priority=value | These key-value pairs have the same meaning as in the usual ovs−ofctl flow syntax. |
fin_idle_timeout=seconds fin_hard_timeout=seconds | Adds a fin_timeout action with the specified arguments to the new flow. |
table=number | The table in which the new flow should be inserted. Specify a decimal number between 0 and 254. The default, if table is unspecified, is table 1. |
field=value field[start…end]=src[start…end] field[start…end] | Adds a match criterion to the new flow. |
load:value−>dst[start…end] load:src[start…end]−>dst[start…end] | Adds a load action to the new flow. |
output:field[start…end] | Add an output action to the new flow’s actions, that outputs to the OpenFlow port taken from field[start…end], which must be an NXM field as described above. |
http://net.zol.com.cn/461/4610667.html
https://www.sdnlab.com/sdn-guide/14716.html
http://www.just4coding.com/blog/2016/12/31/introducing-openflow/
https://www.li-rui.top/2018/12/01/network/openflow介紹/
https://www.opennetworking.org/images/stories/downloads/sdn-resources/onf-specifications/openflow/openflow-switch-v1.3.5.pdf
https://www.sdnlab.com/14484.html