參考文檔:http://www.hyper-v.nu/archives/marcve/2013/01/lbfo-hyper-v-switch-qos-and-actual-performance-part-1/html
An EtherChannel can be established using one of three mechanisms: linux
沒有配置etherchannel以前:stp會禁用端口ios
配置以後:web
同一個Session中的數據包爲啥不能作到Load Balancing?這是由於網絡的7層模型中,一個Session在傳輸過程當中會被拆分紅多個數據包,而且到目的以後再重組,他們必須具備必定的順序,若是這個順序弄亂了,那麼到達目的重組出來的信息就是一堆無心義的亂碼。這就要求同一個session的數據包必須在同一個物理鏈路中按照順序傳輸過去。因此,10條1Gb鏈路組成的10Gb的聚合鏈路,必定不如單條10Gb鏈路來的高速和有效。算法
cisco的說法cisco的EtherChannel reduces part of the binary pattern that the addresses in the frame form to a numerical value that selects one of the links in the channel in order to distribute frames across the links in a channel. EtherChannel frame distribution uses a Cisco-proprietary hashing algorithm. The algorithm is deterministic; if you use the same addresses and session information, you always hash to the same port in the channel. This method prevents out-of-order packet delivery.服務器
All ports in each EtherChannel must be the same speed. You can base the load-balance policy (frame distribution) on a MAC address (Layer 2 [L2]), an IP address (Layer 3 [L3]), or a port number (Layer 4 [L4]). You can activate these policies, respectively, if you issue the set port channel all distribution {ip | mac| session | ip-vlan-session} [source | destination | both] command. The session keyword is supported on the Supervisor Engine 2 and Supervisor Engine 720. The ip-vlan-session keyword is only supported on the the Supervisor Engine 720. Use this keyword in order to specify the frame distribution method, with the IP address, VLAN, and Layer 4 traffic.若是物理交換機也作鏈路聚合,那麼咱們首先要搞清楚物理交換機和主機直接如何鏈路聚合,也就是LACP.網絡
cisco專有的協議爲EtherChannel,支持的場景爲:session
LACP運行在MAC層上,假定全部連接是全雙工,點對點,同等速率的端口app
咱們知道,基於網絡分層的思想,TCP與IP轉發,能夠說是互不干涉的,轉發平面(或者路由器)盡力而爲的轉發報文;而TCP對下層鏈路是不感知的,爲了最大帶寬的利用率,啓動後以慢啓動方式快速的擴大擁塞窗口,直到丟包發生,進入擁塞避免階段(收到對方3個冗餘ACK)或者慢啓動階段(超時丟包)收縮擁塞窗口,接着又開始繼續擴大擁塞窗口發送報文。 負載均衡
雖然IP轉發能夠不理會TCP的處理方式,協議並無要求。但若是IP轉發可以作點事情,幫助TCP鏈路更爲平滑,豈不是更好。
下面舉個多核轉發亂序,致使TCP流量降低,以及如何解決的問題。
假設發送端發送了5個報文,序號分別是1,2,3,4,5,接收端指望也是按順序收到1,2,3,4,5,若是接受端收到了1以後,沒有收到2,但收到了3,4,5,接收端會發送3個ACK,應答報文指明瞭指望收到的序號是2,發送端連續收到了3個冗餘ACK,會進入擁塞避免階段,擁塞窗口收縮爲一半+3個報文段的大小,擁塞窗口的收縮,將影響了發送端發送報文的流量。能夠簡單理解爲開始水龍頭是所有打開的,這時候水流是比較大的,在出現問題後,水龍頭只打開一半多一點點,水流就下降了不少。
單核轉發,問題並不大,一般是報文先到先處理,那麼順序是能夠保證的。
但在多核轉發下,問題就很容易出現了。對於同一個輸入端口,有多個核處理報文,因爲各類報文的處理路徑並不一致(TCP/UDP/ICMP等等),可能有些報文處理的快些,有些報文處理的慢些。好比前面的例子,假如系統有5個核,分別處理上面報文的1,2,3,4,5,核2因某些緣由處理的較慢或者說被阻塞了,核3,4,5處理的較快,就先把報文3,4,5轉發出去了,接受端因爲先收到的報文不是指望的,就連續發送了3個ACK過去,表示指望的報文序號是2,致使發送端的窗口收縮,流量降低。
實際這種狀況是因爲轉發系統亂序引發的。
參考:http://www.supersonicdog.com/2013/04/24/lacp/
條件:vsphere 5.1 + 分佈式虛擬交換機,LACP只能夠用vSphere web client設置
適用狀況:到不一樣的IP上的流量,例如web 服務器。
好處:一個VM的多個IP會話會分佈在多個物理網卡上。the same VM can use both links for different TCP or UDP sessions
不適用的狀況:IP訪問比較固定,例如存儲訪問,VM訪問NFS存儲。(IP包頭裏頭源和目的地址固定了,)
概念:LACP須要虛擬交換機和物理交換機上都配置(進站流量),出站流量用nic teaming設定,且爲IP-hash
for VMs that host applications needing access to multiple target IP addresses, LACP links combined with IP hash load balance algorithm provide good balance of traffic across all connections. Compared to traditional NIC teaming, all links get utilized simultaneously. While traditional NIC teaming is simple to configure, without any extra steps needed on the physical switch, a given VM could only be active on one link at a time (as the MAC appearing on two ports on the switch that are not LACP configured would cause one of the ports to be shutdown)
Static teaming (IEEE 802.3ad draft v1)
優勢:若是交換機不支持LACP,只支持靜態LACP,
缺點:一個VM只能利用一個網卡的帶寬。靜態LACP沒法檢測線纜或者配置錯誤。
In Static teaming mode there is no check for incorrectly plugged cables or other errors. This mode is useful when the preferred bandwidth exceeds a single physical NIC and the switch does not support LACP, but the switch does support static teaming.
參考:http://blog.ipspace.net/2010/11/vmware-virtual-switch-no-need-for-stp.html
2個網卡屬於一個vSwitch,因爲vSwitch不支持LACP和STP,因此2個鏈接都是活動的。vSwitch不依賴STP或 port blocking而是依靠特殊的轉發規則:split horizon switching(Cisco UCS documentation uses the term End Host Mode)
避免了轉發循環。
參考:http://blog.ipspace.net/2010/11/vmware-virtual-switch-no-need-for-stp.html
In a traditional Ethernet switch, the same forwarding rules are used for all ports. Virtual switch uses different forwarding rules for vNICs and uplinks.
The hypervisor knows the MAC addresses of all virtual machines running in the ESX server; there’s no need to perform MAC address learning.
Virtual switch is not running Spanning Tree Protocol (STP) and does not send STP Bridge Protocol Data Units (BPDU). STP BPDUs received by the virtual switch are ignored. Uplinks are never blocked based on STP information.
As ESX doesn’t run STP, you should also configure spanning-tree portfast on these ports.
Packets received through one of the uplinks are never forwarded to other uplinks. This rule prevents forwarding loops through the virtual switch.
Broadcast or multicast packets originated by a virtual machine are sent to all other virtual machines in the same port group (VMware terminology for a VLAN). They are also sent through one of the uplinks like a regular unicast packet (they are not flooded through all uplinks). This ensures that the outside network receives a single copy of the broadcast.
The uplink through which the broadcast packet is sent is chosen based on the load balancing mode configured for the virtual switch or the port group.
Broadcasts/multicasts received through an uplink port are sent to all virtual machines in the port group (identified by VLAN tag), but not to other uplinks (see split-horizon forwarding).
Unicast packets sent from virtual machines to unknown MAC addresses are sent through one of the uplinks (selected based on the load balancing mode). They are not flooded.
Unicast packets received through the uplink ports and addressed to unknown MAC addresses are dropped.
The virtual switch sends a single copy of a broadcast/multicast/unknown unicast packet to the outside network (see the no flooding rules above), but the physical switch always performs full flooding and sends copies of the packet back to the virtual switch through all other uplinks. VMware thus has to check the source MAC addresses of packets received through the uplinks. Packet received through one of the uplinks and having a source MAC address belonging to one of the virtual machines is silently dropped.
參考:https://blogs.vmware.com/vsphere/2012/11/vsphere-5-1-vds-new-features-bpdu-filter.html
http://rickardnobel.se/esxi-5-1-bdpu-guard/
BPDU包
BPDU包就是STP協議的一些交換包。沒有驗證機制信任全部的BPDU包,因此可能有假冒的BPDU包。
虛擬交換機不支持STP,自身也不會發送任何BPDU包,也不會處理任何來自物理交換機的BPDU包。
虛擬機上若是生成和傳播BPDU包會將整個cluster癱瘓掉。例如發送一個假冒的包以便贏得ROOT bridge角色。
爲防止特定端口接收BPDU包,發明了BPDU Guard in Cisco and BPDU Protection on HP network device.
一旦發現某端口有BPDU包就關閉該端口。
由此引出BPDU filter,適用於VDS和VSS兩種交換機。須要每一個主機一個一個的去修改
設置:
LACP itself doesn't provide the ability to bond across multiple switches; it bonds across multiple ports on a single ethernet switch, and depending on the vendor there might even be restrictions on which ports on a switch can be bonded together.
Some vendors have proprietary protocols (typically called MLAG) that allow for bonded ethernet channels across different ethernet switches; this may not be helpful when working with a server's ethernet ports.
Without synchronized ESX-switch configuration you can experience one of the following two symptoms:
虛擬交換機上的配置:
uplink port group上設置LACP Active or Passive mode
port group上設置IP hash
物理交換機上要正確配置LACP和Vlan(組內的vlan要相同)
參考:http://stretch-cloud.info/2013/09/lacp-primer-vsphere-5-5/
ESXi5.1上只支持一個vDS建立一個LAG,可是能夠創建多個vDS,創建多個LAG.
In vSphere 5.1, LACP implementation has some constraints and those were: Supports only one LAG per VDS per host. All uplinks in the dvuplink port group are included in this LAG. Only the IP hash load balancing algorithm is supported. - See more at: http://stretch-cloud.info/2013/09/much-awaited-lacp-enhancement-vsphere-5-5/#sthash.glCRc0Ig.dpuf
Hashing Algorithm - The hashing algorithm determines the LAG member used for traffic. LACP can use different properties of the outgoing traffic (e.g. source IP/Port number) to distribute traffic across all the links participating in a LAG.針對物理交換機配置LACP也須要選擇hash算法決定入站流量在LAG內的分配
A:DC networks moving towards 10GbE, which require multiple etherchannels
B:Hosts with mix of 1GbE and 10GbE NICs need multiple etherchannel support
Enhancement In vSphere 5.5
Support multiple LACP LAGs
Max 32 LAG per Host
Max 64 LAG per VDS Support all supported hashing algorithms in LACP (22)
Note: Uplinks must be going to either the same switch or a pair of switches appearing as a single logical switch (using vPC, VSS, MLAG, SMLT, or similar technology).
catOS
The Cisco-proprietary hash algorithm computes a value in the range 0 to 7. With this value as a basis, a particular port in the EtherChannel is chosen. The port setup includes a mask which indicates which values the port accepts for transmission. With the maximum number of ports in a single EtherChannel, which is eight ports, each port accepts only one value. If you have four ports in the EtherChannel, each port accepts two values, and so forth. This table lists the ratios of the values that each port accepts, which depends on the number of ports in the EtherChannel:
Number of Ports in the EtherChannel | Load Balancing |
8 | 1:1:1:1:1:1:1:1 |
7 | 2:1:1:1:1:1:1 |
6 | 2:2:1:1:1:1 |
5 | 2:2:2:1:1 |
4 | 2:2:2:2 |
3 | 3:3:2 |
2 | 4:4 |
Note: This table only lists the number of values, which the hash algorithm calculates, that a particular port accepts. You cannot control the port that a particular flow uses. You can only influence the load balance with a frame distribution method that results in the greatest variety.
Note: The hash algorithm cannot be configured or changed to load balance the traffic among the ports in an EtherChannel.
Issue the show port channel mod/port info command in order to check the frame distribution policy. In version 6.1(x) and later, you can determine the port for use in the port channel to forward traffic, with the frame distribution policy as the basis. The command for this determination is show channel hash channel-id {src_ip_addr | dest_ip_addr | src_mac_addr | dest_mac_addr | src_port |dest_port} [dest_ip_addr | dest_mac_addr | dest_port] .
These are some examples:
Console> (enable) show channel hash 865 10.10.10.1 10.10.10.2 Selected channel port: 1/1
Console> (enable) show channel hash 865 00-02-fc-26-24-94 00-d0-c0-d7-2d-d4 !--- This command should be on one line. Selected channel port: 1/2
Cisco IOS
EtherChannel load balancing can use MAC addresses, IP addresses, or Layer 4 port numbers with a Policy Feature Card 2 (PFC2) and either source mode, destination mode, or both. The mode you select applies to all EtherChannels that you configure on the switch. Use the option that provides the greatest variety in your configuration. For example, if the traffic on a channel only goes to a single MAC address, use of the destination MAC address results in the choice of the same link in the channel each time. Use of source addresses or IP addresses can result in a better load balance. Issue the port-channel load-balance {src-mac | dst-mac | src-dst-mac | src-ip | dst-ip | src-dst-ip | src-port | dst-port | src-dst-port | mpls} global configuration command in order to configure the load balancing.
6509#remote login switch Trying Switch ... Entering CONSOLE for Switch Type "^C^C^C" to end this session 6509-sp#test etherchannel load-balance interface port-channel 1 ip 10.10.10.2 10.10.10.1 !--- This command should be on one line. Would select Gi6/1 of Po1 6509-sp#
6509#remote login switch Trying Switch ... Entering CONSOLE for Switch Type "^C^C^C" to end this session 6509-sp#test etherchannel load-balance interface port-channel 1 mac 00d0.c0d7.2dd4 0002.fc26.2494 !--- This command should be on one line. Would select Gi6/1 of Po1 6509-sp#
PAgP aids in the automatic creation of EtherChannel links. PAgP packets are sent between EtherChannel-capable ports in order to negotiate the formation of a channel. Some restrictions are deliberately introduced into PAgP. The restrictions are:
PAgP does not form a bundle on ports that are configured for dynamic VLANs. PAgP requires that all ports in the channel belong to the same VLAN or are configured as trunk ports. When a bundle already exists and a VLAN of a port is modified, all ports in the bundle are modified to match that VLAN.
PAgP does not group ports that operate at different speeds or port duplex. If speed and duplex change when a bundle exists, PAgP changes the port speed and duplex for all ports in the bundle.
PAgP modes are off, auto, desirable, and on. Only the combinations auto-desirable, desirable-desirable, and on-on allow the formation of a channel. The device on the other side must have PAgP set to on if a device on one side of the channel does not support PAgP, such as a router.
PAgP is currently supported on these switches:
Catalyst 4500/4000
Catalyst 5500/5000
Catalyst 6500/6000
Catalyst 2940/2950/2955/3550/3560/3750
Catalyst 1900/2820
These switches do not support PAgP:
Catalyst 2900XL/3500XL
Catalyst 2948G-L3/4908G-L3
Catalyst 8500
You can configure EtherChannel connections with or without Inter-Switch Link Protocol (ISL)/IEEE 802.1Q trunking. After the formation of a channel, the configuration of any port in the channel as a trunk applies the configuration to all ports in the channel. Identically configured trunk ports can be configured as an EtherChannel. You must have all ISL or all 802.1Q; you cannot mix the two. ISL/802.1Q encapsulation, if enabled, takes place independently of the source/destination load-balancing mechanism of Fast EtherChannel. The VLAN ID has no influence on the link that a packet takes. ISL/802.1Q simply enables that trunk to belong to multiple VLANs. If trunking is not enabled, all ports that are associated with the Fast EtherChannel must belong to the same VLAN.
要想把接口配置爲PAGP 的desirable 模式使用命令:「channel-group 1 mode desirable」;
要想把接口配置爲PAGP 的auto 模式使用命令:「channel-group 1 mode auto」;
要想把接口配置爲LACP 的active 模式使用命令:「channel-group 1 mode active」;
要想把接口配置爲LACP 的passive 模式使用命令:「channel-group 1 mode passive」。
端口通道負載均衡 port-channel load-balance
sw1(config)#port-channel load-balance ?
dst-ip Dst IP Addr
dst-mac Dst Mac Addr
src-dst-ip Src XOR Dst IP Addr
src-dst-mac Src XOR Dst Mac Addr
src-ip Src IP Addr
src-mac Src Mac Addr
一、以太網通道最多能夠捆綁8條物理鏈路
二、捆綁遵循如下規則:
(1)相同VLAN
(2)端口中繼模式
(3)相同speed和duplex
LACP is a standards-based method to control the bundling of several physical network links together to form a logical channel for increased bandwidth and redundancy purposes. LACP enables a network device to negotiate an automatic bundling of links by sending LACP packets to the peer.
LACP works by sending frames down all links that have the protocol enabled. If it finds a device on the other end of the link that also has LACP enabled, it also sends frames independently along the same links, enabling the two units to detect multiple links between themselves and then combine them into a single logical link.
This dynamic protocol provides these advantages over the static link aggregation method supported by previous versions of vSphere:
Etherchannel 分爲二層和三層etherchannel
以太網鏈路捆綁用來增長帶寬和負載均衡。拓撲以下:
SW1的配置:
interface FastEthernet0/1
channel-group 1 mode desirable
switchport mode trunk
interface FastEthernet0/2
channel-group 1 mode desirable
switchport mode trunk
interface Port-channel 1
switchport mode trunk
SW2的配置:
interface FastEthernet0/1
channel-group 1 mode desirable
switchport mode trunk
interface FastEthernet0/2
channel-group 1 mode desirable
switchport mode trunk
interface Port-channel 1
switchport mode trunk
show etherchannel summary 查看以太網通道的狀態
SW2#show etherchannel summary
Flags: D - down P - in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator
u - unsuitable for bundling
w - waiting to be aggregated
d - default port
Number of channel-groups in use: 1
Number of aggregators: 1
Group Port-channel Protocol Ports
------+-------------+-----------+----------------------------------------------
1 Po1(SU) PAgP Fa0/1(P) Fa0/2(P)
S表明的是二層以太網通道 U表明UP 通道起來了 P表明這兩個接口參與了以太網通道
注意的是:邏輯接口的配置會覆蓋物理接口上的配置
這樣就看到效果了吧!
另外要注意以太網通道的模式
etherchannel 的模式:
一、PAGP的模式:
on:不進行協商,沒有協商traffic。相似nonegotiate
auto:passive negotiat state。可接受對端發出的協商,但不會主動申請。(默認)
desirable:active negotiat state。主動協商狀態。主動發送PAGP包。
二、LACP的模式:
passive:passive negotiating state。被動狀態,可接受。但不會主動申請(默認)
active:active negotiating state。主動狀態,主動申請。
注意:
on on OK
desirable desirable OK
desirable auto OK
auto auto 造成不了
auto on 造成不了
active active OK
passive active OK
passive passive 造成不了
S1(config)# interface range f0/13 -15 S1(config-if-range)# channel-group 1 mode ? active Enable LACP unconditionally auto Enable PAgP only if a PAgP device is detected desirable Enable PAgP unconditionally on Enable Etherchannel only passive Enable LACP only if a LACP device is detected S1(config-if-range)# channel-group 1 mode active Creating a port-channel interface Port-channel 1
VMFS的vmware的一種文件系統,VMDK是vmware的虛擬硬盤文件,RDM是Raw Device Mappings原生設備映射
在VMDK模式時,LUN是被ESXI掛成存儲,而且以Datastore的方式來存放,這個LUN會被格式化爲VMFS格式,VM的虛擬硬盤會以VMDK的文件格式存放在這個已經成爲VMFS格式的Datastore的LUN中,在RDM模式時,LUN是被視爲一個獨立硬盤,也就是存儲設備上的一個LUN,這個LUN能夠是各類文件格式,如NTFS,EXT3,EXT4,FAT32等,視總控這個LUN的操做系統來決定。VM能夠用bit by bit可寫硬盤的方式直接可寫這個LUN,而不須要經過hypervisor的翻譯