實驗拓撲圖:html
故障現象描述:網絡
1.全部PC跨網段路由下一跳指向3750X,網絡互通,無異常。app
2.全部PC跨網段路由下一跳指向N3K VIP,有如下系列問題:tcp
(1)PC1跨網段路由下一跳指向N3K VIP時,其它全部機器跨網段訪問不通(全部用N3K VIP作下一跳的機器;實驗中碰到的是PC1,只要PC1跨網段下一跳不指向N3K,其它同段甚至同臺母機的其它虛擬機均可跨網段訪問, 但存在必定的丟包)ide
(2)跨網段訪問偶爾有丟包,傳輸數據只有幾百K/s測試
(3)跨網段下一跳路由指向3750X的非VLAN200段機器,與192.168.253.0段不通ui
故障排查:this
1.檢查N3k路由、HSRP正常debug
2.PC1跨網段路由下一跳指向N3K VIP,在PC1上ping PC2 30個包,且在PC1和PC2上開啓tcpdump抓包htm
(1)PC1上抓包顯示發30個包到PC2且沒有接收到來自PC2的包
(2)PC2上抓包以下圖,只接到來自PC1的4個包並有回傳
(3)最後聯繫CISCO技術支持,在N3K Standby上抓包分析結果爲通過N3K的包未經過轉發芯片轉發,而是走了CPU;查詢相關資料,答覆是此IOS版本BUG。我用的版本爲:version 6.0(2)A6(3)
解決方案:
重啓全部SVI接口,並在SVI接口中執地no autostate命令。
BUG連接地址:https://bst.cloudapps.cisco.com/bugsearch/bug/CSCup65482/?reffering_site=dumpcr
BUG原文內容:
Nexus3500: Traffic incorrectly punted to CPU matching copp-s-l3mtufail
CSCup65482
Symptom:
Traffic flowing through the switch may get punted to the CPU, matching the 'copp-s-l3mtufail' class in the CoPP.
Conditions:
Nexus 3500 switch running one of the affected releases
AND
This issue is triggered after SVI(s) flap or going down.
On SVI flap, due to this bug, MTU value is getting misprogrammed.
Workaround:
Shut / no shut of VLAN SVIs stop the traffic incorrectly sent to CPU.
Further Problem Description:
SVI flaps when all the Layer2 interfaces in that vlan goes down.
Configure 'no autostate' on SVI(s) to avoid the issue from happening.
Command Reference:
http://www.cisco.com/en/US/docs/switches/datacenter/nexus3000/sw/interfaces/503_U5_1/b_Cisco_n3k_Interfaces_Configuration_Guide_503_u5_1_chapter_010.html#task_FC25C8615CC443F28DA237782DD9B0A0
(4)再測試,跨網段數據傳輸可達到40M以上,且PC1跨網段下一跳指向N3K VIP的靈異現象也消失了。
(5)再測試,PC3與PC4仍是不通;再PC3與PC4上同時開啓tcpdump,再PC3向PC4 PING 10個包
結果:PC4收到10個包並有反傳,而PC3未收到來自PC4的包
因PC4的下一跳在3750X上,登錄3750X,開啓debug;發現發往PC3的ICMP包被重定向到了172.16.101.178
查看3750X的路由表,發現有一條192.168.253.0下一跳指向172.16.101.178的路由,刪除後,網絡恢復正常