取消
显示结果 
搜索替代 
您的意思是: 
cancel
3375
查看次数
20
有帮助
5
评论
Luke Huang
Cisco Employee
Cisco Employee
本帖最后由 fushuang 于 2020-1-15 09:59 编辑
拓扑图
085723rqq36dq12w14bwfi.png
故障描述
1. ACI配置两个 EPG,EPG_SVI601 & EPG_SVI606 在相同 BD fushuang 下,网关 subnet 配置在 EPG 下面;外接 Cat2960 的模式为"Extend the EPG out of the ACI fabric",在 EPG 做 static binding 来绑定接口、 encap-VLAN 信息;
2. C2960 ping EPG subnet,访问没有问题;在shut/no shut C2960 g2/0/11之后,客户发现 C2960 的 ARP table 条目异常:
Switch#show arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 60.6.6.1 15 0022.bdf8.19ff ARPA Vlan601 << 应该是 Vlan606
Internet 60.1.1.1 16 0022.bdf8.19ff ARPA Vlan601 << ACI Subnet
Internet 60.6.6.2 - 0026.527c.7b42 ARPA Vlan606 << local
Internet 60.1.1.2 - 0026.527c.7b41 ARPA Vlan601 << local
Switch#

3. 客户网络中其他设备,C3750 等,没有同样问题;TAC 在 LAB 使用 N7K 模拟 layer2 extend switch,也没有问题。在 TAC LAB 接入一台 C2960, 问题可以重现。
4. C2960# clear ip arp 60.6.6.1,可以解决问题;重新 flap g2/0/11, 问题再次出现
5. 客户当时的疑问是,ACI layer2 extension的方式是否存在问题或者隐患
故障分析
1. 参考 ACI Layer 2 Connection to the Outside Network 文档,把客户相关配置删掉重配,问题依旧存在;
2. C2960 有类似bug ARP request made in the wrong vlan, 配置"no arp arpa",问题依旧存在;
3. 检查了一下 ACI subnet 的分布,BD vlan 会用 secondary IP 方式记录多个 subnet,正常行为
Leaf102# show vlan id 41 ex
VLAN Name Encap Ports
---- -------------------------------- ---------------- ------------------------
41 fushuang:BD_VRF_102 vxlan-14942176 Eth1/31

Leaf102# show ip int vlan 41
IP Interface Status for VRF "fushuang:VRF_102"
vlan41, Interface status: protocol-up/link-up/admin-up, iod: 97, mode: pervasive
IP address: 60.1.1.1, IP subnet: 60.1.1.0/24
IP address: 60.6.6.1, IP subnet: 60.6.6.0/24 secondary
IP broadcast address: 255.255.255.255
IP primary address route-preference: 1, tag: 0
4. 检查 ACI EP 信息,发现和 C2960 对应的 mismatch 现象:
093500rooyz8hyx5kkhyxc.png
093559bbiiuq1rmqbqnrqn.png
5. 在 ACI 清理 EP 信息,没有效果
vsh -c "clear system internal epm endpoint key vrf ip "
6. 检查 ACI EPM, EPMC 信息,发现有IP - MAC 漂移过程:
Leaf102# grep 60.6.6.2.*move /var/log/dme/log/epmc-trace.txt
[2019 Aug 14 09:18:54.969622923:1034721:epmc_process_l3_upd:3522:t] IP 60.6.6.2 moved from MAC 0026.527c.7b42 to MAC 0026.527c.7b41
[2019 Aug 14 09:18:54.970854763:1034766:epmc_process_l3_upd:3522:t] IP 60.6.6.2 moved from MAC 0026.527c.7b41 to MAC 0026.527c.7b42
[2019 Aug 14 09:18:56.969662351:1034816:epmc_process_l3_upd:3522:t] IP 60.6.6.2 moved from MAC 0026.527c.7b42 to MAC 0026.527c.7b41
7. 检查 C2960 debug arp, 发现一些线索:
*Mar 27 16:46:23.263: IP ARP: rcvd req src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 Vlan606 <<< ACI to C2960 ARP request
*Mar 27 16:46:23.263: IP ARP: sent rep src 60.6.6.2 0026.527c.7b42, dst 60.6.6.1 0022.bdf8.19ff Vlan606 <<< C2960 reply
*Mar 27 16:46:23.263: IP ARP: rcvd req src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 Vlan601 <<< ACI to C2960 ARP request, wrong Vlan601
8. 目测应该是 C2960 默认没有开启 IP routing 导致,文档C2960 not enable ip routing by default;于是修改 C2960 sdm 模板, 重启,打开 IP routing 功能。再次测试,问题消失
conf t
sdm prefer lanbase-routing
do write
reload
...
conf t
ip routing
9. C2960 打开 IP routing 之后的 debug arp 如下:
*Mar  1 00:18:55.786: IP ARP: creating incomplete entry for IP address: 60.6.6.1 interface Vlan606
*Mar 1 00:18:55.786: IP ARP: sent req src 60.6.6.2 0026.527c.7b42, dst 60.6.6.1 0000.0000.0000 Vlan606
*Mar 1 00:18:55.791: IP ARP: rcvd rep src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 Vlan606
*Mar 1 00:18:55.791: IP ARP rep filtered src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 0026.527c.7b42 wrong cable, interface Vlan601 <<< C2960 过滤掉了错误的 ARP 信息
Switch#show arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 60.6.6.1 10 0022.bdf8.19ff ARPA Vlan606
Internet 60.1.1.1 10 0022.bdf8.19ff ARPA Vlan601
Internet 60.6.6.2 - 0026.527c.7b42 ARPA Vlan606
Internet 60.1.1.2 - 0026.527c.7b41 ARPA Vlan601

评论
David Chou
Level 7
Level 7
感謝分享,一定要收藏
one-time
Level 13
Level 13
感谢专家分享,谢谢!
wuhao0015
Spotlight
Spotlight
你好,你这个只是解决了问题。故障原因是什么呢?这个和ip routing 到底有什么关系呢?做技术的还是要纠结下的。
Luke Huang
Cisco Employee
Cisco Employee
wuhao0015 发表于 2020-1-16 14:04
你好,你这个只是解决了问题。故障原因是什么呢?这个和ip routing 到底有什么关系呢?做技术的还是要纠结 ...

Root Cause
1. C2960 without routing function may try to resolve ARP with wrong SVI IP & MAC;
2. ACI recvd the pkt then EP IP move, when reply, will use wrong Encap-vlan id;
3. C2960 without routing function will install the ARP with wrong Vlan id.
4. The fix will be enable ip routing on C2960 and that's why C3750/N7K not have the same problem.
wuhao0015
Spotlight
Spotlight
谢谢楼主分享
入门指南

使用上面的搜索栏输入关键字、短语或问题,搜索问题的答案。

我们希望您在这里的旅程尽可能顺利,因此这里有一些链接可以帮助您快速熟悉思科社区:









快捷链接