本帖最后由 fushuang 于 2020-1-15 09:59 编辑 拓扑图故障描述1. ACI配置两个 EPG,EPG_SVI601 & EPG_SVI606 在相同 BD fushuang 下,网关 subnet 配置在 EPG 下面;外接 Cat2960 的模式为"Extend the EPG out of the ACI fabric",在 EPG 做 static binding 来绑定接口、 encap-VLAN 信息;
2. C2960 ping EPG subnet,访问没有问题;在shut/no shut C2960 g2/0/11之后,客户发现 C2960 的 ARP table 条目异常:
Switch#show arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 60.6.6.1 15 0022.bdf8.19ff ARPA Vlan601 << 应该是 Vlan606
Internet 60.1.1.1 16 0022.bdf8.19ff ARPA Vlan601 << ACI Subnet
Internet 60.6.6.2 - 0026.527c.7b42 ARPA Vlan606 << local
Internet 60.1.1.2 - 0026.527c.7b41 ARPA Vlan601 << local
Switch#
3. 客户网络中其他设备,C3750 等,没有同样问题;TAC 在 LAB 使用 N7K 模拟 layer2 extend switch,也没有问题。在 TAC LAB 接入一台 C2960, 问题可以重现。
4. C2960# clear ip arp 60.6.6.1,可以解决问题;重新 flap g2/0/11, 问题再次出现
5. 客户当时的疑问是,ACI layer2 extension的方式是否存在问题或者隐患
故障分析1. 参考
ACI Layer 2 Connection to the Outside Network 文档,把客户相关配置删掉重配,问题依旧存在;
2. C2960 有类似bug
ARP request made in the wrong vlan, 配置"no arp arpa",问题依旧存在;
3. 检查了一下 ACI subnet 的分布,BD vlan 会用 secondary IP 方式记录多个 subnet,正常行为
Leaf102# show vlan id 41 ex
VLAN Name Encap Ports
---- -------------------------------- ---------------- ------------------------
41 fushuang:BD_VRF_102 vxlan-14942176 Eth1/31
Leaf102# show ip int vlan 41
IP Interface Status for VRF "fushuang:VRF_102"
vlan41, Interface status: protocol-up/link-up/admin-up, iod: 97, mode: pervasive
IP address: 60.1.1.1, IP subnet: 60.1.1.0/24
IP address: 60.6.6.1, IP subnet: 60.6.6.0/24 secondary
IP broadcast address: 255.255.255.255
IP primary address route-preference: 1, tag: 0
4. 检查 ACI EP 信息,发现和 C2960 对应的 mismatch 现象:
5. 在 ACI 清理 EP 信息,没有效果
vsh -c "clear system internal epm endpoint key vrf ip "
6. 检查 ACI EPM, EPMC 信息,发现有IP - MAC 漂移过程:
Leaf102# grep 60.6.6.2.*move /var/log/dme/log/epmc-trace.txt
[2019 Aug 14 09:18:54.969622923:1034721:epmc_process_l3_upd:3522:t] IP 60.6.6.2 moved from MAC 0026.527c.7b42 to MAC 0026.527c.7b41
[2019 Aug 14 09:18:54.970854763:1034766:epmc_process_l3_upd:3522:t] IP 60.6.6.2 moved from MAC 0026.527c.7b41 to MAC 0026.527c.7b42
[2019 Aug 14 09:18:56.969662351:1034816:epmc_process_l3_upd:3522:t] IP 60.6.6.2 moved from MAC 0026.527c.7b42 to MAC 0026.527c.7b41
7. 检查 C2960 debug arp, 发现一些线索:
*Mar 27 16:46:23.263: IP ARP: rcvd req src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 Vlan606 <<< ACI to C2960 ARP request
*Mar 27 16:46:23.263: IP ARP: sent rep src 60.6.6.2 0026.527c.7b42, dst 60.6.6.1 0022.bdf8.19ff Vlan606 <<< C2960 reply
*Mar 27 16:46:23.263: IP ARP: rcvd req src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 Vlan601 <<< ACI to C2960 ARP request, wrong Vlan601
8. 目测应该是 C2960 默认没有开启 IP routing 导致,文档
C2960 not enable ip routing by default;于是修改 C2960 sdm 模板, 重启,打开 IP routing 功能。再次测试,问题消失
conf t
sdm prefer lanbase-routing
do write
reload
...
conf t
ip routing
9. C2960 打开 IP routing 之后的 debug arp 如下:
*Mar 1 00:18:55.786: IP ARP: creating incomplete entry for IP address: 60.6.6.1 interface Vlan606
*Mar 1 00:18:55.786: IP ARP: sent req src 60.6.6.2 0026.527c.7b42, dst 60.6.6.1 0000.0000.0000 Vlan606
*Mar 1 00:18:55.791: IP ARP: rcvd rep src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 Vlan606
*Mar 1 00:18:55.791: IP ARP rep filtered src 60.6.6.1 0022.bdf8.19ff, dst 60.6.6.2 0026.527c.7b42 wrong cable, interface Vlan601 <<< C2960 过滤掉了错误的 ARP 信息
Switch#show arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 60.6.6.1 10 0022.bdf8.19ff ARPA Vlan606
Internet 60.1.1.1 10 0022.bdf8.19ff ARPA Vlan601
Internet 60.6.6.2 - 0026.527c.7b42 ARPA Vlan606
Internet 60.1.1.2 - 0026.527c.7b41 ARPA Vlan601