问题描述:
近期某客户开case,报告说发现路由表中某一条路由时有时无,时对时错:
RP/0/RSP0/CPU0:A9K-B#show route 192.168.82.41
Thu Jun 7 14:54:48.024Beijing
Routing entryfor 192.168.82.40/29
Known via"ospf 9812", distance 110, metric 100, type intra area
RP/0/RP0/CPU0:CRS-C#show route 192.168.82.41
Routing entryfor 192.168.80.0/20
Known via "bgp9812", distance 114, metric 0, type internal
RP/0/RSP0/CPU0:A9K-B#show route 192.168.82.41
Routing entryfor 192.168.80.0/20
Known via"static", distance 1, metric 0 (connected)
RoutingDescriptor Blocks
directlyconnected, via Null0
网络环境是这样的,CRS-A上Bundle子接口地址为192.168.82.41,宣告进OSPF,对端华为9504。其中,A9K-B,CRS-C与CRS-A直连,在同一个域内。
问题分析过程:
1)正常情况下192.168.82.41在路由表中显示应该是192.168.82.40/29,来自OSPF。 但是通过反复show route 192.168.82.41发现,该OSPF路由存在flapping。
2)在192.168.82.40/29 flapping的阶段,CRS-A与对端的OSPF邻居关系已经稳定了一周。
RP/0/RP0/CPU0: CRS-A#showospf neighbor
Fri Jun 8 10:46:40.373 Beijing
211.154.94.160 1 FULL/BDR 00:00:38 192.168.82.42 Bundle-Ether1.144.
Neighbor is up for 1w0d
3)检查OSPF数据库,发现Network LSA存在达到MaxAge、SequenceNumber不断增长的情况。
RP/0/RSP0/CPU0: A9K-B #shospf database
Sat Jun 9 20:29:07.163 Beijing
192.168.82.41 10.100.255.187 3603 0x80238f9d 0x00a2a3
N9504# sh ip ospf database
192.168.82.41 10.100.255.187 3601 0x8023901c 0xa224
RP/0/RP0/CPU0: CRS-C #sh ospfdatabase
Sat Jun 9 20:55:50.654 Beijing
192.168.82.41 10.100.255.187 4 0x80239079 0x00e781
RP/0/RP0/CPU0: CRS-C #sh ospfdatabase
Sat Jun 9 20:54:45.406 Beijing
192.168.82.41 10.100.255.187 8 0x80239070 0x00f978
4) 检查CRS-A的OSPFTrace, 发现一直收到更老的Net LSA,OSPF一直在强制更新关于192.168.82.41的NetLSA。
RP/0/RP0/CPU0: CRS-A#show ospf trace all
Trace buffer: adj
16 Jun 11 13:43:56.739* db_install: Rxd ourolder Net LSA: ar 0.0.0.0, seq 0x8023d855, vrf 0x60000000 - force update
17 Jun 11 13:43:56.764 ospf_build_net_lsa: intf BE1.144 rtrid10.100.255.187 lsid 192.168.82.41 area0.0.0.0
18 Jun 11 13:44:04.524 db_install: Rxd our older Net LSA: ar0.0.0.0, seq 0x8023d856, vrf 0x60000000 - force update
5) 由此发现,关于192.168.82.41 的Net LSA存在强制老化的情况。根据OSPF RFC 2328的规定:
A router may only prematurely age its own self-originated LSAs. Therouter may not prematurely age LSAs that have been originated by other routers.An LSA is considered self-originated when either 1) the LSA's AdvertisingRouter is equal to the router's own Router ID or 2) the LSA is a network-LSAand its Link State ID is equal to one of the router's own IP interfaceaddresses.
说明网络中存在重复地址。
6) 再查看OSPF数据库中的Router with ID (10.100.255.187)条目发现,LSA的LinkID为192.168.82.41。
Link connected to: a Transit Network
(Link ID) Designated Router address: 192.168.82.41
(Link Data)Router Interface address: 192.168.82.41
Numberof TOS metrics: 0
TOS 0 Metrics: 50
客户尝试重建OSPF邻居关系,Link ID变成192.168.82.42,此时路由稳定,不再flapping。
Link connected to: a Transit Network
(Link ID) Designated Router address:211.154.82.42
(Link Data) Router Interface address:211.154.82.41
Number of TOS metrics: 0
TOS 0 Metrics: 50
Lab复现
如下图所示拓扑,所有路由器同在Area 0,3925-2的G0/2与CRS16-A的G0/6/1/6地址重复。3925-2的G0/2并不需要宣告OSPF。
通过Lab复现发现,3925-2收到CRS16-Aflooding的Net LSA,检测发现Link ID与自己的G0/2地址重复,故将该LSA的Age设置成MaxAge,然后flooding到全网。此过程可以通过#debugospf test flood查看。
在ASR9922-A上:
[17:28:43]RP/0/RP0/CPU0:Jun28 09:28:44.546 : ospf[1019]: received update from 59.43.0.39,GigabitEthernet0/2/0/7
[17:28:43]RP/0/RP0/CPU0:Jun28 09:28:44.546 : ospf[1019]: Rcv Update Type 2, LSID 73.1.1.1, Adv rtr59.43.0.39, age 1, seq 0x8000019f
[17:28:43]RP/0/RP0/CPU0:Jun28 09:28:44.546 : ospf[1019]: Mask255.255.255.0
[17:28:47]RP/0/RP0/CPU0:Jun28 09:28:49.168 : ospf[1019]: Flooding update on GigabitEthernet0/2/0/10 to224.0.0.5 Area 0
[17:28:47]RP/0/RP0/CPU0:Jun28 09:28:49.168 : ospf[1019]: Send Type 2, LSID 73.1.1.1, Adv rtr 59.43.0.39,age 6, seq 0x8000019f (0), vrf default vrfid 0x60000000
[17:28:52]RP/0/RP0/CPU0:Jun28 09:28:54.109 : ospf[1019]: received update from 74.1.1.1,GigabitEthernet0/2/0/10
[17:28:52]RP/0/RP0/CPU0:Jun28 09:28:54.109 : ospf[1019]: Rcv Update Type 2, LSID 73.1.1.1, Adv rtr59.43.0.39, age 3600, seq 0x8000019f
[17:28:52]RP/0/RP0/CPU0:Jun28 09:28:54.109 : ospf[1019]: Mask255.255.255.0
[17:28:52]RP/0/RP0/CPU0:Jun28 09:28:54.115 : ospf[1019]: Flooding update on GigabitEthernet0/2/0/7 to224.0.0.5 Area 0
[17:28:52]RP/0/RP0/CPU0:Jun28 09:28:54.115 : ospf[1019]: Send Type 2, LSID 73.1.1.1, Adv rtr 59.43.0.39,age 3600, seq 0x8000019f (0), vrf default vrfid 0x60000000