簡介
本文檔介紹使用ACI PBR Multipod環境在遠端POD上辨識和排除IP SLA跟蹤裝置的故障的步驟。
必要條件
需求
思科建議您瞭解以下主題:
採用元件
本文中的資訊係根據以下軟體和硬體版本:
- 思科ACI版本4.2(7l)
- 思科枝葉交換機N9K-C93180YC-EX
- 思科主幹交換機N9K-C9336PQ
- Nexus 7k版本8.2(2)
本文中的資訊是根據特定實驗室環境內的裝置所建立。文中使用到的所有裝置皆從已清除(預設)的組態來啟動。如果您的網路運作中,請確保您瞭解任何指令可能造成的影響。
網路拓撲
拓撲
背景資訊
使用服務圖,思科ACI可以將安全區域之間的流量重定向到防火牆或負載均衡器,而無需將防火牆或負載均衡器作為伺服器的預設網關。
PBR設定中的IP SLA功能允許ACI交換矩陣監控環境中的該服務節點(L4-L7裝置),並使交換矩陣不會將源和目標之間的流量重定向到不可達的服務節點。
注意:ACI IPSLA依賴交換矩陣系統GIPO(組播地址239.255.255.240/28)傳送探測和分發跟蹤狀態。
案例
在本示例中,在POD-1上的源端點192.168.150.1與POD-2上的目標伺服器192.168.151.1之間無法完成東-西連線。流量從POD-1上的服務枝葉103重定向到PBR節點172.16.1.1。PBR正在使用IP SLA監控和重定向運行狀況組策略。
疑難排解步驟
步驟 1.確定IP SLA狀態
- 在APIC UI上,導航至租戶> Your_Tenant >故障。
- 查詢故障F2911、F2833、F2992。
IP SLA故障
步驟 2.辨識狀況群組處於關閉狀態的節點ID
- 在APIC CLI上,使用故障F2911、F2833、F2992運行moquery命令。
- 可以看到POD-2中葉202的運行狀況組lb1::lb-healthGrp已關閉。
MXS2-AP002# moquery -c faultInst -f 'fault.Inst.code == "F2911"'
# fault.Inst
code : F2911
ack : no
alert : no
annotation :
cause : svcredir-healthgrp-down
changeSet : operSt (New: disabled), operStQual (New: healthgrp-service-down)
childAction :
created : 2024-01-31T19:07:31.505-06:00
delegated : yes
descr : PBR service health grp lb1::lb-healthGrp on nodeid 202 fabric hostname MXS2-LF202 is in failed state, reason Health grp service is down.
dn : topology/pod-2/node-202/sys/svcredir/inst/healthgrp-lb1::lb-healthGrp/fault-F2911 <<<
domain : infra
extMngdBy : undefined
highestSeverity : major
步驟 3.驗證PBR裝置是作為終端獲知的,可從服務枝葉訪問
MXS2-LF103# show system internal epm endpoint ip 172.16.1.1
MAC : 40ce.2490.5743 ::: Num IPs : 1
IP# 0 : 172.16.1.1 ::: IP# 0 flags : ::: l3-sw-hit: No
Vlan id : 22 ::: Vlan vnid : 13192 ::: VRF name : lb1:vrf1
BD vnid : 15958043 ::: VRF vnid : 2162693
Phy If : 0x1a00b000 ::: Tunnel If : 0
Interface : Ethernet1/12
Flags : 0x80004c04 ::: sclass : 16391 ::: Ref count : 5
EP Create Timestamp : 02/01/2024 00:36:23.229262
EP Update Timestamp : 02/02/2024 01:43:38.767306
EP Flags : local|IP|MAC|sclass|timer|
MXS2-LF103# iping 172.16.1.1 -V lb1:vrf1
PING 172.16.1.1 (172.16.1.1) from 172.16.1.254: 56 data bytes
64 bytes from 172.16.1.1: icmp_seq=0 ttl=255 time=1.046 ms
64 bytes from 172.16.1.1: icmp_seq=1 ttl=255 time=1.074 ms
64 bytes from 172.16.1.1: icmp_seq=2 ttl=255 time=1.024 ms
64 bytes from 172.16.1.1: icmp_seq=3 ttl=255 time=0.842 ms
64 bytes from 172.16.1.1: icmp_seq=4 ttl=255 time=1.189 ms
--- 172.16.1.1 ping statistics ---
5 packets transmitted, 5 packets received, 0.00% packet loss
round-trip min/avg/max = 0.842/1.034/1.189 ms
步驟 4.檢查本地POD和遠端POD中的PBR運行狀況組
枝葉103是POD-1上的服務枝葉。因此,我們將POD-1視為本地POD,將POD-2視為遠端POD。
運行狀況組僅在枝葉交換機上進行程式設計,源交換機和目標EPG合約要求部署該組。
1. 源EPG位於枝葉節點102 POD-1。您可以看到,PBR裝置被跟蹤為UP,從Service Leaf 103 POD-1。
MXS2-LF102# show service redir info health-group lb1::lb-healthGrp
=======================================================================================================================================
LEGEND
TL: Threshold(Low) | TH: Threshold(High) | HP: HashProfile | HG: HealthGrp | BAC: Backup-Dest | TRA: Tracking | RES: Resiliency
=======================================================================================================================================
HG-Name HG-OperSt HG-Dest HG-Dest-OperSt
======= ========= ======= ==============
lb1::lb-healthGrp enabled dest-[172.16.1.1]-[vxlan-2162693]] up
2. 目標EPG位於枝葉節點202 POD-2。您可以看到PBR裝置被跟蹤為從服務枝葉103 POD-1斷開。
MXS2-LF202# show service redir info health-group lb1::lb-healthGrp
=======================================================================================================================================
LEGEND
TL: Threshold(Low) | TH: Threshold(High) | HP: HashProfile | HG: HealthGrp | BAC: Backup-Dest | TRA: Tracking | RES: Resiliency
=======================================================================================================================================
HG-Name HG-OperSt HG-Dest HG-Dest-OperSt
======= ========= ======= ==============
lb1::lb-healthGrp disabled dest-[172.16.1.1]-[vxlan-2162693]] down <<<<< Health Group is down.
步驟 5.使用ELAM工具捕獲IP SLA探測
注意:您可以使用內建的擷取工具「內嵌邏輯分析器模組(ELAM)」來擷取傳入的封包。ELAM語法取決於硬體型別。另一種方法是使用ELAM助理應用。
要捕獲IP SLA探測,必須在ELAM語法上使用這些值來瞭解資料包到達或丟棄的位置。
ELAM內部L2標頭
源MAC地址= 00-00-00-00-00-01
目標MAC = 01-00-00-00-00-00
注意:源MAC和目標Mac(如前所示)是IP SLA資料包的內部報頭上的固定值。
ELAM外部L3報頭
源IP =服務枝葉的TEP(實驗室中的枝葉103 TEP = 172.30.200.64)
目標IP = 239.255.255.240(交換矩陣系統GIPO必須始終相同)
trigger reset
trigger init in-select 14 out-select 0
set inner l2 dst_mac 01-00-00-00-00-00 src_mac 00-00-00-00-00-01
set outer ipv4 src_ip 172.30.200.64 dst_ip 239.255.255.240
start
stat
ereport
...
------------------------------------------------------------------------------------------------------------------------------------------------------
Inner L2 Header
------------------------------------------------------------------------------------------------------------------------------------------------------
Inner Destination MAC : 0100.0000.0000
Source MAC : 0000.0000.0001
802.1Q tag is valid : no
CoS : 0
Access Encap VLAN : 0
------------------------------------------------------------------------------------------------------------------------------------------------------
Outer L3 Header
------------------------------------------------------------------------------------------------------------------------------------------------------
L3 Type : IPv4
DSCP : 0
Don't Fragment Bit : 0x0
TTL : 27
IP Protocol Number : UDP
Destination IP : 239.255.255.240
Source IP : 172.30.200.64
步驟 6.檢查已在本地和遠端主幹上程式設計的交換矩陣系統GIPO (239.255.255.240)
注意:對於每個GIPO,每個POD中僅選擇一個主幹節點作為權威裝置轉發組播幀並向IPN傳送IGMP加入。
1. 主幹1001 POD-1是轉發組播幀和向IPN傳送IGMP加入的權威交換機。
介面Eth1/3面向N7K IPN。
MXS2-SP1001# show isis internal mcast routes gipo | more
IS-IS process: isis_infra
VRF : default
GIPo Routes
====================================
System GIPo - Configured: 0.0.0.0
Operational: 239.255.255.240
====================================
<OUTPUT CUT> ...
GIPo: 239.255.255.240 [LOCAL]
OIF List:
Ethernet1/35.36
Ethernet1/3.3(External) <<< Interface must point out to IPN on elected Spine
Ethernet1/16.40
Ethernet1/17.45
Ethernet1/2.37
Ethernet1/36.42
Ethernet1/1.43
MXS2-SP1001# show ip igmp gipo joins | grep 239.255.255.240
239.255.255.240 0.0.0.0 Join Eth1/3.3 43 Enabled
2. 主幹2001 POD-2是轉發組播幀和向IPN傳送IGMP加入的權威交換機。
介面Eth1/36面向N7K IPN。
MXS2-SP2001# show isis internal mcast routes gipo | more
IS-IS process: isis_infra
VRF : default
GIPo Routes
====================================
System GIPo - Configured: 0.0.0.0
Operational: 239.255.255.240
====================================
<OUTPUT CUT> ...
GIPo: 239.255.255.240 [LOCAL]
OIF List:
Ethernet1/2.40
Ethernet1/1.44
Ethernet1/36.36(External) <<< Interface must point out to IPN on elected Spine
MXS2-SP2001# show ip igmp gipo joins | grep 239.255.255.240
239.255.255.240 0.0.0.0 Join Eth1/36.36 76 Enabled
3. 確保兩個主幹的VSH中的outgoing-interface-list gipo不是空的。
MXS2-SP1001# vsh
MXS2-SP1001# show forwarding distribution multicast outgoing-interface-list gipo | more
....
Outgoing Interface List Index: 1
Reference Count: 1
Number of Outgoing Interfaces: 5
Ethernet1/35.36
Ethernet1/3.3
Ethernet1/2.37
Ethernet1/36.42
Ethernet1/1.43
External GIPO OIFList
Ext OIFL: 8001
Ref Count: 393
No OIFs: 1
Ethernet1/3.3
步驟 7.驗證IPN上配置的GIPO (239.255.255.240)
1. IPN配置中缺少GIPO 239.255.255.240。
N7K-ACI_ADMIN-VDC-ACI-IPN-MPOD# show run pim
...
ip pim rp-address 192.168.100.2 group-list 225.0.0.0/15 bidir
ip pim ssm range 232.0.0.0/8
N7K-ACI_ADMIN-VDC-ACI-IPN-MPOD# show ip mroute 239.255.255.240
IP Multicast Routing Table for VRF "default"
(*, 239.255.255.240/32), uptime: 1d01h, igmp ip pim
Incoming interface: Null, RPF nbr: 0.0.0.0 <<< Incoming interface and RPF are MISSING
Outgoing interface list: (count: 2)
Ethernet3/3.4, uptime: 1d01h, igmp
Ethernet3/1.4, uptime: 1d01h, igmp
2. GIPO 239.255.255.240現已在IPN上配置。
N7K-ACI_ADMIN-VDC-ACI-IPN-MPOD# show run pim
...
ip pim rp-address 192.168.100.2 group-list 225.0.0.0/15 bidir
ip pim rp-address 192.168.100.2 group-list 239.255.255.240/28 bidir <<< GIPO is configured
ip pim ssm range 232.0.0.0/8
N7K-ACI_ADMIN-VDC-ACI-IPN-MPOD# show ip mroute 225.0.42.16
IP Multicast Routing Table for VRF "default"
(*, 225.0.42.16/32), bidir, uptime: 1w6d, ip pim igmp
Incoming interface: loopback1, RPF nbr: 192.168.100.2
Outgoing interface list: (count: 2)
Ethernet3/1.4, uptime: 1d02h, igmp
loopback1, uptime: 1d03h, pim, (RPF)
步驟 8.確認遠端POD上的IP SLA跟蹤已啟動
MXS2-LF202# show service redir info health-group lb1::lb-healthGrp
=======================================================================================================================================
LEGEND
TL: Threshold(Low) | TH: Threshold(High) | HP: HashProfile | HG: HealthGrp | BAC: Backup-Dest | TRA: Tracking | RES: Resiliency
=======================================================================================================================================
HG-Name HG-OperSt HG-Dest HG-Dest-OperSt
======= ========= ======= ==============
lb1::lb-healthGrp enabled dest-[172.16.1.1]-[vxlan-2162693]] up
相關資訊