Introduction
This document describes how to triage the error message INFRA-ESD-6-PORT_STATE_CHANGE_LINK_DOWN.
Prerequisites
Requirements
Cisco recommends that you have a basic knowledge and working experience with Cisco IOS® XR routers.
Components Used
The information in this document is based on these software and hardware versions:
- Cisco 8000 Routers
- Cisco ASR 9000 Series Aggregation Services Routers
- Cisco Network Convergence System (NCS) 5500 Series Routers
- Cisco IOS XR software
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, ensure that you understand the potential impact of any command.
Problem
The syslog message with the keywords INFRA-ESD-6-PORT_STATE_CHANGE_LINK_DOWN.
The Ethernet Switch Driver (ESD) herein is a node scoped process to provide VLAN-based Layer 2 (L2) switching infrastructure with the Control Ethernet (CE) switches. These CE switches, sometimes also referred to as the Ethernet Out Band Channel (EOBC) switches, reside on the different modules of the chassis, like the Routing Processor (RP) or Routing-Switching Processor (RSP), the Line Card (LC), or even the System Controller (SC) of the NCS 5500 Series routers. They are connected to each other to build an internal control Ethernet network which is used for the intra-chassis communication on the Cisco IOS XR routers.
The message is self-explanatory; it indicates that the CE switch port in the message is down on the module where this message is generated. Hence, it is very common to see such message during the process of a module reload or booting failure on the router. The port must be restored and up after the pertinent module is fully booted up on the router in such case.
What if the message does not get cleared or keeps flapping while the module is up and running on the router?
Solution
This procedure can help to identify the connection of the port and recover it if the failure is a transient one.
- Identify the CE switch link connection for the error message.
- Check the port statistics on both ends of the link for any error or failure.
- Manually reset the port if this method is available on the platform.
- Fully reload the module(s).
- Physically reseat the module(s).
If all the previous steps are not able to recover the port, collect the data mentioned next in the troubleshooting examples for your platform and open a case with them to the Cisco Technical Assistance Center (TAC).
Troubleshooting Examples
This section illustrates the examples for these troubleshooting steps on the platforms Cisco 8000 Series routers, Cisco ASR 9000 Series Aggregation Services Routers, and Cisco NCS 5500 Series routers respectively.
Cisco 8000 Series Routers
RP/0/RP0/CPU0:Mar 6 23:01:56.591 UTC: esd[163]: %INFRA-ESD-6-PORT_STATE_CHANGE_LINK_DOWN : The physical link state of the control ethernet switch port 14 has changed. New Link state DOWN, Admin state: UP
At the beginning of the message, it tells where this message is generated, which is 0/RP0/CPU0 in this case. In addition, the body of the message tells it is port 14 that went down.
The CLI command show controllers switch statistics location 0/RP0/CPU0
shows not only the port traffic statistics but what it is connected to.
RP/0/RP0/CPU0:C8K#show controllers switch statistics location 0/RP0/CPU0
.
.
.
Tx Rx
Phys State Tx Rx Drops/ Drops/
Port State Changes Packets Packets Errors Errors Connects To
.
.
.
14 Up 2905 3431926 2157 0 121 LC15
.
.
.
Port 14 is connected to LC0/15 from the previous output. Then, enter the same CLI command from location 0/15/CPU0.
RP/0/RP0/CPU0:C8K#show controllers switch statistics location 0/15/CPU0
.
.
.
Tx Rx
Phys State Tx Rx Drops/ Drops/
Port State Changes Packets Packets Errors Errors Connects To
0 Up 3154 1787 4266 0 0 RP0
.
.
.
The end-to-end connection for the link in question is between 0/RP0/CPU0 CE switch port 14 and 0/15/CPU0 CE switch port 0. There are some Rx Errors seen on 0/RP0/CPU0 and a big number on the State Changes for both sides in this example.
Manually reset CE switch port 14 on 0/RP0/CPU0, and port 0 on LC0/15/CPU0 using these CLI commands:
set controller switch port reset location 0/RP0/CPU0 port 14
set controller switch port reset location 0/15/CPU0 port 0
Reload modules using these CLI commands:
reload location 0/RP0
reload location 0/15
Tip: In order to reset the entire board, specify the location 0/15, not 0/15/CPU0.
Physically reseat or Online Insertion and Removal (OIR) the modules LC 0/15 and 0/RP0.
Collect the show tech files and open a Service Request (SR) with them to the Cisco TAC if all methods are exhausted at this point:
show tech-support
show tech-support ctrace
show tech-support control-ethernet
Cisco ASR 9000 Series Aggregation Services Routers
Cisco ASR 9000 Series routers are running two types of Cisco IOS XR software: 32-bit OS (cXR) and 64-bit OS (eXR) today.
Example of an ASR 9000 Router Running eXR
0/2/ADMIN0:Jul 11 13:24:02.797 UTC: esd[3510]: %INFRA-ESD-6-PORT_STATE_CHANGE_LINK_DOWN : The physical link state of the control ethernet switch port 33 has changed. New Link state DOWN, Admin state: UP
The message tells that port 33 on the LC 0/2 goes down.
The admin mode CLI command show controller switch reachable
lists all the CE switches in the router along with their locations.
sysadmin-vm:0_RP0#show controller switch reachable
Tue Nov 21 17:57:09.691 UTC+00:00
Rack Card Switch
--------------------
0 RP0 RP-SW
0 RP0 RP-SW1
0 RP1 RP-SW
0 RP1 RP-SW1
0 LC0 LC-SW
0 LC2 LC-SW
0 LC6 LC-SW
0 LC9 LC-SW
0 LC10 LC-SW
The admin mode CLI command show controller switch summary location
shows the port number, the physical state, the admin state, port speed, and what that port connects to. Typically, the port is in forwarding mode if the physical state is up. If the physical state is down and admin state is up, then the other end does not bring the link up.
sysadmin-vm:0_RP0#show controller switch summary location 0/LC2/LC-SW
Tue Nov 21 17:57:41.265 UTC+00:00
Rack Card Switch Rack Serial Number
--------------------------------------
0 LC2 LC-SW
Phys Admin Port Protocol Forward Connects
Port State State Speed State State To
---------------------------------------------------------------
.
.
.
33 Down Up 10-Gbps - - NP3
.
.
.
In order to see the port statistics, you can use the admin mode CLI command show controller switch statistics location
. This CLI command can dump the number of times the link state has changed, the total RX packets, total TX packets, the RX dropped packets, and TX dropped packets.
Tip: In order to dump the detailed statistics for the port, use the admin mode CLI command show controllers switch statistics detail location <loc> <port>
.
In this case, port 33 on the LC 0/2 is connected to the NP3 on the module.
Manually reset the port if this method is available on the platform:
controller switch port-state location 0/LC2/LC-SW 33 down
controller switch port-state location 0/LC2/LC-SW 33 up
Fully reload the module in admin mode with the CLI command reload location 0/2 all
.
Physically reseat or OIR the module 0/2/CPU0.
Note: For the module 0/0/CPU0 on the platform ASR9903, power-cycle the whole chassis is needed as it is a fix module.
Collect the show tech files and open a SR with them to the Cisco TAC if all previous methods are exhausted at this point:
show tech-support
show tech-support ethernet controllers
show tech-support ctrace
admin show tech-support control-ethernet
Example of an ASR 9000 Router Running cXR
0/1/ADMIN0:Oct 1 21:31:03.806 : esd[3347]: %INFRA-ESD-6-PORT_STATE_CHANGE_LINK_DOWN : The physical link state of the control ethernet switch port 51 has changed. New Link state DOWN, Admin state: UP
In this example, port 51 went down on the module LC 0/1.
The CLI command show controllers epm-switch port-mapping location
shows the port connection and status.
RP/0/RSP0/CPU0:A9K-cXR#show controllers epm-switch port-mapping location 0/1/CPU0
Tue Nov 21 17:13:07.206 UTC
Port | Link Status | Vlan | Connected to
------------|-----------------|---------------|---------------
.
.
.
51 | Down | VLAN_EOBC_1 | RSP_1_0
.
.
.
It is connected to RSP1. Enter the same CLI command from the other end 0/RSP1/CPU0.
RP/0/RSP0/CPU0:A9K-cXR#show controllers epm-switch port-mapping location 0/RSP1/CPU0
Tue Nov 21 17:13:08.206 UTC
Port | Link Status | Vlan | Connected to
------------|-----------------|---------------|---------------
.
.
.
40 | Down | VLAN_EOBC_0 | LC_EOBC_1_0
.
.
.
The CLI command show controllers epm-switch mac-stats <port> location
dumps the details of the traffic statistics for the port.
RP/0/RSP0/CPU0:A9K-cXR#show controllers epm-switch mac-stats 51 location 0/1/CPU0
Tue Nov 21 17:15:07.206 UTC
Port MAC counters : port 51
Good Packets Rcv = 302005552 | Good Bytes Rcv = 72995992385
Good Packets Sent = 229201631 | Good Bytes Sent = 62405266641
Bad Packets Rcv = 0 | Bad Bytes Rcv = 0
Unicast Packets Rcv = 192484322 | Unicast Packets Sent = 220568253
Broadcast Packets Rcv = 0 | Broadcast Packets Sent = 1
Multicast Packets Rcv = 109521230 | Multicast Packets Sent = 8633377
0-64 bytes Packets = 31
65-127 bytes Packets = 306484671
128-255 bytes Packets = 110661438
256-511 bytes Packets = 56302837
512-1023 bytes Packets = 15340912
1024-max bytes Packets = 42417294
Mac Transmit Errors = 0
Excessive Collisions = 0
Unrecognized MAC Cntr Rcv = 0
Flow Control Sent = 0
Good Flow Control Rcv = 0
Drop Events = 0
Undersize Packets Rcv = 0
Fragmented Packets = 0
Oversized Packets = 0
Jabber Packets = 0
MAC Receive Error = 0
Bad CRC = 0
Collisions = 0
Late Collisions = 0
Bad Flow Control Rcv = 0
Multiple Packets Sent = 0
Deferred Packets Sent = 0
Fully reload the module from admin mode with the hw-module location 0/1/CPU0 reload
command.
Physically reseat or OIR the module LC 0/1/CPU0.
Collect the show tech files and open a SR with them to the Cisco TAC if all methods are exhausted at this point:
show tech-support
show tech-support ethernet controllers
admin show tech-support control-ethernet
Cisco NCS 5500 Series
0/2/ADMIN0:Aug 3 10:37:14.791 HKT: esd[3440]: %INFRA-ESD-6-PORT_STATE_CHANGE_ADMIN_DOWN : The admin state of the control ethernet switch port 18 has changed. New Admin state: DOWN, Link state DOWN
The error message is from LC 0/2/CPU0 and its CE switch port 18 went down.
The admin mode CLI command show controller switch reachable
lists all the CE switches in the router along with their locations.
Note: All CLI commands regarding the CE switch for NCS5500 platform are in admin mode.
sysadmin-vm:0_RP0# show controller switch reachable
Wed Nov 8 16:39:00.502 UTC+00:00
Rack Card Switch
---------------------
0 SC0 SC-SW
0 SC0 EPC-SW
0 SC0 EOBC-SW
0 SC1 SC-SW
0 SC1 EPC-SW
0 SC1 EOBC-SW
0 LC0 LC-SW
0 LC2 LC-SW
0 LC5 LC-SW
0 LC7 LC-SW
0 FC1 FC-SW
0 FC2 FC-SW
0 FC3 FC-SW
0 FC4 FC-SW
0 FC5 FC-SW
Enter the admin mode CLI command show controller switch statistics detail location 0/LC2/LC-SW
in order to check the port statistics and connection mapping.
sysadmin-vm:0_RP0# show controller switch statistics location 0/LC2/LC-SW
Tue Aug 4 11:12:47.199 UTC+00:00
Rack Card Switch Rack Serial Number
--------------------------------------
0 LC2 LC-SW
Tx Rx
Phys State Tx Rx Drops/ Drops/
Port State Changes Packets Packets Errors Errors Connects To
---------------------------------------------------------------------------
.
.
.
18 Down 97 236972058 272457269 128 0 SC0 EOBC-SW
.
.
.
Tip: The admin mode CLI command show controller switch statistics detail location 0/LC2/LC-SW 18
can show more details for the specific port.
From the previous output, you know that port 18 is connected to the 0/SC0/EOBC-SW. Now enter the same CLI command from the location 0/SC0/EOBC-SW.
sysadmin-vm:0_RP0# show controller switch statistics location 0/SC0/EOBC-SW
Rack Card Switch Rack Serial Number
--------------------------------------
0 SC0 EOBC-SW
Tx Rx
Phys State Drops/ Drops/
Port State Changes Tx Packets Rx Packets Errors Errors Connects To
---------------------------------------------------------------------------
.
.
.
13 Up 113 722686694 706445299 0 0 LC2
.
.
.
The full connection regarding the error message is determined as from 0/LC2/LC-SW CE port 18 to 0/SC0/EOBC-SW port 13.
Manually reset the ports:
controller switch port-state location 0/LC2/LC-SW 18 down
controller switch port-state location 0/LC2/LC-SW 18 up
controller switch port-state location 0/SC0/EOBC-SW 13 down
controller switch port-state location 0/SC0/EOBC-SW 13 up
Fully reload the modules in admin mode:
hw-module loc 0/2 reload
hw-module loc 0/SC0 reload
Tip: Do not enter the exec mode CLI command reload location force
as it does not reset the CE switch onboard.
Physically reseat the modules.
Collect the show tech files and open a SR with them to the Cisco TAC if all methods are exhausted at this point:
admin show tech card-mgr
admin show tech os
admin show tech-support control-ethernet
admin show tech ctrace
admin show tech shelf-mgr