THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Affected Software Product | Affected Release | Affected Release Number | Comments |
---|---|---|---|
NX-OS System Software-ACI | 15 | 15.2(7f), 15.2(7g) | |
NX-OS System Software-ACI | 16 | 16.0(1g), 16.0(1j) |
Defect ID | Headline |
CSCwd83293 | Switch is getting reset with the fatal error during longevity test. |
For affected releases of Cisco Application Policy Infrastructure Controller (APIC) Software, any update of the telemetryStatsServerP managed object on a Cisco Application Centric Infrastructure (ACI) switch can result in a process crash, causing the switch to reboot.
Cisco Intersight Device Connector (DC) processes were added to Cisco NX-OS Software in ACI mode starting with releases 15.2(7f) and 16.0(1g).
Cisco ACI switches that are running software releases that are affected by CSCwd83293 can crash if the switch-based Intersight DC processes are not fully spawned when receiving a policy update from APIC. Changing the Nexus Dashboard for integrated ACI sites, enabling the Nexus Dashboard Insight application, or disabling the Nexus Dashboard Insight application could cause the software to attempt to update these processes, which could result in a switch crash.
A switch that is running an affected release reloads with a process core after performing one of the following operations:
The process core can be on any of these Cisco Intersight DC processes: dcgrpc, dc_nae, or dc.
The crash is the result of an update to one of the following managed objects within Cisco ACI: telemteryStatsServerP or intersightDeviceConnectorInst.
Contact Cisco Technical Assistance Center (TAC) to perform the workaround, which requires root access.
Open a case with Cisco TAC using the Cisco Support Case Manager.
Note: The workaround is NOT persistent. If the workaround is applied to a Cisco ACI switch and the switch reloads for any reason at a later date, the workaround will need to be re-applied.
It is highly encouraged to upgrade the Cisco ACI fabric to a software version that contains the fix for the defect.
Use the following steps to identify affected products.
1. Use the show version command to verify that the switch is running an affected release of the software.
Switch# show version Software BIOS: version 07.69 kickstart: version 15.2(7g) [build 15.2(7g)] system: version 15.2(7g) [build 15.2(7g)] PE: version 5.2(7g) BIOS compile time: 04/07/2021 kickstart image file is: /bootflash/aci-n9000-dk9.15.2.7g.bin kickstart compile time: 08/01/2024 16:26:55 [08/01/2024 16:26:55] system image file is: /bootflash/auto-s system compile time: 08/01/2024 16:26:55 [08/01/2024 16:26:55]
2. Check the switch system reset reason for "Reset Requested due to Fatal System Error."
Switch# vsh_lc -c "show logging onboard internal reset-reason" Last log in OBFL was written at time Thu Jul 13 22:52:44 2023 Reset Reason for this card: Image Version : 15.2(7g) Reset Reason (LCM): Unknown (0) at time Thu Jul 13 20:52:23 2023 Reset Reason (SW): Reset Requested due to Fatal System Error (3) at time Thu Jul 13 20:45:47 2023 Service (Additional Info): Reset Requested due to Fatal System Error Reset Reason (HW): Reset Requested due to Fatal System Error (3) at time Thu Jul 13 20:52:23 2023 Reset Cause (HW): 0x01 at time Thu Jul 13 20:52:23 2023 Reset internal (HW): 0x00 at time Thu Jul 13 20:52:23 2023
3. Verify that the core file is for one of the following processes:
Switch# show cores VDC Module Instance Process-name PID Date(Year-Month-Day Time) --- ------ -------- --------------- -------- -------------------------
4. Verify that the intersightProcessState managed object on the ACI switch has device connector processes in DUMMY state.
Switch# moquery -c intersightProcessState # intersight.ProcessState DCGrpcProcessStatus : ACTIVE DCGrpcServerState : DUMMY <<<<<<<<<< DCGrpcVersion : dcgrpc-1.0.2-1.x86_64.rpm DCProcessStatus : ACTIVE DCState : DUMMY <<<<<<<<<< DCVersion : nxos-connector-1.0.11.438-1.el7.x86_64 NaeProcessStatus : ACTIVE NaeState : DUMMY <<<<<<<<<< NaeVersion : nae-6.2.0.380-1.x86_64.rpm TinyProxyProcessStatus : ACTIVE TinyProxyState : DUMMY <<<<<<<<<< TinyProxyVersion : tinyproxy-1.0.1-1.x86_64.rpm
A process in DUMMY state on the affected software release represents a process that is not fully initialized and susceptible to the defect conditions.
Note: If the ACI switch is running a software release that has the fix for the defect, a DUMMY state for any of the services is not cause for concern.
Version | Description | Section | Date |
1.0 | Initial Release | — | 2024-OCT-11 |
For further assistance or for more information about this field notice, contact the Cisco Technical Assistance Center (TAC) using one of the following methods:
To receive email updates about Field Notices (reliability and safety issues), Security Advisories (network security issues), and end-of-life announcements for specific Cisco products, set up a profile in My Notifications.
Unleash the Power of TAC's Virtual Assistance