THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Affected Product Name | Description | Comments |
---|---|---|
UCSX-9508-FAN | UCS 9508 Fan Module | |
UCSX-9508-FAN= | UCS 9508 Fan Module |
Defect ID | Headline |
CSCwd37309 | UCSX Chassis Fans logic enhancement |
Customers who have Cisco UCS X-Series systems may receive the following Intersight alert: [Chassis fan module] has a critical speed threshold condition.
The fault may occur and then clear on its own. If the fault clears on its own, no action is required. The faults do not impact chassis operation.
The chassis logs (obfl-full) may look like the following:
Go to-XXXXXXXXX_intersight_XXXXX.tar\XXXXXX-IoCard-X.tar\XXXXXXXXX-IoCard-X.tar\techsupport_detailed_iocardX\cmc\log\
2022-10-25T02:28:06.963537+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20a | 10/25/2022 | 02:27:29 | Fan FAN2_FRONT_SPEED | Lower Critical going low | Reading 960 low < Threshold 1140 RPM | Asserted#012
2022-10-25T02:28:06.963622+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20b | 10/25/2022 | 02:27:29 | Fan FAN2_FRONT_SPEED | Lower Non-recoverable going low | Reading 960 low < Threshold 1080 RPM | Asserted#012
2022-10-25T02:28:06.963669+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20c | 10/25/2022 | 02:27:29 | Cooling Device FAN2_FRONT_FAIL | Predictive Failure Asserted#012
2022-10-25T02:30:11.172146+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20d | 10/25/2022 | 02:29:03 | Fan FAN2_FRONT_SPEED | Lower Non-recoverable going low | Reading 2880 low > Threshold 1080 RPM | Deasserted#012
2022-10-25T02:30:11.172294+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20e | 10/25/2022 | 02:29:03 | Fan FAN2_FRONT_SPEED | Lower Critical going low | Reading 2880 low > Threshold 1140 RPM | Deasserted#012
2022-10-25T02:30:11.172365+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20f | 10/25/2022 | 02:29:03 | Cooling Device FAN2_FRONT_FAIL | Predictive Failure Deasserted#012
The fans may stop spinning for about 90 seconds before restarting themselves. In some cases, the recovery can take longer.
This behavior is caused by an oversensitive internal controller that incorrectly detects an abnormal current, resulting in a rotor stoppage for approximately 90 seconds. This, in turn, forces all other fans to run at 100 percent until the failure is de-asserted, causing an unpleasant noise. This issue is related to low fan speed conditions only.
Customers may hear loud noise from Cisco UCS-X systems because some fans are running at 100 percent.
The chassis logs (obfl-full) may look like the following:
XXXXXXXXX_intersight_XXXXX.tar\XXXXXX-IoCard-X.tar\XXXXXXXXX-IoCard-X.tar\techsupport_detailed_iocardX\cmc\log\
2022-10-25T02:28:06.963537+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20a | 10/25/2022 | 02:27:29 | Fan FAN2_FRONT_SPEED | Lower Critical going low | Reading 960 low < Threshold 1140 RPM | Asserted#012
2022-10-25T02:28:06.963622+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20b | 10/25/2022 | 02:27:29 | Fan FAN2_FRONT_SPEED | Lower Non-recoverable going low | Reading 960 low < Threshold 1080 RPM | Asserted#012
2022-10-25T02:28:06.963669+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20c | 10/25/2022 | 02:27:29 | Cooling Device FAN2_FRONT_FAIL | Predictive Failure Asserted#012
2022-10-25T02:30:11.172146+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20d | 10/25/2022 | 02:29:03 | Fan FAN2_FRONT_SPEED | Lower Non-recoverable going low | Reading 2880 low > Threshold 1080 RPM | Deasserted#012
2022-10-25T02:30:11.172294+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20e | 10/25/2022 | 02:29:03 | Fan FAN2_FRONT_SPEED | Lower Critical going low | Reading 2880 low > Threshold 1140 RPM | Deasserted#012
2022-10-25T02:30:11.172365+00:00 CMC NOCSN_cmc_manager-6-CMC OBFL:0:obfl_log_sels:SEL: 20f | 10/25/2022 | 02:29:03 | Cooling Device FAN2_FRONT_FAIL | Predictive Failure Deasserted#012
The chassis thermal log will indicate TACH, as shown in the following example. If the logs are collected when the fans are back to normal, the keyword TACH may not appear and speed will show as normal.
XXXXXXXXX_intersight_XXXXX.tar\XXXXXX-IoCard-X.tar\XXXXXXXXX-IoCard-X.tar\techsupport_detailed_iocardX\cmc\log\
fan[main/1].fault/read/req: 0/100/100 # OK
fan[main/2].fault/read/req: 2/100/100 # TACH
fan[main/3].fault/read/req: 0/100/100 # OK
fan[main/4].fault/read/req: 0/100/100 # OK
fan[ifm1/2].fault/read/req: 0/100/100 # OK
fan[ifm1/3].fault/read/req: 0/100/100 # OK
fan[ifm2/1].fault/read/req: 0/100/100 # OK
fan[ifm2/2].fault/read/req: 0/100/100 # OK
fan[ifm2/3].fault/read/req: 0/100/100 # OK
fan[fem1/1].fault/read/req: 0/100/100 # OK
fan[fem1/2].fault/read/req: 0/100/100 # OK
fan[fem1/3].fault/read/req: 0/100/100 # OK
fan[fem2/1].fault/read/req: 0/100/100 # OK
fan[fem2/2].fault/read/req: 0/100/100 # OK
fan[fem2/3].fault/read/req: 0/100/100 # OK
The chassis OBFL-CMC log will show the fan intermittently changing speed, and an alert will be generated, as shown in the following example:
Fan FAN2_FRONT_SPEED | Lower Critical going low | Reading 960 low < Threshold 1140 RPM | Asserted#012
Fan FAN2_FRONT_SPEED | Lower Non-recoverable going low | Reading 960 low < Threshold 1080 RPM | Asserted#012
Cooling Device FAN2_FRONT_FAIL | Predictive Failure Asserted#012>
To fix the issue that is described in this field notice, upgrade to a fixed software release.
Cisco UCS Manager Software Release | First Fixed Release |
---|---|
4.2 | 4.2(2e) bundle 4.2(3d) bundle |
4.3 | 4.3(2b) bundle |
Note: CSCwd37309 is fixed in releases 4.2(2d) and 4.2(3b) but due to FN74007, customers should upgrade to Release 4.2(2e) or Release 4.2(3d).
The software can be found at https://software.cisco.com/download/home/283612660/type/283655658/release/.
It is not necessary to replace the fan. If the upgrade does not resolve the issue, reseat the fan.
For verifying the logs and events, collect the log bundle for the chassis.
Version | Description | Section | Date |
1.0 | Initial Release | — | 2023-NOV-08 |
For further assistance or for more information about this field notice, contact the Cisco Technical Assistance Center (TAC) using one of the following methods:
To receive email updates about Field Notices (reliability and safety issues), Security Advisories (network security issues), and end-of-life announcements for specific Cisco products, set up a profile in My Notifications
Unleash the Power of TAC's Virtual Assistance