THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Revision | Publish Date | Comments |
---|---|---|
2.1 |
16-Nov-22 |
Updated the Upgrade Program to Use Support Case Manager (SCM) |
2.0 |
28-Oct-22 |
Updated the Products Affected and Background Sections |
1.1 |
10-May-22 |
Updated to correct 64G memory PID |
1.0 |
05-May-22 |
Initial Release |
Affected Product ID | Comments |
---|---|
CNBR-MR-X32G2RT-H |
|
UCS-ML-X64G4RS-H |
|
CSP-MR-X32G2RS-H |
|
CIT3-MR-X16G1RS-H |
|
ULTM-MR-X32G2RS-H |
|
UCS-MR-X32G2RT-H= |
|
BE7K-RAM |
|
CSP-MR-X16G1RS-H |
|
UCS-MR-X32G2RS-H= |
|
BE6K-RAM-M5-NEW |
|
CSP-MR-X16G1RT-H |
|
CSP-MR-X32G2RT-H |
|
BE7K-RAM-M5-NEW |
|
UCS-ML-X64G4RT-H= |
|
BE6K-RAM |
|
UCS-MR-X16G1RS-H= |
|
UCS-MR-X16G1RT-H= |
|
HX-ML-X64G4RS-H= |
|
HX-MR-X32G2RS-H= |
|
HX-MR-X32G2RT-H= |
|
HX-MR-X16G1RT-H= |
|
HX-ML-X64G4RT-H= |
|
HX-MR-X16G1RS-H= |
|
HX-MR-X16G1RT-H= |
|
UCS-MR-X64G2RT-H= |
|
UCS-ML-128G4RT-H= |
|
UCS-ML-128G4RT-H |
Defect ID | Headline |
---|---|
CSCwb13808 | DIMMs from specific MFG failing at higher than expected rate |
A limited number of DIMMs shipped from Cisco are impacted by a known deviation in the memory supplier's manufacturing process. This deviation might result in a higher rate of failure.
DIMM manufacturers compose their DIMMs of multiple memory modules to reach the desired capacity. A 16GB DIMM might be composed of the same modules that a 32GB DIMM is composed of. In this case, a manufacturing deviation in specific modules impacts 16GB, 32GB, 64GB, and 128GB DIMMs. This deviation was contained to a specific date range, and the DIMMs which use these chips were manufactured during the middle to end of 2020. Since the discovery of this deviation, additional limits have been imposed on the manufacturing process to ensure that future DIMMs are not exposed to this process variation.
Most DIMMs with this manufacturing deviation will exhibit persistent correctable memory errors. If left untreated, the DIMMs might eventually encounter an uncorrectable memory event. If encountered during runtime, uncorrectable errors will cause a sudden unexpected server reset. If encountered during Power-On Self-Test (POST), the DIMM will be mapped out and the total available memory reduced. In some cases a boot error might be seen.
Various DIMM Reliability, Availability, and Serviceability (RAS) features or even operating system features might mask the extent of these correctable errors. It is recommended to check your DIMMs for exposure using the Serial Number Validation Tool described in the Serial Number Validation section of this field notice. Only specific DIMMs are impacted by this issue, so do not rely solely on the DIMM error count to judge exposure.
This is a hardware failure. A replacement is strongly recommended in order to avoid potential for unexpected server failure.
A replacement DIMM placed in the same slot as a previously failed DIMM might not immediately show as healthy. If a DIMM does not come up healthy on the first boot after the replacement process, verify the physical DIMM seating. Seating is the most common cause for immediate DIMM errors after replacement.
Cisco recommends to run memory diagnostics prior to placing servers into production in order to mitigate early runtime errors. For more details, see the Testing memory section of Cisco UCS HX M5 Memory Technical Overview - Memory RAS Features.
Impacted DIMMs can be identified based on their serial number. Once you have identified your DIMM serial numbers, you will need to use the Serial Number Validation Tool described in the Serial Number Validation section of this field notice. These methods can be used against any Cisco Unified Computing System (UCS) or Hyperflex server, with access to a management utility (Cisco Integrated Management Controller (IMC) or UCS Manager).
Note: The manufacturer's serial numbers are 18 alphanumeric characters long. Cisco UCS Manager output will truncate this to the last eight characters. This truncated serial number is unique and sufficient to identify an impacted DIMM. If you have trouble retrieving your serial number, there are other methods available to Cisco. Reach out to your account team or the Technical Assistance Center (TAC) for further instructions.
UCS Manager CLI (Simplest Method)
Use SSH to connect to your UCS Manager CLI and enter the show server inventory memory detail | egrep "^Server|Serial"
command. The Vendor Serial (SN) field is the serial number of your DIMM(s) and can be entered into the Serial Number Validation Tool. Note that unpopulated memory slots will show as blank.
FI-B# show server inventory memory detail | egrep "^Server|Serial" Server 1/1: Equipped Serial (SN): FCH185071HQ Acknowledged Serial (SN): FCH185071HQ Serial (SN): FCH185071HQ Serial (SN): Vendor Serial (SN): 18ED63ED Vendor Serial (SN): 18ED63EC Vendor Serial (SN): Vendor Serial (SN): 18ED6F62 Vendor Serial (SN): 18ED63EE Vendor Serial (SN): Vendor Serial (SN): 18F0457C Vendor Serial (SN): 18ED6E94
Cisco IMC CLI
Log into the Cisco IMC via SSH and enter these commands.
C220-FCHXXXXXXXX# scope chassis C220-FCHXXXXXXXX /chassis # show dimm detail | grep Serial Serial Number: 80BA3892 Serial Number: NA Serial Number: NA Serial Number: 80BA3863
Intersight - Managed Object Browser or API Browser (Preferred Method)
You can use the Managed Object Browser (MOB), which is a developer tool, to retrieve and then export the DIMM serial numbers and their location. Open a web browser and log into your Intersight account. Then, open a new tab in the same browser and open the Intersight Developer Center.
Note: For users who have an Intersight appliance, use:
https://[FQDN of appliance]/mobrowser/list/memory/Unit.
Add a search attribute of "Serial" with the value "ne "" (this filters for "Not Equal to NULL") to filter out empty slots. You can then export the results for use with the Serial Number Validation Tool.
Alternatively, you can use the Intersight API REST Client to generate and send a query which will return your DIMM SNs and location. After you log into Intersight, open a new tab and open the Intersight Developer Center API Reference.
Note: For users who have an Intersight appliance, use:
https://[fqdn of appliance]/apidocs/apirefs/api/v1/memory/Units/get/.
Add a new Query Parameter with the Key/Value pair "$select" and "Serial,Dn,Location,RegisteredDevice", another key/value of "$filter" and "Serial ne ''", "$expand" and "RegisteredDevice($select=DeviceHostname)", and a final pair of "$top" "1000". The resulting request will return a list of DIMMs, their serial number, and location for use with the Serial Number Validation Tool while filtering out null values.
Note: This will only return a maximum of 1000 DIMMs (the limit of API calls). If your install base contains greater than 1000 DIMMs, create a new Key/Value pair of "$skip" "1000" (or 2000, 3000, and so on) and query again. If you are unsure that you have more than 1000 DIMMs installed, you can add the "$inlinecount" "allpages" key/value pair to return the "count" of populated DIMMs. If you have difficulty querying your DIMM serial numbers, reach out to your account team or the Technical Assistance Center (TAC) for assistance.
Intersight (HTTP UI)
You can view the individual DIMM serial numbers in the Intersight UI. Navigate to the Server > Inventory page and expand the Memory list.
Intersight Advantage and Premier Customers
Intersight Advantage and Premier customers will be automatically alerted for impacted DIMMs. Review the Advisories tab of your Intersight account for FN72368.
Cisco provides a tool to verify whether a device is impacted by this issue. In order to check the device, enter the device's serial number in the Serial Number Validation Tool.
Note: For security reasons, you must click on the Serial Number Validation Tool link provided in this section to check the serial number for the device. Use of the Serial Number Validation Tool URL external to this field notice will fail.
Click on the following link to open Support Case Manager in a new tab:
https://mycase.cloudapps.cisco.com/fieldnotice?fn=FN72368
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
My Notifications—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.
Unleash the Power of TAC's Virtual Assistance