THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Revision | Publish Date | Comments |
---|---|---|
1.0 |
19-Nov-21 |
Initial Release |
Affected OS Type | Affected Software Product | Affected Release | Affected Release Number | Comments |
---|---|---|---|---|
NON-IOS |
DNA Center Software |
2 |
2.1.1.0, 2.1.1.3, 2.1.2.0, 2.1.2.3, 2.1.2.4, 2.1.2.5, 2.1.2.6, 2.1.2.7, 2.2.1.0, 2.2.2.0, 2.2.2.1, 2.2.2.3, 2.2.2.4, 2.2.2.5 |
Defect ID | Headline |
---|---|
CSCvy83860 | InfluxDB crash specific to org.apache.kafka.common.metrics.consumer metrics cardinality Increase |
The Cisco DNA Center InfluxDB service might continuously crash during operations, after an upgrade, or after a fresh install of Cisco DNA Center. This crash does not impact any customer facing production services, but will impact internal metrics collection used for diagnostics.
The InfluxDB service crash occurs due to a condition that allows the service's database size to grow larger than the maximum available memory. The service crashes and reports an "Out of Memory" error. There are several normal operation activities that might, over time, result in the database exceeding the maximum memory boundary. Also, older versions of Cisco DNA Center software also include a defect that can cause the InfluxDB database to grow significantly.
The InfluxDB service crashes with an exit code of 137 and reason "OOMKilled". In order to display the crash status, enter the magctl service status influxdb
CLI command on the Cisco DNA Center console as shown in this example:
influxdb:
Container ID: docker://367809f29e388598a36a08bf7d2cd14427eee2534d453424f4a7e4c3e96839a7
Image: maglev-registry.maglev-system.svc.cluster.local:5000/influxdb:v1.4.3-1.7.1
Image ID: docker-pullable://maglev-registry.maglev-system.svc.cluster.local:5000/influxdb@sha256:de261b61cbf1427ab854eeed528d037f199f7017cda479792f3d1b8dffac4dc2
Ports: 8083/TCP, 8086/TCP, 8089/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Wed, 14 Aug 2019 14:07:04 +0000
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Wed, 14 Aug 2019 12:32:44 +0000
Finished: Wed, 14 Aug 2019 14:07:02 +0000
Ready: True
Restart Count: 1813
There is currently no user interface instrumentation available in affected versions of Cisco DNA Center to view the status of the InfluxDB database size or check if the service is near its memory limit. Customers are strongly encouraged to implement one of the solutions outlined in the Workaround/Solution section in order to avoid the problem.
The problem has been fixed in Cisco DNA Center Software Release 2.2.2.6 and later. In order to avoid the problem, customers can upgrade to one of the fixed releases.
There is also a workaround for customers who run Cisco DNA Center Releases 2.1.2.1 through 2.2.2.5 and are unable to upgrade immediately. The workaround requires installation of a script that is available for download on Cisco.com. Follow these instructions to download, install, and run the script:
curl
CLI command in order to download the script file from this Cisco URL:
https://software.cisco.com/download/redirect?config=f8aaa84ad087351b9faace14b5b40eeb
scp
CLI command to copy the CSCvy83860-Workaround.zip file from step 2 onto the other two nodes:
scp -P2222 CSCvy83860-Workaround.zip maglev@<Node_IP_Address>:/data/tmp
Substitute <Node_IP_Address> with the IP address of the Cisco DNA Center cluster node.
unzip
CLI command as shown in this example:
cd /data/tmp mkdir -p CSCvy83860-Workaround unzip CSCvy83860-Workaround.zip -d CSCvy83860-Workaround
sh
CLI command as shown in this example:
cd CSCvy83860-Workaround sh CSCvy83860-kafka-cleanup.sh sh CSCvy83860-Garbage-Retention.sh
There is no workaround for Cisco DNA Center Releases earlier than 2.1.2.1. Customers who encounter the InfluxDB service crash on those releases should contact the Cisco Technical Assistance Center (TAC) for assistance with manual remediation.
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
My Notifications—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.
Unleash the Power of TAC's Virtual Assistance