Introduction
This document describes the issues related to the pattern of log generation log files in RCM and its recovery.
Overview
Note: Cisco recommends that you have knowledge of Redundancy Configuration Manager (RCM).
In RCM, log collection files are produced for each component (pods) and are retained for up to 4 days, after which RCM automatically deletes these log files.
As per the configuration in RCM:
Max number of files which can be generated = 10
(can vary as per the RCM docker config but it should be 9/10)
Max size of per file =10Mb
(can vary as per the RCM docker config but 10Mb is majorly defined)
Problem
Log files are created within a timeframe ranging from 3 to 10 minutes. As soon as RCM accumulates 10 files, it removes the older ones, which is why the log files from the past 4 days are not retained.
Troubleshoot
Point of Concern: The issue that requires attention is the rapid generation of a high number of files, leading to the quick attainment of the threshold value of 10 files.
Upon reviewing one of the log files, you have identified error events. It seems that certain extra events were triggered at the debugging level, as outlined here.
{"log":"2023/03/14 10:04:44.399 [DEBUG] [ApplicationContext.go:1922] [infra.application.core] Ping method is found for the rpc rcm-checkpointmgr-19, host rcm-checkpointmgr-191\n","stream":"stdout","time":"2023-03-14T10:04:44.399280518Z"}
{"log":"2023/03/14 10:04:44.399 [DEBUG] [ApplicationContext.go:1760] [infra.dpd.core] Ping reachable client Id 4 Name: rcm-checkpointmgr-193 Setname: rcm-checkpointmgr-19 Host: rcm-checkpointmgr-19 Port: 9003 Url: \n","stream":"stdout","time":"2023-03-14T10:04:44.399284297Z"}
{"log":"2023/03/14 10:04:47.418 [DEBUG] [ApplicationContext.go:1760] [infra.dpd.core] Ping reachable client Id 2 Name: rcm-checkpointmgr-141 Setname: rcm-checkpointmgr-14 Host: rcm-checkpointmgr-14 Port: 9003 Url: \n","stream":"stdout","time":"2023-03-14T10:04:47.418602948Z"}
{"log":"2023/03/14 10:04:47.418 [DEBUG] [ApplicationContext.go:1760] [infra.dpd.core] Ping reachable client Id 2 Name: rcm-checkpointmgr-111 Setname: rcm-checkpointmgr-11 Host: rcm-checkpointmgr-11 Port: 9003 Url: \n","stream":"stdout","time":"2023-03-14T10:04:47.418606903Z"}
{"log":"2023/03/14 10:04:47.418 [DEBUG] [ApplicationContext.go:1922] [infra.application.core] Ping method is found for the rpc rcm-checkpointmgr-14, host rcm-checkpointmgr-141\n","stream":"stdout","time":"2023-03-14T10:04:47.418610757Z"}
The identified error events are related to infrastructure logs configured at the debugging level. These events produce an excessive amount of Ping reachability events that are not essential. Consequently, each log file reaches the 10MB threshold size quickly, causing a buildup of multiple log files.
As per the recommendations:
You must configure debug-level logs exclusively for the RCM application. This logging configuration must be enabled for RCM to filter out any unnecessary log events from other sources.
RCM Ops-Centre Logging Level
This is the recommended logging level to be present in RCM.
logging level application debug
logging level transaction debug
logging level tracing off
logging name infra.application.core level application warn
logging name infra.application.core level transaction warn
logging name infra.application.core level tracing off
logging name infra.dpd.core level application warn
logging name infra.dpd.core level transaction warn
logging name infra.dpd.core level tracing off
logging name infra.config.core level application warn
logging name infra.config.core level transaction warn
logging name infra.config.core level tracing off
logging name infra.heap_dump.core level application warn
logging name infra.heap_dump.core level transaction warn
logging name infra.heap_dump.core level tracing off
logging name infra.resource_monitor.core level application warn
logging name infra.resource_monitor.core level transaction warn
logging name infra.resource_monitor.core level tracing off
logging name infra.topology.core level application warn
logging name infra.topology.core level transaction warn
logging name infra.topology.core level tracing off
logging name infra.transaction.core level application warn
logging name infra.transaction.core level transaction warn
logging name infra.transaction.core level tracing off
logging name infra.diagnostics.core level application warn
logging name infra.diagnostics.core level transaction warn
logging name infra.diagnostics.core level tracing off
After rectifying these improper logging configurations, the issue with incorrect log files is resolved.