Introduction
This document describes the solution for sessmgr instances that go into WARN state due to high acsmgr_icsr_frwk_instance_chkpt_falied()
process usage.
Problem Description
Platform ASR5500
SW Version: 21.27.4 and 21.19.10
Session manager instances in warn state due to high memory consumption on acsmgr_icsr_frwk_instance_chkpt_falied()
function when session recovery is disabled:
[local]ASR5500# show task resources | grep -v good
task cputime memory files sessions
cpu facility inst used allc used alloc used allc used allc S status
----------------------- ----------- ------------- --------- ------------- ------
1/0 sessmgr 13 26% 100% 930.8M 900.0M 37 500 4643 12000 I warn
1/0 sessmgr 36 32% 100% 938.8M 900.0M 39 500 5155 12000 I warn
1/0 sessmgr 53 29% 100% 937.8M 900.0M 40 500 4916 12000 I warn
1/0 sessmgr 56 29% 100% 930.2M 900.0M 41 500 4649 12000 I warn
1/0 sessmgr 83 35% 100% 970.2M 900.0M 40 500 5382 12000 I warn
1/0 sessmgr 90 24% 100% 931.3M 900.0M 42 500 4621 12000 I warn
1/0 sessmgr 130 28% 100% 935.0M 900.0M 40 500 4907 12000 I warn
1/0 sessmgr 141 26% 100% 936.7M 900.0M 37 500 4917 12000 I warn
1/0 sessmgr 145 23% 100% 933.9M 900.0M 39 500 4883 12000 I warn
1/0 sessmgr 174 26% 100% 927.4M 900.0M 37 500 4620 12000 I warn
1/0 sessmgr 188 31% 100% 963.0M 900.0M 40 500 5305 12000 I warn
1/0 sessmgr 223 26% 100% 933.5M 900.0M 38 500 4631 12000 I warn
Aggregate consumption per proc:
-------- ------------------------------------------ -------------- -------------- -------------- --------- ---------
| Nr | Process | Similar | Total Bytes | Human Bytes | Percent | % Acum |
======== ========================================== ============== ============== ============== ========= =========
| 1 | acsmgr_icsr_frwk_instance_chkpt_falied() | 757 | 108301860 | 103.3 MB | 13.95% | 13.95% |
| 2 | egtpc_allocate_peer_rec() | 89 | 77599472 | 74.0 MB | 10.00% | 23.95% |
| 3 | sn_slist_dnode_alloc() | 471 | 64427392 | 61.4 MB | 8.30% | 32.25% |
| 4 | sessmgr_allocate_callline() | 156 | 48601944 | 46.4 MB | 6.26% | 38.51% |
| 5 | sn_aaa_buffer_alloc_more_type() | 45 | 34836120 | 33.2 MB | 4.49% | 43.00% |
[local]ASR5500# show task resources | grep -v good
Session Recovery Status:
Overall Status : Not Enabled
Last Status Update : 8 seconds ago
Analysis
In order to isolate if the high amount of total subscribers triggers the process acsmgr_icsr_frwk_instance_chkpt_falied()
to be over-utilized, a sessmgr instance busy-out is performed and it is confirmed that the sessmgr memory utilization did not decrease:
[local]ASR5500> show task resources facility sessmgr instance 10
task cputime memory files sessions
cpu facility inst used allc used alloc used allc used allc S status
----------------------- ----------- ------------- --------- ------------- ------
8/0 sessmgr 10 20% 100% 981.8M 900.0M 43 500 4142 12000 I warn
Total 1 20.20% 981.8M 43 4142
[local]ASR5500> task sessmgr instance 10 busy-out
[local]ASR5500> show task resources facility sessmgr instance 10
task cputime memory files sessions
cpu facility inst used allc used alloc used allc used allc S status
----------------------- ----------- ------------- --------- ------------- ------
8/0 sessmgr 10 19% 100% 979.7M 900.0M 42 500 3946 12000 B warn
Total 1 19.35% 979.7M 42 3946
[local]ASR5500> task sessmgr instance 10 enable
[local]ASR5500> show task resources facility sessmgr instance 10
task cputime memory files sessions
cpu facility inst used allc used alloc used allc used allc S status
----------------------- ----------- ------------- --------- ------------- ------
8/0 sessmgr 10 17% 100% 979.8M 900.0M 40 500 4141 12000 I warn
Total 1 17.33% 979.8M 40 4141
From the logs, when a busy-out is performed on one of the affected sessmgr instances, it decreases the number of used sessions, but used memory allocation still remains high and shows to cause sessmgr instance to be in WARN state.
On further investigation, acsmgr_icsr_frwk_instance_chkpt_falied()
function is called while the checkpoint information is processed. There are list addition/updation/deletion operations in this function which do not work as expected when session recovery is disabled and this is the reason for the increased memory consumption. The memory used here is accumulated in this scenario over time. This behavior only occurs in the scenario where the require session recovery
is not configured. The accumulated memory to process acsmgr_icsr_frwk_instance_chkpt_falied()
does not get freed up when (no require session recovery
) which potentially causes the memleak."
Solution
Implement session recovery in order to resolve this issue.
Procedure
Step 1. At the Exec mode prompt, verify that the session recovery feature is enabled via the session and feature use licenses on the system with the show license info
command. If the current status of the Session Recovery feature is Disabled, you cannot enable this feature until a license key is installed in the system.
Step 2. Use this configuration example to enable session recovery.
configure
require session recovery
end
This feature does not take effect until after the system has been restarted.
Step 3. Save your configuration as described in Verifying and Saving Your Configuration.
Step 4. Perform a system restart with the reload
command. This is the prompt that appears:
Confirm your desire to perform a system restart and enter Yes
.
The system, when restarted, enables session recovery and creates all mirrored "standby-mode" tasks, performs packet processing card reservations, and other operations automatically.
Step 5. After the system has been restarted, you must verify the preparedness of the system to support this feature as described in Viewing Session Recovery Status. More advanced users can opt to insert the require session recovery
command syntax into a configuration file that already exists with a text editor or other means, and then manually apply the configuration file. Please exercise caution when you do this, in order to ensure that this command is placed among the first few lines of any configuration file that already exists; it must appear before the creation of any non-local context.