简介
tacacsd是与Tacacs AAA服务关联的IOS XR进程。本文讨论一个软件Bug及其症状,该程序可能导致运行IOS XR版本4.2.X或更低版本的路由器观察不断高的CPU利用率。
先决条件
本文档没有任何特定的要求。
使用的组件
本文档中涉及的问题适用于Cisco GSR、ASR9000、CRS和运行IOS XR的其他路由器。下列输出来自运行4.2.X以下IOS XR版本的实验路由器。
问题
运行IOS XR版本4.2.X或更低版本的路由器可能会发现由于已知软件Bug而导致警报记录器进程导致的CPU使用率持续较高。Show process cpu输出将显示占用CPU使用率最大数量的警报记录器进程。
show proc cpu | ex "0% 0% 0%"
CPU utilization for one minute: 100%; five minutes: 100%; fifteen minutes: 100%
PID 1Min 5Min 15Min Process
<snip>
53281 2% 2% 2% syslogd_helper
57379 1% 1% 1% fabricq_prp_driver
69636 1% 1% 1% correlatord
69677 6% 6% 6% syslogd
118842 1% 1% 1% sysdb_svr_local
122962 3% 3% 3% gsp
229604 2% 2% 2% eem_ed_syslog
262456 1% 1% 1% tacacsd
452726918 67% 71% 72% alarm-logger
463302887 1% 1% 1% exec
<snip>
在日志记录缓冲区中,您可能会看到类似以下内容的持续日志:
tacacsd[XXXX]: %SECURITY-TACACSD-7-GENERIC_ERROR:无法获取以下项的请求: key -XXXXX/XXXX/XXXX/XXXX会话XXXXX
show log
<snip>
RP/0/7/CPU0:Dec 26 04:02:03.149 : tacacsd[1110]: %SECURITY-TACACSD-6-SERVER_UP :
TACACS+ server 32.95.X.X/XXXX is UP
RP/0/7/CPU0:Dec 26 04:02:05.956 : tacacsd[1110]: %SECURITY-TACACSD-6-SERVER_DOWN :
TACACS+ server 32.95.X.X/XXXX is DOWN - Socket 43: Connection timed out
RP/0/7/CPU0:Dec 26 04:02:09.468 : tacacsd[1110]: %SECURITY-TACACSD-6-SERVER_DOWN :
TACACS+ server 199.37.X.X/XXXX is DOWN - Socket 43: Connection timed out
RP/0/7/CPU0:Dec 26 04:02:09.647 : tacacsd[1110]: %SECURITY-TACACSD-6-TIMEOUT_IGNORED :
A time out event has been ignored for context key -953829129/1073/60000000/6486405
(session 6486405)
RP/0/7/CPU0:Dec 26 04:02:11.647 : tacacsd[1110]: %SECURITY-TACACSD-7-GENERIC_ERROR :
Failed to get request for: key -953829129/1073/60000000/6486405 session 105407493
RP/0/0/CPU0:last message repeated 520 times
RP/0/7/CPU0:Dec 26 04:02:34.064 : tacacsd[1110]: %SECURITY-TACACSD-6-SERVER_UP :
TACACS+ server 32.95.X.X/XXXX is UP
RP/0/7/CPU0:Dec 26 04:02:34.064 : tacacsd[1110]: %SECURITY-TACACSD-7-GENERIC_ERROR :
Failed to get request for: key -953829129/1073/60000000/6486405 session 105407493
alarm-logger和tacacsd进程详细信息如下所示。
show processes alarm-logger
<snip>
Job Id: 114
PID: 135303
Executable path: /c12k-os-4.2.4/sbin/alarm-logger
Instance #: 1
Version ID: 00.00.0000
Respawn: ON
Respawn count: 1
Max. spawns per minute: 12
Last started: Tue Aug 13 02:17:23 2013
Process state: Run
Package state: Normal
core: MAINMEM
Max. core: 0
Level: 91
Placement: None
startup_path: /pkg/startup/alarm-logger.startup
Ready: 0.672s
Process cpu time: 1401.018 user, 49.774 kernel, 1450.792 total
JID TID Stack pri state TimeInState HR:MM:SS:MSEC NAME
114 1 88K 10 Receive 0:00:02:0071 0:00:40:0919 alarm-logger
114 2 88K 10 Receive 3242:46:17:0308 0:00:00:0000 alarm-logger
114 3 88K 10 Reply 0:00:00:0000 0:23:08:0029 alarm-logger
114 4 88K 10 Mutex 0:00:00:0000 0:00:21:0957 alarm-logger
-------------------------------------------------------------------------------
<snip>
show processes tacacsd
<snip>
Job Id: 1110
PID: 266551
Executable path: /disk0/iosxr-infra-4.2.4/bin/tacacsd
Instance #: 1
Version ID: 00.00.0000
Respawn: ON
Respawn count: 1
Max. spawns per minute: 12
Last started: Tue Aug 13 02:23:47 2013
Process state: Run
Package state: Normal
Started on config: cfg/gl/aaa/tacacs/
Process group: central-services
core: MAINMEM
Max. core: 0
Placement: Placeable
startup_path: /pkg/startup/tacacsd.startup
Ready: 3.954s
Process cpu time: 1010.118 user, 185.932 kernel, 1196.050 total
JID TID Stack pri state TimeInState HR:MM:SS:MSEC NAME
1110 1 108K 16 Sigwaitinfo 3242:46:40:0742 0:00:00:0116 tacacsd
1110 2 108K 10 Nanosleep 0:01:03:0835 0:00:00:0019 tacacsd
1110 3 108K 10 Receive 3242:46:41:0593 0:00:00:0002 tacacsd
1110 4 108K 10 Reply 0:00:00:0000 0:08:55:0970 tacacsd
1110 5 108K 16 Receive 3242:46:40:0771 0:00:00:0000 tacacsd
1110 6 108K 10 Receive 0:07:07:0403 0:04:03:0462 tacacsd
1110 7 108K 10 Receive 0:00:01:0389 0:03:28:0939 tacacsd
1110 8 108K 10 Receive 0:00:01:0332 0:03:03:0622 tacacsd
-------------------------------------------------------------------------------
<snip>
高CPU是由于系统日志消息泛洪导致警报记录器缓冲区已满造成的。因此,警报记录器进程仍然处于繁忙状态,试图同时处理消息并面临缓冲区已满的情况。 在本例中,TACACS进程是压倒性的警报记录器。由于警报记录器是受害者,重新启动警报记录器进程不会有所帮助,因为共享内存缓冲区在进程重新启动后会保持持久性。
解决方案
此问题已通过软件Bug CSCuh得到解决和修复98484 - Tacacsd“Failed to get request for key”错误导致高CPU。此处显示漏洞详细信息
请注意,重启tacacsd进程是一种应停止日志的解决方法,CPU利用率应返回到正常水平。重新启动tacacsd进程不会影响任何功能或数据包转发,它将使进程处于初始状态。
此Bug已在以下IOS XR版本中修复。
4.3.2.SP2
4.3.2.SP3
4.3.2.SP5
4.3.2.SP6
4.3.2.SP7
4.3.2.SP8