This document helps troubleshoot a system that does not respond. The document also discusses the cause, and how you can eliminate the problem.
A router appears to stop working when the system is not responsive to the console or to queries sent from the network (for example, Telnet, Simple Network Management Protocol (SNMP), and so on). These problems can be classified into two broad categories:
When the console does not respond.
When traffic does not go through.
There are no specific requirements for this document.
The information in this document is based on these software and hardware versions:
All Cisco IOS® software versions
All Cisco routers
This document does not apply to Cisco Catalyst switches or MGX platforms.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
For more information on document conventions, refer to the Cisco Technical Tips Conventions.
Console problems occur when the router becomes unresponsive to input at the console port. If the console is not responsive, it means that a high priority process prevents the console driver from responding to input.
Verify cable connectivity.
Verify that the power supply is on.
Verify the router LED status. If all LEDs are down, it is most likely an issue with the power supply of the router.
If traffic still flows through the router:
Disconnect network interfaces and see if the router responds. Many times the router assumes it is doing something too important to service exec sessions.
You can also attempt to reproduce the problem after you issue these commands:
On the Cisco 7200 and 7500 Series:
configure terminal scheduler allocate 3000 1000 ^Z
The scheduler allocate command guarantees CPU time for low priority processes. It puts a maximum time allocated to fast-switching (3000 microseconds - usec) and process-switching (1000 usec) per network interrupt context.
On all other platforms, use:
configure terminal scheduler interval 500 ^Z
The scheduler interval command allows low priority processes to be scheduled every 500 usec, and thereby allows some commands to be typed even if CPU usage is at 100%.
Check the Basic System Management Commands in the Cisco IOS Software Command Reference for more information on these commands.
If the console does not respond because the router CPU utilization is high, it is important to find and correct the cause of the high CPU utilization. For example, if process-switched IP traffic causes problems, then this is reflected in the "IP Input" process in the output from the show processes cpu command. In this situation, it is important to collect the output from show interfaces, show interfaces stat, and possibly show processes to further diagnose the problem. To fix the problem, you would need to reduce the amount of IP traffic that is process switched. See Troubleshooting High CPU Utilization on Cisco Routers for more information.
Another possible cause of an apparent hang is memory allocation failure; that is, either the router has used all available memory, or the memory has been fragmented into such small pieces that the router cannot find a usable available block. For more information, see Troubleshooting Memory Problems.
The router can stop responding due to a security-related problem, such as a worm or virus. This is especially likely to be the cause if there have not been recent changes to the network, such as a router IOS upgrade. Usually, a configuration change, such as adding additional lines to your access lists can mitigate the effects of this problem. The Cisco Security Advisories and Notices page contains information on detection of the most likely causes and specific workarounds.
For additional information, refer to:
If the router appears to freeze during the bootup process, it can be the result of an improperly configured feature or of a software defect in a configured feature. This is often evident from the appearance of a warning or error message on the console immediately before the router freezes.
As a workaround to this problem, boot the router into ROMMON, and bypass the stored configuration, and then configure it again. Complete these steps:
Attach a terminal or PC with terminal emulation to the console port of the router.
Use these terminal settings:
9600 baud rate
No parity
8 data bits
1 stop bit
No flow control
Reboot the router and break into ROMMON by pressing break on the terminal keyboard within 60 seconds of the power-up. If the break sequence doesn't work, see Standard Break Key Sequence Combinations During Password Recovery for other key combinations.
Change the configuration register to 0x2142 and then reset the router. For this, execute the confreg 0x2142 command at the rommon 1> prompt. Then type reset at the rommon 2> prompt. This causes the router to boot from Flash without loading the configuration.
Type no after each setup question or press Ctrl-C to skip the initial setup procedure.
Type enable at the Router> prompt.
You are in enable mode, and see the Router# prompt.
Now, you can save an empty configuration (all commands removed). Issue the copy running-config startup-config command. Alternatively, if you suspect that a certain command causes the problem, you can edit the configuration. To do so, issue the copy startup-config running-config command. Then type configure terminal, and make the changes.
When finished, change the configuration-register back to 0x2102. For this, type config-register 0x2102. Issue the copy running-config startup-config command to commit the changes.
If traffic does not flow through the router:
If traffic no longer passes through the router and the console is unresponsive, there is probably a problem with the system. Generally this means that the router is caught in a continuous loop or stuck at a function. This is almost always caused by a bug in the software. Install the most recent maintenance release of the Cisco IOS software train you currently run.
Before you create a service request with the Cisco TAC, obtain a stack trace from ROM Monitor. Obtaining stack traces during a problem makes it possible to determine where in the code the router is looping or stuck.
Traffic problems occur when the console remains responsive but traffic does not pass through the router. In this case, part of the traffic or part of the interfaces are not responsive. This behavior can be caused by a variety of different causes. When this problem occurs, information can be collected from the router through the console port. The causes for these traffic problems can range from errors on the interfaces to software and hardware problems.
Routing issue – Changes in the network topology or in the configuration of some routers could have affected the routing tables.
High CPU Utilization – Issue the show process cpu command. If the CPU is above 95%, the performance of the router can be affected, and packets can be delayed or dropped. Refer to Troubleshooting High CPU Utilization on Routers for more information.
Interface down – One of the router interfaces can be down. There are multiple events that could cause this, which can range from a wrong configuration command to a hardware failure of the interface or the cable. If some interfaces appear down when you issue a show interfaces command, try to find out what caused it.
Wedged interfaces – This is a particular case of buffer leaks that causes the input queue of an interface to fill up to the point where it can no longer accept packets. Reload the router. This frees that input queue, and restores traffic until the queue is full again. This can take anywhere from a few seconds to a few weeks, based on the severity of the leak.
The easiest way to identify a wedged interface is to issue a show interfaces command, and to look for something similar to this:
Output queue 0/40, 0 drops; input queue 76/75, 27 drops
See Troubleshooting Buffer Leaks for detailed guidelines and examples.
K-trace refers to the procedure used to obtain a stack trace from the router from ROM Monitor. On routers with older ROM Monitor code, a stack trace is obtained with the k command. On routers that run more recent ROM Monitor code, the stack command can also be used.
Complete these steps to obtain stack traces from a router that does not respond:
Enable the break sequence. For this, change the configuration register value. The eighth bit value must be set to zero so that break is not ignored. A value of 0x2002 works.
Router#configure terminal Enter configuration commands, one per line. End with CNTL/Z. Router(config)#config-register 0x2002
Reload the router so that the new configuration register value is used.
Send the break sequence when the problem occurs. The ROM Monitor prompt ">" or "rommon 1 >" must be displayed.
Capture a stack trace. For this, collect the output from either the k 50 or stack 50 commands. Add 50 to the command to print a longer stack trace.
Issue the c or cont command to continue.
Repeat the three last steps several times to ensure that multiple points in a continuous loop have been captured.
After you have obtained several stack traces, reboot the router to recover from the hung state.
Here is an example of this procedure:
User break detected at location 0x80af570 rommon 1 > k 50 Stack trace: PC = 0x080af570 Frame 00: FP = 0x02004750 RA = 0x0813d1b4 Frame 01: FP = 0x02004810 RA = 0x0813a8b8 Frame 02: FP = 0x0200482c RA = 0x08032000 Frame 03: FP = 0x0200483c RA = 0x040005b0 Frame 04: FP = 0x02004b34 RA = 0x0401517a Frame 05: FP = 0x02004bf0 RA = 0x04014d9c Frame 06: FP = 0x02004c00 RA = 0x040023d0 Frame 07: FP = 0x02004c68 RA = 0x04002e9e Frame 08: FP = 0x02004c78 RA = 0x040154fe Frame 09: FP = 0x02004e68 RA = 0x04001fc0 Frame 10: FP = 0x02004f90 RA = 0x0400c41e Frame 11: FP = 0x02004fa4 RA = 0x04000458 Suspect bogus FP = 0x00000000, aborting rommon 2 > cont
Repeat this procedure several times in the event of a system problem to collect multiple instances of the stack trace.
When a router does not respond, it is almost always a software problem. In this case, collect as much information as possible, including the stack trace, before you open a TAC service request. It is also important to include output from the show version, show run, and show interfaces commands.
If you open a TAC Service Request, please attach the following information to your request for troubleshooting Router Hangs: |
Note: If the console is responsive, please do not manually reload or power-cycle the router before collecting the above information, unless required to troubleshoot router hangs, as this can cause important information to be lost that is needed for determining the root cause of the problem. |
Revision | Publish Date | Comments |
---|---|---|
1.0 |
02-Aug-2006 |
Initial Release |