Valuable time and resources are often wasted replacing hardware that actually functions properly. This document helps troubleshoot common hardware issues with the Cisco 12000 Series Internet Router, and provides pointers for identifying whether or not the fault is in the hardware.
Note: This document does not cover any software-related failures except for those that are often mistaken as hardware issues.
Readers of this document should have knowledge of these topics:
Hardware Troubleshooting for the Cisco 12000 Series Internet Router
Troubleshooting Line Card Crashes on the Cisco 12000 Series Internet Router
If you feel that the problem is related to a hardware fault, this document can help you identify the cause of the failure.
The information in this document is based on these software and hardware versions:
All Cisco 12000 Series Internet Routers, including the 12008, 12012, 12016, 12404, 12406, 12410, and the 12416.
All Cisco IOS® software versions that support the Cisco 12000 Series Internet Router.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
Whenever you install a new line card, module, or Cisco IOS software image, it is important to verify whether the router has enough memory, and that the hardware and software are compatible with the features you want to use.
Complete these recommended steps to check for hardware-software compatibility and memory requirements:
Use the Software Advisor (registered customers only) tool to choose software for your network device.
Tip:
The Software Support for Features (registered customers only) section helps you determine the Cisco IOS software image needed by choosing the types of features you wish to implement.
Use the Download Software Area (registered customers only) to check the minimum amount of memory (RAM and Flash) required by the Cisco IOS software, and/or download the Cisco IOS software image. To determine the amount of memory (RAM and Flash) installed on your router, see How to Choose a Cisco IOS Software Release - Memory Requirements.
Tips:
If you want to keep the same features as the version that is currently running on your router, but do not know which feature set you use, enter the show version command on your Cisco device, and paste its output in the Output Interpreter Tool. You can use Output Interpreter (registered customers only) to display potential issues and fixes. To use Output Interpreter (registered customers only) , you must be a registered customer, be logged in, and have JavaScript enabled. It is important to check for feature support, especially if you plan to use recent software features.
If you need to upgrade the Cisco IOS software image to a new version or feature set, refer to How to Choose a Cisco IOS Software Release for more information.
If you determine that a Cisco IOS software upgrade is required, follow the Software Installation and Upgrade Procedure for the Cisco 12000 Series Router.
Tip: For information on how to recover a Cisco 12000 series router stuck in ROMmon (rommon # > prompt), see ROMmon Recovery Procedure for the Cisco 12000.
For more information on document conventions, see the Cisco Technical Tips Conventions.
With the help of the information in this section, you will be able to determine whether the problems you face with your line card are hardware-related.
The first thing you need to do is identify the cause of the line card crash or console errors that you encounter. To see which card is possibly at fault, it is essential that you collect the output from these commands:
show context summary
show logging
show logging summary
show diag <slot>
show context slot <slot>
Along with these specific show commands, you must also gather this information:
Console logs and/or Syslog information: These can be crucial to determine the originating issue if multiple symptoms occur. If the router is set up to send logs to a syslog server, you would possibly see some information on what happened. For console logs, it is best to be directly connected to the router on the console port through System Message Logging.
show technical-support: The show technical-support command is a compilation of many different commands, and includes show version, show running-config, and show stacks. When a router runs into problems, the Cisco Technical Assistance Center (TAC) engineer usually asks for this information. It is important to collect the show technical-support command output before you reload or power-cycle your device, because these actions can cause all information about the problem to be lost.
Here are some examples of output that you can expect to see if your Gigabit Route Processor (GRP) or line card has crashed:
Router#show context summary CRASH INFO SUMMARY Slot 0 : 0 crashes Slot 1 : 1 crashes 1 - crash at 10:36:20 UTC Wed Dec 19 2001 Slot 2 : 0 crashes Slot 3 : 0 crashes Slot 4 : 0 crashes Slot 5 : 0 crashes Slot 6 : 0 crashes Slot 7 : 0 crashes Slot 8 : 0 crashes Slot 9 : 0 crashes Slot 10: 0 crashes Slot 11: 0 crashes Slot 12: 0 crashes Slot 13: 0 crashes Slot 14: 0 crashes Slot 15: 0 crashes Router#show logging Syslog logging: enabled (2 messages dropped, 0 messages rate-limited, 0 flushes, 0 overruns) Console logging: level debugging, 24112 messages logged Monitor logging: level debugging, 0 messages logged Buffer logging: level debugging, 24411 messages logged Logging Exception size (4096 bytes) Trap logging: level informational, 24452 message lines logged 5d16h: %LCINFO-3-CRASH: Line card in slot 1 crashed 5d16h: %GRP-4-RSTSLOT: Resetting the card in the slot: 1,Event: 38 5d16h: %IPCGRP-3-CMDOP: IPC command 3 5d16h: %CLNS-5-ADJCHANGE: ISIS: Adjacency to malachim2 (GigabitEthernet1/0) Up, n8 (slot1/0): linecard is disabled -Traceback=602ABCA8 602AD8B8 602B350C 602B3998 6034312C 60342290 601A2BC4 601A2BB0 5d16h: %LINK-5-CHANGED: Interface GigabitEthernet1/0, changed state to administratively down 5d16h: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0, changed state to down 5d16h: %GRP-3-CARVE_INFO: Setting mtu above 8192 may reduce available buffers on Slot: 1. SLOT 1:00:00:09: %SYS-5-RESTART: System restarted -- Cisco Internetwork Operating System Software IOS (tmew adjacency) GS Software (GLC1-LC-M), Version 12.0(17)ST3, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC Support: http://www.cisco.com/tac Copyright (c) 1986-2001 by cisco Systems, Inc. Compiled Thu 08-Nov-01 20:21 by dchih 5d16h: %GRPGE-6-AUTONEG_STATE: Interface GigabitEthernet1/0: Link OK - autonegotiation complete 5d16h: %LINK-3-UPDOWN: Interface GigabitEthernet1/0, changed state to up 5d16h: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0, changed state to up Router#show diag 1 SLOT 1 (RP/LC 1 ): 3 Port Gigabit Ethernet MAIN: type 68, 800-6376-01 rev E0 dev 0 HW config: 0x00 SW key: 00-00-00 PCA: 73-4775-02 rev E0 ver 2 HW version 2.0 S/N CAB0450G8FX MBUS: Embedded Agent Test hist: 0x00 RMA#: 00-00-00 RMA hist: 0x00 DIAG: Test count: 0x00000001 Test results: 0x00000000 FRU: Linecard/Module: 3GE-GBIC-SC= Route Memory: MEM-GRP/LC-64= Packet Memory: MEM-LC1-PKT-256= L3 Engine: 2 - Backbone OC48 (2.5 Gbps) MBUS Agent Software version 01.46 (RAM) (ROM version is 02.10) Using CAN Bus A ROM Monitor version 10.06 Fabric Downloader version used 05.01 (ROM version is 05.01) Primary clock is CSC 0 Board is analyzed Board State is Line Card Enabled (IOS RUN ) Insertion time: 00:00:10 (5d16h ago) DRAM size: 67108864 bytes FrFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes ToFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes 1 crash since restart Router#show context slot 1 CRASH INFO: Slot 1, Index 1, Crash at 10:36:20 UTC Wed DEC 19 2001 VERSION: GS Software (GLC1-LC-M), Version 12.0(17)ST3, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC Support: http://www.cisco.com/tac Compiled Thu 08-Nov-01 20:21 by dchih Card Type: 3 Port Gigabit Ethernet, S/N System exception: sig=10, code=0x10, context=0x41036514 System restarted by a Bus Error exception STACK TRACE: -Traceback= 406914C8 4004EEAC 4005BCE4 400A33F4 400A33E0 CONTEXT: $0 : 00000000, AT : 41030000, v0 : 00000000, v1 : 41036290 a0 : 00000030, a1 : 412C6CA0, a2 : 00000000, a3 : 00000000 t0 : 00008100, t1 : 34008101, t2 : 400C5590, t3 : FFFF00FF t4 : 400C5560, t5 : 00040000, t6 : 00000000, t7 : 413D1D78 s0 : FF012345, s1 : 00000031, s2 : 41032B10, s3 : 41BB8F00 s4 : 00000000, s5 : 00000001, s6 : 4101D620, s7 : 00000000 t8 : 418EA1C8, t9 : 00000000, k0 : 4142C7A0, k1 : 400C7538 gp : 40F57DC0, sp : 41BB8EE8, s8 : 41023740, ra : 406914C8 EPC : 0x406914C8, SREG : 0x34008103, Cause : 0x00000010 ErrorEPC : 0x400B3A5C -Process Traceback= No Extra Traceback SLOT 1:00:00:09: %SYS-5-RESTART: System restarted -- Cisco Internetwork Operating System Software IOS (tm) GS Software (GLC1-LC-M), Version 12.0(17)ST3, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC Support: http://www.cisco.com/tac Copyright (c) 1986-2001 by cisco Systems, Inc. Compiled Thu 08-Nov-01 20:21 by dchih SLOT 1:20:18:09: %LCGE-6-GBIC_OIR: 3 Port Gigabit Ethernet GBIC removed from port 2 SLOT 1:20:18:29: %LCGE-6-GBIC_OIR: 3 Port Gigabit Ethernet GBIC inserted in port 2 SLOT 1:3d20h: %LCGE-6-GBIC_OIR: 3 Port Gigabit Ethernet GBIC removed from port 2 SLOT 1:3d20h: %LCGE-6-GBIC_OIR: 3 Port Gigabit Ethernet GBIC inserted in port 2 SLOT 1:00:00:09: %SYS-5-RESTART: System restarted -- Cisco Internetwork Operating System Software IOS (TM) GS Software (GLC1-LC-M), Version 12.0(17)ST3, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC Support: http://www.cisco.com/tac Copyright (c) 1986-2001 by cisco Systems, Inc. Compiled Thu 08-Nov-01 20:21 by dchi
If a line card has crashed, and you have identified the line card that has crashed, you now need to determine the cause of the crash. The output from the show context <slot> command enables you to do this. Here is an example:
Router#show context slot 2 CRASH INFO: Slot 2, Index 1, Crash at 12:24:22 MET Wed Nov 28 2001 VERSION: GS Software (GLC1-LC-M), Version 12.0(18)S1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC Support: http://www.cisco.com/tac Compiled Fri 07-Sep-01 20:13 by nmasa Card Type: 3 Port Gigabit Ethernet, S/N System exception: SIG=23, code=0x24, context=0x4103FE84 System restarted by a Software forced crash STACK TRACE: -Traceback= 400BEB08 40599554 4004FB64 4005B814 400A1694 400A1680 CONTEXT: $0 : 00000000, AT : 41040000, v0 : 00000032, v1 : 4103FC00 a0 : 4005B0A4, a1 : 41400A20, a2 : 00000000, a3 : 00000000 t0 : 41D75220, t1 : 8000D510, t2 : 00000001, t3 : FFFF00FF t4 : 400C2670, t5 : 00040000, t6 : 00000000, t7 : 4150A398 s0 : 0000003C, s1 : 00000036, s2 : 4103C4D0, s3 : 41D7EC60 s4 : 00000000, s5 : 00000001, s6 : 41027040, s7 : 00000000 t8 : 41A767B8, t9 : 00000000, k0 : 415ACE20, k1 : 400C4260 GP : 40F0DD00, SP : 41D7EC48, s8 : 4102D120, ra : 40599554 EPC : 0x400BEB08, SREG : 0x3400BF03, Cause : 0x00000024 ErrorEPC : 0x400C6698, BadVaddr : 0xFFBFFFFB -Process Traceback= No Extra Traceback SLOT 2:00:00:09: %SYS-5-RESTART: System restarted -- Cisco Internetwork Operating System Software IOS (TM) GS Software (GLC1-LC-M), Version 12.0(18)S1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC Support: http://www.cisco.com/tac Copyright (c) 1986-2001 by cisco Systems, Inc. Compiled Fri 07-Sep-01 20:13 by nmae
You can identify the type of crash that has occurred from the "SIG=" value in the show context slot <slot> command output. See the SIG Code Table for details.
Here are some links that provide more information on the three most common types of line card crashes, and explain how to troubleshoot them:
In the example above, the line card has crashed due to a "software-forced crash" and, as the name suggests, a software exception has caused the reload. Once you have determined the cause and collected the necessary output, you can check for a bug in your Cisco IOS software release using the Bug Toolkit (registered customers only) .
When you have determined whether the problems are system errors in the log or an actual crash, you must check the current status of the line card to see whether it has recovered from the fault that has occurred. In order to identify the status of individual line cards, you can either examine the Light Emitting Diodes (LEDs) located on the front of the card, or issue the show led command. Here is a sample output:
Router#show led SLOT 1 : RUN IOS SLOT 6 : DNLD FABL SLOT 7 : RP ACTV SLOT 10 : RUN IOS SLOT 11 : RUN IOS SLOT 13 : RUN IOS SLOT 14 : RUN IOS
Table 1 and Table 2 describe the most common types of output that you see from this command and their meanings.
Note: It is possible for the value of the LED to be reversed. For example, IOS RUN can be displayed as RUN IOS.
Table 1 – RP LED Status and MeaningRP LED Status | Meaning of LED Status |
---|---|
RP UP | RP is running Cisco IOS software and functioning correctly |
MSTR RP | RP is acting as the Primary GRP |
SLAV RP | RP is acting as the Slave GRP |
RP ACTV | RP is acting as the Primary GRP |
RP SEC | RP is acting as the Slave GRP |
MEM INIT | RP is trying to size the memory |
LC LED Status | Meaning of LED Status |
---|---|
DIAG DNLD | Line card is downloading Field Diagnostic software |
DIAG FAIL | Line card has failed Field Diagnostic test |
DIAG PASS | Line card has passed Field Diagnostic test |
DIAG TEST | Line card is executing Field Diagnostic software |
FABL DNLD | Line card is launching "Fabric Downloader" |
FABL WAIT | Line card is waiting to load "Fabric Downloader" |
IN RSET | Line card is resetting |
IOS DNLD | Line card is downloading Cisco IOS software through the switch fabric |
IOS RUN | Line card is now enabled |
IOS UP | Line card has finished loading and is now running Cisco IOS software |
MBUS DNLD | Line card is downloading Maintenance Bus (MBUS) agent |
MEM INIT | Line card is trying to size memory |
PWR OFF | Line card is powered off |
If the line card status is anything other than "IOS RUN", or the GRP is neither the active Master/Primary nor the Slave/Secondary, this means that there is a problem and the card has not fully loaded correctly. Before you replace the card, Cisco recommends that you try these steps to fix the issue:
Reload the microcode through the microcode reload <slot> global configuration command.
Reload the card through the hw-module slot <slot> reload command. This causes the line card to reset and re-download the Maintenance Bus (MBUS) and Fabric Downloader software modules before it attempts to re-download the line card Cisco IOS software.
Reset the line card manually. This can rule out any problems that are caused by a bad connection to the MBUS or switching fabric.
Note: For more information on how to troubleshoot line cards stuck in any status other than RUN IOS, see Understanding the Booting Process on the Cisco 12000 Series Internet Router.
Fabric ping failures occur when either a line card or the secondary GRP fails to respond to a fabric ping request from the primary GRP over the switch fabric. Such failures are a problem symptom that you must investigate. They are indicated by these error messages:
%GRP-3-FABRIC_UNI: Unicast send timed out (1) %GRP-3-COREDUMP: Core dump incident on slot 1, error: Fabric ping failure %LCINFO-3-CRASH: Line card in slot 1 crashed
You can find more information about this issue at Troubleshooting Fabric Ping Timeouts and Failures on the Cisco 12000 Series Internet Router.
The Cisco 12000 Series Internet Router Parity Error Fault Tree document explains the steps to troubleshoot and isolate a part or component of the Cisco 12000 Series Internet Router that fails, after you encounter a variety of parity error messages.
If you experience any error messages related to one of the line cards, you can use the Cisco Error Message Decoder (registered customers only) to find information about the meaning of the error message. Some of them point to a hardware issue of the line card, whereas others indicate a Cisco IOS software bug, or a hardware issue on another part of the router. This document does not cover all these messages.
Some Cisco Express Forwarding (CEF) and Inter Process-Communication (IPC)-related messages are explained in Troubleshooting CEF-Related Error Messages.
Line card Field Diagnostic software is designed to identify any faulty line card within a Cisco 12000 (all 12xxx series) router. Prior to Cisco IOS software release 12.0(22)S, the Field Diagnostic software was embedded within the Cisco IOS software. From Cisco IOS software release 12.0(22)S onwards, this software has been unbundled, and you can download it from CCO through the Download Software Area (registered customers only) (select FIELD DIAGS under 120XX platform). It is still run from a command initiated while running Cisco IOS software, but you must specify the source (either Trivial File Transfer Protocol (TFTP) boot server, or PCMCIA Flash memory) on the command line. All Field Diagnostics commands are run at the enable level of Cisco IOS software.
From Cisco IOS software release 12.0(22)S onwards, Cisco Systems has unbundled the Cisco 12000 Field Diagnostic line card image from the Cisco IOS software image. In earlier versions, diagnostics could be launched from the command line and the imbedded diagnostic image would be launched. In order to accommodate customers with 20Mb Flash memory cards, the Field Diagnostic software is now stored and maintained as a separate image: c12k-fdiagsbflc-mz.xxx-xx.S.bin (where x is the version number). This means that for a customer to launch Field Diagnostics, this image must be available on a separate Flash card or TFTP boot server. The latest version is always available on Cisco.com. For Performance Route Processor (PRP) cards, Gigabit switch Route Processor (GRP) cards, and fabric tests, these tests remain imbedded with the Cisco IOS software image. The command line features have been changed to reflect this.
While the diagnostic test is in progress, the line card does not function normally and is not able to pass any traffic for the duration of the test (5-20 minutes, based on the complexity of the line card). Without the verbose keyword, the command gives a truncated output that shows a Pass or Fail for the card. When you communicate with the TAC, the verbose mode is most helpful to identify specific problems. The output of the diagnostic test without the verbose command looks like this:
Router# diag 7 verbose tftp://223.255.254.254/muckier/award/c12k-fdiagsbflc-mz
Running DIAG config check
Fabric Download for Field Diags chosen: If timeout occurs, try 'mbus' option.
Running Diags will halt ALL activity on the requested slot. [confirm]
Router#
Launching a Field Diagnostic for slot 7
Downloading diagnostic tests to slot 7 via fabric (timeout set to 300 sec.)
5d20h: %GRP-4-RSTSLOT: Resetting the card in the slot: 7,Event:
EV_ADMIN_FDIAGLoading muckier/award/c12k-fdiagsbflc-mz from 223.255.254.254
(via Ethernet0): !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
5d20h: Downloading diags from tftp file tftp://223.255.254.254/muckier/award/
c12k-fdiagsbflc-mz
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[OK - 13976524 bytes]
FD 7> *****************************************************
FD 7> GSR Field Diagnostics V6.05
FD 7> Compiled by award on Tue Jul 30 13:00:41 PDT 2002
FD 7> view: award-conn_isp.FieldDiagRelease
FD 7> *****************************************************
Executing all diagnostic tests in slot 7
(total/indiv. timeout set to 2000/600 sec.)
FD 7> BFR_CARD_TYPE_OC12_4P_POS testing...
FD 7> Available test types 2
FD 7> 1
FD 7> Completed f_diags_board_discovery() (0x1)
FD 7> Test list selection received: Test ID 1, Device 0
FD 7> running in slot 7 (30 tests from test list ID 1)
FD 7> Skipping MBUS_FDIAG command from slot 2
FD 7> Just into idle state
Field Diagnostic ****PASSED**** for slot 7
Shutting down diags in slot 7
Board will reload
5d20h: %GRP-4-RSTSLOT: Resetting the card in the slot: 7,Event:
EV_ADMIN_FDIAG
5d20h: %GRP-4-RSTSLOT: Resetting the card in the slot: 7,Event:
EV_FAB_DOWNLOADER_DOWNLOAD_FAILURE
SLOT 7:00:00:09: %SYS-5-RESTART: System restarted --
Cisco Internetwork Operating System Software
IOS (tm) GS Software (GLC1-LC-M), Experimental Version 12.0(20020509:045149)
[award-conn_isp.f_diag_new 337]
Copyright (c) 1986-2002 by cisco Systems, Inc.
Compiled Tue 25-Jun-02 15:51 by award
The line card reloads automatically only after it passes the test.
Here is an example in which the Cisco IOS software release earlier than12.0(22)S, the line card failed the test and thus did not reload automatically. You can manually reload the line card with the hw-module slot <slot> reload command.
When you use the verbose keyword, the output includes each individual test that is performed. If the test PASSES, the next test is begun. A sample output looks like this:
Router# diag 7 verbose tftp tftp://223.255.254.254/ muckier/award/c12k-fdiagsbflc-mz
Running DIAG config check
Fabric Download for Field Diags chosen: If timeout occurs, try 'mbus' option.
Verbose mode: Test progress and errors will be displayed
Runnning Diags will halt ALL activity on the requested slot. [confirm]
Router#
Launching a Field Diagnostic for slot 7
Downloading diagnostic tests to slot 7 via fabric (timeout set to 300 sec.)
00:07:41: %GRP-4-RSTSLOT: Resetting the card in the slot: 7,Event: EV_ADMIN_FDIAG
Loading muckier/award/c12k-fdiagsbflc-mz from 223.255.254.254 (via Ethernet0):
!!!!!! (...)
00:08:24: Downloading diags from tftp file tftp://223.255.254.254/muckier/
award/c12k-fdiagsbflc-mz
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!
[OK - 13976524 bytes]
FD 7> *****************************************************
FD 7> GSR Field Diagnostics V6.05
FD 7> Compiled by award on Tue Jul 30 13:00:41 PDT 2002
FD 7> view: award-conn_isp.FieldDiagRelease
FD 7> *****************************************************
Executing all diagnostic tests in slot 7
(total/indiv. timeout set to 2000/600 sec.)
FD 7> BFR_CARD_TYPE_OC12_4P_POS testing...
FD 7> Available test types 2
FD 7> 1
FD 7> Completed f_diags_board_discovery() (0x1)
FD 7> Verbosity now (0x00000011) TESTSDISP FATL
FD 7> Test list selection received: Test ID 1, Device 0
FD 7> running in slot 7 (30 tests from test list ID 1)
FD 7> Just into idle state
FDIAG_STAT_IN_PROGRESS(7): test #1 Dram Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #2 Dram Datapins
FDIAG_STAT_IN_PROGRESS(7): test #3 Dram Busfloat
FDIAG_STAT_IN_PROGRESS(7): test #4 RBM SDRAM Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #5 RBM SDRAM Datapins
FDIAG_STAT_IN_PROGRESS(7): test #6 RBM SSRAM Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #7 RBM SSRAM Datapins Memory
FDIAG_STAT_IN_PROGRESS(7): test #8 TBM SDRAM Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #9 TBM SDRAM Datapins
FDIAG_STAT_IN_PROGRESS(7): test #10 TBM SSRAM Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #11 TBM SSRAM Datapins Memory
FDIAG_STAT_IN_PROGRESS(7): test #12 PSA TLU SDRAM Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #13 PSA TLU SDRAM Datapins
FDIAG_STAT_IN_PROGRESS(7): test #14 PSA PLU SDRAM Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #15 PSA PLU SDRAM Datapins
FDIAG_STAT_IN_PROGRESS(7): test #16 PSA SRAM Marching Pattern
FDIAG_STAT_IN_PROGRESS(7): test #17 PSA SRAM Datapins
FDIAG_STAT_IN_PROGRESS(7): test #18 To Fabric SOP FIFO SRAM Memory
FDIAG_STAT_IN_PROGRESS(7): test #19 From Fabric SOP FIFO SRAM Memory
FDIAG_STAT_IN_PROGRESS(7): test #20 RBM to SALSA Packet
FDIAG_STAT_IN_PROGRESS(7): test #21 TBM to SALSA Packet
FDIAG_STAT_IN_PROGRESS(7): test #22 RBM to TBM SLI Packet Loopback
FDIAG_STAT_IN_PROGRESS(7): test #23 TBM to PSA Packet -Framer Loopback
FDIAG_STAT_IN_PROGRESS(7): test #24 TBM to TX SOP Packet
FDIAG_STAT_IN_PROGRESS(7): test #25 TBM to RX SOP Packet -4302 Terminal Loopback
FDIAG_STAT_IN_PROGRESS(7): test #26 TBM to RX SOP Packet -Framer System Bus Loop
FDIAG_STAT_IN_PROGRESS(7): test #27 RBM to TBM Fabric Packet Loopback
FDIAG_STAT_IN_PROGRESS(7): test #28 TBM to RBM Packet, RBM page crossing
FDIAG_STAT_IN_PROGRESS(7): test #29 TBM to TX SOP Packet Simultaneous
FDIAG_STAT_IN_PROGRESS(7): test #30 TBM to PSA Multicast Packets -Framer Loopback
FDIAG_STAT_DONE(7)
FD 7> Changed current_status to FDIAG_STAT_IDLE
Field Diagnostic ****PASSED**** for slot 7
Field Diag eeprom values: run 62 fail mode 0 (PASS) slot 7
last test failed was 0, error code 0
Shutting down diags in slot 7
Board will reload
These results are then stored in an Electrically Erasable Programmable Read-Only Memory (EEPROM) on the line card. You can view the results of the last diagnostic performed on the line card with the diag <slot> previous command. Here is a sample output:
Router#diag 3 previous Field Diag eeprom values: run 0 fail mode 0 (PASS) slot 3 last test failed was 0, error code 0
If no previous field diagnostics have been performed on the card, the output looks like this:
Router#diag 3 previous Field Diags have not been run on this board previously - EE prom results uninitialized. Field Diag eeprom values: run 16777215 fail mode 0 (PASS) slot 9 last test failed was 65535, error code 65535
There have been some bugs in the past that have caused the diagnostic tests to fail even though the card is not faulty, so, as a precaution, if the line card fails and it has already been replaced previously, it would be useful to check this output with the Technical Assistance Center (TAC).
Line card Field Diagnostic software is bundled with the main Cisco IOS Software to enable you to test whether or not the suspect line card is faulty. To use this feature, you must be in privileged enable mode, and issue the diag <slot> <verbose> command.
While the diagnostic test is in progress, the line card does not function normally and is not able to pass any traffic for the duration of the test (5-15 minutes, based on the complexity of the line card). Without the verbose keyword, the command gives a truncated output that shows a Pass or Fail for the card. The output of the diagnostic test without the verbose command looks like this:
Router#diag 3 Running DIAG config check Running Diags will halt ALL activity on the requested slot [confirm] Router# Launching a Field Diagnostic for slot 3 Downloading diagnostic tests to slot 3 (timeout set to 600 sec.) *Nov 18 22:20:40.237: %LINK-5-CHANGED: Interface GigabitEthernet3/0, changed state to administratively down Field Diag download COMPLETE for slot 3 FD 3> ***************************************************** FD 3> GSR Field Diagnostics V4.0 FD 3> Compiled by award on Thu May 18 13:43:04 PDT 2000 FD 3> view: award-conn_isp.FieldDiagRelease FD 3> ***************************************************** FD 3> BFR_CARD_TYPE_1P_GE testing... FD 3> running in slot 3 (83 tests) Executing all diagnostic tests in slot 3 (total/indiv. timeout set to 600/200 sec.) Field Diagnostic: ****TEST FAILURE**** slot 3: last test run 51, Fabric Packet Loopback, error 3 Shutting down diags in slot 3 slot 3 done, will not reload automatically
The line card reloads automatically only after it passes the test. In the example above, the line card failed the test and thus did not reload automatically. You can manually reload the line card with the hw-module slot <slot> reload command.
When you use the verbose keyword, the output includes each individual test that is performed, and whether or not each test has passed or failed. Here is a sample output:
Router#diag 3 verbose Running DIAG config check Running Diags will halt ALL activity on the requested slot. [confirm] Router# Launching a Field Diagnostic for slot 3 Downloading diagnostic tests to slot 3 (timeout set to 600 sec.) Field Diag download COMPLETE for slot 3 FD 3> ***************************************************** FD 3> GSR Field Diagnostics V4.0 FD 3> Compiled by award on Thu May 18 13:43:04 PDT 2000 FD 3> view: award-conn_isp.FieldDiagRelease FD 3> ***************************************************** FD 3> BFR_CARD_TYPE_1P_GE testing... FD 3> running in slot 3 (83 tests) Executing all diagnostic tests in slot 3 (total/indiv. timeout set to 600/200 sec.) FD 3> Verbosity now (0x00000001) TESTSDISP FDIAG_STAT_IN_PROGRESS(3): test #1 R5K Internal Cache FDIAG_STAT_IN_PROGRESS(3): test #2 Burst Operations FDIAG_STAT_IN_PROGRESS(3): test #3 Subblock Ordering FDIAG_STAT_IN_PROGRESS(3): test #4 P4/EEPROM Clock Speed Matching FDIAG_STAT_IN_PROGRESS(3): test #5 Dram Marching Pattern FDIAG_STAT_IN_PROGRESS(3): test #6 Dram Datapins FDIAG_STAT_IN_PROGRESS(3): test #7 Dram Busfloat FDIAG_STAT_IN_PROGRESS(3): test #8 To Fabric (RX) BMA SDRAM Marching Pattern FDIAG_STAT_IN_PROGRESS(3): test #9 To Fabric (RX) BMA SDRAM Datapins FDIAG_STAT_IN_PROGRESS(3): test #10 To Fabric (RX) BMA Q Manager SRAM Busfloat FDIAG_STAT_IN_PROGRESS(3): test #11 To Fabric (RX) BMA Q Manager SRAM Datapins FDIAG_STAT_IN_PROGRESS(3): test #12 To Fabric (RX) BMA Q Manager SRAM Marching Pa FDIAG_STAT_IN_PROGRESS(3): test #13 From Fabric (TX) BMA SDRAM Marching Pattern FDIAG_STAT_IN_PROGRESS(3): test #14 From Fabric (TX) BMA SDRAM Datapins FDIAG_STAT_IN_PROGRESS(3): test #15 From Fabric (TX) BMA Q Manager SRAM Busfloat FDIAG_STAT_IN_PROGRESS(3): test #16 From Fabric (TX) BMA Q Manager SRAM Datapins FDIAG_STAT_IN_PROGRESS(3): test #17 From Fabric (TX) BMA Q Manager SRAM Marching FDIAG_STAT_IN_PROGRESS(3): test #18 To Fabric SOP FIFO SRAM Memory FDIAG_STAT_IN_PROGRESS(3): test #19 From Fabric SOP FIFO SRAM Memory FDIAG_STAT_IN_PROGRESS(3): test #20 SALSA Asic Registers FDIAG_STAT_IN_PROGRESS(3): test #21 Salsa Dram Access FDIAG_STAT_IN_PROGRESS(3): test #22 Salsa P4 Timeout FDIAG_STAT_IN_PROGRESS(3): test #23 Salsa Asic General Purpose Counter FDIAG_STAT_IN_PROGRESS(3): test #24 Salsa Asic Real Time Interrupt FDIAG_STAT_IN_PROGRESS(3): test #25 Salsa Errors FDIAG_STAT_IN_PROGRESS(3): test #26 Salsa DRAM Burst Operations Error FDIAG_STAT_IN_PROGRESS(3): test #27 Salsa Dram Read Around Write FDIAG_STAT_IN_PROGRESS(3): test #28 Salsa Dram Write Parity Error test FDIAG_STAT_IN_PROGRESS(3): test #29 Salsa Prefetch/Write Buffers FDIAG_STAT_IN_PROGRESS(3): test #30 Salsa FrFab BMA SDram Read Around Write FDIAG_STAT_IN_PROGRESS(3): test #31 Salsa ToFab BMA SDram Read Around Write FDIAG_STAT_IN_PROGRESS(3): test #32 Salsa FrFab Network Interrupt Disable Timer FDIAG_STAT_IN_PROGRESS(3): test #33 Salsa ToFab Network Interrupt Disable Timer FDIAG_STAT_IN_PROGRESS(3): test #34 Salsa ToFab Network Interrupt Mask FDIAG_STAT_IN_PROGRESS(3): test #35 Salsa FrFab Network Interrupt Mask FDIAG_STAT_IN_PROGRESS(3): test #36 Salsa ToFab BMA Interrupt Mask FDIAG_STAT_IN_PROGRESS(3): test #37 Salsa FrFab BMA Interrupt Mask FDIAG_STAT_IN_PROGRESS(3): test #38 Salsa - To Fabric BMA Packet - Early Clear FDIAG_STAT_IN_PROGRESS(3): test #39 Salsa - From Fabric BMA Packet - Early Clear FDIAG_STAT_IN_PROGRESS(3): test #40 Salsa To Fabric SOP Interrupt Mask FDIAG_STAT_IN_PROGRESS(3): test #41 Salsa From Fabric SOP Interrupt Mask FDIAG_STAT_IN_PROGRESS(3): test #42 SALSA ECC Generation FDIAG_STAT_IN_PROGRESS(3): test #43 SALSA ECC Correction FDIAG_STAT_IN_PROGRESS(3): test #44 To Fabric FIA48 ASIC Registers FDIAG_STAT_IN_PROGRESS(3): test #45 To Fabric FIA48 Packet FDIAG_STAT_IN_PROGRESS(3): test #46 To Fabric FIA48 Asic BMA Bus Parity Error FDIAG_STAT_IN_PROGRESS(3): test #47 To Fabric FIA48 Asic CiscoCell Fifo Parity Er FDIAG_STAT_IN_PROGRESS(3): test #48 From Fabric FIA48 ASIC Registers FDIAG_STAT_IN_PROGRESS(3): test #50 SLI Packet Loopback FDIAG_STAT_IN_PROGRESS(3): test #51 Fabric Packet Loopback FD 3> INT_CAUSE_REG = 0x00000620 FD 3> Unexpected L3FE Interrupt occurred. FD 3> ERROR: TX FIA48 Asic Interrupt Occurred FD 3> *** 0-INT: External Interrupt *** FD 3> Dumping out TX FIA Status Registers, Disabling FD 3> TX FIA Interrupt, resetting Asics, continuing... FDIAG_STAT_DONE_FAIL(3) test_num 51, error_code 3 Field Diagnostic: ****TEST FAILURE**** slot 3: last test run 51, Fabric Packet Loopback, error 3 Field Diag eeprom values: run 3 fail mode 1 (TEST FAILURE) slot 3 last test failed was 51, error code 3 Shutting down diags in slot 3 slot 3 done, will not reload automatically Router#
These results are then stored in an Electrically Erasable Programmable Read-Only Memory (EEPROM) on the line card. You can view the results of the last diagnostic performed on the line card with the diag <slot> previous command. Here is a sample output:
Router#diag 3 previous Field Diag eeprom values: run 0 fail mode 0 (PASS) slot 3 last test failed was 0, error code 0
If no previous field diagnostics have been performed on the card, the output looks like this:
Router#diag 3 previous Field Diags have not been run on this board previously - EE prom results uninitialized. Field Diag eeprom values: run 16777215 fail mode 0 (PASS) slot 9 last test failed was 65535, error code 65535
There have been some bugs in the past that have caused the diagnostic tests to fail even though the card is not faulty, so, as a precaution, if the line card fails and it has already been replaced previously, it would be useful to check this output with the Technical Assistance Center (TAC).
If you have identified a component that needs to be replaced, contact your Cisco partner or reseller to request a replacement for the hardware component that is causing the issue. If you have a support contract directly with Cisco, use the TAC Service Request Tool (registered customers only) to open a TAC service request for a hardware replacement. Make sure you attach the following information: |
---|
|
Revision | Publish Date | Comments |
---|---|---|
1.0 |
09-Mar-2009 |
Initial Release |