Monitoring the Control Plane

To verify the overall health of your system, monitor control plane resources on a regular basis.

This chapter includes the following sections:

Avoiding Problems Through Regular Monitoring

Monitoring system resources allows you to detect potential problems before they happen, thus avoiding outages. The following are show the advantages of regular monitoring:

  • In a real-life example, customers installed new line cards. After the line cards were in operation for a few years, lack of memory on those line cards caused major outages in some cases. Monitoring memory usage would have identified a memory issue and avoided an outage.
  • Regular monitoring establishes a baseline for a normal system load. You can use this information as a basis for comparison when you upgrade hardware or software—to see if the upgrade has affected resource usage.

Control Plane Overview

The following sections contain a high-level overview of the control plane:

Cisco ASR 1000 Series Routers Control Plane Architecture

The major components in the control plane are:

  • Cisco ASR 1000 Series Route Processor (RP)—A general purpose CPU responsible for routing protocols, CLI, network management interfaces, code storage, logging, and chassis management. The Cisco ASR 1000 Series RPs process network control packets as well as protocols not supported by the Cisco ASR 1000 Series ESP.
  • Cisco ASR 1000 Series Embedded Services Processor (ESP)—A forwarding processor that handles forwarding control plane traffic, and performs packet processing functions such as firewall inspection, ACLs, encryption, and QoS.
  • Cisco ASR 1000 Series SPA Interface Processor (SIP)—An interface processor that provides the connection between the Route Processor and the shared port adapters (SPAs).

Distributed Control Plane Architecture

Cisco ASR 1000 Series Routers have a distributed control plane architecture. A separate control processor is embedded on each major component in the control plane, as shown in Figure 2-1:

  • Route Processor (RP)
  • Forwarding Engine Control Processor (FECP)
  • I/O Control Processor (IOCP)

The RP manages and maintains the control plane using a dedicated Gigabit Ethernet out-of-band channel (EOBC). The internal EOBC is used to continuously exchange system state information among the different major components. For example, in the event of a failure condition, a switchover event occurs and the standby RP and ESP are immediately ready to assume the data forwarding functions or the control plane functions for the failed component.

The inter-integrated circuit (I2C) monitors the health of hardware components. The Enhanced SerDes Interconnect (ESI) is a set of serial links that are the data path links on the midplane connecting the RP, SIPs, and standby ESPs to the active ESP.

Figure 2-1 Cisco ASR 1000 Series Routers Control Plane Architecture

 

 

The control plane processors perform the following functions:

RP

  • Runs the router control plane (Cisco IOS), including processing network control packets, computing routes, and setting up connections.
  • Monitors interface and environmental status, including management ports, LEDs, alarms, and SNMP network management.
  • Downloads code to other components in the system.
  • Selects the active RP and ESP and synchronizes the standby RP and ESP.
  • Manages logging facilities, on-board failure logging (OBFL), and statistics aggregation.

FECP

  • Provides direct CPU access to the forwarding engine subsystem—the Cisco QuantumFlowProcessor (QFP) subsystem—that is the forwarding processor chipset and also resides on the ESP.
  • Manages the forwarding engine subsystem and its connection to I/O.
  • Manages the forwarding processor chipset.

IOCP

  • Provides direct CPU access to SPAs installed in a SIP.
  • Manages the SPAs.
  • Handles SPA online insertion and removal (OIR) events.
  • Runs SPA drivers that initialize and configure SPAs.

Cisco IOS XE Software Architecture

The control plane processors run Cisco IOS XE software, which is an operating system that consists of a Linux-based kernel and a common set of operating system-level utility programs. It is a distributed software architecture that moves many operating system responsibilities out of the IOS process.

In this architecture, IOS runs as one of many Linux processes while allowing other Linux processes to share responsibility for running the router. IOS runs as a user process on the RP. Hardware-specific components have been removed from the IOS process and are handled by separate middleware processes in Cisco IOS XE software. If a hardware-specific issue is discovered, the middleware process can be modified without touching the IOS process.

Figure 2-2 shows the main components of the Cisco IOS XE software architecture. This modular architecture increases network resiliency by distributing operating responsibility among separate processes. The architecture also allows for better allocation of memory so the router can run more efficiently.

All of the Cisco IOS XE software modules run in their own protective memory spaces, which facilitates fault containment. Any software outages of an individual software module are localized to that particular module. All other software processes continue to operate. For example, for each SPA, a separate driver process is executed on the SIP, even if multiple SPAs of the same type are present. Because each SPA driver runs in its own protective memory, failure or upgrade of an individual driver is localized to the affected SPA.

Figure 2-2 Cisco IOS XE Software Architecture

 

 

Using the Linux architecture, Cisco IOS XE provides the following benefits:

  • The ability to integrate multi-core (multiple CPUs on a single piece of silicon) processors.
  • The IOS process has no direct access to hardware components, thus providing a greater level of resiliency.
  • The ability to run active and standby IOS processes on the non-hardware-redundant Cisco ASR 1004 Router and Cisco ASR 1006 Router.
  • The IOS process operates as a virtual machine under the RP Linux kernel. Upon bootup, the RP Linux kernel allocates 50 percent of available memory to IOS processes as a one-time event. For systems that have a single IOS process, IOS is allocated approximately 45 percent of total RP memory. For redundant IOS process systems, each IOS process is allocated approximately 20 percent of total RP memory.
  • Hardware components are managed through memory-protected middleware processes.
  • SPA drivers run as unique processes allowing the ability to upgrade and restart individual SPAs.

Monitoring Control Plane Resources

The following sections discuss monitoring memory and CPU from the perspective of the IOS process and from the perspective of the overall control plane:

IOS Process Resources

For information about memory and CPU utilization from within the IOS process, use the show memory command and the show process cpu command. Note that these commands provide a representation of memory and CPU utilization from the perspective of the IOS process only; they do not include information for resources on the entire route processor. For example, show memory on an RP2 with 8 GB of RAM running a single IOS process shows the following memory usage:

Router# show memory
 
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 2ABEA4316010 4489061884 314474916 4174586968 3580216380 3512323496
lsmpi_io 2ABFAFF471A8 6295128 6294212 916 916 916
Critical 2ABEB7C72EB0 1024004 92 1023912 1023912 1023912
 

For the dual-core RP2, the show process cpu command reports a single IOS CPU utilization average using both processors:

Router# show process cpu
 
CPU utilization for five seconds: 0%/0%; one minute: 0%; five minutes: 0%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
1 583 48054 12 0.00% 0.00% 0.00% 0 Chunk Manager
2 991 176805 5 0.00% 0.00% 0.00% 0 Load Meter
3 0 2 0 0.00% 0.00% 0.00% 0 IFCOM Msg Hdlr
4 0 11 0 0.00% 0.00% 0.00% 0 Retransmission o
5 0 3 0 0.00% 0.00% 0.00% 0 IPC ISSU Dispatc
6 230385 119697 1924 0.00% 0.01% 0.00% 0 Check heaps
7 49 28 1750 0.00% 0.00% 0.00% 0 Pool Manager
8 0 2 0 0.00% 0.00% 0.00% 0 Timers
9 17268 644656 26 0.00% 0.00% 0.00% 0 ARP Input
10 197 922201 0 0.00% 0.00% 0.00% 0 ARP Background
11 0 2 0 0.00% 0.00% 0.00% 0 ATM Idle Timer
12 0 1 0 0.00% 0.00% 0.00% 0 ATM ASYNC PROC
13 0 1 0 0.00% 0.00% 0.00% 0 AAA_SERVER_DEADT
14 0 1 0 0.00% 0.00% 0.00% 0 Policy Manager
15 0 2 0 0.00% 0.00% 0.00% 0 DDR Timers
16 1 15 66 0.00% 0.00% 0.00% 0 Entity MIB API
17 13 1195 10 0.00% 0.00% 0.00% 0 EEM ED Syslog
18 93 46 2021 0.00% 0.00% 0.00% 0 PrstVbl
19 0 1 0 0.00% 0.00% 0.00% 0 RO Notify Timers
 

Overall Control Plane Resources

For information about control plane memory and CPU utilization on each control processor, use the show platform software status control-processor brief command (summary view) or the show platform software status control-processor command (detailed view).

All control processors should show a status of Healthy. Other possible status values are Warning and Critical. Warning indicates that the router is operational but that the operating level should be reviewed. Critical implies that the router is near failure.

If you see a status of Warning or Critical, take the following actions:

  • Reduce static and dynamic loads on the system by reducing the number of elements in the configuration or by limiting the capacity for dynamic services.
  • Reduce the number of routes and adjacencies, limit the number of ACLs and other rules, reduce the number of VLANs, and so on.

The following sections describe the fields in show platform software status control-processor command output.

Load Average

Load average represents the process queue or process contention for CPU resources. For example, on a single-core processor, an instantaneous load of 7 would mean that seven processes are ready to run, one of which is currently running. On a dual-core processor, a load of 7 would represent seven processes are ready to run, two of which are currently running.

Memory Utilization

Memory utilization is represented by the following fields:

  • Total—Total line card memory
  • Used—Consumed memory
  • Free—Available memory
  • Committed—Virtual memory committed to processes

CPU Utilization

CPU utilization is an indication of the percentage of time the CPU is busy and is represented by the following fields:

  • CPU—The allocated processor
  • User—Non-Linux kernel processes
  • System —Linux kernel process
  • Nice—Low priority processes
  • Idle—Percentage of time the CPU was inactive
  • IRQ—Interrupts
  • SIRQ—System Interrupts
  • IOwait—Percentage of time CPU was waiting for I/O

The following are examples of the show platform software status control-processor command.

Router# show platform software status control-processor brief
Load Average
Slot Status 1-Min 5-Min 15-Min
RP0 Healthy 0.25 0.30 0.44
RP1 Healthy 0.31 0.19 0.12
ESP0 Healthy 0.01 0.05 0.02
ESP1 Healthy 0.03 0.05 0.01
SIP1 Healthy 0.15 0.07 0.01
SIP2 Healthy 0.03 0.03 0.00
 
Memory (kB)
Slot Status Total Used (Pct) Free (Pct) Committed (Pct)
RP0 Healthy 3722408 2514836 (60%) 1207572 (29%) 1891176 (45%)
RP1 Healthy 3722408 2547488 (61%) 1174920 (28%) 1889976 (45%)
ESP0 Healthy 2025468 1432088 (68%) 593380 (28%) 3136912 (149%)
ESP1 Healthy 2025468 1377980 (65%) 647488 (30%) 3084412 (147%)
SIP1 Healthy 480388 293084 (55%) 187304 (35%) 148532 (28%)
SIP2 Healthy 480388 273992 (52%) 206396 (39%) 93188 (17%)
 
CPU Utilization
Slot CPU User System Nice Idle IRQ SIRQ IOwait
RP0 0 30.12 1.69 0.00 67.63 0.13 0.41 0.00
RP1 0 21.98 1.13 0.00 76.54 0.04 0.12 0.16
ESP0 0 13.37 4.77 0.00 81.58 0.07 0.19 0.00
ESP1 0 5.76 3.56 0.00 90.58 0.03 0.05 0.00
SIP1 0 3.79 0.13 0.00 96.04 0.00 0.02 0.00
SIP2 0 3.50 0.12 0.00 96.34 0.00 0.02 0.00
 
 
Router# show platform software status control-processor
RP0: online, statistics updated 10 seconds ago
Load Average: healthy
1-Min: 0.30, status: healthy, under 5.00
5-Min: 0.31, status: healthy, under 5.00
15-Min: 0.47, status: healthy, under 5.00
Memory (kb): healthy
Total: 3722408
Used: 2514776 (60%), status: healthy, under 90%
Free: 1207632 (29%), status: healthy, over 10%
Committed: 1891176 (45%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
User: 30.12, System: 1.69, Nice: 0.00, Idle: 67.63
IRQ: 0.13, SIRQ: 0.41, IOwait: 0.00
 
RP1: online, statistics updated 5 seconds ago
Load Average: healthy
1-Min: 0.14, status: healthy, under 5.00
5-Min: 0.11, status: healthy, under 5.00
15-Min: 0.09, status: healthy, under 5.00
Memory (kb): healthy
Total: 3722408
Used: 2547488 (61%), status: healthy, under 90%
Free: 1174920 (28%), status: healthy, over 10%
Committed: 1889976 (45%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
User: 21.98, System: 1.13, Nice: 0.00, Idle: 76.54
IRQ: 0.04, SIRQ: 0.12, IOwait: 0.16
 
ESP0: online, statistics updated 5 seconds ago
Load Average: healthy
1-Min: 0.06, status: healthy, under 5.00
5-Min: 0.09, status: healthy, under 5.00
15-Min: 0.03, status: healthy, under 5.00
Memory (kb): healthy
Total: 2025468
Used: 1432088 (68%), status: healthy, under 90%
Free: 593380 (28%), status: healthy, over 10%
Committed: 3136912 (149%), status: healthy, under 300%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
User: 13.37, System: 4.77, Nice: 0.00, Idle: 81.58
IRQ: 0.07, SIRQ: 0.19, IOwait: 0.00
 
ESP1: online, statistics updated 5 seconds ago
Load Average: healthy
1-Min: 0.22, status: healthy, under 5.00
5-Min: 0.08, status: healthy, under 5.00
15-Min: 0.02, status: healthy, under 5.00
Memory (kb): healthy
Total: 2025468
Used: 1377980 (65%), status: healthy, under 90%
Free: 647488 (30%), status: healthy, over 10%
Committed: 3084412 (147%), status: healthy, under 300%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
User: 5.76, System: 3.56, Nice: 0.00, Idle: 90.58
IRQ: 0.03, SIRQ: 0.05, IOwait: 0.00
 
SIP1: online, statistics updated 6 seconds ago
Load Average: healthy
1-Min: 0.05, status: healthy, under 5.00
5-Min: 0.06, status: healthy, under 5.00
15-Min: 0.00, status: healthy, under 5.00
Memory (kb): healthy
Total: 480388
Used: 293084 (55%), status: healthy, under 90%
Free: 187304 (35%), status: healthy, over 10%
Committed: 148532 (28%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
User: 3.79, System: 0.13, Nice: 0.00, Idle: 96.04
IRQ: 0.00, SIRQ: 0.02, IOwait: 0.00
 
SIP2: online, statistics updated 8 seconds ago
Load Average: healthy
1-Min: 0.03, status: healthy, under 5.00
5-Min: 0.03, status: healthy, under 5.00
15-Min: 0.00, status: healthy, under 5.00
Memory (kb): healthy
Total: 480388
Used: 273992 (52%), status: healthy, under 90%
Free: 206396 (39%), status: healthy, over 10%
Committed: 93188 (17%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
User: 3.50, System: 0.12, Nice: 0.00, Idle: 96.34
IRQ: 0.00, SIRQ: 0.02, IOwait: 0.00
 

For More Information

For more information about the topics discussed in this chapter, see the following documents:

Topic
Document

Command descriptions

Cisco IOS Master Command List, All Releases

Command Lookup Tool (Requires Cisco.com user ID and password)