Information About High Availability
High Availability in controllers allows you to reduce the downtime of the wireless networks that could occur due to failover of controllers.
A 1:1 (Active:Standby-Hot) stateful switchover of access points and clients is supported (HA SSO). In a High Availability architecture, one controller is configured as the primary controller and another controller as the secondary controller.
After you enable High Availability, the primary and secondary controllers are rebooted. During the boot process, the role of the primary controller is negotiated as active and the role of the secondary controller as standby-hot. After a switchover, the secondary controller becomes the active controller and the primary controller becomes the standby-hot controller. After subsequent switchovers, the roles are interchanged between the primary and the secondary controllers. The reason or cause for most switchover events is due to a manual trigger, a controller and/or a network failure.
During an HA SSO failover event, all of the AP CAPWAP sessions and client sessions in RUN state on the controller are statefully switched over to the standby controller without interruption, except PMIPv6 clients, which will need to reconnect and authenticate to the controller following an HA SSO switchover. For additional client SSO behaviors and limitations, see the "Client SSO" section in the High Availability (SSO) Deployment Guide at:
https://www.cisco.com/c/en/us/td/docs/wireless/controller/technotes/8-1/HA_SSO_DG/High_Availability_DG.html#pgfId-53637The standby-hot controller continuously monitors the health of the active controller through its dedicated redundancy port. Both the controllers share the same configurations, including the IP address of the management interface.
Before you enable High Availability, ensure that both the controllers can successfully communicate with one another through their dedicated redundancy port, either through a direct cable connection or through Layer 2. For more details, see the "Redundancy Port Connectivity" section in the High Availability (SSO) Deployment Guide:
https://www.cisco.com/c/en/us/td/docs/wireless/controller/technotes/8-1/HA_SSO_DG/High_Availability_DG.html#pgfId-83028In the Release 8.0 and later releases, the output of the show ap join stats summary command displays the status of the access points based on whether the access point joined the controller or it was synchronized from Active controller. One of the following statuses is displayed:
-
Synched—The access point joined the controller before the SSO.
-
Connected—The access point joined the controller after the SSO.
-
Joined—The access point rejoined the controller, or a new AP has joined the controller after the SSO.
In Release 8.0 and later, the output of the show redundancy summary command displays the bulk synchronization status of access points and clients after the pair-up of active and standby controllers occurs. The values are:
-
Pending— Indicates that synchronization of access points and the corresponding clients details from the active to standby controller is yet to begin.
-
In-progress— Indicates that synchronization of access points and the corresponding clients details from the active to standby controller has begun and synchronization is in progress.
-
Complete—Indicates that synchronization is complete and the standby controller is ready for a switchover to resume the services of the active controller.
From release 8.0 and later, in a High Availability scenario, the sleeping timer is synchronized between active and standby.
ACL and NAT IP configurations are synchronized to the High Availability standby controller when these parameters are configured before High Availability pair-up. If the NAT IP is set on the management interface, the access point sets the AP manager IP address as the NAT IP address.
The following are some guidelines for high availability:
-
We recommend that you do not pair two controllers of different hardware models. If they are paired, the higher controller model becomes the active controller and the other controller goes into maintenance mode.
-
We recommend that you do not pair two controllers on different controller software releases. If they are paired, the controller with the lower redundancy management address becomes the active controller and the other controller goes into maintenance mode.
-
We recommend that you disable High Availability and add license in Cisco 5520and 8540 controllers (RTU based). However, it is not mandatory to disable High Availability as AP licenses added in Primary controller will be inherited to Secondary controller.
-
All download file types, such as image, configuration, web-authentication bundle, and signature files are downloaded on the active controller first and then pushed to the standby-hot controller.
-
Certificates should be downloaded separately on each controller before they are paired.
-
You can upload file types such as configuration files, event logs, crash files, and so on, from the standby-hot controller using the GUI or CLI of the active controller. You can also specify a suffix to the filename to identify the uploaded file.
-
To perform a peer upload, use the service port. In a management network, you can also use the redundancy management interface (RMI) that is mapped to the redundancy port or RMI VLAN, or both, where the RMI is the same as the management VLAN. Note that the RMI and the redundancy port should be in two separate Layer2 VLANs, which is a mandatory configuration.
-
If the controllers cannot reach each other through the redundant port and the RMI, the primary controller becomes active and the standby-hot controller goes into the maintenance mode.
Note
When the RMIs for two controllers that are a pair, and that are mapped to same VLAN and connected to same Layer3 switch stop working, the standby controller is restarted.
The mobilityHaMac is out of range XML message is seen during the active/standby second switchover in a High Availability setup. This occurs if mobility HA MAC field is more than 128.
-
When High Availability is enabled, the standby controller always uses the Remote Method Invocation (RMI), and all the other interfaces—dynamic and management, are invalid.
Note
The RMI is meant to be used only for active and standby communications and not for any other purpose.
-
You must ensure that the maximum transmission unit (MTU) on RMI port is 1500 bytes or higher before you enable high availability.
-
When High Availability is enabled, ensure that you do not use the backup image. If this image is used, the High Availability feature might not work as expected:
-
The service port and route information that is configured is lost after you enable SSO. You must configure the service port and route information again after you enable SSO. You can configure the service port and route information for the standby-hot controller using the peer-service-port and peer-route commands.
-
We recommend that you do not use the reset command on the standby-hot controller directly. If you use this, unsaved configurations will be lost.
-
-
We recommend that you enable link aggregation configuration on the controllers before you enable the port channel in the infrastructure switches.
-
All the configurations that require reboot of the active controller results in the reboot of the standby-hot controller.
-
The Rogue AP Ignore list is not synchronized from the active controller to the standby-hot controller. The list is relearned through SNMP messages from Cisco Prime Infrastructure after the standby-hot controller becomes active.
-
Client SSO related guidelines:
-
The standby controller maintains two client lists: one is a list of clients in the Run state and the other is a list of transient clients in all the other states.
-
Only the clients that are in the Run state are maintained during failover. Clients that are in transition, such as roaming, 802.1X key regeneration, web authentication logout, and so on, are dissociated.
-
As with AP SSO, Client SSO is supported only on WLANs. The controllers must be in the same subnet. Layer3 connection is not supported.
-
-
In Release 7.3.x, AP SSO is supported, but client SSO is not supported, which means that after a High Availability setup that uses Release 7.3.x encounters a switchover, all the clients associated with the controller are deauthenticated and forced to reassociate.
-
You must manually configure the mobility MAC address on the then active controller post switchover, when a peer controller has a controller software release that is prior to Release 7.2.
-
To enable an access point to maintain controlled quality of service (QoS) for voice and video parameters, all the bandwidth-based or static call admission control (CAC) parameters are synchronized from active to standby when a switchover occurs.
-
The standby controller does not reboot; instead enters the maintenance mode when unable to connect to the default gateway using the redundant port. Once the controller reconnects to the default gateway, the standby controller reboots and the High Availability pair with the active controller is initiated. However, the active controller still reboots before entering the maintenance mode.
-
The following are supported from Release 8.0:
-
Static CAC synchronization—To maintain controlled Quality-of-Service (QoS) for voice and video parameters, all the bandwidth-based or static CAC parameters services are readily available for clients when a switchover occurs.
-
Internal DHCP server—To serve wireless clients of the controller, the internal DHCP server data is synchronized from the active controller to the standby controller. All the assigned IP addresses remain valid, and IP address assignation continues when the role changes from active to standby occurs.
-
Enhanced debugging and serviceability—All the debugging and serviceability services are enhanced for users.
-
-
The physical connectivity or topology of the access points on the switch are not synchronized from the active to the standby controller. The standby controller learns the details only when the synchronization is complete. Hence, you must execute the show ap cdp neighbors all command only after synchronization is complete, and only when the standby becomes the then active controller.
-
To enable access points to join the HA SKU secondary controller that has been reset to factory defaults, you must:
-
Configure the HA SKU controller as secondary controller. To do this, you must run the config redundancy unit secondary command on the HA SKU controller.
-
Reboot the HA SKU controller after you successfully execute the config redundancy unit secondary command.
-
Redundancy Management Interface
The active and standby-hot controllers use the RMI to check the health of the peer controller and the default gateway of the management interface through network infrastructure.
The RMI is also used to send notifications from the active controller to the standby-hot controller if a failure or manual reset occurs. The standby-hot controller uses the RMI to communicate to the syslog, NTP/SNTP server, FTP, and TFTP server.
It is mandatory to configure the IP addresses of the Redundancy Management Interface and the Management Interface in the same subnet on both the primary and secondary controllers.
Redundancy Port
The redundancy port is used for configuration, operational data synchronization, and role negotiation between the primary and secondary controllers.
The redundancy port checks for peer reachability by sending UDP keepalive messages every 100 milliseconds (default frequency) from the standby-hot controller to the active controller. If a failure of the active controller occurs, the redundancy port is used to notify the standby-hot controller.
If an NTP/SNTP server is not configured, the redundancy port performs a time synchronization from the active controller to the standby-hot controller.
The redundancy ports can connect over an L2 switch. Ensure that the redundancy port round-trip time is less than 80 milliseconds if the keepalive timer is set to default, that is, 100 milliseconds, or 80 percent of the keepalive timer if you have configured the keepalive timer in the range of 100 milliseconds to 400 milliseconds. The failure detection time is calculated, for example, if the keepalive timer is set to 100 milliseconds, as follows: 3 * 100 = 300 + 60 = 360 + jitter (12 milliseconds) = ~400 milliseconds. Also, ensure that the bandwidth between redundancy ports is 60 Mbps or higher. Ensure that the maximum transmission unit (MTU) is 1500 bytes or higher.