Information About DMVPN-Tunnel Health Monitoring and Recovery Backup NHS
NHS States
An NHS attains different states while associating with the hubs to from a spoke-to-hub tunnel. The table below describes different NHS states.
State |
Description |
---|---|
DOWN |
NHS is waiting to get scheduled. |
PROBE |
NHS is declared as “DOWN” but it is still actively probed by the spoke to bring it “UP”. |
UP |
NHS is associated with a spoke to establish a tunnel. |
NHS Priorities
NHS priority is a numerical value assigned to a hub that controls the order in which spokes select hubs to establish a spoke-to-hub tunnel. The priority value ranges from 0 to 255, where 0 is the highest and 255 is the lowest priority.
You can assign hub priorities in the following ways:
-
Unique priorities to all NHS.
-
Same priority level to a group of NHS.
-
Unspecified priority (value 0) for an NHS, a group of NHSs, or all NHSs.
NHS Clusterless Model
NHS clusterless model is a model where you assign the priority values to the NHSs and do not place the NHSs into any group. NHS clusterless model groups all NHSs to a default group and maintains redundant connections based on the maximum NHS connections configured. Maximum NHS connections is the number of NHS connections in a cluster that must be active at any point in time. The valid range for maximum NHS connections is from 0 to 255.
Priority values are assigned to the hubs to control the order in which the spokes select hubs to establish the spoke-to-hub tunnel. However, assigning these priorities in a clusterless model has certain limitations.
The table below provides an example of limitations for assigning priorities in a clusterless model.
Maximum Number of Connections = 3 |
|||
NHS |
NHS Priority |
Scenario 1 |
Scenario 2 |
NHS A1 |
1 |
UP |
UP |
NHS B1 |
1 |
UP |
PROBE |
NHS C1 |
1 |
UP |
UP |
NHS A2 |
2 |
DOWN |
UP |
NHS B2 |
2 |
DOWN |
DOWN |
NHS C2 |
2 |
DOWN |
DOWN |
Consider a scenario with three data centers A, B, and C. Each data center consists of two NHSs: NHSs A1 and A2 comprise one data center, NHS B1 and B2 another, and C1 and C3 another.
Although two NHSs are available for each data center, the spoke is connected to only one NHS of each data center at any point in time. Hence, the maximum connection value is set to 3. That is, three spoke-to-hub tunnels are established. If any one NHS, for example, NHS B1, becomes inactive, the spoke-to-hub tunnel associated with NHS B1 goes down. Based on the priority model, NHS A2 has the next priority value and the next available NHS in the queue, so it forms the spoke-to-hub tunnel and goes up. However, this does not meet the requirement that a hub from data center B be associated with the spoke to form a tunnel. Hence, no connection is made to data center B.
This problem can be addressed by placing NHSs into different groups. Each group can be configured with a group specific maximum connection value. NHSs that are not assigned to any groups belong to the default group.
NHS Clusters
The table below presents an example of cluster functionality. NHSs corresponding to different data centers are grouped to form clusters. NHS A1 and NHS A2 with priority 1 and 2, respectively, are grouped as cluster1, NHS B1 and NHS B2 with prirority 1 and 2, respectively, are grouped as cluster2, and NHS C1 and NHS C2 with prirority 1 and 2, respectively, are grouped as cluster3. NHS 7, NHS 8, and NHS 9 are part of the default cluster. The maximum cluster value is set to 1 for each cluster so that at least one spoke-to-hub tunnel is continuously established with all the four clusters.
In scenario 1, NHS A1, NHS B1, and NHS C1 with the highest priority in each cluster are in the UP state. In scenario 2, the connection between the spoke and NHS A1 breaks, and a connection is established between the spoke and NHS A2 (hub from the same cluster). NHS A1 with the highest priority attains the PROBE state. In this way, at any point in time a connection is established to all the three data centers.
NHS |
NHS Priority |
Cluster |
Maximum Number of Connections |
Scenario 1 |
Scenario 2 |
---|---|---|---|---|---|
NHS A1 |
1 |
1 |
1 |
UP |
PROBE |
NHS A2 |
2 |
DOWN |
UP |
||
NHS B1 |
1 |
2 |
1 |
UP |
UP |
NHS B2 |
2 |
DOWN |
DOWN |
||
NHS C1 |
1 |
3 |
1 |
UP |
UP |
NHS C2 |
2 |
DOWN |
DOWN |
||
NHS 7 |
1 |
Default |
2 |
UP |
DOWN |
NHS 8 |
2 |
UP |
UP |
||
NHS 9 |
0 |
PROBE |
UP |
NHS Fallback Time
Fallback time is the time that the spoke waits for the NHS to become active before detaching itself from an NHS with a lower priority and connecting to the NHS with the highest priority to form a spoke-to-hub tunnel. Fallback time helps in avoiding excessive flaps.
The table below shows how the spoke flaps from one NHS to another excessively when the fallback time is not configured on the spoke. Five NHSs having different priorities are available to connect to the spoke to form a spoke-to-hub tunnel. All these NHSs belong to the default cluster. The maximum number of connection is one.
NHS |
NHS Priority |
Cluster |
Scenario 1 |
Scenario 2 |
Scenario 3 |
Scenario 4 |
Scenario 5 |
NHS 1 |
1 |
Default |
PROBE |
PROBE |
PROBE |
PROBE |
UP |
NHS 2 |
2 |
Default |
PROBE |
PROBE |
PROBE |
UP |
DOWN |
NHS 3 |
3 |
Default |
PROBE |
PROBE |
UP |
DOWN |
DOWN |
NHS 4 |
4 |
Default |
PROBE |
UP |
DOWN |
DOWN |
DOWN |
NHS 5 |
5 |
Default |
UP |
DOWN |
DOWN |
DOWN |
DOWN |
In scenario 1, NHS 5 with the lowest priority value is connected to the spoke to form a tunnel. All the other NHSs having higher priorities than NHS 5 are in the PROBE state.
In scenario 2, when NHS 4 becomes active, the spoke breaks connection with the existing tunnel and establishes a new connection with NHS 4. In scenario 3 and scenario 4, the spoke breaks the existing connections as soon as an NHS with a higher priority becomes active and establishes a new tunnel. In scenario 5, as the NHS with the highest priority (NHS 1) becomes active, the spoke connects to it to form a tunnel and continues with it until the NHS becomes inactive. Because NHS 1 is having the highest priority, no other NHS is in the PROBE state.
The table below shows how to avoid the excessive flapping by configuring the fallback time. The maximum number of connection is one. A fallback time period of 30 seconds is configured on the spoke. In scenario 2, when an NHS with a higher priority than the NHS associated with the spoke becomes active, the spoke does not break the existing tunnel connection until the fallback time. Hence, although NHS 4 becomes active, it does not form a tunnel and attain the UP state. NHS 4 remains active but does not form a tunnel untill the fallback time elapses. Once the fallback time elapses, the spoke connects to the NHS having the highest priority among the active NHSs.
This way, the flaps that occur as soon as an NHS of higher priority becomes active are avoided.
NHS |
NHS Priority |
Cluster |
Scenario 1 |
Scenario 2 |
Scenario 3 |
Scenario 4 |
Scenario 5 |
NHS 1 |
1 |
Default |
PROBE |
PROBE |
PROBE |
UP-hold |
UP |
NHS 2 |
2 |
Default |
PROBE |
PROBE |
UP-hold |
UP-hold |
DOWN |
NHS 3 |
3 |
Default |
PROBE |
UP-hold |
UP-hold |
UP-hold |
DOWN |
NHS 4 |
4 |
Default |
UP-hold |
UP-hold |
UP-hold |
UP-hold |
DOWN |
NHS 5 |
5 |
Default |
UP |
UP |
UP |
UP |
DOWN |
NHS Recovery Process
NHS recovery is a process of establishing an alternative spoke-to-hub tunnel when the existing tunnel becomes inactive, and connecting to the preferred hub upon recovery.
The following sections explain NHS recovery:
Alternative Spoke to Hub NHS Tunnel
When a spoke-to-hub tunnel fails it must be backed up with a new spoke-to-hub tunnel. The new NHS is picked from the same cluster to which the failed hub belonged. This ensures that the required number of spoke-to-hub tunnels are always present although one or more tunnel paths are unavailable.
The table below presents an example of NHS backup functionality.
NHS |
NHS Priority |
Cluster |
Maximum Number of Connections |
Scenario 1 |
Scenario 2 |
Scenario 3 |
---|---|---|---|---|---|---|
NHS A1 |
1 |
1 |
1 |
UP |
PROBE |
PROBE |
NHS A2 |
2 |
DOWN |
UP |
DOWN |
||
NHS A3 |
2 |
DOWN |
DOWN |
UP |
||
NHS A4 |
2 |
DOWN |
DOWN |
DOWN |
||
NHS B1 |
1 |
3 |
1 |
UP |
PROBE |
PROBE |
NHS B2 |
2 |
DOWN |
UP |
DOWN |
||
NHS B3 |
2 |
DOWN |
DOWN |
UP |
||
NHS B4 |
2 |
DOWN |
DOWN |
DOWN |
||
NHS 9 |
Default |
Default |
1 |
UP |
UP |
DOWN |
NHS 10 |
DOWN |
DOWN |
UP |
Four NHSs belonging to cluster 1 and cluster 3 and two NHSs belonging to the default cluster are available for setting up spoke-to-hub tunnels. All NHSs have different priorities. The maxmum number of connections is set to 1 for all the three clusters. That is, at any point in time, at least one NHS from each cluster must be connected to the spoke to form a tunnel.
In scenario 1, NHS A1 from cluster 1, NHS B1 from cluster 3, and NHS 9 from the default cluster are UP. They establish a contact with the spoke to form different spoke-to-hub tunnels. In scenario 2, NHS A1 and NHS B1 with the highest priority in their respective clusters become inactive. Hence a tunnel is established from the spoke to NHS A2 and NHS B2, which have the next highest priority values. However, the spoke continues to probe NHS A1 and NHS B1 because they have the highest priority. Hence, NHS A1 and NHS B1 remain in the PROBE state.
In scenario 3, NHS A2, NHS B2, and NHS 9 become inactive. The spoke checks if the NHSs in PROBE state have turned active. If yes, then the spoke establishes a connection to the NHS that has turned active. However, as shown in scenario 3, because none of the NHSs in the PROBE state is active, the spoke connects to NHS A3 of cluster 1 and NHS B3 of cluster 2. NHS A1 and NHS B1 continue to be in the PROBE state until they associate themselves with the spoke to form a tunnel and attain the UP state.
Returning to Preferred NHS Tunnel upon Recovery
When a spoke-to-hub tunnel fails, a backup tunnel is established using an NHS having the next higher priority value. Even though the tunnel is established with an NHS of lower priority, the spoke continuously probes the NHS having the highest priority value. Once the NHS having the highest priority value becomes active, the spoke establishes a tunnel with the NHS and hence the NHS attains the UP state.
The table below presents NHS recovery functionality. Four NHSs belonging to cluster 1 and cluster 3 and two NHSs belonging to the default cluster are available for setting up spoke-to-hub tunnels. All NHSes have different priorities. The maximum connection value is set to 1. In scenario 1, NHS A4, NHS B4, and NHS 10 with the least priority in their respective clusters associate with the spoke in establishing a tunnel. The spoke continues to probe NHSs of higher prirority to establish a connection with the NHS having the highest priority value. Hence, in scenario 1, NHSs having the highest priority value in their respective clusters are in the PROBE state. In scenario 2, NHS A1 is ACTIVE, forms a tunnel with the spoke, and attains the UP state. Because NHS A1 has the highest priority, the spoke does not probe any other NHS in the cluster. Hence, all the other NHSs in cluster1 are in the DOWN state.
When the connection with NHS B4 breaks, the spoke connects to NHS B3, which has the next higher priority value, because NHS B1 of cluster 3 is not active. In scenario 3, NHS A1 continues to be in the UP state and NHS B1 with the highest priority in cluster 2 becomes active, forms a tunnel, and attains the UP state. Hence, no other NHSs in cluster 2 are in the PROBE state. However, because NHS 10 having the lowest priority value in the default cluster is in the UP state, the spoke continues to probe NHS 9 having the highest priority in the cluster.
In scenario 4, NHS A1 and NHS B1 continue to be in the UP state and NHS 9 having the highest priority in the default cluster attains the UP state. Hence, because the spoke is associated with the NHSs having the highest priority in all the clusters, none of the NHSs are in the PROBE state.
NHS |
NHS Priority |
Cluster |
Maximum Number of Connections |
Scenario 1 |
Scenario 2 |
Scenario 3 |
Scenario 4 |
---|---|---|---|---|---|---|---|
NHS A1 |
1 |
1 |
1 |
PROBE |
UP |
UP |
UP |
NHS A2 |
2 |
DOWN |
DOWN |
DOWN |
DOWN |
||
NHS A3 |
2 |
DOWN |
DOWN |
DOWN |
DOWN |
||
NHS A4 |
2 |
UP |
DOWN |
DOWN |
DOWN |
||
NHS B1 |
1 |
3 |
1 |
PROBE |
PROBE |
UP |
UP |
NHS B2 |
10 |
PROBE |
DOWN |
DOWN |
DOWN |
||
NHS B3 |
10 |
PROBE |
UP |
DOWN |
DOWN |
||
NHS B4 |
30 |
UP |
DOWN |
DOWN |
DOWN |
||
NHS 9 |
Default |
Default |
1 |
PROBE |
PROBE |
PROBE |
UP |
NHS 10 |
100 |
UP |
UP |
UP |
DOWN |