Troubleshoot EGTP Path Failures Due to Mismatch in Restart Counter Values

Available Languages

Download Options

PDF (30.0 KB)
View with Adobe Reader on a variety of devices
ePub (85.7 KB)
View in various apps on iPhone, iPad, Android, Sony Reader, or Windows Phone
Mobi (Kindle) (73.3 KB)
View on Kindle device or Kindle app on multiple devices

Updated:December 13, 2023

Document ID:221238

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Introduction

Troubleshooting Commands

Analysis

StarOS Perspective

Workaround

Introduction

This document describes the troubleshooting of Evolved-GPRS Tunnelling Protocol (EGTP) path failures observed due to a mismatch in restart counter values between SGSN/MME and GGSN/Serving Gateway or PDN Gateway (SPGW).

Troubleshooting Commands

show egtpc peers interface
show egtpc peers path-failure-history
show egtpc statistics path-failure-reasons
show egtp-service all
show egtpc sessions
show egtpc statistics

egtpc test echo gtp-version 2 src-address <source node IP address>  peer-address <remote node IP address>

For more details about this commands refer this mentioned link

https://www.cisco.com/c/en/us/support/docs/wireless-mobility/gateway-gprs-support-node-ggsn/119246-technote-ggsn-00.html

Analysis

From the logs and stats, it is identified that the restart counter value at the Mobility Management Entity (MME) end is 11, and at the EPG end is 12.

You can observe the traps as mentioned here:

Internal trap notification 1112 (EGTPCPathFail) context s11mme, service s11-mme, interface type mme, self address <X.X.X.X>,  peer address <Y.Y.Y.Y>, peer old restart counter 4,  peer new restart counter 4,  peer session count  240, failure reason no-response-from-peer,  path failure detection Enabled.
Internal trap notification 1112 (EGTPCPathFail) context XGWin, service EGTP1, interface type pgw-ingress, self address <X.X.X.X>,  peer address <Y.Y.Y.Y>, peer old restart counter 54,  peer new restart counter 12,  peer session count  1107240, failure reason restart-counter-change
Internal trap notification 1112 (EGTPCPathFail) context XGWin, service EGTP1, interface type pgw-ingress, self address <X.X.X.X>,  peer address <Y.Y.Y.Y>, peer old restart counter 12,  peer new restart counter 54,  peer session count  1107207, failure reason create-sess-restart-counter-change

Vendor Gateway (GW) has a problem accepting lesser values from the Serving GPRS Support Node (SGSN) if the restart counter is changed. If vendor GW has stored a higher value (old one) and after node reload if Cisco SGSN sends a lesser value, Vendor GW does not accept it.

Note: As per TS 29.060:

1. If the SGSN is in contact with the Gateway GPRS Support Node (GGSN) for the first time or has recently restarted without indicating the new Restart Counter value to the GGSN, it incorporates a Recovery information element into the Create Policy Decision Point (PDP) Context Request. This element is included by the SGSN when necessary. The GGSN that receives a Recovery information element in the Create PDP Context Request message element handles it like when receiving an Echo Response message. The Create PDP Context Request message is considered a valid activation request for the PDP context included in the message.

2. The GGSN includes the Recovery information element into the Create PDP Context Response if the GGSN is in contact with the SGSN for the first time or the GGSN has restarted recently and the new Restart Counter value has not yet been indicated to the SGSN. The SGSN receiving the Recovery information element handles it as when an Echo Response message is received. However, it considers the PDP context being created as active if the response indicates successful context activation at the GGSN.

3. The GTP interface uses a restart counter in order to track the number of restarts. As per TS 23.060, GTP nodes must use persistent storage in order to keep track of their local GTP restart counters so one expects these restart counters to proceed upwards always. However, in the event of a peer node detecting a decrease in the restart counter, the GTP node behavior is elaborated in session '18 GTP-C based restart procedures' of TS 23.007. Suppose the value of a restart counter previously stored for a peer is larger than the restart counter value received in the Echo Response message or the GTP-C message, taking the integer rollover into account. In that case, this indicates a possible race condition (newer message arriving before the older one). The received new Restart counter value is discarded and an error can be logged. In other words, when the GTP node detects a lower restart counter from a peer, it never records that new restart counter.

StarOS Perspective

From the StarOS end, you can explicitly change the RC value in the StarOS from the path /flash/restart_file_cntr.txt which is done at the time of the upgrade.

According to this theory, when comparing it to the current configuration, the MME RC value was lower than the Vendor GW RC value. In order to address the issue, the RC value at the Vendor GW node was modified.

Now after changing the RC value, it is seen that the EGTPC path failures stopped but still, sessions are not increasing and EGTPC links are still showing inactive.

These are the commands that were used during troubleshooting:

show sgtp-service all | grep "restart"  ----------------- to check RC value

[local]Nodename# show egtp-service all | more
  Service name                          : egtpc_sv_service
  Service-Id                            : 5
  Context                               : SGs
  Interface Type                        : mme
  Status                                : STARTED
  Restart Counter                       : 11   ----------------- RC value to verify
  Max Remote Restart Counter Change     : 255
  Message Validation Mode               : Standard
  GTPU-Context                          :
  GTPC Retransmission Timeout           : 5000 (milliseconds)
  GTPC Maximum Request Retransmissions  : 4
  GTPC IP QOS DSCP value                : 10
  GTPC Echo                             : Enabled
  GTPC Echo Mode                        : Default


[local]Nodename# show egtpc peers           ------------ To check link status
Sunday February 05 15:31:00 IST 2023
+----Status:                 (I) - Inactive  (A) - Active                        
|                                                                                
|+---GTPC Echo:              (D) - Disabled  (E) - Enabled                       
||                                                                               
||+--Restart Counter Sent:   (S) - Sent      (N) - Not Sent                      
|||                                                                              
|||+-Peer Restart Counter:   (K) - Known     (U) - Unknown                       
||||                                                                             
||||+-Type of Node:          (S) - SGW       (P) - PGW                           
|||||                        (M) - MME       (G) - SGSN                          
|||||                        (L) - LGW       (E) - ePDG                          
|||||                        (C) - CGW       (B) - MBMS                          
|||||                        (U) - Unknown                                       
|||||                                                                            
|||||  Service                       Restart--------+  No. of                    
|||||  ID                            Counter        |  restarts                  
|||||   |                                           |   |   Current       Max    
vvvvv   v              Peer Address                 v   v   sessions    sessions  LCI OCI
-----  --- --------------------------------------- --- --- ----------- ------------------
IDSKS  10                           X.X.X.X         91   0           0           0   X   X
IDNKS  11                           Y.Y.Y.Y          4  95           0       34005   X   X
IDNKS  11                           Z.Z.Z.Z         10 103           0       16805   X   X
IDNKS  11                           A.A.A.A        104  95           0        7250   X   X
AESKS  11                           B.B.B.B          0   0        4004       47649   X   X
AESKS  11                           C.C.C.C          0   0        4053       46571   X   X
AESKS  11                           D.D.D.D          0   0        4026       46734   X   X

ABove output peers if you see no sessions on this peer and also link are inactive

Further. check echo request/response (to be checked in hidden mode):

egtpc test echo gtp-version 2 src-address <MME end IP> peer-address <EPG end IP>

This is the output when the Restart Counter value is corrected and configured the same as that of MME for the S11 interface for the affected EGTP peer and then the Echo request/response is fine but the link is still inactive.

[s11mme]Nodename# egtpc test echo gtp-version 2 src-address <X.X.X.X> peer-address <Y.Y.Y.Y>

Sunday February 05 16:22:42 IST 2023

EGTPC test echo

---------------

Peer: X.X.X.X                          Tx/Rx:  1/1  RTT(ms): 1    (COMPLETE) Recovery: 10 (0x0A)

However, the same does not work as expected on other problematic affected GWs. You still get a failure for echo request/response as mentioned here.

[s11mme]Nodename# egtpc test echo gtp-version 2 src-address <X.X.X.X> peer-address <Y.Y.Y.Y>




Sunday February 05 16:46:11 IST 2023

EGTPC test echo

---------------

Peer: X.X.X.X                         Tx/Rx:  1/0  RTT(ms): 0    (FAILURE)

Workaround

1. In order to fix this issue, take note of the current restart counter in /flash/restart_file_cntr.txt before the VNF deactivation. Later, when it is activated with new software, log in to CF and update the file /flash/restart_file_cntr.txt with the old restart counter. Then, as a normal upgrade procedure, reload the VNF with day-N configuration.

2. Modify the cat /flash/restart_file_cntr.txt to the required value and reload the node with the current configuration.

Note: You can try with SGTPC restart as well once as the initial step.

Revision History

Revision	Publish Date	Comments
1.0	13-Dec-2023	Initial Release

Contributed by Cisco Engineers

Bharati Choudary
Cisco TAC Engineer
Krishna Kishore DV
Cisco TAC Engineer
Naveen Sampath
Cisco TAC Engineer

Was this Document Helpful?

Feedback

Contact Cisco

Open a Support Case
(Requires a Cisco Service Contract)

This Document Applies to These Products

ASR 5000 Series

Troubleshoot EGTP Path Failures Due to Mismatch in Restart Counter Values

Available Languages

Download Options

Bias-Free Language

Contents

Introduction

Troubleshooting Commands

Analysis

StarOS Perspective

Workaround

Revision History

Contributed by Cisco Engineers

Was this Document Helpful?

Contact Cisco

This Document Applies to These Products