Latest

Cisco SDWAN bug: Identify vEdge Certificate Expired

Cisco SDWAN bug: Identify vEdge Certificate Expired

It may possible you are going to hit with the bug where a vEdge that has an expired certificate affecting control plane connections, which eventually impacts data plane connections resulting in loss of service. The impacted devices are vEdge 1000, vEdge 2000, and vEdge 100M/B platforms. 

Cisco Viptela SDWAN
Fig 1.1- Cisco Viptela SDWAN

In our case, our issue controller and vEdge versions were : 
vEdge : 20.4.2
Controllers : 20.6.4 

Fixed Versions
vEdge : 20.6.4.1
Controllers: 20.6.4.1

You can check the issue version and the fixed version on the below URL
Identify vEdge Certificate Expired on May 9th 2023 - Cisco

How you will know the cert is expired ?

Well just run the below command to see the status of the cert 

vEdge_NDNA1# show control local-properties

personality                       vedge
sp-organization-name      NDNA-1122
organization-name           NDNA-1122
root-ca-chain-status         Installed
certificate-status               Installed
certificate-validity           Not Valid -  certificate has expired    
certificate-not-valid-before  May 10 05:11:21 2013 GMT
certificate-not-valid-after    Jan 04 03:25:07 2038 GMT

what will be the impact of this ?
Below are the impact and you should make sure that the device will not be reboot in any case.

  • Loss of connections to vSmart
  • Loss of connections to vManage
  • Port-Hop 
  • Control policy changes such as topology changes in the network
  • Clear control connection
  • Interface Flaps
  • Device Reload

How to resolve this issue ?

You need to upgrade the controller which the fix version from Cisco. As in our case we have vManage/vSmart/vBond controllers on 20.6.4 and the fix is 20.6.4.1, So we need to first upgrade the controller and then upgrade the vEdge to the same fix code of 20.6.4.1

Lets do the upgrade procedure now. Before upgrade, please take the database backup and the AURA report from vManage CLI interface as shown below: 

STEP 1: Database backup
NDNA_vManage# request nms configuration-db backup path /home/admin/backup-may23.tar.gz

STEP 2: AURA Reports on SDWAN
Using the Python script, run the below command from the link
GitHub – Cisco DevNet/sure: SD-WAN Upgrade Readiness Experience
NDNA_vManage # vshell
NDNA_vManage:~$
NDNA_vManage:~$ python3 py3_sure.py -u NDNAuser1
vManage Password:
#########################################################
###         SURE – Version 3.0.0                      ###
#########################################################
###     Performing SD-WAN Upgrade Readiness Check     ###
#########################################################

*Starting Checks, this may take several minutes

**** Performing Critical checks

Critical Check:#01
Critical Check:#02
Critical Check:#03
Critical Check:#04
Critical Check:#05

Step 3: From the AURA report, we saw some of the issues with the Elastic search, so we removed that one by one from vManage

INFO:#08:Check:vManage:Elasticsearch Indices version
ERROR:#08: Check result:   Failed
ERROR:#08: Check Analysis: StatsDB indices with version below than 6.0 found for vManage version(20, 6, 4, 1)
ERROR:#08: List of indices with older versions  :
{'alarm_2018_09_30t23_15_36': 5.5999999999999996, 'ipsalert_2019_08_26t04_17_46': 5.5999999999999996, 'bridgemacstatistics_2018_09_30t23_15_35': 5.5999999999999996, 

STEP 4: Run from the vManage CLI to remove these

NDNA_vManage:~$ curl -X DELETE 'http://localhost:9200/alarm_2018_09_30t23_15_36'        NDNA_vManage:~$ curl -X DELETE 'http://localhost:9200/ipsalert_2019_08_26t04_17_46'
NDNA_vManage:~$ curl -X DELETE 'http://localhost:9200/bridgemacstatistics_2018_09_30t23_15_35' 

Step 5: Now upgrade the Controllers with 20.6.4.1 and later on vEdge with 20.6.4.1 from your vManage screen as the way you are doing

In case you are upgrading the router and rebooted and you see the control connections didn't came up, it may possible that due to the older version pick up. we maty need OOB access or console access to the router in order to resolve that and try upgrading again to the fix version.


You need OOB /Console/SSH to the device to reset the settings by which you get the control connection up again 
  • Rollback clock to May 1st, 2023 on vEdge which has control connection DOWN
  • Wait 2-3 minutes for the board-id to initialize. Check "show control local-properties" to ensure the device now has a SN listed in the output. (If this doesn't happen within 2-3 minutes, reload vEdge and check "show control local-properties" output to ensure the device now has a SN listed in the output)
  • Revert the clock to current time.
  • Verify if control connections are up