Cisco SD-WAN App-aware SLA Based routing in Action

App-aware SLA Based routing in Action ....
Solving Real Life Problem

Today article is more like discussion with customer and how a feature can solve a real life problem - it's an experience sharing with you! One fine day, I connected to the customer and try to understand his experience on Cisco SD-WAN how things are working during PoC with small set of sites migrated to the SD-WAN setup recently. The customer said it is doing good but how can we more “PROACTIVE” in our approach if one of the critical applications is slow without getting call from remote site.

Hmmm… I was silent for a moment and then asked more information to understand the case. Then the customer said yesterday there was an issue with HRMS Application it was very slow and clients were complaining but being a new solution to us we were not able to decide the flow on how to troubleshoot and resolve the issue. Could you please help?

After going through the basic checks, it was clear that SD-WAN setup was running with basic configuration with default mode flow based load share across both the links (Internet and MPLS). 

One of the Troubleshooting Tools (Simulate Flow) in vManage helped to prove that application was using both the links. Using this feature, user can simulate the traffic to understand how it is flowing through the network –

Figure 1.1 – HRMS Running over Private and Internet link between DC

Not all the users but some of the users in the branch were facing issue who established session over the degraded link. Now the question was how to check which link is degraded? It was not difficult there are many options available from where network admin can find-out the SLAs over links –

Figure 1.2: vManage Main Dashboard Snapshot

From the output above we can see the internet link between the branch and DC has 100+ millisecond latency and loss is more than 2%. It is easy to say all the users who were using the internet link faced the slow response from the HRMS application and the SLAs over both the links are poles apart. 

But wait a minute it is for the complete network, what about the remote site? Do we have a capability to get the view of how links are working? Then said yes you can have a device level network view.

Figure 3: Branch Internet Link Loss Percentage is 8% compared to 0% loss on MPLS/private 1 TLOC (link)

It is clearly visible from above output that there was huge loss on the biz-internet link that was the reason for poor end user experience while accessing the HRMS application. Finally, the customer asked, how can we proactive in future where if something goes wrong with link SLA the application should shift to next available backup path automatically.

 Application aware routing is the answer to such issues which makes a network application aware. Administrator defined SLA for application is compared to link performance and in event of SLA breach, traffic is forwarded to backup link. Configuration steps followed are as follows -

 Step 1: Create lists for

  • IP Prefix – this list was created with all the HRMS application subnet
  • SLA List – this list was created with required SLA for HRMS application
  • Branch List – this list was created for all the branches where Application aware policy needs to apply
  • VPN list – this list contains the list of VPNs on which App-aware routing policy will be implemented

Figure 4: IP Prefix List - IPs for HRMS Applications

Figure 5: SLA list - SLA for HRMS is less than 80 ms latency

Figure 6: Branch list - will be used when applying the policy

Figure 7: VPN list - VPN 10 (LAN Subnet) will be used when applying the policy

Once the relevant list is created time to define the traffic rules for HRMS application. We are going to define a rule stating all the HRMS application traffic should use MPLS link (private 1 TLOC) and should use the HRMS SLA as per the figure 5. In case of SLA breach perform the load balancing across other link as well as we don’t have any link better than the MPLS

But if there are more than 2 link and you want if first MPLS link goes down should use second MPLS, you can select another option to use the preferred backup path.

Figure 8: App-aware routing definition using traffic sequence

Figure 9: Application of the App-aware policy to the branch and LAN VRN

Figure 10: Policy preview

After this App-aware SLA based policy was configured to make a network proactive and it was verified with Simulate flow utility of Troubleshooting tools and this time, the behavior was customized based on the policy – traffic for HRMS should flow through MPLS link in case of high latency, it should think about the load sharing.

Figure 11: After policy activation, HRMS is working over MPLS link

So, with this I conclude this article with a message that a over talked feature truly solves the real time customer problems. And an happy customer appreciate technology feature when the real challenge is solved by technology - not for the sake of compliance!