Chapter 7. Introduction to Performance Routing (PfR)

Bandwidth cost, WAN latency, and lack of bandwidth availability all contribute to the complexities of running an efficient and cost-effective network that meets the unique, application-heavy workloads of today’s enterprise organizations. But as the volume of content and applications traveling across the network grows exponentially, organizations must optimize their WAN investments.

Cisco Performance Routing (PfR) is the IWAN intelligent path control component that can help administrators to accomplish the following:

Image Augment the WAN with additional bandwidth to including lower-cost connectivity options such as the Internet

Image Realize the cost benefits of provider flexibility and the ability to choose different transport technologies (such as MPLS L3VPN, VPLS, or the Internet)

Image Offload the corporate WAN with highly secure direct Internet access

Image Improve application performance and availability based upon an application’s performance requirements

Image Protect critical applications from fluctuating WAN performance

Performance Routing (PfR)

Cisco Performance Routing (PfR) improves application delivery and WAN efficiency. PfR dynamically controls data packet forwarding decisions by looking at application type, performance, policies, and path status. PfR protects business applications from fluctuating WAN performance while intelligently load-balancing traffic over the best-performing path based on the application policy.

Simplified Routing over a Transport-Independent Design

One of the critical IWAN components and also a key design decision was to architect the next-generation WAN around a transport-independent design (TID). The choice of DMVPN was extensively explained in Chapter 2, “Transport Independence.” This overlay approach allows the use of a single routing protocol over the WAN and greatly simplifies the routing decision process and Performance Routing in multiple ways, two of the main ones being

Image Simplified reachability information

Image Single routing domain

The first benefit of this overlay approach is simplified reachability information.

The traditional routing protocols were designed to solve the endpoint reachability problem in a hop-by-hop destination-only forwarding environment of unknown topology. The routing protocols choose only the best path based on statically assigned cost. There are a few exceptions where the network path used can be somewhat engineered. Some routing protocols can select a path that is not the shortest one (BGP, MPLS traffic engineering [TE]).

Designing deterministic routing behavior is difficult with multiple transport providers but is much simpler thanks to DMVPN. The DMVPN network topology is flat, and it is consistent because it is an overlay network that masks the network complexity underneath. This approach simplifies the logical view of the network and minimizes fundamental topology changes. Logically, only reachability to the next hop across the WAN can change.

An overlay network’s routing information is very simple: a set of destination prefixes, and a set of potential transport next hops for each destination. As a result, PfR just needs a mapping service that stores and serves all resolved forwarding states for connectivity per overlay network. Each forwarding state contains destination prefix, next hop (overlay IP address), and corresponding transport address.

The second benefit of using overlay networks is the single routing domain design. In traditional hybrid designs, it is common to have two (or more) routing domains:

Image One routing domain for the primary path over MPLS—EBGP, static, or default routes

Image One routing domain on the secondary path over the Internet—EIGRP, IBGP, or floating static routes

The complexity increases when routes are exchanged between the multiple routing domains, which can lead to suboptimal routing or routing loops. Using DMVPN for all WAN transports allows the use of a single routing protocol for all paths regardless of the of transport choice. Whether the topology is dual hybrid (MPLS plus Internet) or dual Internet (two Internet paths), the routing configuration remains exactly the same, meaning that if there is a change in how your provider chooses to deliver connectivity, or you wish to add or change a provider underneath the DMVPN, the investment in your WAN routing architecture is secure.

EIGRP and IBGP are the best routing protocol options today with DMVPN.

After routing connectivity is established, PfR enters the picture and provides the advanced path control in IWAN. PfR is not a replacement for the routing protocol and never will be. As an adjunct, PfR uses the next-hop information from the routing protocol and overrides it based on real-time performance and link utilization ratio. This next-hop information per destination prefix is critical for PfR to work correctly and is a critical element in the routing design. Having a single routing domain and a very basic mapping service requirement has greatly simplified PfR interaction with the routing protocol.

“Classic” Path Control Used in Routing Protocols

Path control, commonly referred to as “traffic engineering,” is the process of choosing the network path on which traffic is sent. The simplest form is trivial: send all traffic down the primary path unless the path goes down; in that case, send everything through the backup path.

Figure 7-1 illustrates the concept where R31 (branch) sends traffic to R11 (headquarters). When R31’s link to the MPLS provider fails, traffic is sent through the Internet.

Image

Figure 7-1 Traffic Flow over Primary and Backup Links

This approach has two main drawbacks:

Image Traffic is forwarded over a single path regardless of the application type, performance, or bandwidth issues.

Image The backup path is used only when the primary link goes down and not when there is performance degradation or brownouts over the primary path because the routing protocol peers are usually still up and running and do not detect such performance issues.

Path Control with Policy-Based Routing

The next level of path control lets the administrator specify categories of traffic to send on a specific path as long as that path remains up. One of the most common options is the use of policy-based routing (PBR), routing based on DSCP values:

Image DSCP values that are mapped to critical business applications and voice/video types of applications are assigned a next hop that is over the preferred path.

Image DSCP values that are mapped to best-effort applications or applications that do not suffer from performance degradation are assigned a next hop over the secondary path.

However, this approach is not intelligent and does not take into account the dynamic behavior of the network. Routing protocols have keepalive timers that can determine if the next hop is available, but they cannot determine when the path selected suffers from degraded performance, and the system cannot compensate.

Figure 7-2 illustrates the situation where R31 (branch) sends traffic to R11 (headquarters). When R31’s path across the MPLS provider experiences performance issues, traffic continues to be sent through the MPLS backbone. PBR alone is unaware of any performance problems. An additional mechanism is needed to detect events like these, such as the use of IP SLA probes.

Image

Figure 7-2 PBR’s Inability to Detect Problematic Links

Intelligent Path Control—Performance Routing

Classic routing protocols or path control with PBR cannot detect performance issues and fall back affected traffic to an alternative path. Intelligent path control solves this problem by monitoring actual application performance on the path that the applications are traversing, and by directing traffic to the appropriate path based on these real-time performance measurements.

When the current path experiences performance degradation, Cisco intelligent path control moves the affected flows according to user-defined policies.

Figure 7-3 illustrates the situation where R31 sends traffic to R11. When R31’s path across the MPLS provider experiences performance issues, only affected traffic is sent to the Internet path. The choice of traffic to fall back is based on defined policies. For example, voice or business application flows are forwarded over the secondary path, whereas best-effort traffic remains on the MPLS path.

Image

Figure 7-3 Traffic Flow over Multiple Links with Cisco Intelligent Path Control

Advanced path control should include the following:

Image Detection of issues such as delay, loss, jitter, and defined path preference before the associated application is impacted.

Image Passive performance measurement based on real user traffic when available and passively monitored on existing WAN edge routers. This helps support SLAs to protect critical traffic.

Image Efficient load distribution across the WAN links for medium-priority and best-effort traffic.

Image Effective reaction to any network outages before they can affect users or other aspects of the network. These include blackouts that cause a complete loss of connectivity as well as brownouts that are network slowdowns caused by path degradation along the route to the destination. Although blackouts can be detected easily, brownouts are much more challenging to track and are usually responsible for bad user experience.

Image Application-based policies that are designed to support the specific performance needs of applications (for example, point of sale, enterprise resource planning [ERP], and so on).

Image Low WAN overhead to ensure that control traffic is not contributing to overall traffic issues.

Image Easy management options, including a single point of administration and the ability to scale without a stacked deployment.

Cisco Performance Routing (PfR), part of Cisco IOS software, provides intelligent path control in IWAN and complements traditional routing technologies by using the intelligence of a Cisco IOS infrastructure to improve application performance and availability.

As explained before, PfR is not a replacement for the routing protocols but instead runs alongside of them to glean the next hop per destination prefix. PfR has APIs with NHRP, BGP, EIGRP, and the routing table to request information. It can monitor and then modify the path selected for each application based on advanced criteria, such as reachability, delay, loss, and jitter. PfR intelligently load-balances the remainder of the traffic among available paths based on the tunnel bandwidth utilization ratio.


Note

The routing table, known as the routing information base (RIB), is built from dynamic routing protocols and static and directly connected routes. The routing table is referred to as the RIB throughout the rest of this chapter.


Cisco PfR has evolved and improved over several releases with a focus on simplicity, ease of deployment, and scalability. Table 7-1 provides a list of features that have evolved with each version of PfR.

Image

Table 7-1 Evolution of PfR Versions and Features

Introduction to PfRv3

Performance Routing Version 3 (PfRv3) is the latest generation of the original PfR created more than ten years ago. PfRv3 focuses on ease of use and scalability to make it easy to transition to an intelligent network with PfR. It uses one-touch provisioning with multisite coordination to simplify its configuration and deployment from previous versions of PfR. PfRv3 is a DSCP- and application-based policy-driven framework that provides multisite path control optimization and is bandwidth aware for WAN- and cloud-based applications. PfRv3 is tightly integrated with existing AVC components such as Performance Monitor, QoS, and NBAR2.

PfR is composed of devices performing several roles, which are master controller (MC) and border router (BR). The MC serves as the control plane of PfR, and the BR is the forwarding plane which selects the path based on MC decisions.


Note

The MC and BR are components of the IOS software features on WAN routers.


Figure 7-4 illustrates the mechanics of PfRv3. Traffic policies are defined based on DSCP values or application names. Policies can state requirements and preferences for applications and path selection. A sample policy can state that voice traffic uses preferred path MPLS unless delay is above 200 ms. PfR learns the traffic, then starts measuring the bandwidth and performance characteristics. Then the MC makes a decision by comparing the real-time metrics with the policies and instructs the BRs to use the appropriate path.

Image

Figure 7-4 Mechanics of PfRv3


Note

The BRs automatically build a tunnel (known as an auto-tunnel) between other BRs at a site. If the MC instructs a BR to redirect traffic to a different BR, traffic is forwarded across the auto-tunnel to reach the other BR.



Note

The first iteration of PfRv3 was introduced in summer 2014 with IOS 15.4(3)M and IOS XE 3.13.


Introduction to the IWAN Domain

An IWAN domain is a collection of sites that share the same set of policies and are managed by the same logical PfR domain controller. Each site runs PfR and gets its path control configuration and policies from the logical IWAN domain controller through the IWAN peering service. At each site, an MC is the local decision maker and controls the BRs responsible for performance measurement and path enforcement. The IWAN domain can be an entire enterprise WAN, a particular region, and so forth.

The key point for PfRv3 is that provisioning is fully centralized at a logical domain controller, whereas path control decisions and enforcement are fully distributed within the sites that are in the domain, making the solution very scalable.

Figure 7-5 shows a typical IWAN domain with central and branch sites. R10, R20, R31, R41, and R51 are all MCs for their corresponding sites and retrieve their configuration from the logical domain controller. R11, R12, R21, R22, R31, R41, R51, and R52 are all BRs that report back to their local MC. Notice that R31, R41, and R51 operate as both the MC and the BR for their sites.

Image

Figure 7-5 IWAN Domain Concepts


Note

In the remainder of this book, all references to PfR mean PfRv3.


IWAN Sites

An IWAN domain includes a mandatory hub site, optional transit sites, as well as branch sites. Each site has a unique identifier called a site ID that is derived from the loopback address of the local MC.

Central and headquarters sites play a significant role in PfR and are called IWAN Points of Presence (POPs). Each site has a unique identifier called a POP ID. These sites house DMVPN hub routers and therefore provide the following traffic flows (streams):

Image Traditional DMVPN spoke-to-hub connectivity.

Image Spoke-to-hub-to-spoke connectivity until DMVPN spoke-to-spoke tunnels establish.

Image Connectivity through NHS chaining until DMVPN spoke-to-spoke tunnels establish.

Image Transit connectivity to another site via a data center interconnect (DCI) or shared data center network segment. In essence, these sites act as transit sites for the traffic crossing them. Imagine in Figure 7-5 that R31 goes through R21 to reach a network that resides in Site 1. R21 does not terminate the traffic at the local site; it provides transit connectivity to Site 1 via the DCI.

Image Data centers may or may not be colocated with the hub site. To elaborate further, some hub sites contain data centers whereas other hub sites do not contain data centers (such as outsourced colocation cages).

Hub site

Image The logical domain controller functions reside on this site’s MC.

Image Only one hub site exists per IWAN domain because of the uniqueness of the logical domain controller’s presence. The MC for this site is known as the Hub MC, thereby making this site the hub site.

Image MCs from all other sites (transit or branch) connect to the Hub MC for PfR configuration and policies.

Image A POP ID of 0 is automatically assigned to a hub site.

Image A hub site may contain all other properties of a transit site as defined below.

Transit sites

Image Transit sites are located in an enterprise central site, headquarters, or carrier-neutral facilities.

Image They provide transit connectivity to access servers in the data centers or for spoke-to-spoke traffic.

Image A data center may or may not be colocated with the transit site. A data center can be reached via a transit site.

Image A POP ID is configured for each transit site. This POP ID has to be unique in the domain.

Image The local MC (known as a Transit MC) peers with the Hub MC (domain controller) to get its policies and to monitor configuration and timers.

Branch sites

Image These are always DMVPN spokes and are stub sites where traffic transit is not allowed.

Image The local Branch MC peers with the logical domain controller (Hub MC) to get its policies and monitoring guidelines.

Figure 7-6 shows the IWAN sites in a domain with two central sites (one is defined as the hub site and the other as a transit site). R10, R11, and R12 belong to the hub site, and R20, R21, and R22 belong to a transit site. R31, R41, R51, and R52 belong to a branch site. The dotted lines represent the site’s local MC peering with the Hub MC.

Image

Figure 7-6 IWAN Domain Hub and Transit Sites

Device Components and Roles

The PfR architecture consists of two major Cisco IOS components, a master controller (MC) and a border router (BR). The MC is a policy decision point where policies are defined and applied to various traffic classes (TCs) that traverse the BR systems. The MC can be configured to learn and control TCs on the network:

Image Border routers (BRs) are in the data-forwarding path. BRs collect data from their Performance Monitor cache and smart probe results, provide a degree of aggregation of this information, and influence the packet forwarding path as directed by the site local MC to manage router traffic.

Image The master controller (MC) is the policy decision maker. At a large site, such as a data center or campus, the MC is a dedicated (physical or logical) router. For smaller branch locations, the MC is typically colocated (configured) on the same platform as the BR. As a general rule, large locations manage more network prefixes and applications than a branch deployment, thus consuming more CPU and memory resources for the MC function. Therefore, it is a good design practice to dedicate a chassis for the MC at large sites.

Each site in the PfR domain must include a local MC and at least one BR.

The branch typically manages fewer active network prefixes and applications. Because of the costs associated with dedicating a chassis at each branch, the network manager can colocate the local MC and BR on the same router platform. CPU and memory utilization should be monitored on platforms operating as MCs, and if utilization is high, the network manager should consider an MC platform with a higher-capacity CPU and memory. The local MC communicates with BRs and the Hub MC over an authenticated TCP socket but has no requirement for populating its own IP routing table with anything more than a route to reach the Hub MC and local BRs.

PfR is an intelligent path selection technology and requires

Image At least two external interfaces under the control of PfR

Image At least one internal interface under the control of PfR

Image At least one configured BR

Image If only one BR is configured, both external interfaces are attached to the single BR.

Image If more than one BR is configured, two or more external interfaces are configured across these BRs.

The BR, therefore, owns external links, or exit points; they may be logical (tunnel interfaces) or physical links (serial, Ethernet, and so on). With the IWAN prescriptive design, external interfaces are always logical DMVPN tunnels.

A device can fill five different roles in an IWAN domain:

Image Hub MC: This is the MC at the hub site. It acts as MC for the site, makes optimization decisions for that site, and provides the path control policies for all the other MCs. The Hub MC contains the logical PfR domain controller role.

Image Transit MC: This is the MC at a transit site that makes optimization decision for those sites. There is no policy configuration on Transit MCs because they receive their policies from the Hub MC.

Image Branch MC: The Branch MC is the MC for branch sites that makes optimization decisions for those sites. There is no policy configuration on Branch MCs because they receive their policies from the Hub MC.

Image Transit BR: The Transit BR is the BR at a hub or transit site. The WAN interface terminates in the BRs. PfR is enabled on these interfaces. At the time of this writing, only one WAN interface is supported on a Transit BR. This limitation is overcome by using multiple BR devices.


Note

Some Cisco documentation may refer to a Transit BR as a Hub BR, but the two function identically because transit site capabilities were included in a later release of PfR.


Image Branch BR: The Branch BR resides at the branch site and forwards traffic based on the decisions of the Branch MC. The only PfR configuration is the identification of the Branch MC and setting its role as a BR. The WAN interface that terminates on the device is detected automatically.

The PfR Hub MC is currently supported only on the IOS and IOS XE operating systems.

IWAN Peering

PfR uses an IWAN peering service between the MCs and BRs which is based on a publish/subscribe architecture. The current IWAN peering service uses Cisco SAF to distribute information between sites, including but not limited to

Image Learned site prefix

Image PfR policies

Image Performance Monitor information

The IWAN peering service provides an environment for service advertisement and discovery in a network. It is made up of two primary elements: client and forwarder.

Image An IWAN peering service client is a producer (advertises to the network), a consumer of services (requests a service from the network), or both.

Image An IWAN peering service SAF forwarder receives services advertised by clients, distributes the services reliably through the network, and makes services available to clients.

Image An IWAN peering service client needs to send a register message to a forwarder before it is able to advertise (publish) or request (subscribe to) services.

The IWAN peering service also adopts a logical unicast topology to implement the peering system. Each instance that joins the IWAN peering service serves as both a client and a forwarder:

Image The Hub MC listens for unicast packets for advertisements or publications from Transit MCs, Branch MCs, and local BRs.

Image The Transit MC peers with the Hub MC and listens to its local BRs.

Image The Branch MC peers with the Hub MC and listens to its local BRs.

Image BRs always peer with their local MC.

Figure 7-7 illustrates the IWAN peering service with the policies advertised from the Hub MC, the advertisement of monitors, and the exchange of site prefixes.

Image

Figure 7-7 IWAN SAF Peering Service

SAF is automatically configured when PfR is enabled on a site. SAF dynamically discovers and establishes a peering as defined previously. The Hub MC advertises all policies and monitoring configuration to all the sites. Every site is responsible for advertising its own site prefix information to other sites in the domain.

Each instance must use an interface with an IP address that is reachable (routed) through the network to join in the IWAN peering system. PfRv3 requires that this address be a loopback address. It is critical that all these loopback addresses be reachable across the IWAN domain.

Parent Route Lookups

PfR uses the concept of a parent route lookup which refers to locating all the paths that a packet can take to a specific network destination regardless of the best-path calculation. The parent route lookup is performed so that PfR can monitor all paths and thereby prevent network traffic from being blackholed because the BRs have only summary routes in their routing table. PfR has direct API accessibility into EIGRP and BGP and can identify all the paths available for a prefix regardless of whether alternative paths were installed into the RIB.

PfR requires a parent route for every WAN path (primary, secondary, and so on) for PfR to work effectively. PfR searches the following locations in the order listed to locate all the paths for a destination:

1. NHRP cache (when spoke-to-spoke direct tunnels are established)

2. BGP table (where applicable)

3. EIGRP topology table (where applicable)

4. Static routes (where applicable)

5. RIB. Only one path is selected by default. In order for multiple paths to be selected, the same routing protocol must find both paths to be equal. This is known as equal-cost multipathing (ECMP).


Note

If a protocol other than EIGRP or BGP is used, all the paths have to be ECMP in the RIB. Without ECMP in the RIB, PfR cannot identify alternative paths, and that hinders PfR’s effectiveness.


The following logic is used for parent route lookups:

Image The parent route lookup is done during channel creation (see the following section, “Intelligent Path Control Principles,” for more information).

Image For PfR Internet-bound traffic, the parent route lookup is done every time traffic is controlled.

In a typical IWAN design, BGP or EIGRP is configured to make sure MPLS is the preferred path and the Internet the backup path. Therefore, for any destination prefix, MPLS is the only available path in the RIB. But PfR looks into the BGP or EIGRP table and knows if the Internet is also a possible path and can use it for traffic forwarding in a loop-free manner.

Intelligent Path Control Principles

PfR is able to provide intelligent path control and visibility into applications by integrating with the Cisco Performance Monitoring Agent available on the WAN edge (BR) routers. Performance metrics are passively collected based on user traffic and include bandwidth, one-way delay, jitter, and loss.

PfR Policies

PfR policies are global to the IWAN domain and are configured on the Hub MC, then distributed to all MCs via the IWAN peering system. Policies can be defined per DSCP or per application name.

Branch and Transit MCs also receive the Cisco Performance Monitor instance definition, and they can instruct the local BRs to configure Performance Monitors over the WAN interfaces with the appropriate thresholds.

PfR policies are divided into three main groups:

Image Administrative policies: These policies define path preference definition, path of last resort, and zero SLA used to minimize control traffic on a metered interface.

Image Performance policies: These policies define thresholds for delay, loss, and jitter for user-defined DSCP values or application names.

Image Load-balancing policy: Load balancing can be enabled or disabled globally, or it can be enabled for specific network tunnels. In addition, load balancing can provide specific path preference (for example, the primary path can be INET01 and INET02 with a fallback of MPLS01 and MPLS02).

Site Discovery

PfRv3 was designed to simplify the configuration and deployment of branch sites. The configuration is kept to a minimum and includes the IP address of the Hub MC. All MCs connect to the Hub MC in a hub-and-spoke topology.

When a Branch or Transit MC starts:

Image It uses the loopback address of the local MC as its site ID.

Image It registers with the Hub MC, providing its site ID, then starts building the IWAN peering with the Hub MC to get all information needed to perform path control. That includes policies and Performance Monitor definitions.

Image The Hub MC advertises the site ID information for all sites to all its Branch or Transit MC clients.

At the end of this phase, all MCs have a site prefix database that contains the site ID for every site in the IWAN domain.


Note

The site ID is based on the local MC loopback address and is a critical piece of PfR. Routing for MC addresses must be carefully designed to ensure that this address is correctly advertised across all available paths.


Figure 7-8 shows the IWAN peering between all MCs and the Hub MC. R10 is the Hub MC for this topology. R20, R31, R41, and R51 peer with R10. This is the initial phase for site discovery.

Image

Figure 7-8 Demonstration of IWAN Peering to the Domain Controller

Site Prefix Database

PfR maintains a topology that contains all the network prefixes and their associated site IDs. A site prefix is the combination of a network and the site ID for the network prefix attached to that router. The PfR topology table is known as the site prefix database and is a vital component of PfR. The site prefix database resides on the MCs and BRs. The site prefix database located on the MC learns and manages the site prefixes and their origins from both local egress flow and advertisements from remote MC peers. The site prefix database located at a BR learns/manages the site prefixes and their origins only from the advertisements from remote peers. The site prefix database is organized as a longest prefix matching tree for efficient search.

Table 7-2 provides the site prefix database on all MCs and BRs for the IWAN domain shown in Figure 7-8. It provides a mapping between a destination prefix and a destination site.

Image

Table 7-2 Site Prefix Database for an IWAN Domain


Note

The site prefix database can contain multiple network prefixes per site and is not limited to just one. A second entry was added to the table for Site 1 to display the concept.


In order to learn from advertisements via the peering infrastructure from remote peers, every MC and BR subscribes to the site prefix subservice of the PfR peering service. MCs publish and receive site prefixes. BRs only receive site prefixes. An MC publishes the list of site prefixes learned from local egress flows by encoding the site prefixes and their origins into a message. This message can be received by all the other MCs and BRs that subscribe to the peering service. The message is then decoded and added to the site prefix databases at those MCs and BRs. Site prefixes will be explained in more detail in Chapter 8, “PfR Provisioning.”


Note

Site prefixes are dynamically learned at branch sites but must be statically defined at hub and transit sites. The branch site prefixes can be statically defined too.


PfR Enterprise Prefixes

The enterprise-prefix prefix list defines the boundary for all the internal enterprise prefixes. A prefix that is not from the enterprise-prefix prefix list is considered a PfR Internet prefix. PfR does not monitor performance (delay, jitter, byte loss, or packet loss) for network traffic.

In Figure 7-9, all the network prefixes for remote sites (Sites 3, 4, and 5) have been dynamically learned. The central sites (Site 1 and Site 2) have been statically configured. The enterprise-prefix prefix list has been configured to include all the network prefixes in each of the sites so that PfR can monitor performance.

Image

Figure 7-9 PfR Site and Enterprise Prefixes


Note

In centralized Internet access models, in order for PfR to monitor performance to Internet-based services (email hosting and so forth), the hosting network prefix must be assigned to the enterprise-prefix prefix list. In addition, the hosting network is added to all the site prefix lists for sites that provide Internet connectivity.


WAN Interface Discovery

Border router WAN interfaces are connected to different SPs and have to be defined or discovered by PfR. This definition creates the relationship between the SPs and the administrative policies based on the path name in PfR. A typical example is to define an MPLS-VPN path as the preferred one for all business applications and the Internet-based path as a fallback path when there is a performance issue on the primary.

Hub and Transit Sites

In a PfR domain, a path name and a path identifier need to be configured for every WAN interface (DMVPN tunnel) on the hub site and all transit sites:

Image The path name uniquely identifies a transport network. For example, this book uses a primary transport network called MPLS for the MPLS-based transport and a secondary transport network called INET for the Internet-based transport.

Image The path identifier uniquely identifies a path on a site. This book uses path-id 1 for DMVPN tunnel 100 connected to MPLS and path-id 2 for tunnel 200 connected to INET.

IWAN supports multiple BRs for the same DMVPN network on the hub and transit sites only. The path identifier has been introduced in PfR to be able to track every BR individually.

Every BR on a hub or transit site periodically sends a discovery packet with path information to every discovered site. The discovery packets are created with the following default parameters:

Image Source IP address: Local MC IP address

Image Destination IP address: Remote site ID (remote MC IP address)

Image Source port: 18000

Image Destination port: 19000

Branch Sites

WAN interfaces are automatically discovered on Branch BRs. There is no need to configure the transport names over the WAN interfaces.

When a BR on a branch site receives a discovery probe from a central site (hub or transit site):

Image It extracts the path name and path identifier information from the probe payload.

Image It stores the mapping between the WAN interface and the path name.

Image It sends the interface name, path name, and path identifier information to the local MC.

Image The local MC knows that a new WAN interface is available and also knows that a BR is available on that path with the path identifier.

The BR associates the tunnel with the correct path information, enables the Performance Monitors, collects performance metrics, collects site prefix information, and identifies traffic that can be controlled.

This discovery process simplifies the deployment of PfR.

Channel

Channels are logical entities used to measure path performance per DSCP between two sites. A channel is created based on real traffic observed on BRs and is based upon a unique combination of factors such as interface, site, next hop, and path. Channels are based on real user traffic or synthetic traffic generated by the BRs called smart probes. A channel is added every time a new DSCP, interface, or site is added to the prefix database or when a new smart probe is received. A channel is a logical construct in PfR and is used to keep track of next-hop reachability and collect the performance metrics per DSCP.


Note

In the IWAN 2.1 architecture, multiple next-hop capability was added so that PfR could monitor a path taken through a transit site. A channel is actually created per next hop. In topologies that include a transit site, a channel is created for every next hop to the destination prefix to monitor performance.


Figure 7-10 illustrates the channel creation over the MPLS path for DSCP EF. Every channel is used to track the next-hop availability and collect the performance metrics for the associated DSCP and destination site.

Image

Figure 7-10 Channel Creation for Monitoring Performance Metrics

When a channel needs to be created on a path, PfR creates corresponding channels for any alternative paths to the same destination. This allows PfR to keep track of the performance for the destination prefix and DSCP for every DMVPN network. Channels are deemed active or standby based on the routing decisions and PfR policies.

Multiple BRs can sit on a hub or transit site connected to the same DMVPN network. DMVPN hub routers function as NHRP NHSs for DMVPN and are the BRs for PfR. PfR supports multiple next-hop addresses for hub and transit sites only but limits each of the BRs to hosting only one DMVPN tunnel. This limitation is overcome by placing multiple BRs into a hub or transit site.

The combination of multiple next hops and transit sites creates a high level of availability. A destination prefix can be available across multiple central sites and multiple BRs. For example, if a next hop connected on the preferred path DMVPN tunnel 100 (MPLS) experiences delays, PfR is able to fail over to the other next hop available for DMVPN tunnel 100 that is connected to a different router. This avoids failing over to a less preferred path using DMVPN tunnel 200, which uses the Internet as a transport.

Figure 7-11 illustrates a branch with DSCP EF packets flowing to a hub or transit site that has two BRs connected to the MPLS DMVPN tunnel. Each path has the same path name (MPLS) and a unique path identifier (path-id 1 and path-id 2). If BR1 experiences performance issues, PfR fails over the affected traffic to BR2 over the same preferred path MPLS.

Image

Figure 7-11 Channels per Next Hop

A parent route lookup is done during channel creation. PfR first checks to see if there is an NHRP shortcut route available; if not, it then checks for parent routes in the order of BGP, EIGRP, static, and RIB. If at any point an NHRP shortcut route appears, PfR selects that and relinquishes using the parent route from one of the routing protocols. This behavior allows PfR to dynamically measure and utilize DMVPN shortcut paths to protect site-to-site traffic according to the defined polices as well.

A channel is deemed reachable if the following happens:

Image Traffic is received from the remote site.

Image An unreachable event is not received for two monitor intervals.

A channel is declared unreachable in both directions in the following circumstances:

Image No packets are received since the last unreachable time from the peer, as detected by the BR. This unreachable timer is defined as one second by default and can be tuned if needed.

Image The MC receives an unreachable event from a remote BR. The MC notifies the local BR to make the channel unreachable.

When a channel becomes unreachable, it is processed through the threshold crossing alert (TCA) messages, which will be described later in the chapter.

Smart Probes

Smart probes are synthetic packets that are generated from a BR and are primarily used for WAN interface discovery, delay calculation, and performance metric collection for standby channels. This synthetic traffic is generated only when real traffic is not present, except for periodic packets for one-way-delay measurement. The probes (RTP packets) are sent over the channels to the sites that have been discovered.

Controlled traffic is sent at periodic intervals:

Image Periodic probes: Periodic packets are sent to compute one-way delay. These probes are sent at regular intervals whether actual traffic is present or not. By default this is one-third of the monitoring interval (the default is 30 seconds), so by default periodic probes are sent every 10 seconds.

Image On-demand probes: These packets are sent only when there is no traffic on a channel. Twenty packets per second are generated per channel. As soon as user traffic is detected on a channel, the BR stops sending on-demand probes.

Traffic Class

PfR manages aggregations of flows called traffic classes (TCs). A traffic class is an aggregation of flows going to the same destination prefix, with the same DSCP or application name (if application-based policies are used).

Traffic classes are learned on the BR by monitoring a WAN interface’s egress traffic. This is based on a Performance Monitor instance applied on the external interface.

Traffic classes are divided into two groups:

Image Performance TCs: These are any TCs with performance metrics defined (delay, loss, jitter).

Image Non-performance TCs: The default group, these are the TCs that do not have any defined performance metrics (delay, loss, jitter), that is, TCs that do have any match statements in the policy definition on the Hub MC.

For every TC, the PfR route control maintains a list of active channels (current exits) and standby channels.


Note

Real-time load balancing affects only non-performance TCs. PfR moves default TCs between paths to keep bandwidth utilization within the boundaries of a predefined ratio. For performance TCs, new TCs use the least loaded path. After a traffic class is established, it stays on the path defined, unless that path becomes out of policy.


Path Selection

Path and next-hop selection in PfR depends on the routing design in conjunction with the PfR policies. From a central site (hub and transit) to a branch site, there is only one possible next hop per path. From a branch site to a central site, multiple next hops can be available and may span multiple sites. PfR has to make a choice among all next hops available to reach the destination prefix of the traffic to control.

Direction from Central Sites (Hub and Transit) to Spokes

Each central site is a distinct site by itself and controls only traffic toward the spoke on the WAN paths to that site. PfR does not redirect traffic between central sites across the DCI or WAN core to reach a remote site. If the WAN design requires that all the links be considered from POP to spoke, use a single MC to control all BRs from both central sites.

Direction from Spoke to Central Sites (Hub and Transit)

The path selection from BR to a central site router can vary based on the overall network design. The following sections provide more information on PfR’s path selection process.

Active/Standby Next Hop

The spoke considers all the paths (multiple next hops) toward the central sites and maintains a list of active/standby candidate next hops per prefix and interface. The concept of active and standby next hops is based on the routing best metric to gather information about the preferred POP for a given prefix. If the best metric for a given prefix is on a specific central site, all the next hops on that site for all the paths are tagged as active (only for that prefix). A next hop in a given list is considered to have a best metric based on the following metrics/criteria:

Image Advertised mask length

Image BGP weight and local preference

Image EIGRP feasible distance (FD) and successor FD

Transit Site Affinity

Transit Site Affinity (also called POP Preference) is used in the context of a multiple-transit-site deployment with the same set of prefixes advertised from multiple central sites. Some branches prefer a specific transit site over the other sites. The affinity of a branch to a transit site is configured by altering the routing metrics for prefix advertisements to the branch from the transit site. If one of the central sites advertising a specific prefix has the best next hop, the entire site is preferred over the other sites for all TCs to this destination prefix. Transit site preference is a higher-priority filter and takes precedence over path preference. The Transit Site Affinity feature was introduced in Cisco IWAN 2.1.

Path Preference

During Policy Decision Point (PDP), the exits are first sorted on the available bandwidth, Transit Site Affinity, and then a third sort algorithm that places all primary path preferences in the front of the list followed by fallback preferences. A common deployment use case is to define a primary path (MPLS) and a fallback path (INET). During PDP, MPLS is selected as the primary channel, and if INET is within policy it is selected as the fallback.

Image With path preference configured, PfR first considers all the links belonging to the preferred path (that is, it includes the active and the standby links belonging to thepreferred path) and then uses the fallback provider links.

Image Without path preference configured, PfR gives preference to the active channels and then the standby channels (active/standby is per prefix) with respect to the performance and policy decisions.


Note

Active/standby tagging happens whether Transit Site Affinity is enabled or disabled. The active and standby channels (per prefix) may span central sites if they advertise the same prefix. Spoke routers use a hash to choose the active channel.


Transit Site Affinity and Path Preference Usage

Transit Site Affinity and path preference are used in combination to influence the next-hop selection per TC. For example, this book uses a topology with two central sites (Site 1 and Site 2) and two paths (MPLS and INET). Both central sites advertise the same prefix (10.10.0.0/16 as an example), and Site 1 has the best next hop for that prefix (R11 advertises 10.10.0.0/16 with the highest BGP local preference). Enabling Transit Site Affinity and defining a path preference with MPLS as the primary and INET as the fallback path, the BR identifies the following routers (in order) for the next hop:

1. R11 is the primary next hop for TCs with 10.10.0.0/16 as the destination prefix

2. Then R12 (same site, because of Transit Site Affinity)

3. Then R21 (Site 2, because of path preference)

4. Then R22

Performance Monitoring

The PfR monitoring system interacts with the IOS component called Performance Monitor to achieve the following tasks:

Image Learning site prefixes and applications

Image Collecting and analyzing performance metrics per DSCP

Image Generating threshold crossing alerts

Image Generating out-of-policy report

Performance Monitor is a common infrastructure within Cisco IOS that passively collects performance metrics, number of packets, number of bytes, statistics, and more within the router. In addition, Performance Monitor organizes the metrics, formats them, and makes the information accessible and presentable based upon user needs. Performance Monitor provides a central repository for other components to access these metrics.

PfR is a client of Performance Monitor, and through the performance monitoring metrics, PfR builds a database from that information and uses it to make an appropriate path decision. When a BR component is enabled on a device, PfR configures and activates three Performance Monitor instances (PMIs) over all discovered WAN interfaces of branch sites, or over all configured WAN interfaces of hub or transit sites. Enablement of PMI on these interfaces is dynamic and completely automated by PfR. This configuration does not appear in the startup or running configuration file.

The PMIs are

Image Monitor 1: Site prefix learning (egress direction)

Image Monitor 2: Egress aggregate bandwidth per traffic class

Image Monitor 3: Performance measurements (ingress direction)

Monitor 3 contains two monitors: one dedicated to the business and media applications where failover time is critical (called quick monitor), and one allocated to the default traffic.

PfR policies are applied either to an application definition or to DSCP. Performance is measured only per DSCP because SPs can differentiate traffic only based on DSCP and not based on application.

Performance is measured between two sites where there is user traffic. This could be between hub and a spoke, or between two spokes; the mechanism remains the same.

Image The egress aggregate monitor instance captures the number of bytes and packets per TC on egress on the source site. This provides the bandwidth utilization per TC.

Image The ingress per DSCP monitor instance collects the performance metrics per DSCP (channel) on ingress on the destination site. Policies are applied to either application or DSCP. However, performance is measured per DSCP because SPs differentiate traffic only based on DSCP and not based on discovered application definitions. All TCs that have the same DSCP value get the same QoS treatment from the provider, and therefore there is no real need to collect performance metrics per application-based TC.

PfR passively collects metrics based on real user traffic and collects metrics on alternative paths too. The source MC then instructs the BR connected to the secondary paths to generate smart probes to the destination site. The PMI on the remote site collects statistics in the same way it would for actual user traffic. Thus, the health of a secondary path is known prior to being used for application traffic, and PfR can choose the best of the available paths on an application or DSCP basis.

Figure 7-12 illustrates PfR performance measurement with network traffic flowing from left to right. On the ingress BRs (BRs on the right), PfR monitors performance per channel. On the egress BRs (BRs on the left), PfR collects the bandwidth per TC. Metrics are collected from the user traffic on the active path and based on smart probes on the standby paths.

Image

Figure 7-12 PfR Performance Measurement via Performance Monitor


Note

Smart probes are not IP SLA probes. Smart probes are directly forged in the data plane from the source BR and discarded on the destination BR after performance metrics are collected.


Threshold Crossing Alert (TCA)

Threshold crossing alert (TCA) notifications are alerts for when network traffic exceeds a set threshold for a specific PfR policy. TCAs are generated from the PMI attached to the BR’s ingress WAN interfaces and smart probes. Figure 7-13 displays a TCA being raised on the destination BR.

Image

Figure 7-13 Threshold Crossing Alert (TCA)

Threshold crossing alerts are managed on both the destination BR and source MC for the following scenarios:

Image The destination BR receives performance TCA notifications from the PMI, which monitors the ingress traffic statistics and reports TCA alerts when threshold crossing events occur.

Image The BR forwards the performance TCA notifications to the MC on the source site that actually generates the traffic. This source MC is selected from the site prefix database based on the source prefix of the traffic. TCA notifications are transmitted via multiple paths for reliable delivery.

Image The source MC receives the TCA notifications from the destination BR and translates the TCA notifications (that contain performance statistics) to an out-of-policy (OOP) event for the corresponding channel.

Image The source MC waits for the TCA processing delay time for all the notifications to arrive, then starts processing the TCA. The processing involves selecting TCs that are affected by the TCA and moving them to an alternative path.

Path Enforcement

PfR uses the Route Control Enforcement module for optimal traffic redirection and path enforcement. This module performs lookups and reroutes traffic similarly to policy-based routing but without using an ACL. The MC makes path decisions for every unique TC. The MC picks the next hop for a TC’s path and instructs the local BR how to forward packets within that TC.

Because of how path enforcement is implemented, the next hop has to be directly connected to each BR. When there are multiple BRs on a site, PfR sets up an mGRE tunnel between all of them to accommodate path enforcement. Every time a WAN exit point is discovered or an up/down interface notification is sent to the MC, the MC sends this notification to all other BRs in the site. An endpoint is added to the mGRE tunnel pointing toward this BR as a result.

When packets are received on the LAN side of a BR, the route control functionality determines if it must exit via a local WAN interface or via another BR. If the next hop is via another BR, the packet is sent out on the tunnel toward that BR. Thus the packet arrives at the destination BR within the same site. Route control gets the packet, looks at the channel identifier, and selects the outgoing interface. The packet is then sent out of this interface across the WAN.

Summary

This chapter provided a thorough overview of Cisco intelligent path control, which is a core pillar of the Cisco IWAN architecture and is based upon Performance Routing (PfR). The following chapters will expand upon these theories while explaining the configuration of PfR.

PfR provides the following benefits for a WAN architecture:

Image Maximizes WAN bandwidth utilization

Image Protects applications from performance degradation

Image Uses passive monitoring to track application performance across the WAN

Image Enables the Internet as a viable WAN transport

Image Provides multisite coordination to simplify network-wide provisioning

Image Provides an application-based policy-driven framework that is tightly integrated with existing Performance Monitor components

Image Provides a smart and scalable multisite solution to enforce application SLAs while optimizing network resource utilization

PfRv3 is the third-generation multisite-aware bandwidth and path control/optimization solution for WAN- and cloud-based applications and is available now on Cisco Integrated Services Router (ISR) Generation 2 series, ISR-4000 Series, and CSR 1000V and ASR 1000 Series routers.

Further Reading

Cisco. “Performance Routing Version 3.” www.cisco.com.

Cisco. “PfRv3 Transit Site Support.” www.cisco.com.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset