Chapter 6. Designing Fault-Tolerant IPSec VPNs

Chapter 2, “IPSec Overview,” and Chapter 3, “Enhanced IPSec Features,” presented the fundamental concepts of IPSec. Chapter 5, “IPSec VPN Architectures,” covered IPSec VPN architectural models at a conceptual level. In the next few chapters, you will focus on the design aspects of IPSec and begin to apply mechanisms and protocols you have learned in building real-life IPSec VPNs. In this chapter, you will explore mechanisms and architectures for fault-tolerant IPSec VPN design.

Link Fault Tolerance

Because VPN data networks are a critical element of the overall business process, you must ensure that the VPN provides a reliable service to users and their applications. This section focuses on designing fault-tolerant networks. A fault-tolerant VPN is a network that is resilient to changes in the routing paths that may be due to hardware, software, or path failures between the VPN ingress and egress points, including access to the VPN.

One of the primary rules of fault-tolerant network design is that there is no such thing as a cookie-cutter design that can be applied to all networks. You can, however, focus on VPN fault-tolerant design principles, which are dictated by the goals and objectives of the network. In many cases, the design might be more driven by economic factors than technical reasoning. Similarly, the design of fault-tolerant IPSec VPN networks depends on what faults the VPN needs to be able to withstand. Let’s start by looking at the types of faults that can happen in an IPSec VPN, and look at how to design a VPN that is tolerant to these failures.

You know from previous chapters that IPSec uses a peer-to-peer model that assumes the peers are reachable via an IP path provided by the network connecting the two peers. Figure 6-1 shows a basic IPSec VPN connecting two sites.

IPSec Peer Relationship

Figure 6-1. IPSec Peer Relationship

The model in the figure shows which of the components of the IPSec VPN are susceptible to failures and examines mechanisms to design fault-tolerant VPNs. From a fault-tolerance perspective, the IPSec VPN can be broken down into the following components:

  • The backbone IP network, connecting the sites of the VPN

  • Access link—The link that connects the IPSec gateway to the IP backbone

  • The IPSec gateway itself

Backbone Network Fault Tolerance

The backbone network, connecting the sites of an IPSec VPN, can be the public Internet, a private Layer 2 network, or a single service provider IP network. This network may be owned and operated by an organization other than the owner of the IPSec VPN. It is usually built to be fault tolerant to link and IP routing failures within the network. IPSec protocols simply use this backbone for transport and inherently use the IP packet-routing functions provided by this network. In many cases, the IPsec VPN designers have no control over the backbone IP fault tolerance capabilities. We will focus our attention on those components where we can affect fault tolerance – the VPN gateways.

Note

Because an entire book can be written on the fault-tolerant design principles of this backbone and it is not the subject of this book, the design rules and mechanisms to be used in the backbone are not covered here. If you are interested in learning more about this topic, you may wish to reference books published on this subject, including Fault-Tolerant IP and MPLS Networks, by Iftekhar Hussain (Cisco Press, 2004)

Access Link Fault Tolerance

Figure 6-1 showed the access link terminating directly on the IPSec gateway. Note that this is a conceptual representation, and it may not look like this in a real network. Figure 6-2 shows one common physical representation of a site.

Typical IPSec VPN Site ConnectivityI

Figure 6-2. Typical IPSec VPN Site ConnectivityI

In this figure, the access link from the backbone terminates on an IP gateway (INET-GW1-EAST) router that connects the site to the backbone. The IPSec gateway (VPN-GW1-EAST) that terminates IPSec from the remote site is a physically separate entity that is responsible for the IPSec functions. In this model, the INET-GW1-EAST router is the general-purpose gateway router to the site; connectivity to this site is obviously broken if the access link between the INET-GW router and the backbone goes down. One simple mechanism to protect from this type of failure is adding a second access link from the INET-GW to the backbone. In addition to this simple scheme, there are many more ways to make this design fault tolerant to link and node failures of the INET-GW. This is a generic IP network design issue—as IPSec designers, we won’t dwell on this element. The INET-GW, just like the backbone network, simply provides IP transport to the IPSec gateway. For all discussions in this chapter, we will use the conceptual representation of the IPSec VPN shown in Figure 6-1.

So far, we have discussed what we will not cover in this chapter; now, let’s explore components that we will address with respect to IPSec fault-tolerant design. From an IPSec point of view, the chapter will cover two points of failure to provide a fault-tolerant IPSec VPN site. The first point of failure is the access link that terminates on the IPSec gateway (VPN-GW1-WEST). The second point of failure is the gateway itself, and the chapter addresses the VPN-GW1-WEST gateway failure in the peer redundancy section. This section, focuses on designing the site to withstand access link failures.

The most obvious solution to the access link failure is shown in Figure 6-3. Adding a second link to terminate on VPN-GW1-WEST and enabling both links for IPSec certainly improves the fault tolerance.

Redundant IPSec Access Links

Figure 6-3. Redundant IPSec Access Links

Although the solution is, conceptually, quite simple, there are some interesting “twists” in the IPSec forwarding and control plane that need to be considered make this work. IPSec redundancy in the forwarding plane will be achieved only if the same IPSec policies are applied to both links. The control plane (IKE) is a bit more complicated because the application of IPSec on the redundant links creates two possible IKE identities for VPN-GW1-WEST. Recall from Chapter 2, “IPSec Overview,” that the IKE process establishes IPSec connectivity and validates the peer based on the identity provided during the initial IKE exchange. Therefore, the selection of an IKE identity is critical in building an IKE relationship with a peer. The designer may choose to create either multiple IKE identities or a single IKE identity. In the next sections, we explore the implications of choosing either of these identity models.

Multiple IKE Identities

You know, from previous chapters, that the IKE identity of an initiator is derived from the source IP address in the initial IKE message. The initiator knows the IKE identity of its peer from the “set peer” configuration. In Figure 6-3, the VPN-GW1-WEST has two access links enabled for IPSec, and either of these access link IP addresses can be configured on SPOKE-1-WEST as the IKE identity of the VPN-GW1-WEST using the set peer command. In the multiple IKE identity model, both access link IP addresses are configured on SPOKE-1-WEST as IKE identities of VPN-GW1-WEST. When SPOKE-1-WEST initiates IKE negotiation, the first peer IP address is used by IKE and becomes VPN-GW1-WEST’s IKE identity for this peer. If this IKE SA times out during the negotiation, the second IP address becomes the IKE identity of the VPN-GW1-WEST.

Example 6-1 shows the configuration of the IPSec peers in Figure 6-3.

Example 6-1. IPSec Peer with Multiple IKE Identities

VPN-GW1-WEST
crypto isakmp policy 10
 hash md5
 authentication pre-share
crypto isakmp key cisco address 9.1.1.130
crypto isakmp keepalive 60
!
crypto ipsec transform-set esp-tunnel esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.130
 set transform-set esp-tunnel
 match address esp-tunnel-list
!
interface Serial1/0:0
 ip address 9.1.1.22 255.255.255.252
 crypto map vpn
!
interface Serial1/1:0
 ip address 9.1.1.26 255.255.255.252
 crypto map vpn
!
!
ip access-list extended esp-tunnel-list
 permit ip 10.0.0.0 0.255.255.255 10.0.64.0 0.0.0.255
__________________________________________________________
SPOKE- 1 -WEST
crypto isakmp policy 10
 hash md5
 authentication pre-share
crypto isakmp key cisco address 9.1.1.22
crypto isakmp key cisco address 9.1.1.26
crypto isakmp keepalive 60
!
crypto ipsec transform-set esp-tunnel esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.26                                                                    
 set peer 9.1.1.22                                                                    
 set transform-set esp-tunnel
 match address esp-tunnel-list
!
interface Ethernet0
 ip address 10.0.64.1 255.255.255.0
!
interface Serial0
 ip address 9.1.1.130 255.255.255.252
 crypto map vpn
!
!
ip access-list extended esp-tunnel-list
 permit ip 10.0.64.0 0.0.0.255 10.0.0.0 0.255.255.255

Note the two set peer statements on SPOKE-1-WEST’s crypto map configuration. The order of the set peer statements on the spoke is important, because it is the order of the statements that determines which IKE identity will be used by the spoke for the first IKE initialization. Step through the process of the peer establishment and see how IKE SAs and IPSec SAs are set up in this model; for this example, assume the IKE is initiated by SPOKE-1-WEST to VPN-GW1-WEST.

  1. Interesting traffic that matches the IPSec policy on SPOKE-1-WEST triggers an IKE connection toward VPN-GW1-WEST.

  2. The IKE message in step 1 uses the IP address from the first set peer statement on SPOKE-1-WEST (9.1.1.26) as the IKE identity of VPN-GW1-WEST. The IKE message is destined to this IP address with a source address of the serial0 interface IP address (9.1.1.130).

  3. The IKE packet is received by VPN-GW1-WEST and passed on for IKE processing.

  4. The IKE response packet is built by VPN-GW1-WEST and is sent to the source IP address of the original IKE message (serial Serial0 IP address of SPOKE-1-WEST).

  5. The IKE response is received by the spoke and processed.

Recall from Chapter 2, “IPSec Overview,” that this completes phase, 1 of IKE and creates the IKE SAs on both peers. IKE phase 2 follows this phase and IPSec SAs are built on both the peers. So far, so good! Next, you’ll see how IP routing configuration on VPN-GW1-WEST can add some interesting twists in the data plane.

It’s possible that IP routing configuration on VPN-GW1-WEST is such that the access link used to send packets to SPOKE-1-WEST is not the same as the access link on which packets arrive from SPOKE-1-WEST to VPN-GW1-WEST. This is known as asymmetric routing, and it is not uncommon in a multi-path configuration. Let’s see what effect asymmetric routing has on IPSec.

IKE phase 1 is unaffected by asymmetric routing. IKE phase 2 establishes the IPSec SAs between peers and binds them to the interface that matches the IKE identity. As an example, assume that SPOKE-1-WEST initiates IKE to VPN-GW1-WEST’s serial1/1:0 IP address (9.1.1.26). Example 6-2 shows how the IKE and IPSec SAs look on VPN-GW1-WEST after IKE negotiation is complete between the two peers.

Example 6-2. IKE and IPSec SA State for Distinct IKE Identities

VPN-GW1-WEST IKE Status
Vpn-gw1-west# show crypto isakmp sa
dst             src             state            conn-id     slot
9.1.1.130       9.1.1.26        QM_IDLE                2        0

VPN-GW1-WEST IPSec Status
Vpn-gw1-west# show crypto ipsec sa

interface: Serial1/0:0
    Crypto map tag: vpn, local addr. 9.1.1.22

   local  ident (addr/mask/prot/port): (10.0.0.0/255.0.0.0/0/0)
   remote ident (addr/mask/prot/port): (10.0.64.0/255.255.255.0/0/0)
   current_peer: 9.1.1.130
     PERMIT, flags={origin_is_acl,}
    #pkts encaps: 0, #pkts encrypt: 0, #pkts digest 0
    #pkts decaps: , #pkts decrypt: 0, #pkts verify 0
    #pkts compressed: 0, #pkts decompressed: 0
    #pkts not compressed: 0, #pkts compr. failed: 0, #pkts decompress failed: 0
    #send errors 0, #recv errors 0

     local crypto endpt.: 9.1.1.22, remote crypto endpt.: 9.1.1.130
     path mtu 1500, media mtu 1500
     current outbound spi: 0

     inbound esp sas:

     inbound ah sas:

     inbound pcp sas:

     outbound esp sas:

     outbound ah sas:

     outbound pcp sas:


interface: Serial1/1:0
    Crypto map tag: vpn, local addr. 9.1.1.26

   local ident (addr/mask/prot/port): (10.0.0.0/255.0.0.0/0/0)
   remote ident (addr/mask/prot/port): (10.0.64.0/255.255.255.0/0/0)
   current_peer: 9.1.1.130
     PERMIT, flags={origin_is_acl,}
    #pkts encaps: 4, #pkts encrypt: 4, #pkts digest 4
    #pkts decaps: 4, #pkts decrypt: 4, #pkts verify 4
    #pkts compressed: 0, #pkts decompressed: 0
    #pkts not compressed: 0, #pkts compr. failed: 0, #pkts decompress failed: 0
    #send errors 6, #recv errors 0

     local crypto endpt.: 9.1.1.26, remote crypto endpt.: 9.1.1.130
     path mtu 1500, media mtu 1500
     current outbound spi: 8711A91

     inbound esp sas:
      spi: 0x4DC4EC6(81546950)
        transform: esp-des esp-md5-hmac ,
        in use settings ={Tunnel, }
        slot: 0, conn id: 2029, flow_id: 1, crypto map: vpn
        sa timing: remaining key lifetime (k/sec): (4607999/3117)
        IV size: 8 bytes
        replay detection support: Y

     inbound ah sas:

     inbound pcp sas:
     outbound esp sas:
      spi: 0x8711A91(141630097)
        transform: esp-des esp-md5-hmac ,
        in use settings ={Tunnel, }
        slot: 0, conn id: 2030, flow_id: 2, crypto map: vpn
        sa timing: remaining key lifetime (k/sec): (4607999/3108)
        IV size: 8 bytes
        replay detection support: Y

     outbound ah sas:

     outbound pcp sas:

You see that the IPSec SA on VPN-GW1-WEST is bound to the Serial1/1:0 interface. You can derive this information from the output due to the fact that both the inbound and outbound SAs only exist on serial1/1:0.

Now, you see what happens in the forwarding plane. Remember, IP routing is asymmetric here, which means packets from SPOKE-1-WEST to VPN-GW1-WEST arrive on serial1/1:0, but packets from VPN-GW1-WEST to SPOKE-1-WEST will take interface Serial1/0:0. You know that the IPSec SAs on VPN-GW1-WEST are bound to serial1/1:0, therefore IPSec traffic traveling inbound to the gateway will be fine, but you have a problem for outbound traffic from VPN-GW1-WEST to SPOKE-1-WEST: There is no IPSec SA on serial1/0:0. Actually, things are not all that bad because there is a crypto-map on serial1/0:0 that has the same policy as the one on Serial1/1:0. Recall from Chapter 2, “IPSec Overview,” that interesting traffic out of an interface will trigger IKE negotiation. As a result of this IKE negotiation, a new set of IKE and IPSec SAs, is initiated from VPN-GW1-WEST to SPOKE-1-WEST.

The spoke processes this new IKE SA and replaces IPSec SAs for this crypto map. This means that the spoke may have multiple IKE identities for the same VPN-GW. All this extra IKE and IPSec SA creation obviously means more memory usage on the peers, which is not desirable. Figure 6-4 illustrates this scenario.

Asymmetric IKE Peering due to Fault-Tolerant Configuration

Figure 6-4. Asymmetric IKE Peering due to Fault-Tolerant Configuration

Even if there is no asymmetric routing and the primary link on the VPN-GW fails (assuming the remote peer detects three missing IKE keepalive messages), the remote peer initiates another IKE and IPSec SA with the VPN-GW, and the original IKE and IPSec SA associated with Serial 1/1:0 still remain in the VPN-GW1-WEST SADB until they time out. Therefore, this transient memory usage must be taken into account even without asymmetric routing in the multiple IKE identity model.

Multiple IKE Identities Associated with Dial Backup

Dial backup solutions introduce a slight modification to the previous scenario. Note in Figure 6-3 that the access link on the VPN-GW is protected, whereas the access link on the spoke is not protected. Dial backup solutions are a special case of multiple IKE identities in which both ends have unique IKE identities. Of course, the underlying premise is that the spoke can detect a failure in the primary path to the VPN-GW. There are many ways to do this (for example, use dialer watch on the spoke to detect local interface failure or watch for route removal due to primary path failure). Figure 6-6 shows the topology in which dial backup is used to provide access link redundancy.

Note

Using dialer watch to monitor the availability of a local interface does not allow the spoke to detect the failure of the access link at the VPN-GW. If the access link fails, the spoke will persistently retry an IPSec connection via the backbone. Various methods exist for validating the primary path through the backbone. Examples include using BGP between the hub and spoke, building GRE over IPSec with an IGP routing protocol, or using an IPSec-aware probe on the spoke to monitor the viability of the backbone path. Upon detection of backbone path failure, the spoke transitions to the dial backup interface.

The important thing to remember is that the failure of the primary path causes the spoke to invoke a temporary backup connection via the PSTN (Public Switched Telephone Network) to the VPN-GW. Note from Figure 6-5 that the spoke and the VPN-GW have unique IP addresses on these dialer interfaces; hence, there are multiple IKE identities on both the spoke and VPN-GW.

Sequence of IKE Initialization for Asymmetric Routing

Figure 6-5. Sequence of IKE Initialization for Asymmetric Routing

Multiple IKE Identities Using Dial Backup

Figure 6-6. Multiple IKE Identities Using Dial Backup

One of the more important abilities when designing access link redundancy using dial backup is knowing when to tear down the backup link. IPSec will attempt to keep the dial backup link active with IKE keepalives (or the equivalent if GRE over IPSec is used over the dial backup interface). Because the cost of dial backup services can be quite high, the designer should be very careful to design a failback solution to avoid using the dial backup path more than necessary. Again, the same mechanisms used for primary path failure detection may be used to reset the conditions at the spoke such that the dial backup interface is no longer preferred.

Single IKE Identity

As the name suggests, in this model the VPN-GW is identified by all of its peers with a single IKE identity which is achieved by using a loopback address on the VPN-GW as the source IP address for IKE exchange messages.

Note

Any interface with an IPv4 address may be used as an IKE identity. The loopback interface is commonly chosen because it is always in an active state, whereas other interfaces are dependent upon the state of the media connection.

Figure 6-7 demonstrates the topology with the single IKE identity defined by a loopback interface.

IPSec Gateway with a Single IKE Identity

Figure 6-7. IPSec Gateway with a Single IKE Identity

Figure 6-7 shows that a single IKE identity associated with the loopback interface (9.1.1.129) may be used for an IKE security association over either interface 9.1.1.22 or 9.1.1.26. Although a single IKE SA is shared by the interfaces, unique IPSec SAs must be established for the respective interfaces.

The biggest advantages of this scheme are:

  • Establishment of a single IKE connection with a given peer saves resources on the VPN-GW and makes troubleshooting easier. A common IKE SA will exist between the peers regardless of which interface is used. This minimizes instability during failover periods by reusing the existing IKE SA.

  • Decoupling of the IKE connection from the state of the access link that terminates IPSec expedites the transition of the IPSec SA to an alternate interface. The IKE connection may be tied to the loopback interface whose state never goes down. In that case, IKE phase 1 does not need to be reestablished during failover periods.

Example 6-3 shows the hub VPN-GW1-WEST and the remote peer SPOKE-1-WEST. The configuration is essentially the same as the multiple IKE model except for the use of the loopback address.

Example 6-3. IPSec Gateway with a Single IKE Identity

VPN-GW1-WEST
crypto isakmp policy 10
 hash md5
 authentication pre-share
crypto isakmp key cisco address 9.1.1.14
!
crypto ipsec transform-set esp-tunnel esp-des esp-md5-hmac
!
crypto map vpn local-address Loopback0
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.130
 set transform-set esp-tunnel
 match address esp-tunnel-list
!
interface Loopback0
 ip address 9.1.1.49 255.255.255.252
!
interface Ethernet4/0
 ip address 10.1.0.1 255.255.255.0
!
interface Serial1/0:0
 ip address 9.1.1.22 255.255.255.252
 crypto map vpn
!
interface Serial1/1:0
 ip address 9.1.1.26 255.255.255.252
 crypto map vpn
!
router bgp 100
 no synchronization
 network 9.1.1.48 mask 255.255.255.252
 neighbor 9.1.1.21 remote-as 50
 neighbor 9.1.1.25 remote-as 50
!
ip access-list extended esp-tunnel-list
 permit ip 10.0.0.0 0.255.255.255 10.0.64.0 0.0.0.255
__________________________________________________________
SPOKE-1-WEST
crypto isakmp policy 10
 hash md5
 authentication pre-share
crypto isakmp key cisco address 9.1.1.49
!
crypto ipsec transform-set esp-tunnel esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.49
 set transform-set esp-tunnel
 match address esp-tunnel-list
!
interface Ethernet0
 ip address 10.0.64.1 255.255.255.0
!
interface Serial0
 ip address 9.1.1.130 255.255.255.252
 crypto map vpn
!
!
ip access-list extended esp-tunnel-list
 permit ip 10.0.64.0 0.0.0.255 10.0.0.0 0.255.255.255

In this configuration, both access links advertise IP reachability of the loopback interface on the VPN-GW. In the event of failure of one of the access links, the loopback is still reachable from VPN-GW’s peers. The redundant link will maintain the established IKE and IPSec security associations.

On the VPN-GW, both access links have the same crypto map configuration as in the multiple IKE model. Cisco IOS IPSec implementation causes an interesting and useful effect in this model. IOS mirrors all the IPSec SAs on both the access links, even if they were originally set up on one of the access links. In the given example, if IP routing dictates that the path to and from a remote peer to the VPN-GW’s loopback is via Serial 1/1:0, the IPSec SAs that are created will be replicated on all access links where the crypto map is applied. In the multiple IKE model, the IKE and IPSec SAs are associated only on the interface where IKE packets arrived in the VPN-GW from the spoke. In the scenario of a failed primary access link, IPSec SAs need not be negotiated with the remote peer; they pre-exist. The IKE SA in this model is also not renegotiated if one of the access links goes down because the loopback (IKE identity of the VPN-GW) never goes down.

Example 6-4 shows the IKE and IPSec SAs on VPN-GW1-WEST. The inbound and outbound secure parameter indices (SPIs) of both serial interfaces are identical. This demonstrates that encrypted packets arriving on either interface may be decrypted with the existing IPSec SA rather than requiring the reestablishment of a new IPSec on the alternate interface.

Example 6-4. VPN-GW1-WEST Crypto State

Vpn-gw1-west# show crypto ipsec sa map vpn

interface: Serial1/0:0
    Crypto map tag: vpn, local addr. 9.1.1.49
   local  ident (addr/mask/prot/port): (10.0.0.0/255.0.0.0/0/0)
   remote ident (addr/mask/prot/port): (10.0.64.0/255.255.255.0/0/0)
   current_peer: 9.1.1.130
     PERMIT, flags={origin_is_acl,}
    #pkts encaps: 7, #pkts encrypt: 7, #pkts digest 7
    #pkts decaps: 7, #pkts decrypt: 7, #pkts verify 7
    #pkts compressed: 0, #pkts decompressed: 0
    #pkts not compressed: 0, #pkts compr. failed: 0, #pkts decompress failed: 0
    #send errors 0, #recv errors 0

     local crypto endpt.: 9.1.1.49, remote crypto endpt.: 9.1.1.130
     path mtu 1500, media mtu 1500
     current outbound spi: 178108C4

     inbound esp sas:                                                                 
      spi: 0x9393CB47(2475936583)                                                     
        transform: esp-des esp-md5-hmac ,
        in use settings ={Tunnel, }
        slot: 0, conn id: 2029, flow_id: 1, crypto map: vpn
        sa timing: remaining key lifetime (k/sec): (4607997/3105)
        IV size: 8 bytes
        replay detection support: Y

     inbound ah sas:

     inbound pcp sas:

     outbound esp sas:                                                                
      spi: 0x178108C4(394332356)                                                      
        transform: esp-des esp-md5-hmac ,
        in use settings ={Tunnel, }
        slot: 0, conn id: 2030, flow_id: 2, crypto map: vpn
        sa timing: remaining key lifetime (k/sec): (4607998/3096)
        IV size: 8 bytes
        replay detection support: Y

     outbound ah sas:

     outbound pcp sas:


interface: Serial1/1:0
    Crypto map tag: vpn, local addr. 9.1.1.49

   local  ident (addr/mask/prot/port): (10.0.0.0/255.0.0.0/0/0)
   remote ident (addr/mask/prot/port): (10.0.64.0/255.255.255.0/0/0)
   current_peer: 9.1.1.130
     PERMIT, flags={origin_is_acl,}
    #pkts encaps: 7, #pkts encrypt: 7, #pkts digest 7
    #pkts decaps: 7, #pkts decrypt: 7, #pkts verify 7
    #pkts compressed: 0, #pkts decompressed: 0
    #pkts not compressed: 0, #pkts compr. failed: 0, #pkts decompress failed: 0
    #send errors 0, #recv errors 0

     local crypto endpt.: 9.1.1.49, remote crypto endpt.: 9.1.1.130
     path mtu 1500, media mtu 1500
     current outbound spi: 178108C4

     inbound esp sas:                                                                 
      spi: 0x9393CB47(2475936583)                                                     
        transform: esp-des esp-md5-hmac ,
        in use settings ={Tunnel, }
        slot: 0, conn id: 2029, flow_id: 1, crypto map: vpn
        sa timing: remaining key lifetime (k/sec): (4607997/3096)
        IV size: 8 bytes
        replay detection support: Y

     inbound ah sas:

     inbound pcp sas:

     outbound esp sas:                                                                
      spi: 0x178108C4(394332356)                                                      
        transform: esp-des esp-md5-hmac ,
        in use settings ={Tunnel, }
        slot: 0, conn id: 2030, flow_id: 2, crypto map: vpn
        sa timing: remaining key lifetime (k/sec): (4607998/3096)
        IV size: 8 bytes
        replay detection support: Y

     outbound ah sas:

     outbound pcp sas:

The IPSec SA is replicated on all of the interfaces where the crypto map is applied and is associated with the IKE. The connection state might be considered stateful across all the interfaces. As a result, the memory and processing resources are conserved. In addition, the end-to-end traffic flow may be restored as fast as the routing protocols converge between the two IPSec peers. Once convergence occurs between the two peers, the IPSec flow using the IKE identities as the IPSec headers is also restored.

Caution

If the routing protocol convergence takes longer than the IKE dead peer detection, then the IKE SAs expire and the memory resources allocated to the IPSec SA are removed. Care must be taken to ensure that routing transients don’t induce repeated disruptions of the IPSec processes. You accomplish this by tuning the routing protocols between the IPSec peers so they converge faster than IKE detects a dead peer or by extending the dead peer detection interval of IKE such that the routing protocol can converge first.

Single IKE Identity Using Multi-link PPP on the Access Links

A variation of the single IKE identity model can be implemented if the access links that terminate on the IPSec gateway are members of a multi-link PPP bundle (MLPPP). MLPPP allows multiple PPP links terminating on a router to appear as a single logical access link from a Layer 3 perspective; the multiple links may be used to provide link redundancy. Although the primary function of MLPPP is to provide load balancing across the multiple PPP links, a secondary benefit is the resiliency created by the additional links, which can be exploited. When MLPPP is used, individual links in the MLPPP bundle may become active or inactive without changing the state of the MLPPP interface. With IPSec associated with the MLPPP interface as opposed to the individual interfaces in the MLPPP bundle, the IPSec state is decoupled from the state of the individual links in the MLPPP bundle. The use of MLPPP is shown in Figure 6-8.

IPSec Associated with an MLPPP Interface

Figure 6-8. IPSec Associated with an MLPPP Interface

In this configuration, the gateway is identified to its peers by a single IP address (the address configured on the multi-link interface) similar to the loopback interface in the single IKE identity model. Example 6-5 shows the configuration of the VPN-GW shown in Figure 6-8.

Example 6-5. VPN-GW1-WEST Multi-link PPP Bundle

interface Multilink1
 ip address 9.1.1.22 255.255.255.252
 ppp multilink
 no ppp multilink fragmentation
 multilink-group 1
 crypto map vpn
!
interface Serial1/0:0
 encapsulation ppp
 no fair-queue
 ppp multilink
 multilink-group 1
!
interface Serial1/1:0
 encapsulation ppp
 no fair-queue
 ppp multilink
 multilink-group 1
!

The loss of a single PPP link in a multi-link bundle on the VPN-GW does not affect end-to-end IP connectivity from the remote peer. The IPSec peer relationship between the remote peer (SPOKE-1-WEST) and the hub (VPN-GW1-WEST) remains active and, more importantly, the IKE and IPSec SAs are not renegotiated. Note also that the IPSec configuration is simplified because the same crypto map is applicable to all the physical and virtual interfaces associated with the link bundle.

Access Link Fault Tolerance Summary

Our focus on fault-tolerant link access models has addressed one of the most common failure elements in a VPN—the access link. A variety of methods has been presented to support the recovery of an access link failure. As each method was presented, the level of resiliency increased because of the inclusion of additional redundancy elements. Adding redundant elements forces the designer to accommodate more complex failure modes, which in turn, forces a compromise between scale, convergence speed, simplicity, and performance. We briefly described scalability constraints such as duplication of IKE and IPSec SAs; convergence is dependent upon the link status state management models such as keepalive messages, link signaling, and resilient routing protocols. Although peer-to-peer recovery mechanisms provide resiliency to link failures, these mechanisms do not address the failure of a peer or the state of prefix reachability beyond a remote peer. The next section explores peer redundancy scenarios that are applicable to IPSec VPNs. We will address the reachability of prefixes beyond a remote peer in subsequent sections.

IPSec Peer Redundancy

In the “Access Link Fault Tolerance” section of this chapter, we discussed fault tolerance to access link failures. Access link redundancy is sufficient as long as the IPSec peer itself does not fail. In this section, you look into fault-tolerance mechanisms to recover from IPSec node failures. Node failures can be unintentionally induced due to hardware or software issues, or intentionally induced for software upgrades or maintenance. Whatever the reason, IPSec nodal failure affects the connectivity to IPSec peers connected to the affected node. To preserve the integrity of the VPN connections, various peer redundancy models are presented and each model’s advantages and disadvantages are highlighted. The review begins with the simple peer redundancy model.

Simple Peer Redundancy Model

The simplest technique to protect against IPSec node failure is adding a second node. Figure 6-9 shows this model.

Redundant IPSec Peer Gateways

Figure 6-9. Redundant IPSec Peer Gateways

The peer redundancy model depicted in Figure 6-9 shows two VPN-GW peers for the spoke SPOKE-2-WEST. At any given time, only one VPN-GW is active; the other is in standby mode. The VPN gateway that is currently supporting the IPSec processing is active. The standby gateway monitors the active gateway for proper operation; therefore, it has no IPSec state associated with its interfaces. As in the multiple IKE identities model, the order of the set peer configuration statements on the spoke determines the preferred VPN-GW peer. The SPOKE-2-WEST router will first attempt to build an IPSec SA to the VPN-GW1-WEST. Should the spoke’s connection attempt fail on this path, the SPOKE-2-WEST router attempts to build an IPSec SA using the same policies to VPN-GW2-WEST.

Note

A router with multiple set peer statements caches the peer address that successfully completed the last IPSec connection. Subsequent IPSec connections initiated from this peer use the cached peer as the first attempted peer regardless of the order of set peer statements.

Just as in the access link redundancy model, for this model to work, the IPSec policies and transforms applied using the crypto-map configuration should match on both VPN-GW peers. Example 6-6 shows the configuration of the spoke.

Example 6-6. SPOKE-2-WEST Ordered Set of Peers for Hub Site Peer Redundancy

crypto isakmp key cisco address 9.1.1.22
crypto isakmp key cisco address 9.1.1.10
crypto isakmp keepalive 10
!
crypto ipsec transform-set esp-tunnel-internet esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.22
 set peer 9.1.1.10
 set transform-set esp-tunnel-internet
 match address esp-tunnel-list
!
ip access-list extended esp-tunnel-list
 permit ip 10.0.65.0 0.0.0.255 10.0.0.0 0.255.255.255
!

Because either VPN-GW may serve as the active gateway, an interesting problem must be resolved. Traffic returning to the spoke from the HQ-WEST router must be sent to the active VPN-GW; otherwise, asymmetric routing may occur. Let’s review the issues asymmetric IP routing may cause in this model. Asymmetric routing occurs when reachability for prefixes behind the VPN-GWs from the spoke is via VPN-GW1-WEST and reachability for prefixes behind the spoke is via VPN-GW2-WEST. The configuration of the spoke, as shown in the example, will cause IKE to be initiated to the VPN-GW1-WEST (9.1.1.22). The VPN-GW1-WEST responds to the spoke’s IKE message, IKE negotiation proceeds, and the IKE establishes IPSec SAs on the spoke and the VPN-GW1-WEST. So far, so good! Now, when data packets are sent from the spoke to the hub, the packets arrive into VPN-GW1-WEST, get decrypted, and are routed into the HQ-WEST router.

Packets returning to the spoke will be directed via VPN-GW2-WEST—but all the IKE and IPSec SAs were previously established on VPN-GW1-WEST! The crypto-map configuration on VPN-GW2-WEST causes it to trigger IKE messages to the spoke. The spoke and VPN-GW2-WEST negotiate a new set of IKE and IPSec SAs. The spoke now has two sets of IKE and IPSec SAs. One associated with VPN-GW1-WEST and another set associated with VPN-GW2-WEST. For traffic returning from the spoke side to the gateway side, the spoke chooses the SA from which it last received packets. If HQ-WEST load balances packets across both VPN-GW1-WEST and VPN-GW2-WEST, packets will alternate using both sets of SAs in both directions. You will look at resolving the asymmetric routing and redundant resource consumption issue in a bit more detail in the section later in this chapter, “Peer Redundancy Using GRE.” First, however, you will explore the fault-tolerance aspects of the simple peer redundancy.

Next, you’ll see how fault tolerance to a VPN-GW failure works in this model. Assume that the active gateway for the spoke is currently VPN-GW1-WEST and the IKE and IPSec SAs are established between the spoke and the active gateway. Now, if the active gateway has a failure of some sort, the first thing the spoke must do is detect this failure. Failure of the active gateway to respond to three consecutive IKE keepalive messages signals the spoke that the active gateway has failed. Next, the spoke initiates a new set of IKE and IPSec SAs to the secondary gateway. Until the IKE and IPSec SAs are built, all traffic between sites will be lost. Another important point to bear in mind here is that there is no concept of pre-emption in IPSec fault tolerance. In other words, after the failure of VPN-GW1-WEST, the spoke now has IPSec sessions with VPN-GW2-WEST. When VPN-GW1-WEST recovers from a failure, the spoke may not revert back to VPN-GW1-WEST. Instead, the return path is determined by HQ-WEST’s decision to route the packet to VPN-GW1-WEST or VPN-GW2-WEST. This means there may potentially be spokes whose current IPSec peer is VPN-GW1-WEST and others whose current active peer is VPN-GW2-WEST.

An interesting point of failure in the simple peer model is the failure of the private Ethernet interface on the active VPN-GW1-WEST. This failure may not be synchronized with the IPSec state because IKE keepalives will continue to be exchanged between the spoke and the active VPN-GW1-WEST. As far as the spoke is concerned, the VPN-GW1-WEST has not failed. Under this condition, all data packets that arrive from the spoke to the gateway will be sent into a black hole after decryption. A simple fix for this is to have redundant Ethernet links on the VPN-GW routers to the HQ-WEST router, an Ethernet link connecting the two VPN-GW routers together, or even better, a redundant link and a redundant HQ-WEST router, as shown in Figure 6-10.

Resiliencies Through Redundant VPN-GW

Figure 6-10. Resiliencies Through Redundant VPN-GW

Although the addition of redundant links between the VPN-GWs and HQ router improves the resiliency of the peer redundancy, it does not reconcile the issue of asymmetric routing. Next, you will explore a couple of common methods that resolve the asymmetric routing issue and the IPSec black hole created when the VPN-GW Ethernet interface fails.

The asymmetric routing problem is reconciled fairly easily by routing mechanisms deployed on the VPN-GWs. One commonly used tool for simplifying route management is HSRP/VRRP. The HQ-WEST router may use a default route to a virtual IP address that is shared between the VPN-GWs on the private side. Because the VPN-GW’s HSRP and IPSec state may not be synchronized, the same router may not be active for both HSRP and IPSec. The router serving as the active HSRP node may be forced to quickly establish IPSec connections when traffic returns to the spoke such that HSRP and IPSec are synchronized. Synchronization of HSRP and IPSec may occur rather quickly for a few spokes; however, a VPN design supporting hundreds or thousands of spokes may experience a longer delay as the VPN-GW, assuming the active IPSec role must rebuild all of the IKE and IPSec connections simultaneously. If fast recovery is required, this issue limits the scalability of our VPN design. Figure 6-11 demonstrates the use of HSRP on the VPN-GW’s private interfaces.

Resilient VPN-GW Routers using HSRP

Figure 6-11. Resilient VPN-GW Routers using HSRP

Caution

HSRP can be configured with or without pre-emption. Without HSRP pre-emption, the recovery of a preferred VPN-GW router will not force reconvergence. With HSRP pre-emption, the recovery of a preferred VPN-GW router will force reconvergence for all IPSec connections. Forcing pre-emption may cause unnecessary data loss while the preferred VPN-GW attempts to recover it’s role as the primary gateway. Unless there is a compelling reason for the preferred gateway to resume control of the active sessions, the designer should not use pre-emption.

Note

Dynamic IPSec crypto maps implemented on the VPN-GWs do not allow the creation of outbound IPSec SAs because the SPOKE peer identities are unknown. HSRP will not be able to force the synchronization of HSRP-active and IPSec-active VPN-GWs.

A second routing method that resolves the asymmetric routing problem is synchronizing the return routes for the spoke with the active IPSec VPN-GW. The implementation of Reverse Route Injection (RRI) on the VPN-GWs allows the active VPN-GW to propagate the spoke’s protected route to the HQ-WEST router. With the return routes and IPSec synchronized on the VPN-GWs, the asymmetric routing problem is solved while improving the stability of the network. Unfortunately, you still have the potential for dropping data at the VPN-GW routers because the failure of the Ethernet interface prevents the VPN-GW from propagating the reverse route to the HQ-WEST router. A simple solution for this problem is to use RRI in conjunction with HSRP. The more specific routes from RRI are propagated to HQ-WEST, and the HSRP (virtual) IP address is used as the default gateway for HQ-WEST. Should the failed Ethernet on VPN-GW1-WEST prevent the reverse routes from propagating to the HQ-WEST router, the HQ-WEST router will direct traffic to the default route that is associated with the active HSRP node, VPN-GW2-WEST. The active HSRP node will force the IPSec state to synchronize such that the VPN-GW2-WEST router becomes the active IPSec peer for the spokes. Unfortunately, you are still subject to data black holes at the VPN-GWs if dynamic crypto maps are used. The next section will demonstrate a viable solution to address the dynamic crypto map scenario.

In summary, the simple IPSec peer redundancy model provides a reasonably efficient method of providing fault tolerance. However, you must add many auxiliary functions such as RRI and HSRP in order to prevent asymmetric routing and black holes. In the next section, you’ll find out how to simplify the IPSec configuration and routing management.

Virtual IPSec Peer Redundancy Using HSRP

In the access redundancy section, we presented a method for using a single IKE identity to restore services when one of the VPN-GW’s access links failed. A similar single IKE model is also possible for peer redundancy. Figure 6-12 shows the Virtual IPSec Peer Redundancy model.

Virtual IPSec Peer Redundancy

Figure 6-12. Virtual IPSec Peer Redundancy

In this model, the VPN-GW peers are on a common public Ethernet using HSRP. The HSRP configuration makes one of the VPN-GW routers active and places the other one in standby at any given time. From an IPSec perspective, the spoke has an IPSec peer relationship with the HSRP virtual IP address owned by the active HSRP router. The advantages of this virtual peer model as compared to the simple peer model are as follows:

  • Configuration of the remote peers is simpler. The remote peers now need only one set peer configuration; therefore, the order of the set peer statements is irrelevant.

  • HSRP groups can be used to distribute spoke IPSec security associations across redundant peers. Some spokes might use one HSRP group virtual IP address as the peer, whereas other spokes use a second HSRP group virtual IP address as the peer.

  • HSRP can track the state of another interface on the VPN-GW (specifically the private Ethernet interface) and decide if the router needs to be active or standby.

  • Reverse Route Injection (RRI) can be coupled with HSRP to provide a fault-tolerant IPSec VPN.

The important aspect to the virtual IPSec peer model is that the routing state and the crypto state are synchronized. This minimizes the potential for creating black holes where data is dropped during failures.

Example 6-7 and Example 6-8 show the configurations of the VPN-GW’s in this model.

Example 6-7. VPN-GW1-EAST Configuration

crypto isakmp policy 10
 hash md5
 authentication pre-share
crypto isakmp key cisco address 9.1.1.138
!
!
crypto ipsec transform-set esp-tunnel esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.138
 set transform-set esp-tunnel
 match address esp-tunnel-list
!
interface FastEthernet0/0
 ip address 10.1.1.1 255.255.255.0
!
interface FastEthernet0/1
 ip address 9.1.1.35 255.255.255.240
 standby track Fa0/0                                                                  
 standby 1 priority 100 preempt
 standby 1 ip 9.1.1.37
 crypto map vpn
!
ip access-list extended esp-tunnel-list
 permit ip 10.0.0.0 0.255.255.255 10.0.66.0 0.0.0.255
!

Example 6-8. VPN-GW2-EAST Configuration

crypto isakmp policy 10
 hash md5
 authentication pre-share
crypto isakmp key cisco address 9.1.1.138
!
!
crypto ipsec transform-set esp-tunnel esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.138
 set transform-set esp-tunnel
 match address esp-tunnel-list
!
interface Ethernet4/0
 ip address 10.1.1.2 255.255.255.0
!
interface Ethernet4/1
 ip address 9.1.1.36 255.255.255.240
 half-duplex
 standby 1 priority 50
 standby track Et4/0                                                                  
 standby 1 ip 9.1.1.37
 crypto map vpn
!
ip access-list extended esp-tunnel-list
 permit ip 10.0.0.0 0.255.255.255 10.0.66.0 0.0.0.255
!

Notice the use of the standby track command, which refers to the private Ethernet interface configuration. With this command, the private interface can be tracked. If this interface goes down on the active VPN-GW, then the other VPN-GW automatically takes over, resulting in no black hole for data traffic. In fact, HSRP standby tracking can be used on both private and public Ethernet interfaces such that return routes and IPSec state may always be synchronized. If HSRP is only used on the public interfaces of the VPN-GWs, then RRI may be used to propagate the spoke’s route prefix to the HQ-WEST router. Fortunately, the HSRP implementation on the public interface may track the private interfaces such that traffic does not end up in a black hole when the private Ethernet interfaces fail on the VPN-GWs.

IPSec Stateful Failover

We discussed the IPSec stateful failover mechanism in Chapter 3, “Enhanced IPSec Features.” An interesting extension to the virtual IPSec peer model is its ability to provide stateful failover. As was just demonstrated, the configuration of HSRP on the public and private interfaces of both VPN-GWs forces the synchronization of IP routing and IPSec. But, without stateful failover, convergence in a large VPN will take quite some time when the active IPSec gateway fails. The delay is due to the fact that new IKE and IPSec SAs must be established between all the spokes and the new VPN-GW. The critical element that is missing for stateful failover is the exchange of the IKE and IPSec SA information between the redundant peers. Recall that the single IKE identity model for redundant access links allowed the SA information to be replicated on any interface where the crypto map was applied. With stateful failover, a control channel is used to replicate the same SA information to public interfaces of the peer that is serving as the standby HSRP node. The replication of SA information ensures that the standby HSRP router and standby IPSec peer have the necessary information to immediately assume the role of an active HSRP and active IPSec peer. In the stateful failover case, the IKE and IPSec SAs do not need to be reestablished, as described previously when discussing the virtual IPSec peer redundancy model. Figure 6-13 demonstrates the relationship of the routers for stateful failover.

Stateful IPSec Failover Topology

Figure 6-13. Stateful IPSec Failover Topology

The stateful failover allows VPNs built with the virtual IPSec peer redundancy architecture to scale with minimal configuration complexity. We have eliminated asymmetric routing and black holes and enabled rapid recovery of IKE and IPSec SAs. On the other hand, we have increased the memory requirements on the VPN-GWs because they must manage all the IPSec SAs in the HSRP group even though a router is serving as a standby router. The reason for implementing stateful failover is to improve the convergence interval, which is dependent upon the time required for the standby VPN-GW to detect the loss of active VPN-GW. Once the standby router assumes the active role, the communications path is restored in both directions. Example 6-9 and Example 6-10 show the configuration of the VPN-GW’s supporting stateful failover.

Example 6-9. VPN-GW1-EAST Configuration

!
ssp group 1                                                                           
 remote 9.1.1.36                                                                      
 redundancy public-ipsec                                                              
!
crypto isakmp policy 1
 authentication pre-share
crypto isakmp key spoke address 9.1.1.154
crypto isakmp ssp 1                                                                   
!
crypto ipsec transform-set tunnel esp-des esp-sha-hmac
!
crypto map stateful ha replay-interval inbound 1000 outbound 1000
crypto map stateful 10 ipsec-isakmp
 set peer 9.1.1.154
 set transform-set tunnel
 match address remote
!

interface Ethernet2/0
 ip address 10.1.1.2 255.255.255.0
 standby delay minimum 0 reload 0
 standby 2 track Ethernet2/2                                                          
 standby 2 ip 10.1.1.254
 standby 2 timers msec 250 1
 standby 2 preempt
 standby 2 name ipsec-private                                                         
 standby 2 track Ethernet2/2                                                          
!
interface Ethernet2/2
 ip address 9.1.1.35 255.255.255.240
 standby delay minimum 5 reload 0
 standby ip 9.1.1.37
 standby timers msec 250 1
 standby preempt
 standby name public-ipsec                                                            
 standby track Ethernet2/0                                                            
 crypto map stateful ssp 1                                                            
!
ip access-list extended remote
 permit ip 10.1.1.0 0.0.0.255 10.0.70.0 0.0.0.255

Example 6-10. VPN-GW2-EAST Configuration

!
ssp group 1                                                                           
 remote 9.1.1.35                                                                      
 redundancy public-ipsec                                                              
!
crypto isakmp policy 1
 authentication pre-share
crypto isakmp key spoke address 9.1.1.154
crypto isakmp ssp 1                                                                   
!
crypto ipsec transform-set tunnel esp-des esp-sha-hmac
!
crypto map stateful ha replay-interval inbound 1000 outbound 1000
crypto map stateful 10 ipsec-isakmp
 set peer 9.1.1.154
 set transform-set tunnel
 match address remote
!
interface Ethernet1/2
 ip address 10.1.1.1 255.255.255.0
 standby delay minimum 0 reload 0
 standby 2 track Ethernet2/2                                                          
 standby 2 ip 10.1.1.254
 standby 2 timers msec 250 1
 standby 2 preempt
 standby 2 name ipsec-private                                                         
 standby 2 track Ethernet1/1                                                          
!
interface Ethernet1/1
 ip address 9.1.1.36 255.255.255.240
 standby delay minimum 5 reload 0
 standby ip 9.1.1.37
 standby timers msec 250 1
 standby preempt
 standby name public-ipsec                                                            
 standby track Ethernet1/2                                                            
 crypto map stateful ssp 1                                                            
!
ip access-list extended remote
 permit ip 10.1.1.0 0.0.0.255 10.0.70.0 0.0.0.255

The stateful failover configuration introduces quite a few new elements. You see that the ssp group is defined, which provides information about the node that you must share state with. You see that the crypto maps make reference to the ssp group such that you know which crypto state needs to synchronized with the stateful failover members. Finally, you see that the HSRP elements of the interfaces are augmented with the ssp group such that you know which interfaces need to be monitored for fault detection.

Note

Example 6-10 shows the IPSec stateful failover configuration using SSP. An alternate mechanism for IPSec stateful switchover, Stateful Switchover (SSO), was illustrated in Chapter 3, “Enhanced IPSec Features” in Example 3-7. Functionally, both achieve the same result. The SSP mechanism was developed specifically for IPSec stateful failover, whereas the SSO mechanism was developed under the framework of a more generic High Availability infrastructure which is used for providing stateful failover mechanisms for many other protocols in Cisco IOS such as OSPF, BGP, IP, and others in addition to IPSec. SSO is the recommended configuration method for stateful switchover for IPSec.

The configuration for stateful failover associates the control channel by name (that is, public-ipsec) between the redundant nodes to the HSRP profile names. The HSRP synchronizes the state of the necessary interfaces on both the private and public interfaces of the VPN-GW and, finally, the crypto map is associated with the stateful IPSec control channel. The combination of these elements allows the VPN-GW routers to maintain the proper state and provide a stateful virtual IPSec peer for all of the spokes associated with the IKE address.

The virtual IPSec peer redundancy model provides a robust means of fault tolerance. We have demonstrated how the VPN-GWs are able to present a single IPSec peer to the spokes, which simplifies the configuration of the spokes while enabling rapid recovery of failures associated with the VPN-GWs. One of the limitations with the IPSec model is the lack of support for multiple protocols. The next section describes an alternative peer redundancy method that enables robust fault tolerance while enabling multi-protocol support.

One of the more significant disadvantages to using HSRP is that it tracks only local failures (that is, interfaces). HSRP on the VPN-GWs is not able to track the viability of routing paths behind HQ-WEST. We will explore more comprehensive fault-tolerant methods when we associate routing protocols with the crypto state in subsequent sections. In addition, the use of the Gateway Load Balancing Protocol (GLBP) also mitigates the value of the using stateful IPSec failover because the load-balancing function directs traffic to both the active and the standby VPN-GW. The standby VPN gateway is not in a forwarding state, and therefore packets will be dropped on this VPN-GW.

Peer Redundancy Using GRE

Chapter 2, “IPSec Overview,” demonstrated that GRE tunnels may be used for site-to-site IPSec tunnels. One of the many advantages of implementing GRE protected by IPSec is the ability to reflect the state of the IPSec connection in the router’s routing table. The most common reason to use GRE for site-to-site IPSec connectivity is to provide dynamic routing capabilities between the sites. The routing decisions may be cached in the Cisco Express Forwarding (CEF) table for high-performance routing, tunneling, forwarding, and encryption. For this reason, a significant majority of the enterprise customers use GRE protected by IPSec. We may leverage this capability to provide peer redundancy. This feature can be used to provide IPSec peer redundancy as shown in Figure 6-14.

IPSec Peer Redundancy with GRE Tunnels

Figure 6-14. IPSec Peer Redundancy with GRE Tunnels

In this example, the SPOKE-2-WEST has two IPSec-protected GRE tunnels—one for each VPN-GW. At both the spoke and the VPN-GWs, IKE and IPSec SAs are created to protect the source and destination of the GRE tunnels. IPSec policies on both the VPN-GWs are configured to match protection policies for their respective GRE tunnel. The IP routing database on the spoke determines which GRE tunnel the traffic should take to reach the prefixes protected by the VPN-GWs. Meanwhile, the VPN-GWs learn of the spoke’s protected prefix and then propagate that route to the HQ-WEST router. In most cases, both GRE tunnels from the spoke will be operational all the time, and therefore it does not matter if traffic is routed asymmetrically across the peers.

Fault tolerance, in this model with dynamic routing, is straightforward. IP routing updates validate the GRE paths that require a valid IPSec SA. The routing algorithm calculates the best paths, detects path failure, and reroutes to alternate GRE paths for redundancy. A quick review of the process demonstrates how this will work:

  • If a destination at the VPN-GW site is reachable via both the GRE tunnels from the spoke, the spoke uses the best path calculated by the routing process. If both paths are equally weighted, the spoke will send traffic using CEF load balancing across both GRE tunnels.

  • If one of the VPN-GWs goes down, then the dynamic routing protocol will update the reachability information for these prefixes via the corresponding GRE tunnel. All traffic destined to the VPN-GW destination prefix would take the second available GRE path. The most significant advantage for this model is that no IKE and IPSec SAs need to be negotiated across the alternate path as they are already present and validated by the routing protocols, thus helping convergence.

  • By default, the GRE tunnel interface does not verify reachability of the remote GRE tunnel endpoint. Dynamic routing protocols serve to validate the path and reflect the viability of the path in the routing database.

  • If dynamic routing is not used and one of the VPN-GW peers goes down, then traffic will be sent into a GRE tunnel with no viable endpoint. GRE keepalives may be configured in such instances to validate reachability of the GRE tunnel endpoint and assist in the process of convergence.

Example 6-11 and Example 6-12 show the configurations for the redundant pair of VPN-GWs, whereas Example 6-13 shows the configuration for the SPOKE.

Example 6-11. VPN-GW1-WEST Configuration with GRE Tunnel

crypto isakmp policy 10
 hash md5
 authentication pre-share
 lifetime 3600
crypto isakmp key cisco address 9.1.1.134
crypto ipsec transform-set esp-tunnel-internet esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.134
 set transform-set esp-tunnel-internet
 match address gre-tunnel-list
!
interface Tunnel0
 ip address 10.0.2.1 255.255.255.252
 tunnel source 9.1.1.22
 tunnel destination 9.1.1.134
 crypto map vpn
!
interface Serial1/0:0
 ip address 9.1.1.22 255.255.255.252
 crypto map vpn
!
interface Ethernet4/0
 ip address 10.1.0.1 255.255.255.0
!
router eigrp 100
 network 10.1.0.0 0.0.0.255
 network 10.0.2.0 0.0.0.255
 no auto-summary
!
ip access-list extended gre-tunnel-list
 permit gre host 9.1.1.22 host 9.1.1.134

Example 6-12. VPN-GW2-WEST Configuration with GRE Tunnel

crypto isakmp policy 10
 hash md5
 authentication pre-share
 lifetime 3600
crypto isakmp key cisco address 9.1.1.134
crypto ipsec transform-set esp-tunnel-internet esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.134
 set transform-set esp-tunnel-internet
 match address gre-tunnel-list
!
interface Tunnel0
 ip address 10.0.2.5 255.255.255.252
 tunnel source 9.1.1.10
 tunnel destination 9.1.1.134
 crypto map vpn
!
interface Serial1/0
 ip address 9.1.1.10 255.255.255.252
 crypto map vpn
!
interface Ethernet4/0
 ip address 10.1.0.2 255.255.255.0
!
router eigrp 100
 network 10.1.0.0 0.0.0.255
 network 10.0.2.0 0.0.0.255
 no auto-summary
!
ip access-list extended gre-tunnel-list
 permit gre host 9.1.1.10 host 9.1.1.134

Example 6-13. SPOKE-2-WEST with GRE Tunnels to Redundant GRE Peers

crypto isakmp policy 10
 hash md5
 authentication pre-share
 lifetime 3600
crypto isakmp key cisco address 9.1.1.22
crypto isakmp key cisco address 9.1.1.10
!
crypto ipsec transform-set esp-tunnel-internet esp-des esp-md5-hmac
!
crypto map vpn 10 ipsec-isakmp
 set peer 9.1.1.22
 set transform-set esp-tunnel-internet
 match address gre-tunnel-list
crypto map vpn 20 ipsec-isakmp
 set peer 9.1.1.10
 set transform-set esp-tunnel-internet
 match address gre-tunnel-list2
!
interface Tunnel0
 ip address 10.0.2.2 255.255.255.252
 tunnel source 9.1.1.134
 tunnel destination 9.1.1.22
 crypto map vpn
!
interface Tunnel1
 ip address 10.0.2.6 255.255.255.252
 tunnel source 9.1.1.134
 tunnel destination 9.1.1.10
 crypto map vpn
!
interface FastEthernet0
 ip address 10.0.65.1 255.255.255.0
!
interface Serial0
 no ip address
 ip address 9.1.1.134 255.255.255.252
 crypto map vpn
!
router eigrp 100
 network 10.0.65.0 0.0.0.255
 network 10.0.2.0 0.0.0.255
 no auto-summary
!
ip access-list extended gre-tunnel-list
 permit gre host 9.1.1.134 host 9.1.1.22
ip access-list extended gre-tunnel-list2
 permit gre host 9.1.1.134 host 9.1.1.10

Note

The assignment of an IP address to the tunnel interface is critical for the propagation of routes. In the example provided, the tunnel’s IP address assignment references the private Ethernet interface that is within the scope of addresses covered by the EIGRP routing process. Therefore, routing updates would be directed into the GRE tunnel. Alternately, the tunnel interface’s IP address may be anchored to a loopback interface or to the public interface. In many cases, anchoring the tunnel to a loopback interface provides much more control and reliability over route distribution because the routing process is not associated with either the private or public interfaces.

Note

The requirement to apply the crypto map to both tunnel and physical interfaces in the configuration was removed in IOS versions 12.2(13)T, 12(3)M, and 12(3)T. Subsequent releases only require configuring the crypto map on the physical interfaces.

The configuration for the spoke with redundant GRE peers highlights the fact that a unique IKE and IPSec SA is defined for each tunnel within the context of a single crypto map. The same will be true for a hub with hundreds of spokes attached. That is, the hub will have a unique sequence number for each spoke attached to the hub.

The support for dynamic routing on GRE tunnels allows you design highly fault-tolerant IPSec-protected VPNs. Many features of the routing protocols may be leveraged to optimize traffic flows, convergence intervals, and fault detection. Of course, all this flexibility does not come for free. The IPSec gateway has to deal with routing protocol scalability limitations in addition to IPSec. This is especially true at the hub of a hub-and-spoke architecture.

In addition, you find that the routing processes may create additional demands on the IPSec processing functions. The most important aspect is the simultaneous creation and persistence of multiple IPSec sessions when routing updates are broadcast into the GRE tunnels. When the VPN-GW routers become viable endpoints, all the spokes will attempt to restore their GRE tunnels and IPSec SAs. Likewise, the VPN-GW will attempt to restore a GRE tunnel to each of its spokes. This process places a tremendous burden on the VPN-GW during periods of transition; therefore, scalability becomes a critical design factor. Nevertheless, the IPSec-protected GRE model is the most commonly deployed architecture due to its robustness, simplicity, and versatility. Remember that this model is only for site-to-site IPSec fault tolerance; remote access clients do not support GRE.

Virtual IPSec Peer Redundancy Using SLB

You have already seen a virtual IPSec peer redundancy model using HSRP in a previous section in this chapter. One of the disadvantages of the HSRP model is that, at any given time, only one of the VPN-GWs is an active IPSec peer. In this section, we will discuss an alternate scheme for virtual IPSec peer redundancy using the IOS Server Load Balancing feature (SLB).

The concept behind this model is to load balance the incoming IPSec connections from the spokes or remote access clients across a farm of VPN-GWs. The spokes are only aware of a virtual IP address that represents the VPN-GW farm defined on the SLB. The load-balancing device (or SLB feature) distributes IPSec requests across the farm and, in the case of failure, the VPN-GWs are transparently removed from operation. There are different load-balancing algorithms that can be used to distribute VPN traffic, from a simple round robin up to more sophisticated algorithms, based on the concurrent number of connections, server load, and weight.

Server Load Balancing Concepts

As noted in the previous section, IOS SLB is an IOS feature that provides IP server load balancing. Using this feature, you can define a virtual VPN-GW that represents a cluster of real gateways known as a gateway farm. In this environment, the clients initiate IKE requests to the IP address of the virtual VPN-GW. When a client or a spoke initiates a connection, the IOS SLB function chooses a real gateway for the connection from among the gateways using the same virtual address. Which gateway is selected also depends on the configured load-balancing algorithm. SLB provides two load-balancing algorithms:

  • Weighted round robin—This algorithm assigns a weight (n) to each gateway. SLB assigns new incoming connections to any given real gateway n times before the next gateway is chosen.

  • Weighted least connections—This algorithm specifies that the next real gateway chosen from a gateway farm for a new connection is the one with the fewest active connections. Each real VPN-GW is assigned a weight; the gateway with the fewest connections is based on the number of active connections on each gateway and on the relative capacity of each gateway. The capacity of a given real gateway is calculated as the assigned weight of that gateway divided by the sum of the assigned weights of all of the real gateways associated with that virtual server, or n1/(n1+n2+n3...).

Either algorithm can be used for choosing a real VPN-GW for each new connection request that arrives at the IOS SLB for IPSec termination. IOS SLB automatically detects failure of a real VPN-GW using pings and increments a failure counter for that gateway. If a gateway’s failure counter exceeds a configurable faildetect failure threshold, the gateway is considered out of service and is removed from the list of active real gateways.

IOS SLB also allows configuration of maximum connections for a real VPN-GW. If the maximum number of connections is reached for a VPN-GW, IOS SLB automatically switches all further connection requests to other real VPN-GWs until the connection number drops below the specified limit.

One more key concept for SLB that is used for the IPSec fault-tolerant design is that of sticky connections. This concept assigns new connections from a client IP address or subnet to the same real VPN-GW as were previous connections from that address or subnet.

IPSec Peer Redundancy Using SLB

This section examines how the SLB concepts can be applied to the IPSec peer redundancy model. Figure 6-15 illustrates this model.

Architecture for Load-Balanced IPSec Connections

Figure 6-15. Architecture for Load-Balanced IPSec Connections

The two VPN-GWs in this model are connected to a Catalyst 6500 switch with an MSFC card running IOS that supports SLB functionality. The two VPN-GWs constitute the gateway farm and share a single IKE identity, which is the virtual IP address on the SLB. Example 6-14 shows the configuration of the SLB device and the VPN-GWs.

Example 6-14. Configuration of the Server Load Balancer

Current configuration : 7178 bytes
!
version 12.1
service timestamps debug uptime
service timestamps log uptime
no service password-encryption
!
hostname slb-east
!
boot system flash sup-bootflash:c6sup22-jk2o3sv-mz.121-11bFIE1.bin
enable password lab
!
redundancy
 main-cpu
  auto-sync standard
!
vlan 1
!
vlan 10   --- vlan to the inet-gw
!
vlan 11   --- vlan to the vpn-gws
!
!
no ip domain-lookup
!
ip slb probe SERVER-PROBE ping
 interval 30
 faildetect 3
!
ip slb serverfarm IPSEC
 failaction purge
 probe SERVER-PROBE
 !
 real 9.1.1.35
  weight 1
  maxconns 5000
  inservice
 !
 real 9.1.1.36
  weight 1
  maxconns 5000
  inservice
 !

ip slb vserver IPSEC-ESP
 virtual 9.1.0.37 esp
 serverfarm IPSEC
 sticky 6000 group 1
 idle 3650
 inservice
!
ip slb vserver IPSEC-ISAKMP
 virtual 9.1.0.37 udp isakmp
 serverfarm IPSEC
 sticky 6000 group 1
 idle 3650
 inservice
!
!
!
interface FastEthernet3/1
 description to VPN-gateway
 duplex full
 speed 100
 switchport
 switchport access vlan 10
!
interface FastEthernet3/5
 description to vpn-gw1-east
 no ip address
 duplex full
 speed 100
 switchport
 switchport access vlan 11
!
interface FastEthernet3/6
 description to vpn-gw2-east
 no ip address
 duplex full
 speed 100
 switchport
 switchport access vlan 11
!
!
interface Vlan1
 no ip address
!
interface Vlan10
 ip address 9.1.0.33 255.255.255.0
!
interface Vlan11
 ip address 9.1.1.33 255.255.255.0
!
router ospf 1
 log-adjacency-changes
 network 9.1.0.0 0.0.0.255 area 0


slb-east#show ip slb realserver

real                  farm name        weight  state          conns
-------------------------------------------------------------------
9.1.1.35              IPSEC            1       OPERATIONAL    2
9.1.1.36              IPSEC            1       OPERATIONAL    2ECE

ILB-1#show ip slb conn

vserver         prot client                real             state     nat
-------------------------------------------------------------------------------
IPSEC-ESP       ESP  9.1.1.155:0           9.1.1.35         ESTAB     none
IPSEC-ISAKMP    UDP  9.1.1.155:500         9.1.1.35         ESTAB     none
IPSEC-ESP       ESP  9.1.1.154:0           9.1.1.36         ESTAB     none
IPSEC-ISAKMP    UDP  9.1.1.154:500         9.1.1.36         ESTAB     none

ILB-1#show ip slb vservers

slb vserver      prot  virtual                  state         conns
----------------------------------------------------------------------
IPSEC-ESP        ESP   9.1.0.37/32:0           OPERATIONAL   0
IPSEC-ISAKMP     UDP   9.1.0.37/32:500         OPERATIONAL   0
________________________________________________________________________________
vpn-gw1-east#
crypto isakmp policy 1
 encr 3des
 authentication pre-share
 group 2
cryto isakmp key cisco 9.1.1.154 255.255.255.255 no-xauth
crypto isakmp keepalive 300 5

crypto isakmp client configuration group cisco
 key coke123
 dns 15.15.15.15
 wins 16.16.16.16
 domain cisco.com
 pool cisco

crypto ipsec transform-set esp-tunnel-internet esp-3des esp-sha-hmac
!
crypto dynamic-map cisco 10
 set transform-set esp-tunnel-internet
 reverse-route
!
crypto map crypmap local-address Loopback0
crypto map crypmap 2 ipsec-isakmp dynamic cisco
!
!
interface Loopback0   ---- this address configured is same as the virtual server
 address
ip address 9.1.0.37 255.255.255.255                                                   

!
interface FastEthernet0/0
 description to public  ip address 9.1.1.35 255.255.255.0
 crypto map crypmap
!
interface FastEthernet0/1
 description to corporate
 ip address 10.1.1.1 255.255.255.0
!
!
router ospf 10
log-adjacency-changes
network 10.1.1.0 0.0.0.255 area 0
redistribute static subnets
default-information originate
!
ip route 0.0.0.0 0.0.0.0 9.1.1.33

Note

The configuration of all the VPN-GWs behind the SLB will look exactly the same, except for the real and private IP addresses.

Note

Static crypto maps should not used on the VPN-GWs with set peer statements because the SLB function operates only on connections initiated from the spokes and clients. The dynamic crypto map will eliminate a significant configuration burden. The VPN-GWs must use dynamic crypto maps in order to receive IPSec connections from unknown remote peers.

Notice that the configuration of the SLB has two instances of the slb vserver command, one for IKE traffic and another for encapsulating security payload (ESP) traffic. It is important to bind both these vservers together to avoid asymmetric IKE and IPSec paths. In other words, we don’t want IKE negotiation to happen with a VPN-GW1-EAST while IPSec traffic termination is directed to a separate VPN-GW2-EAST. The concept of sticky connections binds these together. A couple of other important aspects of this model are:

  • An extra VPN-GW in the gateway farm provides for redundancy. A failure of one VPN-GW in the gateway farm requires the load to be redistributed entirely among the remaining servers.

  • The max connections parameter on the SLB should take into account both IKE and IPSec connections; for example, if a real VPN-GW can terminate 1000 IPSec tunnels, the maxconnections should be configured as 2000 (1000 IKE SA and 1000 incoming IPSec SA).

The spoke configuration uses the virtual IP address as the IKE identity of the gateway farm. When the spoke sends an IKE message to this virtual IP address, the SLB receives the IKE traffic and routes it to one of the real VPN-GWs based on the load-balancing algorithm configured on the SLB. For the message to terminate and be processed at the real VPN-GW, the SLB virtual IP address must be configured on the real VPN-GW. A loopback interface is typically used for this purpose. All the real VPN-GWs in the gateway farm should be configured with the same virtual IP address because the IKE and IPSec traffic for an IPSec session could potentially terminate on any VPN-GW. This overlapping IP address scheme violates general IP network design principles, and care should be taken to not advertise this IP address in the rest of the network.

Cisco VPN 3000 Clustering for Peer Redundancy

The Cisco VPN 3000 Concentrator supports a peer redundancy model that conceptually works just like the SLB scheme discussed previously. The VPN 3000 model of peer redundancy is known as clustering. This model is shown in Figure 6-16 and is implemented by grouping together logically, into a virtual cluster, two or more VPN 3000 Concentrators on the same private LAN-to-LAN network, private subnet, and public subnet. A virtual cluster is a set of Concentrators that all serve the same group of users. The remote clients are unaware of the fact that multiple Concentrators exist, because they connect to a virtual representation of the set of Concentrators.

IPSec Clustering for Peer Redundancy

Figure 6-16. IPSec Clustering for Peer Redundancy

All devices in the virtual cluster carry session loads. One device in the virtual cluster, the virtual cluster master, is responsible for directing incoming calls to the other devices, which are called secondary devices. The virtual cluster master monitors all the devices in the cluster and keeps track of how busy each one is, distributing the session load accordingly. The role of virtual cluster master is not tied to a physical device; it can shift among devices. For example, if the current virtual cluster master fails, one of the secondary devices in the cluster takes over that role and immediately becomes the new virtual cluster master.

Note

VPN clustering works only with Cisco VPN clients. It does not work for site-to-site connections.

The virtual cluster appears, to outside clients, as a single virtual cluster IP address. This IP address is not tied to a specific physical device—it belongs to the current virtual cluster master, and is, therefore, considered virtual. A VPN client attempting to establish a connection will connect first to this virtual cluster IP address. The virtual cluster master then returns the public IP address of an available, and least loaded, host in the cluster. In a second transaction (transparent to the user), the client connects directly to that host. In this way, the virtual cluster master directs traffic evenly and efficiently across resources. Example 6-15 shows the configuration of the VPN 3000 for clustering.

Example 6-15. VPN 3000 Configuration for Clustering

Configuration > Interface Configuration > Configure Ethernet #2 (Public) > Interface
 Setting > Enable using Static IP Addressing > Enter IP Address = 9.1.1.35
Configuration > Interface Configuration > Configure Ethernet #2 (Public) > Interface
 Setting > Enable using Static IP Addressing > Enter Subnet Mask = 255.255.255.240
Configuration > Policy Management > Traffic Management > Filters > Assign Rules to a
 Filter > Add a Rule to this Filter (Public) > VCA In
Configuration > Policy Management > Traffic Management > Filters > Assign Rules to a
 Filter > Add a Rule to this Filter (Public) > VCA Out
Configuration > System > Load Balancing > Cluster Configuration > VPN Virtual Cluster
 IP Address = 9.1.1.37
Configuration > System > Load Balancing > Cluster Configuration > Encryption = Enabled
Configuration > System > Load Balancing > Cluster Configuration > Load-Balancing Enable
 = Enable
Configuration > System > Load Balancing > Cluster Configuration > IPSec Shared Secret
 = cisco123
Configuration > System > Load Balancing > Device Configuration > Enable/Disable Load
 Balancing = Enable
Configuration > System > Load Balancing > Device Configuration > Device Priority = 10

If a VPN 3000 in the cluster fails, the client may close the IPSec session state and immediately reconnect to the virtual cluster IP address. The virtual cluster master then directs these connections to another active device in the cluster. Should the virtual cluster master fail, a secondary device in the cluster automatically takes over as the new virtual session master. Even if several devices in the cluster fail, users can continue to connect to the cluster as long as any one device in the cluster is up and available.

Peer Redundancy Summary

This section highlighted several peer redundancy models, emphasizing the advantages as well as the disadvantages for the various models. In particular, IPSec peer redundancy may be more appropriate for client-initiated connections. The deficiencies in the route management and the complex proxy statements limit the utility of native IPSec models for site-to-site connections. Conversely, we have demonstrated that the GRE peer redundancy model is more appropriate for static environments such as site-to-site connections in which complex routing adjacencies may be required between the two IPSec peers. In all cases, the state management of the IPSec security associations may affect the performance and scalability of the redundancy model.

Intra-Chassis IPSec VPN Services Redundancy

Thus far, we have discussed redundancy based on external attributes of the VPN router—that is, link redundancy and peer redundancy. Designers must also consider scenarios in which internal redundancy improves the reliability of the IPSec services. There are two options to consider:

  • Stateless Failover

  • Stateful Failover

Stateless IPSec Redundancy

The stateless failover model assumes that the state of the encryption processes is not synchronized across redundant hardware within the chassis. Designers are forced in that case to use an active/standby redundancy model, in which the standby hardware assumes the identity of the active hardware when the active hardware fails. Unfortunately, the standby hardware has no knowledge of the existing IPSec sessions; therefore, sites and clients with established IPSec sessions to the VPN router will be terminated. Subsequently, the remote sites and clients must reinitialize their connection to the VPN router. Effectively, the standby encryption hardware is equivalent to the redundant peer model described in the previous section.

The impact of a stateless failover varies based on the types of IPSec connections and the applications. For remote office sites, the peers may detect the loss of the active encryption hardware and reconnect to the standby hardware. Note that the remote hosts retain their originally assigned IP addresses and the applications may periodically check for the reestablishment of end-to-end communications. The impact may be minimized if the fault detection and connection reestablishment is successful before the application’s timeout due to loss of end-to-end connectivity. In contrast, remote client applications will likely be terminated because the reestablishment of an IPSec connection to a standby encryption process will inevitably lead to the assignment of a new IP address for client connections. In this case, the standby encryption hardware provides marginal value.

Stateful IPSec Redundancy

Now, let’s look at the stateful failover scenario. The stateful failover scenario may use an active/active model or an active/standby model; however, the state of IPSec connections is always synchronized. The primary difference between the active/active and active/standby model is that the active/active model leverages some form of load balancing that may occur, whereas active/standby keeps the standby encryption hardware idle at all times.

When designing an active/active redundancy model, network architects must ensure that the cumulative load of the active/active encryption hardware does not exceed 100 percent; it is advisable to keep it less than 70 percent as a reasonable design principle. Assuming perfect load balancing, the two encryption engines would each sustain approximately 35 percent of the total load. If one of the encryption engines fails, then the redundant engine would service the cumulative load of 70 percent.

The active/standby stateful failover model assumes that the active encryption hardware will sustain all IPSec connections until failure occurs. At that point, the standby encryption hardware sustains all of the IPSec connections. Obviously, the standby encryption hardware must have similar or better capabilities or it will have a negative impact on performance. The primary advantage of using the stateful failover model is that all the IPSec sessions are synchronized between the two IPSec encryption engines. The platform detects the hardware failure or removal and immediately transfers responsibility for the IPSec connections to the standby encryption hardware. The advantage is that remote sites and clients do not need to renegotiate IPSec state, and therefore the recovery period is much shorter. More importantly, remote IPSec clients retain their previously assigned IP address and thus are able to sustain their end-to-end application connections. The trade-off with the stateful IPSec failover is that a reliable control channel must be sustained between the two encryption engines because each IPSec session keeps tracks of anti-replay sequence numbers, counters, and timers. Not only must the state of the IPSec encryptions be synchronized, but any auxiliary functions that are tightly coupled with the encryption engine may also need to be synchronized, such as IP routing state and MAC adjacencies.

Summary

In this chapter, you looked at various IPSec fault-tolerant models that illustrate how to recover from access link failures and node failures. The assessment of each model highlighted the pros and cons of each model. The combination of redundancy and large-scale VPNs creates additional design constraints that must be addressed; the next two chapters address the scalability issues of these models in greater detail.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset