Chapter 22
Site Recovery Manager

In this chapter, you will learn to:

  • What Is SRM?
  • Exploring the SRM Cmdlets
  • Connecting to the SRM Server
  • Information on SRM Recovery Plans
  • Protecting Virtual Machines
  • Unprotecting a Virtual Machine
  • Adding a Protection Group to a Recovery Plan
  • Testing an SRM Recovery Plan

VMware’s Site Recovery Manager (SRM) has quickly become the de facto standard for providing simple-to-use, easy-to-configure, and highly reliable recoverability from disasters of all sorts. Whether your systems experience an earthquake, hurricane, or administrator error, the ability to click a button to move your operations from the primary site to a backup site is a powerful motivator.

What Is SRM?

Site Recovery Manager is an additional software package for your vCenter server that enables you to add disaster recovery protection to your virtual machines. It is installed as a plug-in to vCenter and adds a new management window to your vCenter web interface. The SRM interface allows you to manage the different aspects of the recovery plan. There are multiple components to SRM that you must be aware of before attempting to automate any actions:

  1. Sites In order to fail over your virtual machines, you must have a destination. That destination is known as a disaster recovery site. In the real world, these are disparate datacenters that would be isolated from like disasters. For example, if your primary site is in an earthquake-prone area, you would want your disaster recovery site to be in a location that is far enough away to not be affected by the same earthquake. The primary site, the site that is the one where operations are run from during nonemergency conditions, is known as the protected site. The secondary site, the one that the virtual machines are executed from during an emergency, is known as the recovery site.
  2. Storage Virtual machine storage must be available at both sites in order to move operations from the protected site to the recovery site. This can be accomplished two ways: using your storage vendor’s array-based replication, or using vSphere Data Protection. vSphere Data Protection (VDP) is a VMware product that replicates virtual machine storage in a storage array–agnostic manner. For more information on VDP, please see the VMware website. When viewing the SRM API you will see references to ABR and HBR. ABR refers to array-based replication and relies on software provided by your storage vendor known as a storage replication adapter (SRA) to perform the functions necessary. HBR refers to host-based replication and is a reference to vSphere Data Protection.
  3. Protection Groups A protection group is a group of virtual machines that should be recovered together. This is most commonly a datastore used as the basis for array-based replication, but it could also be related to the application and individual virtual machines when you are using VDP. For example, you may want to have all of your infrastructure servers recovered before application servers, or vice versa.
  4. Recovery Plans A recovery plan details all the steps required to fail over the virtual machines in the protection groups. It also contains the set of steps used when testing failover for a site. The administrator adds protection groups to the recovery plan to assist with organizing the virtual machines, their storage, and their network requirements when performing disaster recovery operations.

Unfortunately, SRM configuration has been a manual process to this point. Many of the operations still require the use of the GUI, but VMware has expanded the functions that can be automated through the API with each new version. Beginning with PowerCLI 5.5, two cmdlets enable you to leverage the automation functions available via the SRM API: Connect-SrmServer and Disconnect-SrmServer. Some operations are dependent on SRM API v2.0, which is not available unless you are using SRM 5.8 or later, so please check which version you are using before executing any commands!

Exploring the SRM Cmdlets

The two cmdlets just mentioned, Connect-SrmServer and Disconnect-SrmServer, are the gateway to automating SRM tasks on your server. However, the SRM functionality is not yet exposed through cmdlets like most other tasks. Instead, you must call the API methods directly using PowerCLI. Here are a few examples showing how to explore the features and functionality exposed.

Connecting to the SRM Server

The first step to automating SRM tasks is, of course, connecting to the server. Before connecting to an SRM instance, you must connect to vCenter. After that, it’s as simple as calling the cmdlet, as we did in Listing 22-1.

Listing 22-1: Connecting to the SRM server instance

Connect-VIServer -Server $hostnameOrIp -Credential $credential
Connect-SrmServer -RemoteCredential $credential -Credential $credential

This single command connects your PowerCLI session to the SRM services on both the local and remote vCenter servers. Additionally, much like the way the vCenter connection is stored in a global variable, so is the SRM connection: $global:DefaultSrmServers. You can use this variable to address the SRM server whenever you need to.

Information on SRM Recovery Plans

Connecting to the SRM server is one thing, but let’s do something useful. The next bit of code (Listing 22-2) shows how to use the SRM API to view your recovery plans:

Listing 22-2: Listing SRM recovery plans

$srmApi = $global:DefaultSrmServers[0].ExtensionData
$recoveryPlans = $srmApi.Recovery.ListPlans()
$recoveryPlans | %{ $_.GetInfo().Name }

The code in Listing 22-2 prints the name of each of the recovery plans known by the SRM server. To get additional information, let’s look at some of the methods provided by the API. One of the most critical aspects of SRM is ensuring that the datastores are being replicated. The storage for the virtual machine must be replicated between the protected and recovery sites. This can be done using array-based replication (depending on your vendor), or using VMware’s vSphere Data Protection technology, referenced as host-based replication by SRM. Without mirroring the storage for a virtual machine, you cannot expect it to be available at the remote location. Listing 22-3 shows how to use the SRM API to list the datastores that are being protected. Note that this code snippet will only work with datastores that are using array-based replication.

Listing 22-3: Showing protected datastores

$srmApi = $global:DefaultSrmServers[0].ExtensionData
$srmApi.Protection.ListProtectionGroups() | %{
    Write-Host "Datastores protected by group $($_.GetInfo().Name)"

    $_.ListProtectedDatastores() | %{
        try {
            $_.UpdateViewData("Name")
            Write-Host "  $($_.Name)"
        } catch {
            return
        }
    }
}

There are some interesting things about the SRM API that you won’t see other places. Much of this is because the PowerCLI cmdlets mask complexity, but in this case, you don’t have cmdlets to use. The first thing to note is that you are executing methods directly in the API. To determine which methods are available, you can pipe the API object to the Get-Member cmdlet, as shown in Listing 22-4.

Listing 22-4: Showing available actions and values

$srmApi = $global:DefaultSrmServers[0].ExtensionData
$srmApi.Protection | Get-Member

   TypeName: VMware.VimAutomation.Srm.Views.SrmProtection

Name                               MemberType Definition
----                               ---------- ----------
CreateAbrProtectionGroup           Method     VMware.VimAutomat
CreateHbrProtectionGroup           Method     VMware.VimAutomat
Equals                             Method     bool Equals(Syste
GetHashCode                        Method     int GetHashCode()
GetProtectionGroupRootFolder       Method     VMware.VimAutomat
GetType                            Method     type GetType()
ListInventoryMappings              Method     VMware.VimAutomat
ListProtectedDatastores            Method     System.Collection
ListProtectedVms                   Method     System.Collection
ListProtectionGroups               Method     System.Collection
ListReplicatedDatastores           Method     System.Collection
ListUnassignedReplicatedDatastores Method     System.Collection
ListUnassignedReplicatedVms        Method     System.Collection
ToString                           Method     string ToString()
MoRef                              Property   VMware.Vim.Manage

Anything listed in the return is either a function that can be executed or a property that contains data. If your PowerShell screen is wide enough, you will be able to see the return value, along with any parameters that need to be supplied. If you are curious, and your screen isn’t wide enough, you can force PowerShell to display the full listing using the Format-List cmdlet:

$srmApi.Protection | Get-Member | Format-List

The second thing to notice about the code in Listing 22-3 is that the output does not contain full objects, but rather just framework objects with the managed object reference set. This means that you must update the data (using the UpdateViewData() method) to populate the fields and information that is important to you.

Now that you have a basic understanding of how the SRM API works, let’s look at some additional information about your recovery plans. Listing 22-5 shows how to use the SRM API to list the protected virtual machines for a recovery plan.

Listing 22-5: Showing protected virtual machines

$srmApi = $global:DefaultSrmServers[0].ExtensionData
$srmApi.Protection.ListProtectionGroups() | %{
    Write-Host "Virtual machines protected by group $($_.GetInfo().Name)"

    $_.ListProtectedVms() | %{
        try {
            $_.Vm.UpdateViewData("Name")
            Write-Host "  $($_.Vm.Name)"
        } catch {
            return
        }
    }
}

The code in Listing 22-5 gives you a simple list of VM names that are being protected by SRM. Although this is useful, it’s arguably more useful to see which VMs are being replicated but not protected, which is demonstrated in Listing 22-6. This would imply that you have virtual machines in replicated datastores but they have not been added to the protection plan. Note that this will only work when using vSphere Data Protection.

Listing 22-6: Showing unprotected virtual machines

$srmApi = $global:DefaultSrmServers[0].ExtensionData

# a local protection group
$protectionGroup = $srmApi.Protection.ListProtectionGroups() | Where-Object {
    $_.GetInfo().Name -eq $localPlanName
}

# get the MoRef for each of the VMs
$VMs = (Get-Cluster $clusterName | Get-VM).ExtensionData.MoRef

# get the virtual machines which can be protected
$protectionGroup.QueryVmProtection($VMs) | Where-Object {
    $_.Status -ne "IsProtected"
} | Foreach-Object {
    $_.Vm.UpdateViewData("name")
    $_.Vm.Name
}

Protecting Virtual Machines

To gain value from using SRM, you want to make sure that virtual machines are being protected by the protection plan. Even if the virtual machine is created in a datastore that is protected by either host-based or array-based replication that SRM is aware of, it will still not be protected by the plan.

To add the virtual machine to a protection group, you need to execute the ProtectVms method on the object. We have created a function to abstract this functionality and pipelining of the virtual machine object in order to simplify workflow. Listing 22-7 shows the function that adds a virtual machine to a protection plan.

Listing 22-7: The Add-SrmProtection function

function Add-SrmProtection {
    <#  .SYNOPSIS
        Adds a virtual machine to an SRM protection group

        .EXAMPLE
        Get-VM $vmName | Add-SrmProtection -ProtectionGroup "Super Protected"
        Add a virtual machine to SRM
        .PARAMETER VM
        The virtual machine to protect.

        .PARAMETER ProtectionGroup
        The name of the protection plan to add the VM to.

        .INPUTS
        [VMware.VimAutomation.ViCore.Impl.V1.Inventory.VirtualMachineImpl]

    #>
    [CmdletBinding()]
    param(
        [parameter(
            Mandatory=$true,
            ValueFromPipeline=$true
        )]
        [VMware.VimAutomation.ViCore.Impl.V1.Inventory.VirtualMachineImpl]
        $VM
        ,


        [parameter(Mandatory=$true)]
        [String]$ProtectionGroup
    )
    process {
        if ($global:DefaultSrmServers.length -lt 1) {
            throw "Not connected to SRM server"
        }

        $srmApi = $global:DefaultSrmServers[0].ExtensionData

        # get the protection group object
        $oProtectionGroup = $srmApi.Protection.ListProtectionGroups() |
            ?{ $_.GetInfo().Name -eq $ProtectionGroup }

        # if you are using vSphere Advanced Data Protection (VADP/VDR),
        # which is referred to as Host Based Replication (HBR) by SRM,
        # then you will need to associate the VM with the protection
        # group before executing the ProtectVms operation.  To do this,
        # uncomment the following code block:
        #if (
        #    $oProtectionGroup.ListAssociatedVms().MoRef
        #        -notcontains $VM.Id
        #    ) {
        #    Write-Verbose "Associating VM '$($VM.Name)' with protection `
        #       group '$($oProtectionGroup.GetInfo().Name)'"
        #
        #    $oProtectionGroup.AssociateVms($VM.Id)
        #}

        # create the spec to specify the VM to be protected
        $protectionSpec = New-Object VMware.VimAutomation.Srm.Views.SrmProtectionGroupVmProtectionSpec
        $protectionSpec.Vm = $VM.ExtensionData.MoRef

        # update the group to add the Vm
        $oProtectionGroup.ProtectVms( @($protectionSpec) )
    }
}

You can take advantage of this by pipelining the virtual machines in a replicated datastore into the function.

Get-Datastore "Replicated" | Get-VM | Add-SrmProtection -ProtectionPlan $planName

Unprotecting a Virtual Machine

Removing virtual machines that no longer require protection is important because it frees resources for those virtual machines that do need the extra protection. Fewer virtual machines to power on at the destination means faster recovery. Less data to replication between sites means that the replication may be able to happen faster, and thus provides a lower recovery point objective (RPO) and recovery time objective (RTO).

Much like when protecting a virtual machine, unprotecting the virtual machine involves passing the VM managed object references to the UnprotectVms method of the protection plan. The code in Listing 22-8 allows you to automate the removal of SRM protection.

Listing 22-8: The Remove-SrmProtection function

function Remove-SrmProtection {
    <#  .SYNOPSIS
        Removes a virtual machine from an SRM protection group

        .EXAMPLE
        Get-VM $vmName | Remove-SrmProtection -ProtectionGroup "oldPlan"
        Remove a virtual machine from SRM protection group

        .PARAMETER VM
        The virtual machine to unprotect.

        .PARAMETER ProtectionGroup
        The name of the protection plan to remove the VM from.

        .INPUTS
        [VMware.VimAutomation.ViCore.Impl.V1.Inventory.VirtualMachineImpl]

    #>
    [CmdletBinding()]
    param(
        [parameter(
            Mandatory=$true,
            ValueFromPipeline=$true
        )]
        [VMware.VimAutomation.ViCore.Impl.V1.Inventory.VirtualMachineImpl]
        $VM
        ,

        [parameter(Mandatory=$true)]
        [String]$ProtectionGroup
    )
    process {
        if ($global:DefaultSrmServers.length -lt 1) {
            throw "Not connected to SRM server"
        }
        $srmApi = $global:DefaultSrmServers[0].ExtensionData

        # get the protection plan object
        $oProtectionGroup = $srmApi.Protection.ListProtectionGroups() |
            Where-Object {$_.GetInfo().Name -eq $ProtectionGroup}

        # update the protection group to unprotect the Vm
        $oProtectionGroup.UnprotectVms($VM.Id)
    }
}

When you want to remove a virtual machine from an SRM protection group, you can simply use the previous function to update its status:

Get-VM $vmName | Remove-SrmProtection -ProtectionPlan $planName

Adding a Protection Group to a Recovery Plan

Since a protection group by itself only contains the virtual machines that you wish to associate together as a group, you want to make sure that the protection group has been added to a recovery plan. A recovery plan is where the action actually takes place, as this is where things like recovery order, pre- and post-scripts, and other actions can be configured. Unfortunately, at this time, there is no way to configure these settings using the API and PowerCLI, so you must rely on the GUI to perform them. However, you can, at least, ensure that the protection group is properly protected by the recovery plan. A function to do this in a single step is shown in Listing 22-9.

Listing 22-9: The Add-SrmProtectionGroup function

function Add-SrmProtectionGroup {
    <#  .SYNOPSIS
        Adds a protection group to a recovery plan.
        .EXAMPLE
        Add-SrmProtectionGroup -RecoveryPlan $planName -ProtectionGroup $groupName
        Add a group to a plan.

        .PARAMETER RecoveryPlan
        The name of the recovery plan to modify.

        .PARAMETER ProtectionGroup
        The name of the protection group to add.

        .OUTPUTS
        VMware.VimAutomation.Srm.Views.SrmRecoveryPlanInfo
    #>

    [CmdletBinding()]
    param(
        [parameter(Mandatory=$true)]
        [String]
        $RecoveryPlan
        ,

        [parameter(Mandatory=$true)]
        [String]
        $ProtectionGroup
    )
    process {
        if ($global:DefaultSrmServers.length -lt 1) {
            throw "Not connected to SRM server"
        }

        $srmApi = $global:DefaultSrmServers[0].ExtensionData

        # get the recovery plan from the API
        $plan = $srmApi.Recovery.ListPlans() | Where-Object {
            $_.GetInfo().Name -eq $RecoveryPlan
        }
        # get the protection group
        $group = $srmApi.Protection.ListProtectionGroups() | Where-Object {
            $_.GetInfo().Name -eq $ProtectionGroup
        }

        if ($plan.GetInfo().ProtectionGroups.MoRef -contains $group.MoRef)
            {
            Write-Warning "Protection group $($ProtectionGroup) is " `
                "already a member of $($RecoveryPlan)."
        } else {
            $plan.AddProtectionGroup( $group.MoRef )
            $plan.GetInfo()
        }
    }
}

You can now, simply and quickly, add a protection group to a recovery plan:

Add-SrmProtectionGroup -RecoveryPlan $plan -ProtectionGroup $group

This will add the group using default settings, so it’s still a good idea to verify, using the GUI, that the recovery plan is configured for the needs of your business. But it is a convenient method of ensuring that a minimum amount of protection is available for your virtual machines.

Testing an SRM Recovery Plan

Having a protection plan added to the recovery group in place is a huge step toward ensuring that your infrastructure and organization are prepared in the event that recovery becomes necessary. However, having a plan in place is not the same as having a tested plan that you know works. Many nights of sleep have been lost by administrators who believed their disaster recovery plans were solid, just to find out when they were executed something failed and required significant manual intervention.

SRM provides the ability to test the protection plans at the click of a button and enables you to verify that virtual machines, and the services they provide, are able to effectively recover at the destination. The test can be executed using the SRM API, which can be accessed using PowerCLI. This means that you can automate tests whenever you want. Listing 22-9 provides a function that can be used to add a protection group to a recovery plan. When that function is combined with the function in Listing 22-10, you can use it to invoke test and clean up cycles on demand.

Listing 22-10: The Invoke-SrmPlanAction function

function Invoke-SrmPlanAction {
    <#  .SYNOPSIS
        Executes an operation on the specified SRM plan.

        .EXAMPLE
        Invoke-SrmPlanAction -RecoveryPlan $planName -Failover
        Begin an failover operation

        .EXAMPLE
        Invoke-SrmPlanAction -RecoveryPlan $planName -Test -WaitMinutes 30
        Begin a test, wait a maximum of 30 minutes for completion

        .EXAMPLE
        Invoke-SrmPlanAction -RecoveryPlan $planName -Cleanup
        Cleanup after a test operation

        .PARAMETER RecoveryPlan
        The name of the plan to execute.

        .PARAMETER Test
        Used to indicate a test operation.

        .PARAMETER Cleanup
        Used to indicate a cleanup operation.

        .PARAMETER Failover
        Used to indicate a failover operation.  WARNING, this
        will cause SRM to move operations to your recovery
        site.

        .PARAMETER Reprotect
        This will reverse the protection after a failover
        operation, making the former recovery site (now
        the active site) the protected site.

        .PARAMETER WaitMinutes
        The maximum number of minutes to wait for the test to complete.

        .INPUTS
        [VMware.VimAutomation.Srm.Views.SrmRecoveryPlanInfo]

        .OUTPUTS
        [System.String]

    #>
    [CmdletBinding()]
    param (
        [parameter(
            Mandatory=$true,
            ValueFromPipeline=$true,
            ParameterSetName="Test"
        )]
        [parameter(
            Mandatory=$true,
            ValueFromPipeline=$true,
            ParameterSetName="Cleanup"
        )]
        [parameter(
            Mandatory=$true,
            ValueFromPipeline=$true,
            ParameterSetName="Failover"
        )]
        [parameter(
            Mandatory=$true,
            ValueFromPipeline=$true,
            ParameterSetName="Reprotect"
        )]
        [Alias(‘Name’)]
        [String]
        $RecoveryPlan
        ,
        [parameter(
            ParameterSetName="Test",
            Mandatory=$false
        )]
        [Switch]
        $Test
        ,

        [parameter(
            ParameterSetName="Cleanup",
            Mandatory=$false
        )]
        [Switch]
        $Cleanup
        ,

        [parameter(
            ParameterSetName="Failover",
            Mandatory=$false
        )]
        [Switch]
        $Failover
        ,

        [parameter(
            ParameterSetName="Reprotect",
            Mandatory=$false
        )]
        [Switch]
        $Reprotect
        ,
        [int]$WaitMinutes = 5

    )
    process {
        if ($global:DefaultSrmServers.length -lt 1) {
            throw "Not connected to SRM server"
        }

        $srmApi = $global:DefaultSrmServers[0].ExtensionData

        # get the recovery plan from the API
        $plan = $srmApi.Recovery.ListPlans() | Where-Object {
            $_.GetInfo().Name -eq $RecoveryPlan
        }

        $expectedState = ""
        $mode = 0
        switch ($true) {
            $Test {
                $expectedState = "Protecting"
                $mode = 1
            }

            $Cleanup {
                $expectedState = "NeedsCleanup"
                $mode = 2
            }

            $Failover {
                $expectedState = "Protecting"
                $mode = 0
            }

            $Reprotect {
                $expectedState = "FailedOver"
                $mode = 3
            }
        }

        if ($plan.GetInfo().State -eq $expectedState) {
            Write-Host "Starting operation..." -NoNewline
            $plan.Start($mode);

            # wait for the operation to finish/fail
            $start = Get-Date
            while ((New-TimeSpan $start).TotalMinutes -le $WaitMinutes ) {
                if ($plan.GetInfo().State -ne "Running") {
                    break
                }

                Write-Host "." -NoNewline
                Start-Sleep -Seconds 5
            }

            Write-Host "DONE!"

            # get result
            $planHistory = $srmApi.Recovery.GetHistory($plan.MoRef)

            $planHistory.GetRecoveryResult(1)[0]

        } else {
            Write-Warning "SRM plan is not in expected state."
        }
    }
}

When used to test a plan, followed by cleanup, the output will look like the following:

# begin plan test
Invoke-SrmPlanAction -RecoveryPlan $planName -Test
Executing test......................DONE!
Last result was: Success

# cleanup after the test
Invoke-SrmPlanAction -RecoveryPlan $planName -Cleanup
Starting cleanup.......DONE!
Cleanup result: Success

You can use this function, with its different actions, to execute a test followed by a cleanup in one simple, concise action. When you combine this with Windows scheduled tasks, as documented in Chapter 25, “Running Scripts,” and a bit of extra code to send an email report, you can set up a planned test to occur at regular intervals, such as every Monday morning. This ensures that you always have a validated recovery plan. You can be confident that in the event of an emergency, your services will come back online and remain available for your users.

We hope that this chapter has demonstrated some of the actions that are available using the SRM API in vSphere 5.5 and vSphere 6.0. As time goes on, VMware is sure to add more functionality, enabling further management of storage replication adapters, protection groups, and recovery plans. For now, it is a huge benefit to be able to automate adding and removing virtual machines from recovery plans, and testing those plans to ensure that, as your virtual empire grows, your infrastructure’s ability to survive a disaster remains intact.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset