Matt works as a datacenter fabric engineer for Contoso, a premier datacenter service provider located in the southeast United States. Matt is a disgruntled employee who plans to resign from his position next week after performing some activities he’s planned for the past several months.
First, Matt logs into a rack-mounted Hyper-V host that contains virtual machines owned by a local health care firm. He targets a virtual domain controller, stops the VM, and copies its VHD to Matt’s trusty USB thumb drive. He figures, correctly, that the health care firm has several domain controllers and won’t notice this virtual DC being offline for one hour.
Second, Matt copies the VHD to his personal laptop, where he uses community hacking software to launch an offline attack on the client’s Active Directory database.
Third, Matt logs into one of the client’s file server VMs and injects some malware that scans the server’s file system for sensitive data and transmits it to Matt’s offshore FTP account.
After all is said and done, Matt “owns” sensitive data from one of Contoso’s most important clients, and he has a back door to their information systems, available for Matt’s use whenever he wishes.
This is a nightmarish scenario, isn’t it? The sad fact is that it reflects reality. In this chapter we examine new Windows Server 2016 features that remediate the problem of separation of duties between fabric and workload administrators. Specifically, we dive into Microsoft’s Guarded Fabric solution, which protects Hyper-V VM workloads against virtualization host administrators. The days of the hardware host having full, “keys to the kingdom” access to all guest VMS are rapidly coming to an end.
Skills in this chapter:
In Microsoft nomenclature, a Guarded Fabric is a Hyper-V virtualization infrastructure that provides granular, delegated access to all guest virtual machines (VMs). The key to understanding Microsoft’s vision of the Guarded Fabric is to grasp the separation of concerns between fabric and workload administrators.
In information technology nomenclature, fabric refers not to cloth, but to computing hardware that functions together in order to accomplish a business goal. For example, a stack of Hyper-V hosts mounted in a datacenter server rack is a good example of a fabric.
More to that point, a fabric administrator is a systems administrator who is charged with the maintenance of the fabric’s constituent hardware and system software. In a Hyper-V context, this normally allows the fabric administrator to perform actions like:
Start, restart, and service the host hardware
Start and stop the host’s virtual machines (VMs)
Notice that a fabric administrator can very well have no right to actually logging into those hosted VMs. That job role is normally reserved for the workload administrator. In IT, a workload refers to the type of work a computer is given to do. For instance, a Hyper-V server is called a virtualization host. You have other physical or virtual servers that run Active Directory, SQL Server, SharePoint Server, and so forth.
It’s true that in some businesses the fabric administrators and the workload administrators are the same people. Guarded Fabric isn’t appropriate for those scenarios. Instead, let’s examine how to create a hard security boundary line between those administrative roles to help your business remain in compliance with any governmental and/or industry regulations and SLAs you could be subject to.
This section covers how to:
Install and configure the Host Guardian Service
Configure admin- and TPM-trusted attestation
Configure Key Protection Service using HGS
Migrate shielded VMs to other guarded hosts
Building a trusted fabric for Hyper-V services involves deploying a Host Guardian Service (HGS) cluster. HGS is a new server role in Windows Server 2016. Figure 2-1 demonstrates its general workflow.
In Figure 2-1, observe the following points:
The Host Guardian Service (HGS) cluster exists in a separate Active Directory forest called a safe harbor forest. This creates a strong security and isolation boundary between the HGS cluster and your production forest.
You need to manually create a one-way external trust between the HGS forest and the production forest. The “resources trust accounts” directionality of the trust signifies that the HGS forest is at least theoretically willing to trust user and computer accounts from the production forest (more on that later).
Technically, the HGS server role provides two services that enable guarded hosts (also known as HGS clients) to run shielded virtual machines. For now, consider a shielded VM to be a protected VM; we’ll formally delve into shielded VMs specifically later in the chapter.
What are the two services?
Attestation The Host Guardian Service unlocks a shielded VM only if the identity and integrity of the VM has been verified.
Key protection These are the encryption keys that enable the shielded VM to transition between the encrypted and unencrypted states.
Note Vocabulary Alert - Attestation
The verb attest means “to be a witness to or to certify formally.” The noun attestation appears in HGS discussions because we need some way or ways for our HGS cluster to certify that guarded (shielded) Hyper-V VMs are authorized for use.
We recommend that you deploy at least three physical or virtual HGS servers to provide high availability. The reason for this is simple: you won’t be able to work with any shielded VMs unless the HGS cluster is available.
As you’ll see in a moment, each HGS server becomes a domain controller in a new, separate, “safe harbor” Active Directory Domain Services (AD DS) forest. For this reason, your hosts need to be configured for workgroup networking and not be a member of any other AD domain.
We do all Host Guardian Service setup and configuration through Windows PowerShell, so make sure that you’re logged onto the prospective HGS server as a local administrator; and that you’ve started an elevated Windows PowerShell console session.
You need only a single line of PowerShell to install the Host Guardian Service server role:
Install-WindowsFeature -Name HostGuardianServiceRole -IncludeManagementTools -Restart
When using the Restart switch parameter, you definitely need to restart the server after running InstallWindowsFeature. You’ll also observe that PowerShell installs the Active Directory Domain Services, Failover Clustering, Web Server, and BitLocker Drive Encryption server roles or features in the bargain.
After you verify that the new HGS node isn’t already a member of an AD domain, you’re ready to set up the safe harbor forest:
$pw = ConvertTo-SecureString -String 'P@$$word!' -AsPlainText
Install-HgsServer -HgsDomainName 'safe.local' -SafeModeAdministratorPassword $pw -Restart
When the server is back from its second reboot, you’ll have a brand-new domain controller in a brand-new forest. Make sure you log into the HGS server by using the new credentials (in our case, the username is safeadministrator) and not local creds.
Recall that the Host Guardian Service involves two services: attestation and encryption key transport. Take a close look at Figure 2-2, in which we explain in further detail how the Guarded Fabric solution works in practice.
In Figure 2-2 you see the following:
A workload administrator attempts to access a shielded VM, typically by using a Remote Desktop Protocol (RDP) Remote Desktop connection
Because the VM is shielded and resides on a guarded host, the guarded Hyper-V host requests (a) host attestation; and (b) decryption keys from the HGS cluster
Depending on how attestation is configured, the guarded host is either allowed or not allowed to unlock the shielded VM
If the shielded VM is approved for unlock, the HGS transmits the decryption keys
The workload administrator continues his or her work as necessary
Meanwhile, the fabric (Hyper-V host) administrator is limited only to turning on or turning off the shielded VMs. In a Guarded Fabric environment, fabric administrators cannot use tools like the Virtual Machine Connection or the PowerShell Direct to interact with shielded VMs. The shielded VMs, including their BitLocker-encrypted VHD files, are strictly off-limits to fabric administrators.
The guarded host attestation method you choose depends upon how current your HGS hardware is, as well as your need for high security. In Table 2-1 we compare the admin-trusted attestation and TPM-trusted attestation options.
We need to create a one-way external trust relationship between our safe.local HGS forest and our contoso.local production forest. Let’s do that by using the trusty netdom command-line tool:
netdom trust safe.local /domain:contoso.local /userD:contosoAdministrator /
passwordD:P@$$w0rd /add
Although netdom is considered by many Windows systems administrators to be a legacy tool and you can perform the previous action by using PowerShell, I personally adhere to the mantra “Whatever works best.” If you saw the cumbersome, non-intuitive PowerShell that’s required to create that trust relationship, you’d understand.
We also need to ensure DNS name resolution between the production and HGS forests. Run the following PowerShell command on a DNS server in your production forest:
Add-DnsServerConditionalForwarderZone -Name 'safe.local' -ReplicationScope 'Forest' -MasterServers 10.0.0.2
Okay, let’s go back to our elevated Windows PowerShell session. We need to generate signing and encryption digital certificates for the cluster. In production you’d most likely using an Active Directory Certificate Services (AD CS) public key infrastructure (PKI). However, for test/dev/study purposes, self-signed certificates are just fine.
$certificatePassword = ConvertTo-SecureString -AsPlainText 'P@$$w0rd' –Force
$signingCert = New-SelfSignedCertificate -DnsName 'signing.safe.local'
Export-PfxCertificate -Cert $signingCert -Password $certificatePassword -FilePath 'C:signingCert.pfx'
$encryptionCert = New-SelfSignedCertificate -DnsName 'encryption.safe.local'
Export-PfxCertificate -Cert $encryptionCert -Password $certificatePassword -FilePath 'C:encryptionCert.pfx'
Now, initialize the HGS server:
Initialize-HgsServer -LogDirectory C: -HgsServiceName 'HGS' -Http -TrustActiveDirectory
-SigningCertificatePath 'C:signingCert.pfx' -SigningCertificatePassword
$certificatePassword -EncryptionCertificatePath 'C:encryptionCert.pfx'
-EncryptionCertificatePassword $certificatePassword
That’s a lot of code. Here are the main takeaways:
The HgsServiceName is the host name of the Host Guardian Service cluster. Therefore, my safe.local DNS includes an entry for hgs.safe.local that refers to the cluster itself
The TrustActiveDirectory switch signifies admin-trusted attestation; the TrustTPM switch signifies TPM-trusted attestation
The pfx files contain the private and public keys for the certificates the cluster uses to encrypt and decrypt shielded VMs
We’re using admin-trusted attestation in our test lab. Therefore, we need to work with Active Directory global security groups in both forests.
Production forest We created a global security group called GuardedHostGroup that contains the hostname of our hyperv1.contoso.local Hyper-V server.
HGS forest We ran the following PowerShell command to include the GuardedHostGroup to the HGS cluster’s attestation group. This means that only Hyper-V hosts that reside in this group is allowed to work with shielded VMs.
Add-HgsAttestationHostGroup -Name 'GuardedHostGroup' -Identifier S-1-5-21-2964496017-1673051062-3633127581-1603
You can obtain the security identifier (SID) of the group by running GetADGroup on one of your fabric domain controllers. Finally, you can run the following command to verify that your first HGS host is correctly set up:
Get-HgsTrace -RunDiagnostics
A result of “Pass” is what you want to see. We’ll cover troubleshooting in the “Troubleshoot guarded hosts” section later in this section. For reference, Figure 2-3 shows you an example forest trust and AD security group setup.
As we’ve seen, in admin-trusted attestation we verify only that the guarded Hyper-V host belongs to the appropriate Active Directory security group.
TPM-trusted attestation kicks Guarded Fabric security up several notches. Specifically, we take advantage of hardware TPM, UEFI, and Secure Boot to ensure that your guarded hosts are in a “healthy” state and they run only trusted code.
The TPM-trusted attestation setup process actually involves Device Guard, which we covered in Chapter 1. Specifically, you need to capture the following information from each guarded host:
TPM identifier This is an endorsement key (EK) that uniquely identifies each Hyper-V guarded host.
TPM baseline Measurements of boot environment. If a single bit falls out of compliance, the guarded host is not able to start shielded VMs.
Code integrity policies This is Device Guard, where we whitelist digitally signed software that the Hyper-V guarded host can run. Any software not in the whitelist simply cannot be performed.
As you’d expect, we use Windows PowerShell to capture guarded host state and transfer this data to the HGS cluster.
Need More Review? Set Up HGS In Your Test Lab
The purpose of this book is certification exam prep, not comprehensive lab procedures. To that point, the procedures given here should be considered as overviews rather than comprehensive step-by-steps.
Microsoft has published an exhaustive step-by step Guarded Fabric deployment guide that fully covers both attestation scenarios. Download the “Guarded Fabric Deployment Guide for Windows Server 2016” whitepaper at http://timw.info/gf.
The Key Protection Service (KPS) is installed automatically when you install the Host Guardian Service server role. Whereas the Attestation service is all about validating the identity and integrity of trusted hosts, KPS is concerned with storing and transmitting encryption keys for use with shielded virtual machines.
You can verify the presence of your signing and encryption keys by running a PowerShell query like the following:
Get-ChildItem -Path Cert:LocalMachineMy -DnsName *enc*
Get-ChildItem -Path Cert:LocalMachineMy -DnsName *sign*
The preceding statements work fine in our environment because the DNS names of our two certs are encryption.safe.local and signing.safe.local, respectively.
We’ve seen a trend thus far that if your server hardware fabric is more capable (if you’ve invested capital in the latest and greatest tech), then you can step up your security benefits tremendously.
Take the hardware security module (HSM), for example. Like the TPM, this is a dedicated hardware cryptoprocessor. Unlike the TPM, however, the HSM is typically an aftermarket purchase and installation rather than a native enhancement to the system motherboard.
Let’s assume you installed an HSM on an HGS host—how can you migrate your existing certificates to the HSM? You can just use the Add-HgsKeyProtectionCertificate cmdlet. You then use Set-HgsKeyProtectionCertificate to make your HSM-backed keys the default ones to be used by each node in the cluster.
You may be thinking that “we showed you the basics of setting up an Host Guardian Service cluster, but are you going to teach me how to set up the HGS clients?”
Yes, indeed! Let’s move over to my HYPERV1.contoso.local member server. Recall that in our lab we’re using admin-trusted attestation, and that our HYPERV1 host exists in a special security group considered to be trusted by the HGS cluster.
Begin by installing the Hyper-V server and Host Guardian Client server roles and rebooting the server.
Install-WindowsFeature -Name Hyper-V, HostGuardian -IncludeManagementTools -Restart
HGS makes use of Representational State Transfer (REST) application programming interfaces (APIs) to perform the attestation and key transfer operations. That’s why the IIS Web Server was installed when you set up your HGS cluster.
You can retrieve your HGS cluster’s attestation and key protection URLs by running Get-HgsServer on any of your HGS cluster nodes.
On the guarded host, run the following PowerShell command from an elevated console:
Set-HgsClientConfiguration -AttestationServerUrl 'http://hgs.safe.local/Attestation'
-KeyProtectionServerUrl 'http://hgs.safe.local/KeyProtection'
In the previous example, recall that hgs.safe.local is the cluster’s fully qualified domain name (FQDN). Just because we have only one HGS host in our fabric doesn’t mean you should do the same. According to Microsoft best practices, your HGS cluster should include at least three hosts.
Note Managing the Host Guardian Service in the Enterprise
If you think all this system-by-system PowerShell configuration is tedious, you’re certainly not alone. The good news is that Microsoft is building HGS-aware tooling into both System Center 2016 Virtual Machine Manager (SCVMM) as well as Azure Stack to manage HGS clusters and shielded VMs.
In case you are unaware, SCVMM is a core component of the Microsoft private cloud stack. Azure Stack is an on-premises version of the Azure public cloud.
To initiate an attestation attempt on the guarded host, run the following PowerShell command and take note of the output:
PS C:> Get-HgsClientConfiguration
IsHostGuarded : True
Mode : HostGuardianService
KeyProtectionServerUrl : http://hgs.safe.local/KeyProtection
AttestationServerUrl : http://hgs.safe.local/Attestation
AttestationOperationMode : ActiveDirectory
AttestationStatus : Passed
AttestationSubstatus : NoInformation
Have you ever heard the age-old expression, “You have to learn how to crawl before you can walk?” That’s kind of how we feel about this particular exam objective. Why cover how to migrate shielded VMs to other guarded hosts when you haven’t formally been introduced to shielded VM setup yet?
At any rate, here we go. We’ll actually start from an exceedingly common scenario in which we have an unshielded Generation 2 VM running on a Hyper-V host that is presently not a guarded host, and we want the VM to move to the guarded host; and take on shielding. The process of shielding an existing VM is called grandfathering by Microsoft.
Here’s the high-level procedure; again, we’re taking these steps on a Hyper-V host that is not part of our Guarded Fabric. The VM in question is a Generation 2 Windows Server 2016 system named vs1.contoso.local:
1. Retrieve the HGS guardian metadata from the HGS server. The output, which allows us to create a key protector for the VM, is an extensible markup language (XML) file that needs to be copied to the non-guarded Hyper-V host. The command to do this is as follows:
Invoke-WebRequest http://hgs.safe.local/KeyProtection/service/metadata/2014-07/metadata.xml -OutFile C:HGSGuardian.xml
2. The following PowerShell command sequence, which is annotated with in-line comments, performs the shielding operation on the vs1 VM. The Key Protector that is created contains one owner guardian, and one or more HGS guardians (thanks to the Microsoft Datacenter and Private Cloud Security team for authoring this script).
# vs1 is the VM name to be shielded
$VMName = ‘vs1.contoso.local’
# Turn off the VM first. You can only shield a VM when it is powered off Stop-VM –VMName $VMName
# Create an owner self-signed certificate
$Owner = New-HgsGuardian –Name ‘Owner’ –GenerateCertificates
# Import the HGS guardian
$Guardian = Import-HgsGuardian -Path ‘C:HGSGuardian.xml’ -Name ‘TestFabric’ – AllowUntrustedRoot
# Create a Key Protector, which defines the fabric that’s allowed to run this shielded VM
$KP = New-HgsKeyProtector -Owner $Owner -Guardian $Guardian -AllowUntrustedRoot
# Enable shielding on the VM
Set-VMKeyProtector –VMName $VMName –KeyProtector $KP.RawData
# Set the security policy of the VM to be shielded
Set-VMSecurityPolicy -VMName $VMName -Shielded $true
# Enable vTPM on the VM
Enable-VMTPM -VMName $VMName
Note Generation 1 Need Not Apply
The Host Guardian Service can work with only Generation 2 virtual machines that use the .VHDX virtual hard disk file format. Generation 2 VMs are required to support UEFI firmware and virtual TPM, among other features.
Because Microsoft Azure supports only Generation 1 .VHD VMs, Host Guardian Service does not work in Azure. This is likely to change, given that the Azure development team ships several new features every single business day.
3. Because you won’t get console access to the VM once it becomes shielded, it’s crucial that you prepare the unshielded VM beforehand. This involves setting appropriate rules in Windows Firewall, configuring WSMan remote management, and (most importantly) enabling BitLocker Drive Encryption on the VM’s virtual hard drive.
4. Export the VM from the tenant host, and import it on a guarded Hyper-V host.
The VM is now shielded and thereby protected against fabric administrators. Let’s test that so we can get the full experience!
Let’s pretend that we’re a fabric administrator and we’re poking around the host operating system on a guarded Hyper-V host. As you can see in Figure 2-4:
Shielded VMs don’t display a live thumbnail preview in Hyper-V Manager. Instead, you see a pale grey rectangle.
Shielded VMs don’t allow direct connection via the Hyper-V Virtual Machine Connection tool.
Likewise, you’ll receive an “access denied” error if you shut down the shielded VM, open Disk Management on the guarded host and attempt to mount a shielded VM’s virtual hard drive. This is BitLocker Drive Encryption at work, and the error is shown in Figure 2-5.
Here are the details that explain the annotations in Figure 2-5:
A As long as the shielded VM is powered off and the VHDX is thereby unlocked (figuratively speaking), you can attach it to your host system.
B The shielded VM disk volumes actually are listed, but notice the BitLocker Drive Encryption label.
C Sure enough, BitLocker isn’t going to allow any type of offline attack.
We’ll have much more to say about virtual TPM in the next section, but for now you should be able to view a shielded VM’s TPM data by opening the TPM Management console from within the shielded VM. As you can see in Figure 2-6, the TPM shows Microsoft as the manufacturer, and 2.0 as the specification version.
The great beauty of Guarded Fabric is that as long as you want to migrate a shielded VM from one guarded host to another within a single HGS cluster, you can use any of the standard VM migration methods, including:
Live migration (with or without shared storage)
Hyper-V replica
VM checkpoints
Hyper-V export/import
Let’s get some of the preliminary queries out of the way first. You can determine whether a VM is shielded by running the GetVMSecurity command on the guarded host:
PS C:> Get-VMSecurity -VMName 'vs1.contoso.local'
TpmEnabled : True
Shielded : True
EncryptStateAndVmMigrationTraffic : True
CimSession : CimSession: .
ComputerName : HYPERV1
IsDeleted : False
You can also verify the VM’s status by examining the VM’s properties in Hyper-V Manager, as shown in Figure 2-7. Notice that the fabric administrator cannot disable shielding; all options on the page are unavailable.
You can determine whether a host is guarded by verifying that the Host Guardian Hyper-V Support feature is enabled on the system. Here’s the PowerShell and resulting output:
PS C:> Get-WindowsFeature -Name HostGuardian
Display Name Name Install State
------------ ---- -------------
[X] Host Guardian Hyper-V Support HostGuardian Installed
Next, perform the final confirmation by inspecting the host’s HGS client configuration, paying particular attention to the IsHostGuarded property:
PS C:> Get-HgsClientConfiguration
IsHostGuarded : True
Mode : HostGuardianService
KeyProtectionServerUrl : http://hgs1.safe.local/KeyProtection
AttestationServerUrl : http://hgs1.safe.local/Attestation
AttestationOperationMode : ActiveDirectory
AttestationStatus : Passed
AttestationSubstatus : NoInformation
Finally, we have the sticky situation of RDP sessions failing to a shielded VM and all local console access blocked. Add BitLocker to the equation, and you might think both the workload and fabric admins are hosed.
Not so fast. Shielded VM recovery is possible. Try to remove shielding from the VM on the guarded host:
Set-VMSecurityPolicy -VMName 'vs1.contoso.local' -Shielded $false
That’s going to fail, for obvious reasons. According to Microsoft’s “Guarded Fabric and Shielded VMs Troubleshooting Guide” white paper, you can try the following:
1. Export the shielded VM from the guarded host and import it on a host along with the owner’s guardian key.
2. On the second host, run the previous PowerShell command to disable shielding.
3. Make whatever modifications you need to repair the VM’s configuration.
Obviously, shielded VM recovery is a proverbial “sticky wicket” because any backdoors to defeat a security system can obviously be abused by bad actors. Later in the chapter we cover how you can use encryption-supported VMs to make troubleshooting and recovery a bit more flexible.
In covering the Host Guardian Service, we laid the groundwork for a trusted computing platform with Hyper-V shielded virtual machines. Now let’s turn our attention formally to this subject.
This section covers how to:
Determine requirements and scenarios for implementing shielded VMs
Create a shielded VM using the Hyper-V environment
Enable and configure vTPM to allow operating system and data disk encryption within a VM
Determine requirements and scenarios for implementing encryption-supported VMs
Shielded VM connections and recovery
First of all, we probably ought to formally define a shielded VM. A shielded virtual machine is a Generation 2 Hyper-V virtual machine running Windows Server 2012 R2, Windows Server 2016, or Linux that uses a variety of current-generation technologies, including virtualization based security (VBS) and BitLocker Drive Encryption, to protect its contents from fabric administrators. Workload administrators use RDP and PowerShell remoting to access the VM as they normally would.
Note Did You Say Windows Server 2012R2? And Linux?
Windows Server 2012 R2 supports Generation 2 VMs, so you can deploy Windows Server 2012 R2–based shielded virtual machines on Windows Server 2016 Hyper-V hosts.
Although the documentation is sketchy as of this writing, Windows Server 2016 supports Linux-based Hyper-V shielded VMs as well. Linux supports TPM, UEFI, and Secure Boot, but not BitLocker Drive Encryption. To that end, Microsoft plans to employ the dm-crypt disk encryption subsystem to provide whole-disk encryption for Linux-based shielded VMs.
We need to be clear here in stating that the problem of the untrusted fabric admin is in no way unique to Microsoft Hyper-V scenarios. Wherever we have a hypervisor, be it VMware, Citrix Xen, KVM, whatever, this issue exists. The difference is that the Host Guardian Service represents a Microsoft-and-Hyper-V-centric solution to the problem.
Moreover, remember that Microsoft’s current and future security focus is on embracing an “assume breach” posture. For example, let’s imagine that our Hyper-V hardware hosts have been compromised by malware. Let’s go further and conceive that the malware has elevated permissions to that of fabric administrator. Those virtual machines suddenly become pretty darned vulnerable, don’t they?
So what we’re saying is that the need for shielded virtual machines is just as much about protecting our VMs from the host itself as it is about protecting those resources from rogue fabric administrators.
The fact that Windows Server 2016 supports nested virtualization is a big deal. Recall that nested virtualization allows you to set up a Hyper-V VM as a virtualized Hyper-V host itself. That might seem like simply an intellectual exercise or a parlor trick, but not so fast.
Remember in the previous chapter when we discussed Credential Guard? That’s VBS, and it’s the technology behind virtual TPM (vTPM). We can take advantage of this Virtual Secure Mode (VSM, and yes, we’re dealing with far too many three-letter acronyms (TLAs)) both at the host and guest operating system level. At the guest level, VBS protects Windows Server 2016 VMs against “pass the hash” or “pass the ticket” memory attacks.
The purpose of shielded VMs is to ensure that their VHDX virtual hard disk files as well as their configuration data are protected (shielded) from fabric administrators. Fabric administrators fail at any of the following attempts to access a shielded VM from the host:
VM Connect console access
RDP access (unless they have guest operating system credentials)
PowerShell Direct
The guarded hosts are part of an HGS guarded fabric; Windows PowerShell Just Enough Administration (JEA) protects the endpoints on guarded hosts. But what about the workload administrators?
The good news is that workload administrators can use their standard methods for interacting with the shielded VMs to which they legitimately have access:
RDP
PowerShell remoting
Remote Server Administration Tools (RSAT)
Browser-based connectivity
The worst case scenario for a workload administrator is deploying a shielded VM without first ensuring that remote management is enabled and working on those VMs. We’ll cover options for remediating this very real problem later in this section.
As of this writing, fabric administrators have two methods for creating shielded VMs:
Converting an existing, non-shielded VM This process is called “grandfathering,” and we covered it earlier in the chapter
Using a template VHDX This is the preferred way to deploy shielded virtual machines because it subscribes to the “clean source” security principle and the VM is protected over its entire lifecycle
Let’s face it: manually performing any manual task is a pain in the neck, and prone to costly human error. If your company is on-board with virtualization, then adding administrative automation and orchestration to the mix is a no-brainer.
System Center 2016 Virtual Machine Manager is Microsoft’s primary tool for fabric administrators to centrally manage Hyper-V hosts and VM templates. As you can see in Figure 2-8, SCVMM 2016 is fully enlightened with regard to shielded VMs, and enables you to store pre-shielded templates for rapid deployment.
Azure Stack is a forthcoming Microsoft solution that packages the Azure public cloud services (mainly infrastructure-as-a-service (IaaS), but also some platform-as-a-service (PaaS)) for use on-premises. Once again, Azure Pack, whose release date is scheduled for late 2017 as of this writing, are fully shielded VM-aware and allow you to deploy and manage shielded VMs.
Let’s work through how we can deploy a new VM with shielding in the absence of a fabric-management tool such as SCVMM or Azure Stack.
We’ll work from a hyperv1.contoso.local guarded Hyper-V host; we can (and should) run Get-HgsClientConfiguration to make sure our link to the Host Guardian Service cluster is still active and problem-free.
Specifically, we’re going to deploy a new shielded VM by creating the following artifacts on our host:
signed template VHDX
shielding data file
Unattend.xml answer file
Don’t worry, you’ll understand those previous three items momentarily. The first thing we need to do on our Hyper-V host is to install the Shielded VM Tools server feature. As usual, we’ll use Windows PowerShell exclusively:
Install-WindowsFeature -Name RSAT-Shielded-VM-Tools
You’ll need to have a new Generation 2 VM ready to rock; the virtual hard disk file in our example is named template.vhdx.
You should perform the following actions in the guest VM in order to prepare it for shielding:
Enable and configure RDP and PowerShell remoting
Configure Windows Firewall in correspondence with network security policies
Encrypt the disk with BitLocker Drive Encryption
If you plan to reuse the VHDX template, you’ll want to sysprep and shut down the VM before proceeding.
Digital certificates remain the primary method for providing authentication, integrity, and confidentiality. To that end, we need to digitally sign our unshielded template disk. In production you’d use a valid certificate that’s trusted throughout your organization; here we’ll create a self-signed cert:
$cert = New-SelfSignedCertificate -DnsName 'signing.contoso.com'
Protect-ServerVHDX -Path 'C:vms emplate.vhdx' -TemplateName 'ServerOSTemplate'
-Version 1.0.0.1 -Certificate $cert
Note The Ever-Volatile Nature Of Windows
Don’t be surprised if you try the Protect-ServerVHDX command on your Windows Server 2016 server and receive an error from the PowerShell engine. This cmdlet could very well be renamed to Protect-TemplateDisk.
You could see a similar shift from the Protect-ShieldingDataFile cmdlet to its supposed, eventual New-ShieldingDataFile replacement.
We need to keep in mind that today’s rapid development/continuous delivery IT service model means that what’s true today with Microsoft technology can be renamed (at the least), or removed or replaced (at the worst), with minimal advance notice from the Microsoft development teams.
Now we get to the heart of the matter. If the (untrusted) fabric administrator is tasked with deploying a shielded VM for his or her client, then how can we prevent that fabric administrator from viewing VM-specific secrets such as the guest operating system administrator password? This seems like a “chicken vs. egg” proposition at first.
The PDK file is essentially an encrypted collection of secrets that allows you to shield the VM, link the VM to your HGS cluster, and keep sensitive data out of the reach of the fabric admin who is provisioning the VM in the first place.
The idea is that it is the workload administrators who create the shielding data; these admins can then hand off the encrypted PDK file to the fabric admins who actually provision the shielded VM.
The following is a sample annotated PowerShell to give you the shielding data creation workflow. Thanks to the Microsoft Datacenter and Private Cloud Security team for providing the code for us to adapt:
#Create a volume signature catalog file for the template disk;
this ensures the template disk is not being tampered with at deployment time
Save-VolumeSignatureCatalog -TemplateDiskPath '. emplate.contoso.local.vhdx'–
VolumeSignatureCatalogPath '.ServerOSTemplate.vsc'
# Create an owner certificate
$Owner = New-HgsGuardian –Name 'Owner' –GenerateCertificates
# Import the HGS guardian
$Guardian = Import-HgsGuardian -Path '.HGSGuardian.xml' -Name 'TestFabric' –AllowUntrustedRoot
# Create the PDK file on the tenant host server
Protect-ShieldingDataFile -ShieldingDataFilePath "template.contoso.local.
pdk" -Owner $Owner –Guardian $guardian –VolumeIDQualifier (New-VolumeIDQualifier
-VolumeSignatureCatalogFilePath '.ServerOSTemplate.vsc' -VersionRule Equals)
-WindowsUnattendFile '.unattend.xml' -Policy Shielded
We want you to understand the PowerShell command workflow, but don’t expect that the code always runs exactly as recorded in this book. Instead, use the Microsoft TechNet documentation because the product teams regularly update it as technology evolves.
In the Protect-ShieldingDataFile statement you saw a WindowsUnattendFile parameter; this answer file contains important secrets that pertain to the new VM, including:
local administrator password
time zone
AD DS domain join metadata
RDP certificate thumbprint and .pfx password
If you’re not yet up-to-speed with Windows PowerShell (although you certainly must be if you’re to pass the 70-744 certification exam), the Shielded VM Tools feature does have a graphical Shielding Data File Wizard located at C:WindowsSystem32ShieldingDataFileWizard.exe. Figure 2-9 depicts the user interface.
At this point the fabric administrator completes the process by placing the signed and encrypted template VHDX file and encrypted shielding data PDK file on a guarded Hyper-V host, and then initializing the shielded VM by using the shielding data file.
What’s so cool about virtual Trusted Platform Module (vTPM) is that we can use TPM technology on our Hyper-V VMs even if the hardware host doesn’t have a physical TPM. Of course, the best-case scenario is that your Hyper-V hosts all have on-board TPMs and potential hardware security modules (HSMs) as well.
The “secret sauce” behind vTPM is what Microsoft calls Isolated User Mode (IUM). Take a look at Figure 2-10, and we’ll expand upon what we covered about IUM.
Notice in Figure 2-10 that historically the LSASS process stores credentials in unprotected memory space. This, of course, opens the system to memory attacks and credential theft. As long as you have Hyper-V running on your Windows Server 2016 servers, the operating system can store secrets in strongly-isolated memory space.
Exam Tip
Microsoft is infamous for spontaneously and repeatedly changing product and technology names. On the 70-744 exam you could see references to Virtual Secure Mode (VSM), Isolated User Mode (IUM), or Virtualization-Based Security (VBS). These acronyms all mean the same thing.
Use the following procedure to enable vTPM on a new unshielded Hyper-V VM.
1. Recall that the “guardian” in a Host Guardian Service context refers to the HGS cluster; specifically, its certificate-based key. We’ll assign a variable named owner to our guardian (which is unfortunately named Owner).
$owner = Get-HgsGuardian -Name Owner
2. Next we’ll generate a key protector and then associate it with our VM:
$kp = New-HgsKeyProtector -Owner $owner -AllowUntrustedRoot
Set-VMKeyProtector -VMName ‘server02.contoso.local’ -KeyProtector $kp.RawData
Throughout this chapter we’ve used self-signed certificates for simplicity. In production you’d use your VM owner digital certificate and omit the -AllowUntrustedRoot switch parameter in your NewHgsKeyProtector statement. Recall that a key protector defines on which guarded fabrics a shielded VM is allowed to run.
1. Finally, we switch on vTPM in the VM:
Enable-VMTPM -VMName ‘server02.contoso.local’
You can now toggle vTPM support in the Settings page of the VM in Hyper-V Manager, as shown in Figure 2-11.
BitLocker Drive Encryption ensures that your shielded VM’s VHDX files are secure when at rest. The shielding data file that contains VM-specific secrets is encrypted as well.
But what about when shielded VMs (including their memory state, configuration, as well as virtual hard disks) are in transit, for example during a live migration?
The good news is that the VM’s vTPM is as portable as the rest of the VM. This means that your shielded VMs remain protected even when their data is transmitted over the network.
Recall that VM live migration can take place both in failover clusters as well as in stand-alone scenarios, and shielded VMs work perfectly fine in either one. However, the source and destination hosts need to be valid members of your guarded fabric.
While we’re on the subject, each Hyper-V VM runs as a worker process named vmwp.exe on the host. Shielded VMs have “hardened” VM worker processes that prevent illicit tampering from fabric administrators; for instance, attaching a debugger to the process.
Everything we’ve done thus far in this module pertains to “classic” shielded VMs. We also noticed an Achilles Heel to the technology: that there is no built-in recovery method to provide console VM access in the event that workload administrators forgot to configure remote management in the virtual machine prior to its shielding.
Microsoft offers the encryption-supported VM option for businesses who:
trust their fabric administrators
require console access to Hyper-V VMs
can meet their compliance requirements without full VM shielding
Let’s compare shielded vs. encryption-supported virtual machines with respect to how the relate to core Windows Server 2016 security features. Check out Table 2-2:
The anticlimax here is that the process of creating an encryption-supported VM is nearly identical to that of creating a shielded VM. Let’s run through the procedure of “grandfathering” an existing unshielded VM named server10.contoso.local on our guarded host named hyperv1.contoso.local.
1. First, we pull down the HGS guardian metadata to our workload (tenant) server. Recall that this is the HGSGuardian.xml file we worked with earlier.
2. We make sure the VM to be shielded is stopped.
3. We create a variable to hold our owner (workload administrator) certificate.
4. We import the HGS guardian metadata from the previously downloaded
5. We create the key protector, which links the owner and the guardian together
6. We run SetVMKeyProtector to enable shielding
7. Finally, we set the security policy (this is where the difference comes in):
Set-VMSecurityPolicy -VMName ‘server01.contoso.local’ -Shielded $false
Although this is in no way obvious or intuitive, it all comes down to the value of the Shielded parameter. If $true
, then the VM is shielded. If $false
, then the VM is encryption-supported.
As of this writing, Microsoft doesn’t have a solid story on recovering from an inaccessible shielded virtual machine. Perhaps a workload administrator accidentally disabled remote connections. Perhaps the firewall was misconfigured--the reasons stack up as to why a particular VM is no longer accessible. Of course, if this is an encryption-supported VM, a fabric administrator can gain host-level access to the VM as previously described. But fully shielded VMs are another story.
The Microsoft Datacenter and Private Cloud Security team put up a blog post called “Step by Step - Shielded VM Recovery (http://timw.info/svm) that cleverly takes advantage of the nested virtualization feature of Windows Server 2016 Hyper-V to get around the console access problem.
Have a look at the illustration in Figure 2-12 and let’s make sense of the recovery approach.
In Figure 2-12, we start from the perspective of a Hyper-V hardware host that’s connected to an Internal Hyper-V switch. We create a dedicated, shielded recovery VM that has nested virtualization enabled. Incidentally, you can enable nested virtualization on a VM by running the following PowerShell command from the host:
Set-VMProcessor -VMName <VMName> -ExposeVirtualizationExtensions $true
It’s important for you to know that you must disable dynamic memory on the virtual machine, and that you need to allocate enough host RAM to cover any nested VMs you plan to run on the virtual Hyper-V host.
Note Other Uses for Nested Virtualization
Nested virtualization refers to the capacity of a virtual machine to become a virtualization host itself. This is a feature that customers asked Microsoft about for many years, and it’s great that we finally have it in Windows Server 2016.
With host hardware being so powerful nowadays, it makes sense to deploy virtualized Hyper-V hosts Going further, shared storage has become much more affordable in Windows Server 2016, so it’s almost trivial to deploy highly available virtual machines that themselves spring from the nested virtualization scenario. Finally, in today’s age of rapid application development and continuous integration, developers appreciate being able to deploy “second level” VM pods from “first level” VMs to which they have access.
Okay. So we’ve created a shielded recovery VM (with nested virtualization enabled) that’s also connected to the aforementioned internal Hyper-V switch. Part of this scenario involves the understanding that the workload admins and fabric admins need to work cooperatively to enact this solution, and that the fabric admins don’t get to tap into the troubled VM.
The fabric admin is responsible for deploying the recovery VM and exporting the troubled VM’s VHDX file(s).
The workload admin then RDPs into the recovery VM and imports the troubled VM as a nested virtual machine.The workload admin then uses PowerShell to change the nested shielded VM’s security policy to encryption-supported.
The workload admin then establishes a VM Connect console session from the recovery VM to the nested, troubled VM and fix whatever problems where present.
Finally, the fabric admin restores the previously troubled VM to the fabric and deletes the recovery VM.
The Host Guardian Service (HGS) is a new role in Windows System 2016 that allows for the creation and management of shielded virtual machines.
The need for HGS and shielded VMs is based in the separation of duties between workload (VM) administrators and fabric (Hyper-V host) administrators and least-privilege security.
HGS is deployed exclusively with PowerShell; Microsoft recommends at least three nodes per HGS cluster to support high availability.
HGS and shielded VMs rely upon various hardware and software features (physical and virtual TPM, UEFI, Secure Boot, Hardware Security Module (HSM), and more.
HGS has two main functions: attestation that a guarded host is healthy, and key transfer to lock and unlock shielded virtual machines.
Local console access is blocked for shielded virtual machines, making pre-shielding VM configuration crucial to allow for remote management.
Shielded VMs offer strong protection against fabric (host) administrators as well as compromised Hyper-V host servers themselves.
Shielded VM deployment is inextricably tied to the presence and availability of a Host Guardian Service (HGS) cluster.
The strong protections offered by shielded VMs have one potential downfall—no host console access could lead to connectivity and availability problems if the shielded VM isn’t correctly configured.
Encryption-supported VMs represent an approach that combines some of the shielded VM protections but preserves console access. However, this protection method involves trusting your fabric admins.
In this thought experiment, demonstrate your skills and knowledge of the topics covered in this chapter. You can find answer to this thought experiment in the next section.
You are a datacenter administrator for Contoso Solutions, a managed service provider (MSP) located in Buffalo, NY. Your newest client, Woodgrove Bank, has strict regulatory requirements that limit access to their servers only to their own full-time information technology staff.
You installed four hardware Hyper-V hosts in a secure server rack for Woodgrove Bank; each host contains five virtual machines. All server hardware includes a TPM v2.0 chip and UEFI firmware and runs Windows Server 2016 Datacenter Edition with the full GUI.
You’ve outlined the new Hyper-V security features offered by Windows Server 2016. In reply, Woodgrove IT personnel have the following questions for you:
1. If we accidentally mess up RDP or WinRM access to our workload VMs, we need another way to access them. How can we accomplish this goal?
2. Woodgrove plans to virtualize more of its infrastructure over the coming years, and we need a way to automate (or at least make easier) shielded VM deployment. What’s possible?
3. What are the pros and cons of TPM-trusted vs. admin-trusted attestation?
This section contains the solution to the thought experiment.
1. Shielded VM recovery is very much a “version 1.0” technology as of this writing. We really have only one solution: to implement the “repair garage” scheme as we discussed earlier in the chapter. By this method, fabric admins would be allowed temporary access to the VMs to unlock them. Then, presumably Woodgrove staff would connect to the workload VMs, reconfigure, and then allow the fabric admin to re-lock the shielded VMs.
Configuring the VMs as encryption-supported would enable console access, but this option gives fabric administrators the ability to access the workload VM data permanently.
2. Microsoft discourages the approach of “grandfathering” existing, unshielded VMs into shielded state because this is a violation of the “clean source” principal. In other words, the best practice is to deploy new VMs in a guarded state to ensure integrity throughout the VM’s lifetime.
3. That said, both System Center 2016 Virtual Machine Manager and Azure Stack both include in-box features that make it easier to store shielded VM templates. The big question for Woodgrove is who does the shielded VM deployment work; remember that SCVMM and Azure Stack are fabric management tools, and would be more suited for Contoso Solutions’ use rather than for Woodgrove workload administrators.
TPM-trusted attestation provides a much stronger set of protections for virtual machines running in a guarded fabric. Technically, we can use virtual TPM functionality in Hyper-V virtual machines even in the absence of a server physical TPM, but Woodgrove is fortunate enough to have host hardware that allows for TPM-trusted attestation.
Recall that in the TPM-trusted attestation scenario, we capture the startup and runtime environment of each guarded host. This means we need to perform the extra work of capturing a “golden image” of each host’s state and deploying a Code Integrity (CI) policy that whitelists the code that can run.
If Woodgrove’s security requirements are this strict, then AD-trusted attestation is a much easier implementation approach. However, the only thing we’re attesting in this scenario is that a guarded host belongs to the appropriate AD security group. If the HGS cluster domain were to be compromised, then this defeats the entire attestation method and trust path.