Chapter 13. Protecting data through encryption, privacy and auditing

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 13
Protecting data through encryption, privacy and auditing

Introducing security principles and protocols

Auditing with SQL Server and Azure SQL Database

Securing Azure infrastructure as a service

Security has become incredibly important to private industry and the public sector. The news shows that the number of leaks and hacks of sensitive information is increasing almost daily, and we data professionals are at the forefront of securing this information. Along with the technical features built into SQL Server and Azure SQL Database, organizations should embrace the guidelines in privacy laws worldwide for handling and managing customer information.

Continuing on from Chapter 12, “Administering security and permissions,” which focuses on authorization, this chapter covers features in the Database Engine and the underlying operating system (OS) that help you to secure your server and the databases that reside on it.

We begin with privacy, and how it guides the responsibilities of the data professional. Then we get into the technical details of what it means to encrypt data, looking at the different features in SQL Server and Azure SQL Database that help you achieve and maintain a more secure environment. We also look at securing the network, the OS, and the database itself right down to the column and row level.

Throughout the chapter you should be thinking about defense in depth, by combining different features and strategies to protect your data and minimize the fallout if your data is stolen.

Sample scripts in this chapter, and all scripts for this book, are all available for download at https://www.MicrosoftPressStore.com/SQLServer2019InsideOut/downloads.

Privacy in the modern era

A chapter about protecting your data estate is incomplete without discussing external policies and procedures to ensure that protection. This section is provided as a reminder to keep abreast of current local and international legislation, using one particular regulation as a reference.

Caution

This section does not constitute legal advice. Consult your own legal counsel for more information.

General Data Protection Regulation (GDPR)

On May 25th, 2018, the General Data Protection Regulation (GDPR) came into effect in the European Union (EU). It provides for the protection of any personal data associated with data subjects (EU residents) located in the EU. Organizations across the world may be affected if they process personal data belonging to EU residents.

Any time your organization deals with an EU resident’s personal data, you are responsible for ensuring it is managed according to that legislation. The good news is, this is a business problem, not a technical one. Your organization must develop policies and procedures to enforce the requirements of the GDPR, and you as a data professional can create technical solutions to satisfy those procedures.

Responsibility of data professionals

What follows are highlights of these procedures as they might apply to data professionals. This is not an exhaustive list.

Pseudonymization

Problem statement. Personal data must be transformed in a way that the resulting data cannot be attributed to a data subject without additional information.
Proposed solution. This can be implemented through encryption. We cover this in detail in the next section, “Introducing security principles and protocols,” including the various methods to ensure that sensitive data is not made available to privileged users and administrators.

Right of access

Problem statement. A data subject may request access to their personal data.
Proposed solution. This can be achieved through standard Transact-SQL (T-SQL) queries, assuming that you have taken measures to handle encryption of the data and authentication of the data subject appropriately. We cover authentication and authorization in Chapter 2, “Introducing database server components.”

Right to erasure

Problem statement. A data subject may request erasure of their personal data.
Proposed solution. In order to erase personal data, you need to identify it. This can become arduous with more than a few databases. Organizations may not appreciate the numerous environments in which they store personal data, including scaled out applications, data warehouses, data marts, data lakes, and email. Fortunately, you can catalog and tag sensitive data in SQL Server and Azure SQL Database.

Introducing security principles and protocols

Information security is about finding a balance between the value of your data and the cost of protecting it. Ultimately the business and technical decisionmakers in your organization make this call, but at least you have the technical tools available to undertake these measures to protect yourself. In other words, you should not leave security solely to the IT department.

SQL Server implements a number of security principles through cryptography and other means, which you can use to build up layers of security to protect your environment.

Computer cryptography is implemented through some intense mathematics that use very large prime numbers, though we won’t delve deeply into specifics. This section explains various security principles and goes into some detail about encryption. It also covers network protocols and how cryptography works. This will aid your understanding of how SQL Server and network security protect your data. Keep in mind that encryption is not the only way to protect data.

Securing your environment with defense in depth

Securing a SQL Server or Azure SQL Database environment requires a number of protections that work together to make it difficult for an attacker to get in, snoop around, steal or modify data, and then get out.

Defense in depth is about building layers of protection around your data and environment. These measures might not be completely effective on their own, but work well as part of an overall strategy, because each layer helps weaken and isolate an attack long enough to allow you to respond.

Perimeter security should include logical and physical segmentation; for example, keeping sensitive servers and applications on a separate part of the network, perhaps off-premises in a separate datacenter or in the Azure cloud. You would then want to protect these connections; for example, by using a Virtual Private Network (VPN).

You should have a firewall and other network defenses to protect against external network attacks. From a physical aspect, don’t let just anyone plug a laptop into an unattended network point, or allow them to connect to your corporate wireless network and have access to the production environment.

From within the network, you need to implement authentication (who you are) and authorization (what you can do), preferably through Active Directory, which is available on both Windows and Linux.

On the servers themselves, you should ensure that the file system is locked down with a policy that does at least the following:

Enforce permissions and modern protocols for files, folders, and network shares
- SMB 1.0 (also known as CIFS) is a deprecated file sharing protocol which Microsoft recommends removing from your environment. For more information about disabling SMB 1, visit https://techcommunity.microsoft.com/t5/Storage-at-Microsoft/Stop-using-SMB1/ba-p/425858. SMB 2.0 and higher are still supported.
Deny access to unauthorized users
Deny access to untrusted storage devices
Ensure service accounts do not have system administrator privileges
Encrypt the file system (optional but recommended)

SQL Server permissions should be set correctly so that the service account does not have administrative privileges on the server, and database files, transaction logs, and backups cannot be accessed by unauthorized users.

On the application side, you can implement coding practices that protect against things like SQL injection attacks, and you can implement encryption in your database (and backup files).

The SQL Vulnerability Assessment tool in SQL Server Management Studio can help identify possible security vulnerabilities. Read more about it in Chapter 12, or on Microsoft Docs at https://docs.microsoft.com/sql/relational-databases/security/sql-vulnerability-assessment.

The difference between hashing and encryption

In a security context, data that is converted in a repeatable manner to an unreadable, fixed-length format using a cryptographic algorithm and that cannot be converted back to its original form is said to be hashed.

Data that is converted to an unreadable form that can be converted back to its original form using a cryptographic key is said to be encrypted.

Cryptographic algorithms can be defeated in certain ways, the most common being brute-force and dictionary attacks. Let’s take a quick look at each one:

Brute-force attack. In a brute-force attack, the attacking code checks every possible combination of a password, passphrase, or encryption key against the hashing or encryption service, until it finally arrives at the correct value. Depending on the type of algorithm and the length of the password, passphrase, or key, this can take anywhere from a few milliseconds to many years.
Dictionary attack. A dictionary attack is a lot faster to perform, so an attacker would attempt this first. Dictionary attacks take a list of words from a dictionary (which can include common words, passwords, and phrases) and use these against the hashing or encryption service. Dictionary attacks take advantage of the fact that we are bad at remembering passwords and tend to use common words.

As computers become more powerful and parallelized, the length of time to run a brute-force attack continues to decrease. Countermeasures do exist to protect against some of these attacks, and some encryption systems cannot be defeated by a brute-force attack. These countermeasures are beyond the scope of this book, but it is safe to say that sufficiently complex algorithms and long encryption keys will take several years to compromise.

Security expert Bruce Schneier suggests that an algorithm with a 512-bit block and key size, and 128 rounds, will be sufficiently complex for the foreseeable future. Schneier’s essay “Cryptography after the Aliens Land” is available at https://www.schneier.com/essays/archives/2018/09/cryptography_after_t.html.

Hashing

A cryptographic hash function (an algorithm) takes variable-length data (usually a password) and applies a mathematical formula to convert it to a fixed size, or hash value.

This is the recommended method of securing passwords. When a password has been hashed correctly, it cannot be decrypted into its original form. Used with a random salt (a random string applied along with the hash function), this results in passwords that are impossible to reconstruct, even if the same password is used by different people.

To validate a password, it must be hashed using the same hash function again, with the same salt, and compared against the stored hash value.

Because hash values have a fixed size (the length depends on the algorithm used), there is a possibility that two sets of data (two different passwords) can result in the same hash value. This is called a hash collision, and it is more likely to occur with shorter hash value lengths. This is why longer hashes are better.

Note

Make sure that you use passwords that are at least 15 characters in length and, preferably, more than 20 characters. You should use a password manager so that you don’t need to memorize multiple passwords. Brute-force attacks take exponentially longer for each additional character you choose, so don’t be shy about using phrases or sentences either (with spaces). Password length matters more than its complexity.

Encryption

Data encryption is the process of converting human-readable data, or plain text, into an encrypted form by applying a cryptographic algorithm called a key (the cipher) to the data. This process makes the encrypted data (the ciphertext) unreadable without the appropriate key to unlock it. Encryption facilitates both the secure transmission and storage of data.

Over the years, many ciphers have been created and subsequently defeated (cracked) because those algorithms were considered weak. In many cases, this is because both CPUs and Graphics Processing Units (GPUs) have become faster and more powerful, reducing the length of time it takes to perform brute-force and other attacks. In other cases, the implementation of the cryptographic function was flawed, and attacks on the implementation itself have been successful.

A primer on protocols and transmitting data

Accessing a database involves the transmission of data over a network interface, which you need to do in a secure manner. A protocol is a set of instructions for transmitting that information over a specific network port.

A port is one of 65,535 possible connections per protocol that can be made to a networked device. The most common protocol for SQL Server is Transmission Control Protocol (TCP). It is always associated with an IP address and a port number.

TCP ports 1433 and 1434 – SQL Server
TCP ports 2382 and 2383 – SQL Server Analysis Services
TCP port 80 – HTTP
TCP port 443 – HTTPS
TCP port 22 – Secure Shell (SSH)
UDP port 53 – Domain Name System (DNS)

The Internet protocol suite

To discuss security on a network, you need to understand cryptographic protocols. To discuss the network itself, you need to discuss the biggest network of them all: the Internet.

The Internet is a network of networks (it literally means “between networks”) that transmits data using a suite of protocols, including TCP, which sits on top of Internet Protocol (IP). TCP/IP is the most common network protocol stack in use today. Most of the services on the Internet, as well as local networks, rely on TCP/IP.

Note

The full Internet protocol suite comprises TCP, IP, Address Resolution Protocol (ARP), Internet Control Message Protocol (ICMP), UDP, and Internet Group Management Protocol (IGMP). All of these are required to implement the full TCP/IP stack.

IP is a connectionless protocol, meaning that each individual unit of transfer, also known as a network packet or datagram, contains the data itself (the payload) and a header that indicates where it came from and where it needs to go (the routing information).

IP network packets might be delivered out of order, with no delivery guarantee at all. This low overhead makes the protocol fast and allows packets to be sent to several recipients at once (multicast or broadcast).

To mitigate this, TCP provides the necessary instructions for reliability, sequencing (the order of packets), and data integrity. If a packet is not received by the recipient, or a packet is received out of order, TCP can resubmit the data again using IP as its delivery mechanism.

Versions of IP in use today

Version 4 of the Internet Protocol (IPv4) has a 32-bit address space, which provides nearly 4.3 billion addresses (2³², or approximately 4.3 x 10⁹). Unfortunately, when this version was first proposed in September 1981, very few people predicted that the Internet would be as large and important as it is today. With billions of humans online, and billions of devices connected, the available IPv4 address space is all but depleted.

You can read the Internet Protocol, Version 4 Specification, known as Internet Engineering Task Force Request For Comments #791, at https://tools.ietf.org/html/rfc791.

Tricks like Network Address Translation (NAT), which uses private IP addresses behind a router with a single valid public IP address representing that entire network, have held off the depletion over the years, but time and address space has run out.

Version 6 of the Internet Protocol (IPv6) has an address space of 128 bits which provides more than 340 undecillion (340 trillion trillion) addresses (2¹²⁸, or approximately 3.4 x 10³⁸). This number is so staggeringly huge that, even with networks and devices being added every minute, including the upward trend of the Internet of Things, each of these devices can have its own unique address on the Internet without ever running out of addresses.

You can read the Internet Protocol, Version 6 Specification, known as Internet Engineering Task Force Request For Comments #8200, at https://tools.ietf.org/html/rfc8200.

Making sense of an IP address

An IP address is displayed in a human-readable notation but is binary under the hood:

IPv4. The address is broken up into four subclasses of decimal numbers, each subclass ranging from 0 to 255, and separated by a decimal point. For example, 52.178.167.109 is a valid IPv4 address.
IPv6. The address is broken up into eight subclasses of hexadecimal numerals, each subclass being four digits wide, and separated by a colon. If a subclass contains all zeroes, it can be omitted. For example, 2001:d74f:e211:9840:0000:0000:0000:0000 is a valid IPv6 address that can be simplified to 2001:d74f:e211:9840:: with the zeroes omitted (note the double-colon at the end to indicate the omission).

You can read more about IPv6 address representation at http://www.ciscopress.com/articles/article.asp?p=2803866.

Note

Hexadecimal is a counting system that uses all the decimal numbers plus the first six letters of the Latin alphabet, to represent the sixteen values between 0 and 15 (10 = A, 11 = B, 12 = C, 13 = D, 14 = E, 15 = F). Hex is used as a convenient way to describe binary values that would otherwise take up a lot of space to display.

Adoption of IPv6 across the Internet is taking decades, so a hybrid solution is currently in place by which IPv4 and IPv6 traffic is shared across compatible devices. If that doesn’t sound like enough of a headache, let’s add routing into the mix.

Finding your way around the Internet

Routing between networks on the Internet is performed by the Border Gateway Protocol (BGP), which sits on top of TCP/IP.

BGP is necessary because there is no map of the Internet. Devices and entire networks appear and disappear all the time. BGP routes billions of network packets through millions of routers based on a best-guess scenario. Packets are routed based on trust: routers provide information to one another about the networks they control, and BGP implicitly trusts that information.

BGP is thus not secure, because it was designed solely to fix the scalability of the Internet, which was (and still is) growing exponentially. It was a “quick fix” that became part of the fabric of the infrastructure long before security was a concern.

Efforts to secure BGP have been slow. It is therefore critical to assume that your own Internet traffic will be hijacked at some point. When this happens, proper cryptography can prevent third parties from reading your data.

A brief overview of the World Wide Web

A lot of people conflate the World Wide Web (the web) with the Internet, but the web is a single component of the greater Internet, along with email and other services that are still very much in use today (such as File Transfer Protocol and Voice over IP).

Note

Based on publicly available information, Microsoft processes more than 500 billion emails per month through its various services.

The web uses the Hypertext Transport Protocol (HTTP), which sits on top of TCP/IP. A web server provides mixed media content (text, graphics, video, and other media) in Hypertext Markup Language (HTML) format, which is transmitted using HTTP and then interpreted and rendered by a web browser.

The web grew quickly for two reasons. First, the Internet became commercialized after originally being an academic and military project for several decades. The web itself then became wildly popular because of the introduction of the first graphical web browser, NCSA Mosaic, in the 1990s. The spiritual successors to Mosaic were Netscape Navigator and Microsoft Internet Explorer, during a period of internet history known as the “browser wars.”

You can learn more about the commercial beginnings of the web and the so-called “Internet Era,” in McCullough, B, How the Internet Happened: From Netscape to the iPhone (Liveright, 2018), or by listening to the Internet History Podcast, available at http://www.internethistorypodcast.com.

Modern web browsers include Microsoft Edge, Google Chrome, Mozilla Firefox, and Apple Safari.

Note

The modern web browser is hugely complex, doing a lot more than rendering HTML, but for the purposes of this discussion and in the interest of brevity, we gloss over those extras.

How does protocol encryption fit into this?

The explosive adoption of the web in the 1990s prompted public-facing organizations to start moving their sales online into electronic commerce, or e-commerce, ventures, which created the need for secure transactions. Consumers wanted to use their credit cards safely and securely so that they could shop and purchase goods without leaving the comfort of their homes.

Remember that the Internet is built on the Internet Protocol, which is stateless and has routing information in the header of every single packet. This means that anyone can place a hardware device (or software) in the packet stream, do something with the packet, and then pass it on (modified or not) to the destination without the sender or recipient having any knowledge of this interaction. Because this is a fundamental building block of a packet-switching network, it’s very difficult to secure properly.

As we discussed earlier, encryption transforms your data into an unreadable format. Now, if someone connected to the same network were to intercept encrypted packets, that person couldn’t see what you’re doing. The payload of each packet would appear garbled and unreadable, unless this person has the key to decrypt it.

A secure version of HTTP was created by Netscape Communications in 1994—called HTTPS—which stands for HTTP Secure, or HTTP over Secure Sockets Layer (SSL). Over the years, the moniker of HTTPS has remained, but it has come to be known as HTTP over Transport Layer Security (TLS) as standards improved.

When we talk about data moving over the network, that usually means TCP/IP is involved, and we need to transmit that data securely.

Symmetric and asymmetric encryption

You can encrypt data in two ways: symmetric and asymmetric. Each has its advantages and disadvantages.

Symmetric encryption (shared secret)

A secret key—usually a password, passphrase, or random string of characters—is used to encrypt data with a particular cryptographic algorithm. This secret key is shared between the sender and the recipient, and both parties can encrypt and decrypt all content by using this secret key.

If the key is accidentally leaked to a third party, the encrypted data could be intercepted, decrypted, modified, and re-encrypted again, without either the sender or recipient being aware of this. This type of attack is known as a man-in-the-middle attack.

Asymmetric encryption (public key)

Also known as public key encryption (PKE). A key–pair is generated, comprising a private key and a public key, and the public key can be widely distributed. The public key is used to encrypt data, and the private key is used to decrypt that data.

The advantage is that the private key never needs to be shared, which makes this method far more secure because only you can use your private key to decrypt the data. Unfortunately, asymmetric encryption requires a lot more processing power, plus both parties need their own key–pairs.

Digital certificates

Public keys require discoverability, which means that they need to be made publicly available. If a sending party wants to sign a message for the receiving party, the burden is on the sender to locate the recipient’s public key in order to sign a message.

For small-scale communications between two private entities, this might be done by sharing their public keys between each other.

For larger-scale communications with many senders and one recipient (such as a web or database server, for example), a certificate authority can provide the public key through a digital certificate, which the recipient (the website or database administrator) can install directly on the server.

This certificate serves as an electronic signature for the recipient, which includes its public key. The authority, known as a Certification Authority, is trusted by both the sender and the recipient, and the sender can verify that the recipient is indeed who it claims to be.

Digital certificates, also known as Public Key Certificates, are defined by the X.509 standard. Many protocols use this standard, including TLS and its predecessor, SSL.

You can read more about how digital certificates and TLS relate to SQL Server and Azure SQL Database later in this chapter. The X.509 standard is available at https://www.itu.int/rec/T-REC-X.509.

Certification Authority

A Certification Authority (CA) is an organization or entity that issues digital certificates, which include the name of the owner, the owner’s public key, and start and expiration dates.

The certificate is automatically revoked after it expires, and the CA can revoke any certificate before then.

For the certificate to be trusted, the CA itself must be trustworthy. It is the responsibility of the CA to verify the owner’s identity so that any certificates issued in that owner’s name can be trusted.

In recent years, several CAs have lost their trustworthy status, either because their verification process was flawed, or their signing algorithms were weak. Take care when choosing a CA for your digital certificates.

Protecting the data platform

You will use multiple layers of defense as you go down the stack of your data environment. Even if you are using virtual machines and containers in a cloud environment, the same principles apply.

Each layer protects the layer below it by using a combination of encryption keys (asymmetric and symmetric), certificates, and other obfuscation techniques. SQL Server provides features that protect sensitive data from unauthorized users, even if they manage the data. Azure SQL Database shares a lot in common with SQL Server, and it also has unique protections which we mention where applicable.

This section breaks down each layer into network, OS, the SQL Server instance, and finally the database itself, including columns and rows. Much of this hierarchy is encrypted by SQL Server, starting at the OS layer and working all the way down to individual cells in a table. Figure 13-1 shows this hierarchy.

This illustration shows the many layers of encryption, represented by labeled rectangles of various sizes, shapes and lengths. The base of the hierarchy is the encrypted data. A rectangle at the far right shows that at the very least, a password is used to access each layer. The next layer up from the bottom is the symmetric key. A symmetric key can protect the encrypted data. Above that, the symmetric key can be protected by an additional symmetric key, an asymmetric key, the extensible key module, or a certificate. The asymmetric key can be protected by the Database Master Key, which in turn is protected by the Service Master Key. On Windows Server, the Service Master Key is protected by the DPAPI. — **Figure 13-1** The SQL Server encryption hierarchy.

Securing the network with TLS

Data is in motion from the moment it is read from, or written to, the buffer pool in SQL Server or Azure SQL Database. Data in motion is data that the Database Engine provides over a network interface. Protecting data in motion requires a number of considerations, from perimeter security, to cryptographic protocols for the communication itself, and the authorization of the application or process accessing the data.

SQL Server protects data during transmission over a network connection using Transport Layer Security (TLS). Any network protocols and APIs involved must support encrypting and decrypting the data as it moves in and out of the buffer pool.

We touched briefly on TLS in the discussion about TCP/IP earlier in this chapter, but we did not go into much detail. TLS is a security layer on top of a transport layer, or cryptographic protocol. Most networks use the TCP/IP protocol stack, and TLS is designed to secure the traffic on TCP/IP-based networks.

Once you have opened Configuration Manager and expanded the SQL Server Network Configuration branch for your instance, right click on Protocols for <InstanceName> and select Properties. You will see an Import button, which will open the certificate import wizard. Follow the prompts and enter the appropriate credentials, confirm the addition of the certificate, and restart the instance.

Note

Remember that longer keys mean better security. Public keys of 1,024 bits (128 bytes) are considered short these days, so some organizations now prefer 2,048-bit, or even 4,096-bit public key certificates for TLS.

A brief history of TLS

Just as earlier cryptographic protocols have been defeated or considered weak enough that they will eventually be defeated, so too have SSL and TLS had their challenges:

The prohibition of SSL 2.0 is covered at https://tools.ietf.org/html/rfc6176.
Known attacks on TLS are available at https://tools.ietf.org/html/rfc7457.

TLS 1.2 was defined in 2008 and is the latest commonly available public version (TLS 1.3 was only approved in March 2018). TLS 1.2 is vulnerable to certain attacks like its predecessors, but as long as older encryption algorithms are not used (for instance 3DES, RC4, and IDEA), it is good enough while we wait for TLS 1.3 to propagate.

Where possible, you should be using TLS 1.2 everywhere. SQL Server ships with TLS 1.0, 1.1, and 1.2 support out of the box, so you will need to turn off 1.0 and 1.1 at the OS level to ensure that you use TLS 1.2.

You can see how to turn off older versions of TLS in the Microsoft Knowledge Base article at https://support.microsoft.com/help/3135244.

Data protection from the OS

At the top of the encryption hierarchy on a server, protecting everything below it, is the OS. Windows Server provides an Application Programming Interface (API) for system-level and user-level processes to take advantage of data protection (encryption) on the file system.

In other words, SQL Server and other applications can make use of this data protection API to have Windows automatically encrypt data on the drive without having to encrypt data through other means.

SQL Server uses the Data Protection API (DPAPI) for Transparent Data Encryption (TDE).

The encryption hierarchy in detail

Each layer of the hierarchy protects the layer below it by using a combination of keys (asymmetric and symmetric) and certificates (refer to Figure 13-1).

Individual layers in the hierarchy can be accessed by a password at the very least, unless an Extensible Key Management (EKM) module is being used. The EKM module is a standalone device that holds symmetric and asymmetric keys outside of SQL Server.

The Database Master Key (DMK) is protected by the Service Master Key (SMK), and both of these are symmetric keys. The SMK is created when you install SQL Server and is protected by the DPAPI.

If you want to use TDE on your database (see the “Configuring TDE on a user database” section later in this chapter), it requires a symmetric key called the Database Encryption Key (DEK), which is protected by an asymmetric key in the EKM module or by a certificate through the DMK.

Note

Although we do not recommend 3DES for TLS, you can still use 3DES lower in the SQL Server security hierarchy for securing DEKs because these are protected by the SMK, the DMK, and a Certificate, or entirely by an HSM/EKM module like Key Vault (see “Using EKM modules with SQL Server” in the next section).

This layered approach helps to protect your data from falling into the wrong hands.

There are two considerations when deciding how to secure a SQL Server environment, which you can implement independently.

Data at rest. In the case of TDE, this is decrypting the data on a drive as it is read into the buffer pool and encrypting the data as it is flushed to a drive from the buffer pool. (You can also encrypt your storage layer independently from SQL Server, but this does not form part of the encryption hierarchy.)
Data in motion. Protecting the data during transmission over a network connection. Any network protocols and APIs involved must support encrypting and decrypting the data as it moves in and out of the buffer pool.

As mentioned previously, data is in motion from the moment it is read from or written to the buffer pool. Between the buffer pool and the underlying storage, data is considered to be at rest.

Note

TDE encrypts database backup files along with the data and transaction log files. The TDE feature is available with SQL Server Enterprise and Standard editions and Azure SQL Database.

Using EKM modules with SQL Server

Organizations might choose to take advantage of a separate security appliance called a Hardware Security Module (HSM) or EKM device to generate, manage, and store encryption keys for the network infrastructure outside of a SQL Server environment.

SQL Server can make use of these keys for internal use. The HSM/EKM device can be a hardware appliance, a USB device, a smart card, or even software, as long as it implements the Microsoft Cryptographic Application Programming Interface (MCAPI) provider.

EKM is an advanced SQL Server setting and is turned off by default. To use the key or keys from an HSM/EKM device, you need to turn on EKM by using the sp_execute 'EKM provider enabled' command with the appropriate parameter. Then, the device must be registered as an EKM module for use by SQL Server.

After the HSM/EKM device creates a key for use by SQL Server (for TDE, for instance), the device exports it securely into SQL Server via the MCAPI provider.

The module might support different types of authentication (Basic or Other), but only one of these types can be registered with SQL Server for that provider.

If the module supports Basic authentication (a user name and password combination), SQL Server uses a credential to provide transparent authentication to the module.

Credentials that will be used by all databases can be created in the master database by using the CREATE CREDENTIAL command, or per individual database using the CREATE DATABASE SCOPED CREDENTIAL command.

To read more about EKM in SQL Server, visit Microsoft Docs at https://docs.microsoft.com/sql/relational-databases/security/encryption/extensible-key-management-ekm.

Cloud security with Azure Key Vault

You can use Azure Key Vault in addition to, or as a drop-in replacement of, a traditional HSM/EKM device. Your SQL Server instance (whether on-premises or on a VM in the cloud) would require Internet access to the Key Vault.

Key Vault is implemented as an EKM provider inside SQL Server, using the SQL Server Connector (a standalone Windows application) as a bridge between Key Vault and the SQL Server instance. To make use of Key Vault, you must create the vault and associate it with a valid Azure Active Directory (Azure AD).

Begin by registering the SQL Server service principal name in Azure AD. After the service principal name is registered, you can install the SQL Server Connector and turn on EKM in SQL Server.

You can read more about service principal names and Kerberos in Chapter 2.

You must then create a login that SQL Server will use for accessing Key Vault, and then map that login to a new credential that contains the Key Vault authentication information.

A step-by-step guide for this process is available on Microsoft Docs at https://docs.microsoft.com/sql/relational-databases/security/encryption/setup-steps-for-extensible-key-management-using-the-azure-key-vault.

Master keys in the encryption hierarchy

Since SQL Server 2012, both the SMK and DMK are symmetric keys encrypted using the Advanced Encryption Standard (AES) cryptographic algorithm. AES is faster and more secure than Triple Data Encryption Standard (3DES), which was used in SQL Server prior to 2012.

Note

When you upgrade from an older version of SQL Server that was encrypted using 3DES, you must regenerate both the SMK and DMK to upgrade them to AES.

The SMK

The SMK is at the top of the encryption hierarchy in SQL Server. It is automatically generated the first time the SQL Server instance starts, and it is encrypted by the DPAPI in combination with the local machine key (which itself is created when Windows Server is installed). The key is based on the Windows credentials of the SQL Server service account and the computer credentials. (On Linux, the local machine key is part of the PAL used by SQL Server.)

Note

You will get a new SMK if you change the service account that runs SQL Server, but it is considered “self-healing,” meaning that you don’t have to do anything else once it has changed.

If you need to restore or regenerate an SMK, you first must decrypt the entire SQL Server encryption hierarchy, which is a resource-intensive operation. You should perform this activity only in a scheduled maintenance window. If the key has been compromised, however, you shouldn’t wait for that maintenance window.

Caution

It is essential that you back up the SMK to a file and then copy it securely to an off-premises location. Losing this key will result in total data loss if you need to recover a database or environment.

To back up the SMK, you can use the T-SQL script shown that follows, but be sure to choose a randomly generated password. The password will be required for restoring or regenerating the key at a later stage. Keep the password separate from the SMK backup file so that they cannot be used together if your secure backup location is compromised. Ensure that the folder on the drive is adequately secured. After you back up the key, transfer and store it securely in an off-premises location.

Click here to view code image

BACKUP SERVICE MASTER KEY TO FILE = 'c:SecureLocationservice_master_key'

   ENCRYPTION BY PASSWORD = '<UseAReallyStrongPassword>';

GO

The DMK

The DMK is used to protect asymmetric keys and private keys for digital certificates stored in the database. A copy of the DMK is stored in the database for which it is used as well as in the master database. The copy is automatically updated by default if the DMK changes. This allows SQL Server to automatically decrypt information as required. A DMK is required for each user database that will make use of TDE.

Refer back to Figure 13-1 to see how the DMK is protected by the SMK.

Caution

Don’t forget to back up the DMK to a file, as well, and copy it securely to an off-premises location.

It is considered a security best practice to regenerate the DMK periodically to protect the server from brute-force attacks. The idea is that it will take longer for a brute-force attack to break the key than the length of time for which the key is in use.

For example, suppose that you encrypt your database with a DMK in January of this year. In the following July you regenerate the DMK, which will cause all keys for digital certificates to be re-encrypted with the new key. If anyone had begun a brute-force attack on data encrypted with the previous DMK, all results from that attack will be rendered useless by the new DMK.

You can back up the DMK by using the T-SQL script that follows. The same rules apply as with backing up the SMK (choose a random password, store the file off-premises, and keep the password and backup file separately). This script assumes that the master key exists.

Click here to view code image

USE WideWorldImporters;

GO

BACKUP MASTER KEY TO FILE = 'c:SecureLocationwwi_database_master_key'

    ENCRYPTION BY PASSWORD = '<UseAReallyStrongPassword>';

GO

You can read more about the SMK and DMK on Microsoft Docs at https://docs.microsoft.com/sql/relational-databases/security/encryption/sql-server-and-database-encryption-keys-database-engine.

Encrypting data by using TDE

Continuing with our defense-in-depth discussion, an additional way to protect your environment is to encrypt data at rest, namely the database files (and when TDE is turned on, all backups of that database).

Third-party providers, including storage vendors, provide excellent on-disk encryption for your Direct-Attached Storage (DAS) or Storage-Area Network (SAN), as a file system solution or at the physical storage layer. Provided that your data and backups are localized to this particular solution, and no files are copied to machines that are not encrypted at the file-system level, this might be an acceptable solution for you.

If you have either the Enterprise or Standard edition of SQL Server, you can use TDE, which encrypts the data, transaction log, and backup files at the file-system level by using a DEK.

If someone manages to acquire these files via a backup server, Azure Storage archive, or by gaining access to your production environment, that person will not be able to simply attach the files or restore the database without the DEK.

The DEK is a symmetric key (shared secret) that is secured by a certificate stored in the master database. If using HSM/EKM or Key Vault, the DEK is protected by an asymmetric key in the EKM module, instead. The DEK is stored in the boot record of the protected database (page 0 of file 1) so that it is easily available during the recovery process.

Note

TDE is invisible to any applications that use it. No changes are required in those applications to take advantage of TDE for the database.

In the data file, TDE operates at the page level, because all data files are stored as 8-KB pages. Before being flushed from the buffer pool, the contents of the page are encrypted, the checksum is calculated, and then the page is written to the drive. When reading data, the 8-KB page is read from the drive, decrypted, and then the contents are placed into the buffer pool.

Note

Even though encryption might to some degree increase the physical size of the data it is protecting, the size and structure of data pages is not affected. Instead, the number of pages in the data file might increase.

For log files, the contents of the log cache are also encrypted before writing to and reading from the drive.

To read more about checkpoint operations and active virtual log files (VLFs) in the transaction log, refer to Chapter 3, “Designing and implementing an on-premises database infrastructure.”

Backup files are simply the contents of the data file, plus enough transaction log records to ensure that the database restore is consistent (redo and undo records of active transactions when the backup is taken). Practically speaking, this means that the contents of new backup files are encrypted by default after TDE is turned on.

Note

Files associated with the buffer pool extension are not encrypted if you use TDE.

Configuring TDE on a user database

To use TDE on SQL Server, you need to create a DMK if you don’t already have one.

Verify that it is safely backed up and securely stored off-premises. If you have never backed up the DMK, you will be warned by the Database Engine after using it that it has not yet been backed up. If you don’t know where that backup is, back it up again. This is a crucial detail to using TDE (or any encryption technology).

Next, you will create a digital certificate or use one that you have acquired from a CA. In the next example, the certificate is created on the server directly.

Then, you create the DEK, which is signed by the certificate and encrypted using a cryptographic algorithm of your choice.

Although you do have a choice of algorithm, we recommend AES over 3DES for performance and security reasons, and you have a choice of three AES key sizes: 128, 192, or 256 bits. Remember that larger keys are more secure but will add additional overhead when encrypting data. If you plan to rotate your keys every few months, you can safely use 128-bit AES encryption because no brute-force attack (using current computing power) should be able to attack a 128-bit key in the months between key rotations.

After you create the DEK, you turn on encryption on the database. The command completes immediately, but the process will take place in the background because each page in the database will need to be read into the buffer pool, encrypted, and flushed to the drive.

Click here to view code image

USE master;

GO

-- Remember to back up this Database Master Key once it is created

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<UseAReallyStrongPassword>';

GO

CREATE CERTIFICATE WideWorldServerCert WITH SUBJECT = ' WWI DEK Certificate';

GO

USE WideWorldImporters;

GO

CREATE DATABASE ENCRYPTION KEY

   WITH ALGORITHM = AES_128

   ENCRYPTION BY SERVER CERTIFICATE WideWorldServerCert;

GO

ALTER DATABASE WideWorldImporters SET ENCRYPTION ON;

GO

Verifying whether TDE is turned on for a database

To determine which databases are encrypted with TDE, you can issue the following T-SQL query:

Click here to view code image

SELECT name, is_encrypted FROM sys.databases;

If a user database is encrypted, the is_encrypted column value for that database will be set to 1. TempDB will also show a value of 1 in this column.

Managing and monitoring the TDE scan

Enabling TDE on a database requires each data page to be read into the buffer pool before being encrypted and written back out to the drive. If the database instance is under a heavy workload, you can pause the scan and resume it at a later stage.

To pause the scan on the WideWorldImporters database, issue the following command:

Click here to view code image

ALTER DATABASE WideWorldImporters SET ENCRYPTION SUSPEND;

To resume the scan on the same database, issue the following command:

Click here to view code image

ALTER DATABASE WideWorldImporters SET ENCRYPTION RESUME;

You can also check the progress of the TDE scan using the new encryption_scan_state column in the sys.dm_database_encryption_keys DMV. To see when the state was last modified, refer to the encryption_scan_modify_date column in the same DMV.

Protecting sensitive columns with Always Encrypted

Although Transparent Data Encryption (TDE) is really useful for encrypting the entire database at the file-system level, it doesn’t prevent database administrators and other users from having access to sensitive information within the database.

The first rule of storing sensitive data is that you should avoid storing it altogether when possible. Credit card information makes sense in a banking system, but not for instance in a sales database.

Note

Many third-party systems can encrypt your data securely but are beyond the scope of this chapter. It is good to keep in mind that there is a small but inherent risk in storing encryption keys with data, as SQL Server does. Your organization must balance that risk against the ease of managing and maintaining those keys.

To learn about column-level encryption, which uses a symmetric key, visit https://docs.microsoft.com/sql/relational-databases/security/encryption/encrypt-a-column-of-data.

If you must store sensitive data, Always Encrypted protects how data is viewed at the column level. It works with applications that use particular connection types (client drivers; see the next section) to interact with SQL Server. These client drivers are protected by a digital certificate so that only specific applications can view the protected data.

Always Encrypted was introduced in SQL Server 2016 and has been available on all editions since SQL Server 2016 Service Pack 1. With SQL Server 2019, Always Encrypted can also be combined with secure enclaves to provide additional functionality. Secure enclaves leverage virtualization-based security (VBS) to isolate a region of memory inside the SQL Server process.

To use Always Encrypted, the database makes use of two types of keys: column encryption keys and column master keys (discussed shortly).

The encryption used by Always Encrypted is one of two types.

Deterministic encryption. This is the same as generating a hash value without a salt. The same encrypted value will always be generated for a given plain-text value. Without secure enclaves, this is useful for joins, indexes, searching, and grouping, but makes it possible for people to guess what the hash values represent.
Randomized encryption. This is the same as generating a hash value with a salt. No two of the same plain-text values will generate the same encrypted value. Without secure enclaves, this does not permit joins, indexes, searching, and grouping for those encrypted columns.

As you can see in Table 7-1, secure enclaves provide a much richer experience when protecting data with Always Encrypted.

Table 7-1 Functionality available with Always Encrypted encryption

Operation	Without enclave		With enclave
Operation	Randomized	Deterministic	Randomized	Deterministic
In-place encryption	No	No	Yes	Yes
Equality comparison	No	Yes (external)	Yes (internal)	Yes (external)
Beyond equality	No	No	Yes	No
LIKE predicate	No	No	Yes	No

Without secure enclaves, you can use randomized encryption for values that are not expected to participate in joins or searches, while deterministic encryption is useful for values like social security numbers and other government-issued values because it helps for searching and grouping. With secure enclaves, randomized encryption is useful for both of these scenarios.

Because the whole intent of Always Encrypted is to prevent unauthorized persons from viewing data (including database administrators), you should generate the keys elsewhere and store them in a trusted key store (in the operating system’s key store for the database server and the application server, or an EKM module such as Azure Key Vault), away from the database server. To maintain the chain of trust, the person who generates the keys should not be the same person who is administering the database.

Client application providers that support Always Encrypted

The following providers currently support Always Encrypted:

.NET Framework 4.6 or higher
Microsoft JDBC Driver 6.0 or higher
ODBC Driver 13.1 for SQL Server or higher
Microsoft Drivers 5.2 for PHP for SQL Server or higher

Caution

Neither .NET Core, nor the Microsoft OLE DB Driver for SQL Server, are currently supported.

The connection between the Database Engine and application is made by using a client-side encrypted connection. Each provider has its own appropriate method to control this setting:

.NET Framework. Set the Column Encryption Setting in the connection string to enabled, or configure the SqlConnectionStringBuilder.ColumnEncryption Setting property to SqlConnectionColumnEncryptionSetting.Enabled.
JDBC. Set the columnEncryptionSetting to Enabled in the connection string, or configure the SQLServerDataSource() object with the setColumnEncryptionSetting("Enabled") property.
ODBC. Set the ColumnEncryption connection string keyword to Enabled, use the SQL_COPT_SS_COLUMN_ENCRYPTION pre-connection attribute, or through the Data Source Name (DSN) using the SQL_COLUMN_ENCRYPTION_ENABLE setting.
PHP. Set the ColumnEncryption connection string keyword to Enabled. Note that the PHP drivers use the ODBC drivers for encryption.

Additionally, the application must have the VIEW ANY COLUMN MASTER KEY DEFINITION and VIEW ANY COLUMN ENCRYPTION KEY DEFINITION database permissions in order to view the Column Master Key and Column Encryption Key.

The Column Master Key and Column Encryption Key

The Column Master Key (CMK) protects one or more Column Encryption Keys (CEK).

The CEK is encrypted using AES encryption and is used to encrypt the actual column data. You can use the same CEK to encrypt multiple columns, or you can create a CEK for each column that needs to be encrypted.

Metadata about the keys (but not the keys themselves) is stored in the database’s system catalog views:

sys.column_master_keys
sys.column_encryption_keys

This metadata includes the type of encryption and location of the keys, plus their encrypted values. Even if a database is compromised, the data in the protected columns cannot be read without access to the secure key store.

To read more about considerations for key management on Microsoft Docs, visit https://docs.microsoft.com/sql/relational-databases/security/encryption/overview-of-key-management-for-always-encrypted.

Using the Always Encrypted Wizard

The easiest way to configure Always Encrypted is by using the Always Encrypted Wizard in SQL Server Management Studio (SSMS). As noted previously, you need to have the following permissions before you begin:

VIEW ANY COLUMN MASTER KEY DEFINITION
VIEW ANY COLUMN ENCRYPTION KEY

If you plan on creating new keys, you also need the following permissions:

ALTER ANY COLUMN MASTER KEY
ALTER ANY COLUMN ENCRYPTION KEY

In SSMS, in Object Explorer, right-click the name of the database that you want to configure. Click Tasks, then Encrypt Columns. This brings up the Always Encrypted wizard.

On the Column Selection page, choose the column that you want to encrypt, and then select the encryption type (deterministic or randomized). If you want to decrypt a previously encrypted column, you can choose Plaintext.

On the Master Key Configuration page, you can create a new key by using the local OS certificate store or by using a centralized store like Key Vault or an HSM/EKM device. If you already have a CMK in your database, you can use it instead. You must also choose the master key source, which is either Current User or Local Machine.

Note

Memory-optimized and temporal tables are not supported by this wizard, but you can still encrypt them by using Always Encrypted.

Limitations in Always Encrypted

Certain column types are not supported by Always Encrypted, including:

image, (n)text, xml, sql_variant, and timestamp / rowversion datatypes
string columns with non-BIN2 collations
FILESTREAM columns
columns with IDENTITY or ROWGUIDCOL properties
columns with default constraints or referenced by check constraints

You can read the full list of limitations and find out more about Always Encrypted on Microsoft Docs at https://docs.microsoft.com/sql/relational-databases/security/encryption/always-encrypted-database-engine.

Configuring Always Encrypted with secure enclaves

Always Encrypted with secure enclaves requires a fairly complex environment in order to guarantee the security it offers.

Host Guardian Service (HGS). Install HGS in a Windows Server 2019 failover cluster with three computers in its own Active Directory forest. These computers must not be connected to an existing Active Directory, and none of these computers must be used for the SQL Server installation. Microsoft suggests that this cluster be isolated from the rest of your network, and that different administrators manage this environment. The only access to HGS will be through HTTP (TCP port 80) and HTTPS (TCP port 443).
SQL Server 2019. Install SQL Server 2019 on Windows Server 2019, on its own physical hardware. Virtual machines do not support the recommended Trusted Platform Module (TPM) enclave attestation, which is hardware-based.
- You can read more about TPM attestation at https://docs.microsoft.com/windows-server/identity/ad-ds/manage/component-updates/tpm-key-attestation.
Tools for client and development. Install the requisite tools on your client machine:
- .NET Framework 4.7.2 or later
- SQL Server Management Studio (SSMS) 18.0 or later
- SQL Server PowerShell module version 21.1 or later
- Visual Studio 2017 or later
- Developer pack (SDK) for .NET Framework 4.7.2
- Microsoft.SqlServer.Management.AlwaysEncrypted.EnclaveProviders NuGet package
- Microsoft.SqlServer.Management.AlwaysEncrypted.AzureKeyVaultProvider NuGet package version 2.2.0 or later, if you plan to use Azure Key Vault for storing your column master keys
Configure the enclave. From the client/development machine, you will enable the enclave type using SSMS. Rich computations are disabled by default for performance reasons. If you wish to use this feature, you must enable it after every SQL Server instance restart using Trace Flag 127. Make sure you test the performance impact on your environment before enabling this feature in production.
Provision enclave-enabled keys. This works in a similar way to regular Always Encrypted keys, except that the column master keys provisioned through HGS are marked as enclave-enabled. You can use SSMS or PowerShell to provision these keys, and store them either in the Windows Certificate Store, or an Azure Key Vault.
Encrypt sensitive columns. Finally, you will use the same methods to encrypt data as before.

A step-by-step walkthrough of this process is available at https://docs.microsoft.com/sql/relational-databases/security/encryption/configure-always-encrypted-enclaves.

Row-level security

Protecting the network and database instance is good and proper, but this does not protect assets within the environment from, for example, curious people snooping on salaries in the Human Resources database. Or, perhaps you have a customer database and you want to restrict the data those customers can access.

Row-level security, which does not use encryption, operates at the database level to restrict access to a table through a security policy, based on group membership or execution context. It is functionally equivalent to a WHERE clause.

Access to the rows in a table is protected by an inline table-valued function, which is invoked and enforced by the security policy.

The function checks whether the user is allowed to access a particular row, while the security policy attaches this function to the table. So, when you run a query against a table, the security policy applies the predicate function.

There are two types of security policies supported by row-level security, both of which you can apply simultaneously:

Filter predicates, which limit the data that can be seen
Block predicates, which limits the actions a user can take on data

Hence, a user might be able to see rows, but cannot insert, update, or delete rows that look like rows they can see. This concept is covered in more detail in the next section.

Caution

There is a risk of information leakage if an attacker writes a query with a specially crafted WHERE clause and, for example, a divide-by-zero error, to force an exception if the WHERE condition is true. This is known as a side-channel attack. It is wise to limit the ability of users to run ad hoc queries when using row-level security.

Filtering predicates for read operations

You can silently filter rows that are available through read operations. The application then has no knowledge of the other data that is filtered out.

Filter predicates affect all read operations. This list is taken directly from the official documentation at https://docs.microsoft.com/sql/relational-databases/security/row-level-security:

SELECT. Cannot view rows that are filtered.
DELETE. Cannot delete rows that are filtered.
UPDATE. Cannot update rows that are filtered. It is possible to update rows that will be subsequently filtered. (The next section covers ways to prevent this.)
INSERT. No effect (inserting is not a read operation). Note, however, that a trigger could cause unexpected side effects in this case.

Blocking predicates for write operations

These predicates block access to write (or modification) operations that violate the predicate. Block predicates affect all write operations:

AFTER INSERT. Prevents inserting rows with values that violate the predicate. Also applies to bulk insert operations.
AFTER UPDATE. Prevents updating rows to values that violate the predicate. Does not run if no columns in the predicate were changed.
BEFORE UPDATE. Prevents updating rows that currently violate the predicate.
BEFORE DELETE. Blocks delete operations if the row violates the predicate.

Dynamic data masking

Data masking works on the principle of limiting exposure to data by obfuscation. It does not use encryption. Without requiring too many changes to the application or database, you can mask portions of columns to prevent lower-privilege users from seeing them, such as with full credit card numbers and other sensitive information.

The mask is defined in the column definition of the table, using MASKED WITH (FUNCTION = [type]) syntax, and you can add masking after table creation by using ALTER COLUMN syntax.

There are four types of masks that are available:

Default. The column is masked according to the data type (not its default value). Strings will use “XXXX” (fewer if the length is less than four characters); numerics will use a zero value; dates will use midnight on January 1st, 1900; and binary will use a single byte binary equivalent of zero. If a string is too short to complete the entire mask, part of the prefix or suffix will not be exposed.
Email. Only the first letter and the trailing domain suffix is not masked; for example, “[email protected].”
Random. This replaces a numeric data type with a random value within a range you specify.
Custom String. Only the first and last letters are not masked. There is a custom padding string in the middle, which you specify.

You can read more about dynamic data masking, including samples of how to set it up, at https://docs.microsoft.com/sql/relational-databases/security/dynamic-data-masking.

Limitations with masking data

Dynamic data masking has some limitations. It does not work on Always Encrypted columns, nor FILESTREAM or COLUMN_SET column types. Computed columns are also excluded, but if the computed column depends on a masked column, the computed column inherits that mask and returns masked data.

If a column has dependencies, you cannot perform dynamic data masking on it without removing the dependency first, adding the dynamic data mask, and then recreating the dependency.

Finally, a masked column cannot be a used as a FULLTEXT index key.

Caution

It is possible to expose masked data with carefully crafted queries. This can be performed by using a brute-force attack or by using inference based on the results. If you are using data masking, you should also limit the ability of the user to run ad hoc queries and ensure that their permissions are sound.

Protecting Azure SQL Database

All of the security features discussed thus far work equally on SQL Server and Azure SQL Database, namely TDE, Always Encrypted, row-level security and dynamic data masking.

That’s great if you’re just comparing SQL Server to Azure SQL Database, but there are some features unique to Azure SQL Database that are worth looking at, which we cover in the next section. Keep in mind that because Azure features and products are always changing, this is only a brief overview.

Azure SQL Database Advanced Threat Protection (ATP)

The risks of having a publicly accessible database in the cloud are numerous. To help protect against attacks, you can activate ATP, which runs 24 hours per day on each of your Azure SQL Database servers (called nodes) for a monthly fee. This service notifies you by email whenever atypical behavior is detected.

Some of the interesting threats include SQL injection attacks and potential vulnerabilities as well as unfamiliar database access patterns, including unfamiliar logins or access from unusual locations. Each notification includes possible causes and recommendations to deal with the event.

ATP ties into the Azure SQL Audit log (discussed in the next section); thus, you can review events in a single place and decide whether each one was expected or malicious.

Although this does not prevent malicious attacks (over and above your existing protections), you are given the necessary tools to mitigate and defend against future events. Given how prevalent attacks like SQL injection are, this feature is very useful in letting you know if that type of event has been detected.

You can turn on ATP using the Azure portal, PowerShell, or with Azure CLI.

To read more on configuring Azure SQL Database ATP with PowerShell, go to https://docs.microsoft.com/azure/sql-database/scripts/sql-database-auditing-and-threat-detection-powershell.

Managed instance security features

Managed instances provide similar security features to SQL Server Enterprise edition. However, because the instance is isolated on an independent virtual network, it can only be accessed through Azure Active Directory (Azure AD), and not Windows authentication.

Otherwise it provides the same protections as on-premises instance, including TDE, Always Encrypted, row-level security, dynamic data masking, and its own flavor of auditing.

Built-in firewall protection

Azure SQL Database is secure by default. All connections to your database environment pass through a firewall. No connections to the database are possible until you add a rule to the firewall to allow access.

To provide access to all databases on an Azure SQL server, you must add a server-level firewall rule through the Azure portal (or indeed using PowerShell or Azure CLI) with your IP address as a range. This does not apply to managed instances because you access the instance through a private IP address inside the virtual network.

To read more about protecting your Azure SQL Database, see Chapter 17, “Provisioning Azure SQL Database.”

Auditing with SQL Server and Azure SQL Database

Auditing is the act of tracking and recording events that occur in the Database Engine. Since SQL Server 2016 Service Pack 1, the Audit feature is available in all editions, as well as in Azure SQL Database. Chapter 17 covers configuring auditing in Azure SQL Database in depth.

SQL Server Audit

There is a lot going on in the Database Engine. SQL Server Audit uses extended events to give you the ability to track and record those actions at both the instance and database level.

Note

Although extended events carry minimal overhead, it is important that you carefully balance auditing against performance impact. Use targeted auditing by only capturing the events that are necessary to fulfil your audit requirements.

You can read more about extended events in Chapter 8, “Maintaining and monitoring SQL Server.”

Audits are logged to event logs or audit files. An event is initiated and logged every time the audit action is encountered, but for performance reasons, the audit target is written to asynchronously.

The permissions required for SQL Server auditing are complex and varied, owing to the different requirements for reading from and writing to the Windows Event Log, the file system, and SQL Server itself.

Requirements for creating an audit

To keep track of events (called actions), you need to define a collection, or audit. The actions you want to track are collected according to an audit specification. Recording those actions is done by the target (destination).

Audit. The SQL Server audit object is a collection of server actions or database actions (these actions might also be grouped together). Defining an audit creates it in the off state. After it is turned on, the destination receives the data from the audit.
Server audit specification. This audit object defines the actions to collect at the instance level or database level (for all databases on the instance). You can have multiple Server Audits per instance.
Database audit specification. You can monitor audit events and audit action groups. Only one database audit can be created per database per audit. Server-scoped objects must not be monitored in a database audit specification.
Target. You can send audit results to the Windows Security event log, the Windows Application event log, or an audit file on the file system. You must ensure that there is always sufficient space for the target. Keep in mind that the permissions required to read the Windows Application event log are lower than the Windows Security event log, if you are using the Windows Application event log.

An audit specification can be created only if an audit already exists.

To read more about audit action groups and audit actions, visit https://docs.microsoft.com/sql/relational-databases/security/auditing/sql-server-audit-action-groups-and-actions.

SQL Server can be shut down by a failure in the audit. You will find an entry in the log saying MSG_AUDIT_FORCED_SHUTDOWN. You can start SQL Server in single-user mode using the -m option at the command line, which will write an entry to the log saying MSG_AUDIT_SHUTDOWN_BYPASSED.

An audit initiation failure also can prevent SQL Server from starting. In this case, you can use the -f command-line option to start SQL Server with minimal configuration (which is also single-user mode).

Creating a server audit in SQL Server Management Studio (SSMS)

Verify that you are connected to the correct instance in SSMS. Then, in Object Explorer, expand the Security folder. Right-click the Audits folder, and then, on the shortcut menu that opens, select New Audit (see Figure 13-2).

This screenshot shows the dialog box for creating a new Audit. Many of the default values are unchanged, but an Audit name of “Sales_Security_Audit” has been set, and a file path value of “C:SalesAudit” has been selected. — **Figure 13-2** Creating an audit in SSMS.

In the Create Audit dialog box that opens, configure the settings to your requirements, or you can leave the defaults as is. Just be sure to enter in a valid file path if you select File in the Audit Destination list box. We also recommend that you choose an appropriate name to enter into the Audit Name box (the default name is based on the current date and time).

Remember to turn on the audit after it is created. It will appear in the Audit folder, which is within the Security folder in Object Explorer. To do so, right-click the newly created audit, and then, on the shortcut menu, click Enable Audit.

Create a server audit by using T-SQL

The server audit creation process can be quite complex, depending on the destination, file options, audit options, and predicates. As just demonstrated, you can configure a new audit by using SSMS, and then create a script of the settings before clicking OK, which produces a T-SQL script. You can also do this manually.

To read more about creating a server audit in T-SQL visit https://docs.microsoft.com/sql/t-sql/statements/create-server-audit-transact-sql.

To create a server audit in T-SQL, verify that you are connected to the appropriate instance, and then run the next code sample. (You’ll need to change the audit name and file path accordingly.) Note that the next example also sets the audit state to ON. It is created in the OFF state by default.

This audit will not have any effect until an audit specification and target are also created.

Click here to view code image

USE master;

GO

-- Create the server audit.

CREATE SERVER AUDIT Sales_Security_Audit

     TO FILE (FILEPATH = 'C:SalesAudit');

GO

-- Enable the server audit.

ALTER SERVER AUDIT Sales_Security_Audit

    WITH (STATE = ON);

GO

Create a server audit specification in SSMS

In Object Explorer, expand the Security folder. Right-click the Server Audit Specification folder, and then, on the shortcut menu, click New Server Audit Specification.

In the Create Server Audit Specification dialog box (Figure 13-3), in the Name box, type a name of your choosing for the audit specification. In the Audit list box, select the previously created server audit. If you type a different value in the Audit box, a new audit will be created by that name.

This screenshot shows Create Server Audit Specification dialog box, the Name has been set to “Server_Audit,” and for the Audit value, the previously created “Sales_Security_Audit,” has been selected from the dropdown. In the Audit Action Type column, three rows have been created, namely “SERVER_OPERATION_GROUP,” “LOGOUT_GROUP” and “DATABASE_OPERATION_GROUP.” There are no other values selected in the other columns, because they cannot be modified. — **Figure 13-3** Creating a Server Audit Specification in SSMS.

Now you can choose one or more audit actions, or audit action groups.

A full list of audit actions and audit action groups is available at https://docs.microsoft.com/sql/relational-databases/security/auditing/sql-server-audit-action-groups-and-actions.

Note

If you have selected an audit group action, you cannot select Object Class, Object Schema, Object Name, and Principal Name, because the group represents multiple actions.

Remember to turn on the server audit specification after you create it by using the context menu.

Create a server audit specification by using T-SQL

In much the same way as you create the audit itself, you can create a script of the configuration from a dialog box in SSMS, or you can create the specification manually, as shown in the script that follows. Note that the server audit specification refers to a previously created audit.

Click here to view code image

USE [master];

GO

-- Create the server audit specification.

CREATE SERVER AUDIT SPECIFICATION Server_Audit

FOR SERVER AUDIT Sales_Security_Audit

     ADD (SERVER_OPERATION_GROUP),

     ADD (LOGOUT_GROUP),

     ADD (DATABASE_OPERATION_GROUP)

WITH (STATE = ON);

GO

Creating a database audit specification in SSMS

As you would expect, the location of the database audit specification is under the database security context.

In Object Explorer, expand the database on which you want to perform auditing, and then expand the Security folder. Right-click the Database Audit Specifications folder, and then, on the shortcut menu, click New Database Audit Specification. Remember again to use the context menu to turn it on.

Figure 13-4 shows an example of capturing SELECT and INSERT operations on the Sales.CustomerTransactions table by the dbo user.

In this dialog box, the Name has been set to “Sales_Tables,” and for the Audit value, the previously created “Sales_Security_Audit” has been selected from the list box. In the Audit Action Type column, two rows have been created, namely “SELECT” and “INSERT”. The Object Class has been set to OBJECT, the Object Schema is “Sales”, and the Object Name is “CustomerTransactions” for both rows. The Principal Name is set to “dbo” for both, as well. — **Figure 13-4** Creating a database audit specification in SSMS.

Creating a database audit specification by using T-SQL

Again, verify that you are in the correct database context. Create the database audit specification by referring to the server audit that was previously created, and then specify which database actions you want to monitor, as demonstrated in the next example.

The destination is already specified in the server audit, so as soon as this is turned on, the destination will begin logging the events as expected.

Click here to view code image

USE WideWorldImporters;

GO

-- Create the database audit specification.

CREATE DATABASE AUDIT SPECIFICATION Sales_Tables

    FOR SERVER AUDIT Sales_Security_Audit

    ADD (SELECT, INSERT ON Sales.CustomerTransactions BY dbo)

    WITH (STATE = ON);

GO

Viewing an audit log

You can view audit logs either in SSMS or in the Security Log in the Windows Event Viewer. This section describes how to do it by using SSMS.

Note

To view any audit logs, you must have CONTROL SERVER permission.

In Object Explorer, expand the Security folder, and then expand the Audits folder. Right-click the audit log that you want to view, and then, on the shortcut menu, select View Audit Logs.

Note that the Event Time is in UTC format. This is to avoid issues regarding time zones and daylight saving time.

Figure 13-5 shows two audit events that have been logged. In the first, the audit itself has been changed (it was turned on). The second event is a SELECT statement that was run against the table specified in the database audit specification example presented earlier.

This screenshot shows the Log File Viewer dialog box for viewing an audit log. The Event Time is presented in UTC format. — **Figure 13-5** File Viewer dialog box for viewing a SQL Server audit.

There are many columns in the audit that you cannot see in Figure 13-5, notable among them are Server Principal ID (SPID), Session Server Principal Name (the logged-in user), and the Statement (the command that was run). The point here is that you can capture a wealth of information.

Note

You can also view the audit log in an automated manner by using the built-in T-SQL system function sys.fn_get_audit_file, though the data is not formatted the same way as it is through the File Viewer in SSMS. See more at https://docs.microsoft.com/sql/relational-databases/system-functions/sys-fn-get-audit-file-transact-sql.

Auditing with Azure SQL Database

With Azure SQL Database auditing, you can track database activity and write it to an audit log in a container in your Azure Storage account (you are charged for storage accordingly).

This helps you to remain compliant with auditing regulations as well as see anomalies (as discussed earlier in the section “Azure SQL Database Threat Detection”) to give you greater insight into your Azure SQL Database environment.

Auditing gives you the ability to retain an audit trail, report on activity in each database, and analyze reports, which includes trend analysis and security-related events. You can define server-level and database-level policies. Server policies automatically cover new and existing databases.

If you turn on server auditing, that policy applies to any databases on the server. Thus, if you also turn on database auditing for a particular database, that database will be audited by both policies. You should avoid this unless retention periods are different, or you want to audit for different event types.

You can read more about Azure SQL Database auditing in Chapter 17.

Securing Azure infrastructure as a service

Infrastructure as a service (IaaS), or SQL Server running on an Azure VM, is secured in much the same way as the on-premises product. Depending on the edition, you can use TDE, Always Encrypted, row-level security, and dynamic data masking.

With Azure IaaS, setting up a VM in a resource group is secure by default. If you want to allow connections from outside of your Azure virtual network, you need to allow not only the connection through the OS firewall (which is on by default in Windows Server), but you also can control connections through a Network Security Group.

In addition to that, you can control access through a network appliance, such as a firewall or NAT device. This provides finer-grained control over the flow of network traffic in your virtual network, which is needed to set up Azure ExpressRoute, for example (Chapter 3 covers this in some detail).

Network Security Group

A Network Security Group (NSG) controls the flow of traffic in and out of the entirety (or part) of an Azure virtual network subnet.

An NSG provides security for an entire subnet by default, which affects all the resources in that subnet (see Figure 13-6). If you require more control, you can associate the NSG with an individual network interface card (NIC), thus further restricting traffic.

In this illustration, an Azure Virtual Network called TestVNet (network ID 192.168.0.0/16) is separated into two subnets, labeled FrontEnd (network ID 192.168.1.0/24) and BackEnd (network ID 192.168.2.0/24). Each subnet is protected by a Network Security Group. The FrontEnd subnet contains two web servers, Web1 and Web2, whereas the BackEnd subnet contains two SQL Servers, SQL1 and SQL2. — **Figure 13-6** A typical virtual network, with each subnet secured by a Network Security Group.

Note

When creating a VM using the Azure Resource Manager, it will come with at least one virtual NIC, which you manage through an NSG. This is an important, because individual NICs can belong to different NSGs, which provides finer control over the flow of network traffic on individual VMs.

As with typical firewalls, the NSG has rules for incoming and outgoing traffic. When a packet hits a port on the virtual network or subnet, the NSG intercepts the packet and checks whether it matches one of the rules. If the packet does not qualify for processing, it is discarded (dropped).

Rules are classified according to source address (or range) and destination address (or range). Depending on the direction of traffic, the source address can refer to inside the network or outside on the public Internet.

This becomes cumbersome with more complex networks, so to simplify administration and provide flexibility, you can use service tags to define rules by service name instead of IP address.

You can also use default categories, namely VirtualNetwork (the IP range of all addresses in the network), AzureLoadBalancer (the Azure infrastructure load balancer), and Internet (IP addresses outside the range of the Azure Virtual Network).

You can read more about Azure Virtual Network security and get a full list of service tags at https://docs.microsoft.com/azure/virtual-network/security-overview.

User-defined routes and IP forwarding

As a convenience to Azure customers, all VMs in an Azure Virtual Network are able to communicate with one another by default, irrespective of the subnet in which they reside. This also holds true for virtual networks connected to your on-premises network by a VPN, and for Azure VMs communicating with the public Internet (including those running SQL Server).

You can read more about Virtual Private Networks in Chapter 3.

In a traditional network, communication across subnets like this requires a gateway to control (route) the traffic. Azure provides these system routes for you automatically.

You might decide that this free-for-all communication is against your network policy and that all traffic from your VMs should first be channeled through a network appliance (such as a firewall or NAT device). Virtual appliances are available in the Azure Marketplace at an additional cost, or you could configure a VM yourself to run as a firewall.

A user-defined route with IP forwarding makes this happen. With a user-defined route, you create a subnet for the virtual appliance and force traffic from your existing subnets or VMs through the virtual appliance.

You must enable IP forwarding for the VM in order for that VM to receive traffic addressed to other destinations. This is an Azure setting, not a setting in the guest OS. (https://docs.microsoft.com/azure/virtual-network/virtual-network-scenario-udr-gw-nva)

You may also have to enable IP forwarding in the VM itself in certain circumstances.

Caution

With user-defined routes, you cannot control how traffic enters the network from the public Internet. They only control how traffic leaves a subnet, which means that your virtual appliance must be in its own subnet. If you want to control traffic flow from the public Internet as it enters a subnet, use a Network Security Group.

Until you create a routing table (by user-defined route), subnets in your Virtual Network rely on system routes. A user-defined route adds another entry in the routing table, so a technique called Longest Prefix Match (LPM) kicks in to decide which is the better route to take, by selecting the most specific route (the one with the longest prefix). As seen earlier in Figure 13-6, a /24 prefix is longer than a /16 prefix, and a route entry with a higher prefix takes precedence.

If two entries have the same LPM match, the order of precedence is as follows:

User-defined route
BGP route
System route

Remember BGP? It’s used for ExpressRoute. As we mentioned in Chapter 3, ExpressRoute is a VPN service by which you can connect your Azure Virtual Network to your on-premises network, without going over the public Internet. You can specify BGP routes to direct traffic between your network and the Azure Virtual Network.

Additional security features in Azure networking

There are additional features for improving the management and security of an Azure Virtual Network, as it relates to SQL Server or Azure SQL Database, which are worth discussing here. As of this writing, some of these features are still in preview.

Virtual network service endpoints

Service endpoints make it possible for you to restrict access to certain Azure services that were traditionally open to the public Internet so that they are available only to your Azure Virtual Network, as illustrated in Figure 13-7.

In this illustration, an Azure virtual network is shown on the left (network ID 10.1.0.0/16), containing a subnet (network ID 10.1.1.0/24) with a virtual machine in it (IP address 10.1.1.4). On the right is an Azure Storage account, which is connected to the virtual machine by way of a service endpoint. This endpoint is configured to block access to the Storage account from the Internet but allow access from the virtual machine. — **Figure 13-7** A service endpoint protecting an Azure Storage account.

Configurable through the Azure portal, PowerShell, or Azure CLI, you can block public Internet access to your Azure Storage and Azure SQL Database resources. Additional service endpoints are available, and more will be introduced in the future.

To read more about Virtual Network service endpoints, go to https://docs.microsoft.com/azure/virtual-network/virtual-network-service-endpoints-overview.

Distributed-denial-of-service protection

Azure provides protection against distributed-denial-of-service (DDoS) attacks for Virtual Networks, which is helpful given that attacks against publicly accessible resources are increasing in number and complexity. The basic service included in your subscription provides real-time protection by using the scale and capacity of the Azure infrastructure to mitigate attacks (see Figure 13-8).

For an additional cost, you can take advantage of built-in machine learning algorithms to protect against targeted attacks, with added configuration, alerting, and telemetry.

You also can use the Azure Application Gateway web application firewall to help protect against more sophisticated attacks.

Combined with Azure SQL Database auditing and NSGs, these features provide a comprehensive suite of protection against the latest threats.

To read more about Azure DDoS protection, go to https://azure.microsoft.com/services/ddos-protection.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 13. Protecting data through encryption, privacy and auditing

Create new playlist

Sign In

Sign Up

Chapter 13Protecting data through encryption, privacy and auditing

Privacy in the modern era

General Data Protection Regulation (GDPR)

Responsibility of data professionals

Pseudonymization

Right of access

Right to erasure

Introducing security principles and protocols

Securing your environment with defense in depth

The difference between hashing and encryption

Hashing

Encryption

A primer on protocols and transmitting data

The Internet protocol suite

Versions of IP in use today

Making sense of an IP address

Finding your way around the Internet

A brief overview of the World Wide Web

How does protocol encryption fit into this?

Symmetric and asymmetric encryption

Symmetric encryption (shared secret)

Asymmetric encryption (public key)

Digital certificates

Certification Authority

Protecting the data platform

Securing the network with TLS

A brief history of TLS

Data protection from the OS

The encryption hierarchy in detail

Using EKM modules with SQL Server

Cloud security with Azure Key Vault

Master keys in the encryption hierarchy

The SMK

The DMK

Encrypting data by using TDE

Configuring TDE on a user database

Verifying whether TDE is turned on for a database

Managing and monitoring the TDE scan

Protecting sensitive columns with Always Encrypted

Client application providers that support Always Encrypted

The Column Master Key and Column Encryption Key

Using the Always Encrypted Wizard

Limitations in Always Encrypted

Configuring Always Encrypted with secure enclaves

Row-level security

Filtering predicates for read operations

Blocking predicates for write operations

Dynamic data masking

Limitations with masking data

Protecting Azure SQL Database

Azure SQL Database Advanced Threat Protection (ATP)

Managed instance security features

Built-in firewall protection

Auditing with SQL Server and Azure SQL Database

SQL Server Audit

Requirements for creating an audit

Creating a server audit in SQL Server Management Studio (SSMS)

Create a server audit by using T-SQL

Create a server audit specification in SSMS

Create a server audit specification by using T-SQL

Creating a database audit specification in SSMS

Creating a database audit specification by using T-SQL

Viewing an audit log

Auditing with Azure SQL Database

Securing Azure infrastructure as a service

Network Security Group

User-defined routes and IP forwarding

Additional security features in Azure networking

Virtual network service endpoints

Distributed-denial-of-service protection

Table of Contents for
Chapter 13. Protecting data through encryption, privacy and auditing

Chapter 13
Protecting data through encryption, privacy and auditing