Chapter 3
Designing and implementing an on-premises database infrastructure

This chapter covers the architecture of an on-premises database infrastructure, including the differences between data and transaction log files, and how certain features work to ensure durability and consistency even during unexpected events.

We cover what certain important configuration settings mean, both from a performance and best practice perspective. We also go into detail about the different kinds of data compression and file system settings that are most appropriate for your environment.

Finally, we discuss how to deploy SQL Server using containers and Kubernetes.

The sample scripts in this chapter, and all scripts for this book, are available for download at https://www.MicrosoftPressStore.com/SQLServer2019InsideOut/downloads.

Introduction to SQL Server database architecture

The easiest way to observe the implementation of a SQL Server database is by its files. Every SQL Server database comprises at least two main kinds of file:

  • Data. The data itself is stored in one or more filegroups. Each filegroup in turn comprises one or more physical data files.

  • Transaction Log. This is where all data modifications are saved until committed or rolled back and then hardened to a data file. There is usually only one transaction log file per database.

Note

There are several other file types used by SQL Server, including logs, trace files, and memory-optimized filegroups, which we discuss later in this chapter.

Data files and filegroups

When you initially create a user database, SQL Server uses the model database as a template, which provides your new database with its default configuration, including ownership, compatibility level, file growth settings, recovery model (full, bulk-logged, simple), and physical file settings.

By default, each new database has one transaction log file and one data filegroup. This data filegroup is known as the primary filegroup, comprising a single data file by default. It is known as the primary data file, which has the file extension .mdf (see Figure 3-1).

This graphic shows that a typical SQL Server database can have one or more filegroups, and each filegroup can have one or more data files. Each file is in turn composed of many 8-KB data pages. The primary data file in the primary filegroup has an MDF file extension and is the main data file. Non-primary data files have an NDF file extension.

Figure 3-1 The data files as they make up one or more filegroups in a database.

Note

The file extensions used for SQL Server data and transaction log files are listed by convention only and are not required.

You can have more than one file in a filegroup, which can provide better performance through parallel reads and writes (but please test this scenario before adding too many files). Secondary data files generally have the file extension .ndf.

The real benefit comes with adding new filegroups and splitting your logical data storage across those filegroups. This makes it possible for you to do things like piecemeal backups and online restore at a filegroup level in Enterprise edition.

Note

Offline filegroup restore is available in Standard edition.

You can also age-out data into a filegroup that is set to read-only and store it on slower storage than the current data, to manage storage costs better.

If you use table partitioning (see the “Table partitioning” section later in the chapter), splitting partitions across filegroups makes even more sense.

Grouping data pages with extents

SQL Server data pages are 8 KB in size. Eight of these contiguous pages is called an extent, which is 64 KB in size.

There are two types of extents in a SQL Server data file:

  • Uniform Extent. All eight 8-KB pages per extent are assigned to a single object.

  • Mixed Extent. (Rare) Each page in the extent is assigned to its own separate object (one 8-KB page per object).

Mixed extents were originally created to reduce storage requirements for database objects, back when mechanical hard drives were much smaller and more expensive. As storage becomes faster and cheaper, and SQL Server more complex, this causes contention (a hotspot) at the beginning of a data file, especially if a lot of small objects are being created and deleted.

Mixed extents are turned off by default for TempDB and user databases, while they are turned on by default for system databases. If you want, you can configure mixed extents on a user database by using the following command:

ALTER DATABASE <dbname> SET MIXED_PAGE_ALLOCATION ON;

Contents and types of data pages

All data pages begin with a header of 96 bytes, followed by a body containing the data itself. At the end of the page is a slot array, which fills up in reverse order, beginning with the first row, as illustrated in Figure 3-2. It instructs the Database Engine where a particular row begins on that particular page. Note that the slot array does not need to be in any particular order after the first row.

This illustration shows that each 8-KB data page has a page header of 96 bytes at the start of the page, and a slot array at the end of the page, with the data taking up the rest of the space. The slot array is in reverse order and shows the order of the rows in the page.

Figure 3-2 A typical 8-KB data page, showing the header, the data, and slot array.

At certain points in the data file, there are system-specific data pages (also 8 KB in size). These help SQL Server recognize and manage the different data within each file.

Several types of pages can be found in a data file:

  • Data. Regular data from a heap, or clustered index at the leaf level (the data itself; what you would see when querying a table).

  • Index. Nonclustered index data at the leaf and non-leaf level as well as clustered indexes at the non-leaf level.

  • Large object data types. These include text, ntext, image, nvarchar(max), varchar(max), varbinary(max), Common Language Runtime (CLR) data types, xml, and sql_variant where it exceeds 8 KB. Overflow data can also be stored here (data that has been moved “off-page” by the Database Engine), with a pointer from the original page.

  • Global Allocation Map (GAM). Keeps track of all free extents in a data file. There is one GAM page for every GAM interval (64,000 extents, or roughly 4 GB).

  • Shared Global Allocation Map (SGAM). Keeps track of all extents that can be mixed extents. It has the same interval as the GAM.

  • Page Free Space (PFS). Keeps track of free space inside heap and large object pages. There is one PFS page for every PFS interval (8,088 pages, or roughly 64 MB).

  • Index Allocation Map (IAM). Keeps track of which extents in a GAM interval belong to a particular allocation unit (an allocation unit is a bucket of pages that belong to a partition, which in turn belongs to a table). It has the same interval as the GAM. There is at least one IAM for every allocation unit. If more than one IAM belongs to an allocation unit, it forms an IAM chain.

  • Bulk Changed Map (BCM). Keeps track of extents that were modified by bulk-logged operations since the last full backup. It is used by transaction log backups in the bulk-logged recovery model to see which extents to back up.

  • Differential Changed Map (DCM). Sometimes called a differential bitmap. Keeps track of extents that were modified since the last full or differential backup that is used for differential backups.

  • Boot Page. Only one per database and contains information about the database.

  • File Header Page. One per data file and contains information about the file.

Verifying data pages by using a checksum

By default, when a data page is read into the buffer pool, a checksum is automatically calculated over the entire 8-KB page and compared to the checksum stored in the page header on the drive. This is how SQL Server keeps track of page-level corruption. If the checksum stored on the drive does not match the checksum in memory, corruption has occurred. A record of this suspect page is stored in the msdb database and you will see an error message when that page is accessed.

The same checksum is performed when writing to a drive. If the checksum on the drive does not match the checksum in the data page in the buffer pool, page-level corruption has occurred.

Although the PAGE_VERIFY property on new databases is set to CHECKSUM by default, it might be necessary to check databases that have been upgraded from previous versions of SQL Server, especially those created prior to SQL Server 2005 (compatibility level 90).

You can look at the checksum verification status on all databases by using the following query:

SELECT name, page_verify_option_desc

FROM sys.databases;

You can reduce the likelihood of data page corruption by using Error-Correcting Code (ECC) memory. Data page corruption on the drive is detected by using DBCC CHECKDB and other operations.

Recording changes in the transaction log

The transaction log is the most important component of a SQL Server database because it is where all units of work (transactions) performed on a database are recorded, before the data can be written (flushed) to the drive. The transaction log file usually has the file extension .ldf.

Note

Although it is possible to use more than one file to store the transaction logs for a database, we do not recommend this because there is no performance or maintenance benefit to using multiple files. To understand why and where it might be appropriate to have more than one, see the section “Inside the transaction log file with virtual log files” later in the chapter.

A successful transaction is said to be committed. An unsuccessful and completely reversed transaction is said to be rolled back.

In Chapter 2, we saw that when SQL Server needs an 8-KB data page from the data file, it usually copies it from the drive and stores a copy of this page in memory in an area called the buffer pool while that page is required. When a transaction needs to modify that page, it works directly on the copy of the page in the buffer pool. If the page is subsequently modified, a log record of the modification is created in the log buffer (also in memory), and that log record is then written to the drive.

By default, SQL Server uses a technique called Write-Ahead Logging (WAL), which ensures that no changes are written to the data file before the necessary log record is written to the drive in a permanent form (in this case, non-volatile storage).

However, you can use delayed durability (also known as lazy commit), that does not save every change to the transaction log as it happens. Instead, it waits until the log cache grows to a certain size (or sp_flushlog runs) before flushing it to the drive.

Caution

If you turn on delayed durability on your database, the performance benefit has a downside of potential data loss if the underlying storage layer experiences a failure before the log can be saved. Indeed, sp_flushlog should also be run before shutting down SQL Server for all databases with delayed durability enabled.

A transaction’s outcome is unknown until a commit or rollback occurs. An error might occur during a transaction, or the operator might decide to roll back the transaction manually because the results were not as expected. In the case of a rollback, changes to the modified data pages must be undone. SQL Server will make use of the saved log records to undo the changes for an incomplete transaction.

Only when the transaction log file is written to can the modified 8-KB page be saved in the data file, though the page might be modified several times in the buffer pool before it is flushed to the drive, using a checkpoint operation.

Our guidance, therefore, is to use the fastest storage possible for the transaction log file(s), because of the low-latency requirements.

Flushing data to the storage subsystem with checkpoints

Recall from Chapter 2 that any changes to the data are written to the database file asynchronously, for performance reasons. This process is controlled by a database checkpoint. As its name implies, this is a database-level setting that can be changed under certain conditions by modifying the Recovery Interval or by running the CHECKPOINT command in the database context.

The checkpoint process takes all of the modified pages in the buffer pool, as well as transaction log information that is in memory, and writes that to the storage subsystem. This reduces the time it takes to recover a database because only the changes made after the latest checkpoint need to be rolled forward in the Redo phase (see the “Restarting with recovery” section later in the chapter).

Inside the transaction log file

A transaction log file is split into logical segments, called virtual log files (VLFs). These segments are dynamically allocated when the transaction log file is created and whenever the file grows. The size of each VLF is not fixed and is based on an internal algorithm, which depends on the version of SQL Server, the current file size, and file growth settings. Each VLF has a header containing a Minimum Log Sequence Number and whether it is active.

Every transaction is uniquely identified by a Log Sequence Number (LSN). Each LSN is ordered, so a later LSN will be greater than an earlier LSN. The LSN is also used by database backups and restores.

Figure 3-3 illustrates how the transaction log is circular. When a VLF is first allocated by creation or file growth, it is marked inactive in the VLF header. Transactions can be recorded only in active portions of the log file, so the SQL Server engine looks for inactive VLFs sequentially, and as it needs them, marks them as active to allow transactions to be recorded.

This graphic shows a typical SQL Server transaction log file. The transaction log file has an LDF file extension. Each log file is composed of virtual log files, or virtual log files, that are not of a fixed size. The log file is circular, so once the end of the file is reached, SQL Server wraps around to the beginning of the file to find an inactive virtual log file to use.

Figure 3-3 The transaction log file, showing active and inactive VLFs.

Marking a VLF inactive is called log truncation, but this operation does not affect the size of the physical transaction log file. It just means that an active VLF has been marked inactive and can be reused.

There are several reasons why log truncation can be delayed. After the transactions that make use of an active VLF are committed or rolled back, what happens next depends on a number of factors:

  • The recovery model:

    • Simple. An automatic checkpoint is queued after the recovery interval timeout is reached or if the log becomes 70 percent full.

    • Full/bulk-logged. A transaction log backup must take place after a transaction is committed. A checkpoint is queued if the log backup is successful.

  • Other processes that can delay log truncation:

    • Active backup or restore. The transaction log cannot be truncated if it is being used by a backup or restore operation.

    • Active transaction. If another transaction is using an active VLF, it cannot be truncated.

    • Database mirroring. Mirrored changes must be synchronized before the log can be truncated. This occurs in high-performance mode or if the mirror is behind the principal database.

    • Replication. Transactions that have not yet been delivered to the distribution database can delay log truncation.

    • Database snapshot creation. This is usually brief, but creating snapshots (manually or through database consistency checks, for instance) can delay truncation.

    • Log scan. Usually brief, but this, too, can delay a log truncation.

    • Checkpoint operation. See the section “Flushing data to the storage subsystem with checkpoints” later in the chapter.

After the checkpoint is issued and the dependencies on the transaction log (as just listed) are removed, the log is truncated by marking those VLFs as inactive.

The log is accessed sequentially in this manner until it gets to the end of the file. At this point, the log wraps around to the beginning, and the Database Engine looks for an inactive VLF from the start of the file to mark active. If there are no inactive VLFs available, the log file must create new VLFs by growing in size according to the auto growth settings.

If the log file cannot grow, it will stop all operations on the database until VLFs can be reclaimed or created.

The Minimum Recovery LSN

When a checkpoint occurs, a log record is written to the transaction log stating that a checkpoint has commenced. After this, the Minimum Recovery LSN (MinLSN) must be recorded. This LSN is the minimum of either the LSN at the start of the checkpoint, the LSN of the oldest active transaction, or the LSN of the oldest replication transaction that hasn’t been delivered to the transactional replication distribution database.

In other words, the MinLSN “…is the log sequence number of the oldest log record that is required for a successful database-wide rollback.” (https://docs.microsoft.com/sql/relational-databases/sql-server-transaction-log-architecture-and-management-guide).

This way, crash recovery knows to start recovery only at the MinLSN and can skip over any older LSNs in the transaction log if they exist.

The checkpoint also records the list of active transactions that have made changes to the database. If the database is in the simple recovery model, the unused portion of the transaction log before the MinLSN is marked for reuse. All dirty data pages and information about the transaction log are written to the storage subsystem, the end of the checkpoint is recorded in the log, and (importantly) the LSN from the start of the checkpoint is written to the boot page of the database.

Note

In the full and bulk-logged recovery models, a successful transaction log backup issues a checkpoint implicitly.

Types of database checkpoints

Checkpoints can be activated in a number of different scenarios. The most common is the automatic checkpoint, which is governed by the recovery interval setting (see the Inside OUT sidebar that follows to see how to modify this setting) and typically takes place approximately once every minute for active databases (those databases in which a change has occurred at all).

Note

Infrequently accessed databases with no transactions do not require a frequent checkpoint, because nothing has changed in the buffer pool.

Other checkpoint events include the following:

  • Database backups (including transaction log backups)

  • Database shutdowns

  • Adding or removing files on a database

  • SQL Server instance shutdown

  • Minimally logged operations (for example, in a database in the simple or bulk-logged recovery model)

  • Explicit use of the CHECKPOINT command

There are four types of checkpoints that can occur:

  • Automatic. Issued internally by the Database Engine to meet the value of the recovery interval setting at the instance level. On SQL Server 2016 and higher, the default is one minute.

  • Indirect. Issued to meet a user-specified target recovery time at the database level, if the TARGET_RECOVERY_TIME has been set.

  • Manual. Issued when the CHECKPOINT command is run.

  • Internal. Issued internally by various features, such as backup and snapshot creation, to ensure consistency between the log and the drive image.

Restarting with recovery

Whenever SQL Server starts, recovery (also referred to as crash recovery or restart recovery) takes place on every single database (on at least one thread per database, to ensure that it completes as quickly as possible) because SQL Server does not know for certain whether each database was shut down cleanly.

The transaction log is read from the latest checkpoint in the active portion of the log, or the LSN it gets from the boot page of the database (see the “The Minimum Recovery LSN” section earlier in the chapter), and scans all active VLFs looking for work to do.

All committed transactions are rolled forward (Redo portion) and then all uncommitted transactions are rolled back (Undo portion). The total number of rolled forward and rolled back transactions are recorded for each database with a respective entry in the ERRORLOG file.

SQL Server Enterprise edition brings the database online immediately after the Redo portion is complete. Other editions must wait for the Undo portion to complete before the database is brought online.

  • Images See Chapter 8 for more information about database corruption and recovery.

The reason why we cover this in such depth in this introductory chapter is to help you to understand why drive performance is paramount when creating and allocating database files.

When a transaction log is first created or file growth occurs, the portion of the drive must be zeroed-out (the file system literally writes zeroes in every byte in that file segment).

Instant file initialization does not apply to transaction log files for this reason, so keep this in mind when growing or shrinking transaction log files. All activity in a database will stop until the file operation is complete.

As you can imagine, this can be time consuming for larger files, so you need to take care when setting file growth options, especially with transaction log files. You should measure the performance of the underlying storage layer and choose a fixed growth size that balances performance with reduced VLF count. Consider setting file growth for transaction log files in multiples of 8 GBs. At a sequential write speed of 200 MBps, this would take under a minute to grow the transaction log file.

MinLSN and the active log

As mentioned earlier, each VLF contains a header that includes an LSN and an indicator as to whether the VLF is active. The portion of the transaction log, from the VLF containing the MinLSN to the VLF containing the latest log record, is considered the active portion of the transaction log.

All records in the active log are required in order to perform a full recovery if something goes wrong. The active log must therefore include all log records for uncommitted transactions, too, which is why long-running transactions can be problematic. Replicated transactions that have not yet been delivered to the distribution database can also affect the MinLSN.

Any type of transaction that does not allow the MinLSN to increase during the normal course of events affects overall health and performance of the database environment, because the transaction log file might grow uncontrollably.

When VLFs cannot be made inactive until a long-running transaction is committed or rolled back, or if a VLF is in use by other processes (including database mirroring, availability groups, and transactional replication, for example), the log file is forced to grow. Any log backups that include these long-running transaction records will also be large. The recovery phase can also take longer because there is a much larger volume of active transactions to process.

A faster recovery with Accelerated Database Recovery

In SQL Server 2019 and Azure SQL Database, you can enable Accelerated Database Recovery (ADR) at the database level. At a high level, ADR trades extra space in the data file for reduced space in the transaction log, and improved performance during manual transaction rollbacks and crash recovery, especially in environments where long-running transactions are common.

It introduces four components:

  • Persisted version store (PVS). This works in a similar way to Read Committed Snapshot Isolation (RCSI) by recording a previously committed version of a row until a transaction is committed. The main difference is that the PVS is stored in the user database and not in TempDB, which allows database-specific changes to be recorded in isolation from other instance-level operations.

  • Logical revert. If a long-running transaction is aborted, the versioned rows created by the PVS can be safely ignored by concurrent transactions. Additionally, upon rollback, the previous version of the row is immediately made available by releasing all locks.

  • Secondary log stream (sLog). The sLog is a low-volume in-memory log stream that records non-versioned operations (including lock acquisitions). It is persisted to the transaction log file during a checkpoint operation, and aggressively truncated when transactions are committed.

  • Cleaner. This asynchronous process periodically cleans page versions that are no longer required.

Where ADR shines is in the Redo and Undo phases of crash recovery. In the first part of the Redo phase, the sLog is processed first. Because it contains only uncommitted transactions since the last checkpoint, it is processed extremely quickly. The second part of the Redo phase begins from the last checkpoint in the transaction log, as opposed to the oldest committed transaction.

In the Undo phase, ADR is able to complete almost instantly by firstly undoing non-versioned operations recorded by the sLog, and then performing a logical revert on row-level versions in the PVS and releasing all locks.

Customers using this feature may notice faster rollbacks, a significant reduction in transaction log usage for long-running transactions, faster SQL Server startup times, and a small increase in the size of the data file for each database where this is enabled (on account of the storage required for the PVS). However, as with all features of SQL Server, we recommend that you do testing before enabling this on all production databases.

Table partitioning

SQL Server allows you to break up a table into logical units, or partitions, for easier management and maintenance while still treating it as a single table. All tables in SQL Server are already partitioned if you look deep enough into the internals. It just so happens that there is one logical partition per table by default.

This concept is called horizontal partitioning. Suppose that a database table is growing extremely large, and adding new rows is time consuming. You might decide to split the table into groups of rows, based on a partitioning key (typically a date column), with each group in its own partition. In turn, you can store these in different filegroups to improve read and write performance.

Breaking up a table into logical partitions can also result in a query optimization called partition elimination, by which only the partition that contains the data you need is queried. However, it was not designed primarily as a performance feature. Partitioning tables will not automatically result in better query performance, and, in fact, performance might be worse due to other factors, specifically around statistics.

Even so, there are some major advantages to table partitioning, which benefit large datasets, specifically around rolling windows and moving data in and out of the table. This process is called partition switching, by which you can switch data into and out of a table almost instantly.

Assume that you need to load data into a table every month and then make it available for querying. With table partitioning, you put the data that you want to insert into a completely separate table in the same database, which has the same structure and clustered index as the main table. Then, a switch operation moves that data into the partitioned table almost instantly because no data movement is needed.

This makes it very easy to manage large groups of data or data that ages-out at regular intervals (sliding windows) because partitions can be switched out nearly immediately.

Compressing data

SQL Server supports several types of data compression to reduce the amount of drive space required for data and backups, as a trade-off against higher CPU utilization.

In general, the amount of CPU overhead required to perform compression and decompression depends on the type of data involved, and in the case of data compression, the type of queries running against the database, as well. Even though the higher CPU load might be offset by the savings in I/O, we always recommend testing before implementing this feature.

Table and index compression

SQL Server provides two main ways to reduce the amount of storage required for data in tables and indexes. We discuss compressing rowstore data in this section. You can read more about columnstore indexes in Chapter 15, “Understanding and designing indexes.”

Note

SQL Server 2019 introduces a new collation type, UTF-8, which may improve storage of Latin-based strings. See Chapter 7 for more information.

Row compression

You turn on row compression at the table or index level. Each column in a row is evaluated according to the type of data and contents of that column, as follows:

  • Numeric data types (such as integer, decimal, floating point, datetime, money, and their derived types) are stored as variable length at the physical layer.

  • Fixed-length character data types are stored as variable length strings, where the blank trailing characters are not stored.

  • Variable length data types, including large objects, are not affected by row compression.

  • Bit columns actually take up more space due to associated metadata.

Row compression can be useful for tables with fixed-length character data types and where numeric types are overprovisioned (e.g., a bigint column that contains mostly int values). Unicode compression alone can save between 15 and 50 percent, depending on your locale and collation.

Page compression

You turn on page compression at the table level, but it operates on all data pages associated with that table, including indexes, table partitions, and index partitions. Leaf-level pages (see Figure 3-4) are compressed using three steps:

  1. Row compression

  2. Prefix compression

  3. Dictionary compression

    This diagram shows a B+ tree structure, used by both clustered and non-clustered indexes internally. A root node, or level (Page 3), comprises one 8-KB page. It fans out downwards to distribute the leaf nodes (where the data is actually stored) as evenly as possible (Pages 1, 4, 5, and 7). An intermediate level (the non-leaf level) is also shown in this contrived example (Pages 2 and 6).

    Figure 3-4 A small clustered index with leaf and non-leaf levels, clustered on ColID.

Non-leaf-level pages are compressed using row compression only. This is for performance reasons.

Prefix compression works per column, by searching for a common prefix in each column. A row is created just below the page header, called the compression information (CI) structure, containing a single row of each column with its own prefix.

If any of a column’s rows on the page match the prefix, its value is replaced by a reference to that column’s prefix.

Dictionary compression then searches across the entire page, looking for repeating values, irrespective of the column, and stores these in the CI structure. When a match is found, the column value in that row is replaced with a reference to the compressed value.

If a data page is not full, it will be compressed using only row compression. If the size of the compressed page along with the size of the CI structure is not significantly smaller than the uncompressed page, no page compression will be performed on that page.

Backup compression

Whereas page-level and row-level compression operate at the table level, backup compression applies to the backup file for the entire database.

Compressed backups are usually smaller than uncompressed backups, which means fewer drive I/O operations are involved, which in turn means a reduction in the time it takes to perform a backup or restore. For larger databases, this can have a dramatic effect on the time it takes to recover from a disaster.

Backup compression ratio is affected by the type of data involved, whether the database is encrypted, and whether the data is already compressed. In other words, a database making use of page and/or row compression might not gain any benefit from backup compression.

The CPU can be limited for backup compression in Resource Governor (you can read more about Resource Governor in the section “Configuration settings” later in the chapter, and in more detail in Chapter 8).

In most cases, we recommend turning on backup compression, keeping in mind that you might need to monitor CPU utilization.

Managing the temporary database

TempDB is the working area of every database on the instance, and there is only one TempDB per instance. SQL Server uses this temporary database for a number of things that are mostly invisible to you, including temporary tables, table variables, triggers, cursors, sorting, version data for snapshot isolation and read-committed snapshot isolation, index creation, user-defined functions, and many more.

Additionally, when performing queries with operations that don’t fit in memory (the buffer pool and the buffer pool extension), these operations spill to the drive, requiring the use of TempDB.

Storage options for TempDB

Every time SQL Server restarts, TempDB is cleared out. If the files don’t exist, they are recreated. If the files are configured at a size that is different from their last active size, they will automatically be resized. Like the database file structure described earlier, there is usually one TempDB transaction log file, and one or more data files in a single filegroup.

Performance is critical for TempDB, even more than with other databases, to the point that the current recommendation is to use your fastest storage for TempDB, before using it for user database transaction log files.

Where possible, use solid-state storage for TempDB. If you have a Failover Cluster Instance, have TempDB on local storage on each node.

Starting with SQL Server 2019, TempDB can store certain metadata in memory-optimized tables for increased performance.

Recommended number of files

As with every database, only one transaction log file should exist for TempDB.

For physical and virtual servers, the default number of TempDB data files recommended by SQL Server Setup should match the number of logical processor cores, up to a maximum of eight, keeping in mind that your logical core count includes symmetrical multithreading (for example, Hyper-Threading). Adding more TempDB data files than the number of logical processor cores rarely results in positive performance. Adding too many TempDB data files could in fact severely harm SQL Server performance.

  • Images You can read more about processors in Chapter 2.

Increasing the number of files to eight (and other factors) reduces TempDB contention when allocating temporary objects. If the instance has more than eight logical processors allocated, you can test to see whether adding more files helps performance, and is very much dependent on the workload.

You can allocate the TempDB data files together on the same volume (see the “Types of storage” section in Chapter 2), provided that the underlying storage layer is able to meet the low-latency demands of TempDB on your instance. If you plan to share the storage with other database files, keep latency and IOPS in mind.

Configuration settings

SQL Server has scores of settings that you can tune to your particular workload. There are also best practices regarding the appropriate settings (such as file growth, memory settings, and parallelism). We cover some of these in this section.

Managing system usage with Resource Governor

Using Resource Governor, you can specify limits on resource consumption at the application-session level. You can configure these in real time, which allows for flexibility in managing workloads without affecting other workloads on the system.

  • Images You can find out more about Resource Governor in Chapter 8.

A resource pool represents the physical resources of an instance, which means that you can think of a resource pool itself as a mini SQL Server instance. To make the best use of Resource Governor, it is helpful to logically group similar workloads together into a workload group so that you can manage them under a specific resource pool.

This is done via classification, which looks at the incoming application session’s characteristics. That incoming session will be categorized into a workload group based on your criteria. This facilitates fine-grained resource usage that reduces the impact of certain workloads on other, more critical workloads.

Caution

There is a lot of flexibility and control in classification because Resource Governor supports user-defined functions (UDFs), allowing you to use system functions and even tables to classify sessions. This means that a poorly written UDF can render the system unusable. Always test classifier functions and optimize them for performance. If you need to troubleshoot a classifier function, use the Dedicated Administrator Connection (DAC) because it is not subject to classification.

For example, a reporting application might have a negative impact on database performance due to resource contention at certain times of the day, so by classifying it into a specific workload group, you can limit the amount of memory or disk I/O that reporting application can use, reducing its effect on, say, a month-end process that needs to run at the same time.

Configuring the operating system page file

Operating systems use a portion of the storage subsystem for a page file (also known as a swap file, or swap partition on Linux) for virtual memory for all applications, including SQL Server, when available memory is not sufficient for the current working set. It does this by offloading (paging out) segments of RAM to the drive. Because storage is slower than memory (see Chapter 2), data that has been paged out is also slower when working from the system page file.

The page file also serves the role of capturing a system memory dump for crash forensic analysis, a factor that dictates its size on modern operating systems with large amounts of memory. This is why the general recommendation for the system page file is that it should be at least the same size as the server’s amount of physical memory.

Another general recommendation is that the page file should be managed by the operating system. For Windows Server, this should be set to System Managed, and, since Windows Server 2012, that guideline has functioned well. However, in servers with large amounts of memory, this can result in a very large page file, so be aware of that if the page file is located on your operating system (OS) volume. This is also why the page file is often moved to its own volume, away from the OS volume.

On a dedicated SQL Server instance, you can set the page file to a fixed size, relative to the amount of Max Server Memory assigned to SQL Server. In principle, the database instance will use up as much RAM as you allow it, to that Max Server Memory limit, so Windows will preferably not need to page SQL Server out of RAM. On Linux, the swap partition can be left at the default size, or reduced to 80% of the physical RAM, whichever is lower.

Note

If the Lock pages in memory policy is on (recommended on a physical Windows-based server only), SQL Server will not be forced to page out of memory, and you can set the page file to a smaller size. This can free up valuable space on the OS drive, which can be beneficial to the OS.

  • Images For more about Lock pages in memory, see the section by the same name later in this chapter.

Taking advantage of logical processors by using parallelism

SQL Server is designed to run on multiple logical processors (for more information, refer to the section “Central Processing Unit” in Chapter 2).

  • Images You can find out more about parallel query plans in Chapter 14.

In SQL Server, parallelism makes it possible for portions of a query (or the entire query) to run on more than one logical processor at the same time. This has certain performance advantages for larger queries, because the workload can be split more evenly across resources. There is an implicit overhead with running queries in parallel, however, because a controller thread must manage the results from each logical processor and then combine them when each thread is completed.

The SQL Server query optimizer uses a cost-based optimizer when coming up with query plans. This means that it makes certain assumptions about the performance of the storage, CPU and memory, and how they relate to different query plan operators. Each operation has a cost associated with it.

SQL Server will consider creating parallel plan operations, governed by two parallelism settings: Cost threshold for parallelism and Max degree of parallelism. These two settings can make a world of difference to the performance of a SQL Server instance if it is using default settings.

Query plan costs are recorded in a unitless measure. In other words, the cost bears no relation to resources such as drive latency, IOPS, number of seconds, memory usage, or CPU power, which can make query tuning difficult without keeping this in mind. What matters is the magnitude of this measure.

Cost threshold for parallelism

This is the minimum cost a query plan can be before the optimizer will even consider parallel query plans. If the cost of a query plan exceeds this value, the query optimizer will take parallelism into account when coming up with a query plan. This does not necessarily mean that every plan with a higher cost is run across parallel processor cores, but the chances are increased.

The default setting for cost threshold for parallelism is 5. Any query plan with a cost of 5 or higher will be considered for parallelism. Given how much faster and more powerful modern server processors are than when this setting was first created, many queries will run just fine on a single core, again because of the overhead associated with parallel plans.

Note

Certain query operations can force some or all of a query plan to run serially, even if the plan cost exceeds the cost threshold for parallelism. Paul White’s article “Forcing a Parallel Query Execution Plan” describes a few of these. You can read Paul’s article at https://www.sql.kiwi/2011/12/forcing-a-parallel-query-execution-plan.html.

It might be possible to write a custom process to tune the cost threshold for parallelism setting automatically, using information from the Query Store. Because the Query Store works at the database level, it helps identify the average cost of queries per database and find an appropriate setting for the cost threshold relative to your specific workload.

Until we get to an autotuning option, you can set the cost threshold for parallelism to 50 as a starting point for new instances, and then monitor the average execution plan costs, to adjust this value up or down (and you should adjust this value based on your own workload).

Cost threshold for parallelism is an advanced server setting; you can change it by using the command sp_configure 'cost threshold for parallelism'. You can also change it in SQL Server Management Studio by using the Cost Threshold For Parallelism setting, which can be found in the Advanced page of the Server Properties dialog.

Max degree of parallelism

SQL Server uses this value, also known as MAXDOP, to select the maximum number of logical processors to run a parallel query plan when the cost threshold for parallelism is reached.

The default setting for MAXDOP is 0, which instructs SQL Server to make use of all available logical processors to run a parallel query (taking processor affinity into account—see later in this chapter).

The problem with this default setting for most workloads is twofold:

  • Parallel queries can consume all resources, preventing smaller queries from running or forcing them to run slowly while they find time in the CPU scheduler.

  • If all logical processors are allocated to a plan, it can result in foreign memory access, which, as we explain in Chapter 2 in the “Non-Uniform Memory Access (NUMA)” section, carries a performance penalty.

Specialized workloads can have different requirements for the MAXDOP. For standard or Online Transaction Processing (OLTP) workloads, to make better use of modern server resources, the MAXDOP setting must take NUMA nodes into account:

  • Single NUMA node. With up to eight logical processors on a single node, the recommended value should be set to 0 or the number of cores. With more than eight logical processors, the recommended value should be set to 8.

  • Multiple NUMA nodes. With up to 16 logical processors on a single node, the recommended value should be set to 0 or the number of cores. With more than 16 logical processors, the recommended value should be set to 16.

MAXDOP is an advanced server setting; you can change it by using the command sp_configure 'max degree of parallelism'. You can also change it in SQL Server Management Studio by using the Max Degree Of Parallelism setting, in the Advanced page of the Server Properties dialog.

SQL Server memory settings

Since SQL Server 2012, the artificial memory limits imposed by the license for lower editions (Standard, Web, and Express) apply to the buffer pool only (see https://docs.microsoft.com/sql/sql-server/editions-and-components-of-sql-server-version-15).

This is not the same thing as the Max Server Memory, though. According to Microsoft Docs, the Max Server Memory setting controls all of SQL Server’s memory allocation, which includes, but is not limited to the buffer pool, compile memory, caches, memory grants, and CLR (Common Language Runtime, or .NET) memory (https://docs.microsoft.com/sql/database-engine/configure-windows/server-memory-server-configuration-options).

Additionally, limits to columnstore and memory-optimized object memory are over and above the buffer pool limit on non-Enterprise editions, which gives you a greater opportunity to make use of available physical memory.

This makes memory management for non-Enterprise editions more complicated, but certainly more flexible, especially taking columnstore and memory-optimized objects into account.

Max Server Memory

As noted in Chapter 2, SQL Server uses as much memory as you allow it. Therefore, you want to limit the amount of memory that each SQL Server instance can control on the server, ensuring that you leave enough system memory for the following:

  • The OS itself (see the algorithm below).

  • Other SQL Server instances installed on the server.

  • Other SQL Server features installed on the server; for example, SQL Server Reporting Services, SQL Server Analysis Services, or SQL Server Integration Services.

  • Remote desktop sessions and locally run administrative applications like SQL Server Management Studio (SSMS) and Azure Data Studio.

  • Antimalware programs.

  • System monitoring or remote management applications.

  • Any additional applications that might be installed and running on the server (including web browsers).

Caution

If you connect to your SQL Server instance via a remote desktop session, make sure that you have a secure VPN connection in place.

The appropriate Max Server Memory setting will vary from server to server. A good starting point for the reduction from the total server memory is 10 percent less, or 4 GB less than the server’s total memory capacity, whichever is the greater reduction (but see OS reservation later in this section for a more detailed recommendation). For a dedicated SQL Server instance and 16 GB of total memory, an initial value of 12 GB (or a value of 12288 in MB) for Max Server Memory is appropriate.

Note

SQL Server is supported on servers with as little as 4 GB of RAM, in which case a Max Server Memory value of 2,048 MB is recommended.

OS reservation

Jonathan Kehayias has published the following algorithm that can help with reserving the appropriate amount of RAM for the OS itself. Whatever remains can then be used for other processes, including SQL Server by means of Max Server Memory:

  • 1 GB of RAM for the OS

  • Add 1 GB for each 4 GB of RAM installed, from 4 to 16 GB

  • Add 1 GB for every 8 GB RAM installed, above 16 GB RAM

  • Images To learn more, read Kehayias, J and Kruger, T, Troubleshooting SQL Server: A Guide for the Accidental DBA (Redgate Books, 2011).

Assuming that a server has 256 GB of available RAM, this requires a reservation of 35 GB for the OS. The remaining 221 GB can then be split between SQL Server and anything else that is running on the server.

Performance Monitor to the rescue

Ultimately, the best way to see if the correct value is assigned to Max Server Memory is to monitor the MemoryAvailable MBytes value in Performance Monitor. This way, you can ensure that Windows Server has enough working set of its own and adjust Max Server Memory downward if this value drops below 300 MB.

  • Images Performance Monitor is covered in more detail in Chapter 14.

Max Server Memory is an advanced server setting; you can change it by using the command sp_configure 'max server memory'. You can also change it in SQL Server Management Studio by using the Max Server Memory setting, in the Server Properties section of the Memory node.

Max Worker Threads

Every process on SQL Server requires a thread, or time on a logical processor, including network access, database checkpoints, and user threads. Threads are managed internally by the SQL Server scheduler, one for each logical processor, and only one thread is processed at a time by each scheduler on its respective logical processor.

These threads consume memory, which is why it’s generally a good idea to let SQL Server manage the maximum number of threads allowed automatically.

However, in certain special cases, changing this value from the default of 0 might help performance tuning. The default of 0 means that SQL Server will dynamically assign a value when starting, depending on the number of logical processors and other resources.

To check whether your server is currently under CPU pressure, run the following query, which returns one row per CPU core:

SELECT AVG(runnable_tasks_count)

FROM sys.dm_os_schedulers

WHERE status = 'VISIBLE ONLINE';

If the number of tasks is consistently high (in the double digits), your server is under CPU pressure. You can mitigate this in a number of other ways that you should consider before increasing the number of Max Worker Threads.

In some scenarios, lowering the number of Max Worker Threads can improve performance.

Lock pages in memory

The Lock pages in memory policy prevents Windows from taking memory away from applications such as SQL Server in low memory conditions, which can cause instability if you use it incorrectly. You can mitigate the danger of OS instability by carefully aligning Max Server Memory capacity for any installed SQL Server features (discussed earlier) and reducing the competition for memory resource from other applications.

When reducing memory pressure in virtualized systems, it is also important to avoid over-allocating memory to guests on the virtual host. Meanwhile, locking pages in memory can still prevent the paging of SQL Server memory to the drive due to memory pressure, which is a significant performance hit.

  • Images For a more in-depth explanation of the Lock pages in memory policy, see Chapter 2.

Optimize for ad hoc workloads

Ad hoc queries are defined, in this context, as queries that are run only once. Applications and reports should be running the same queries many times, and SQL Server recognizes them and caches them over time.

By default, SQL Server caches the runtime plan for a query after the first time it runs, with the expectation of using it again and saving the compilation cost for future runs. For ad hoc queries though, these cached plans will never be reused yet will remain in cache.

When Optimize For Ad Hoc Workloads is set to True, a plan will not be cached until it is recognized to have been called twice (in other words, it will cache the full plan on the second execution). The third and all ensuing times it is run would then benefit from the cached runtime plan. Therefore, it is recommended that you set this option to True.

For most workloads, the scenario in which plans might only ever run exactly twice is unrealistic, as is the scenario in which there is a high reuse of plans.

Note

Enabling Forced Parameterization at the database level can force query plans to be parameterized even if they are considered unique by the query optimizer, which then can reduce the number of unique plans. Provided you test this scenario, you can get better performance using this feature in combination with Optimize For Ad Hoc Workloads.

This is an advanced server setting; you can change it by using the command sp_configure 'optimize for ad hoc workloads'. You can also change it in SQL Server Management Studio by using the Optimize For Ad Hoc Workloads setting, in the Advanced page of the Server Properties dialog.

Allocating CPU cores with an affinity mask

It is possible to assign certain logical processors to SQL Server. This might be necessary on systems that are used for instance stacking (more than one SQL Server instance installed on the same OS) or when workloads are shared between SQL Server and other software.

SQL Server on Linux does not support instance stacking, and virtual consumers (virtual machines or containers) are probably a better way of allocating these resources, but there might be legitimate or legacy reasons for setting core affinity.

Suppose that you have a dual-socket NUMA server, with both CPUs populated by 16-core processors. Excluding simultaneous multithreading (SMT), this is a total of 32 cores, and SQL Server Standard edition is limited to 24 cores, or four sockets, whichever is lower.

When it starts, SQL Server will allocate all 16 cores from the first NUMA node, and eight from the second NUMA node. It will write an entry to the ERRORLOG stating this case, and that’s where it ends. Unless you know about the core limit, you will be stuck with unbalanced CPU core and memory access, resulting in unpredictable performance.

One way to solve this without using a VM or container, is to limit 12 cores from each CPU to SQL Server, using an affinity mask (see Figure 3-5). This way, the cores are allocated evenly and combined with a reasonable MAXDOP setting of 8, foreign memory access is not a concern.

A screenshot showing the Server Properties dialog box in SQL Server Management Studio. The Processors page is selected.

Figure 3-5 Setting the affinity mask in SQL Server Management Studio.

Note

I/O affinity allows you to assign specific CPU cores to I/O operations, which may be beneficial on enterprise-level hardware with more than 16 CPU cores. You can read more in Microsoft Support at https://support.microsoft.com/help/298402.

By setting an affinity mask, you are instructing SQL Server to use only specific cores. The remaining unused cores are marked as offline to SQL Server. When SQL Server starts, it will assign a scheduler to each online core.

Caution

Affinity masking is not a legitimate way to circumvent licensing limitations with SQL Server Standard edition. If you have more cores than the maximum usable by a certain edition, all logical cores on that machine must be licensed.

Configuring affinity on Linux

For SQL Server on Linux, even when an instance is going to be using all of the logical processors, you should use the ALTER SERVER CONFIGURATION option to set the PROCESS AFFINITY value, which maintains efficient behavior between the Linux OS and the SQL Server Scheduler.

You can set the affinity by CPU or NUMA node, but the NUMA method is simpler.

Suppose that you have four NUMA nodes. You can use the configuration option to set the affinity to use all the NUMA nodes as follows:

ALTER SERVER CONFIGURATION SET PROCESS AFFINITY NUMANODE = 0 TO 3;

File system configuration

This section primarily deals with the default file system on Windows Server. Any references to other file systems, including Linux file systems are noted separately.

The NT File System (NTFS) was originally created for the first version of Windows NT, bringing with it more granular security than the older File Allocation Table (FAT)-based file system as well as a journaling file system (think of it as a transaction log for your file system). You can configure a number of settings that deal with NTFS in some way to improve your SQL Server implementation and performance.

Instant file initialization

As stated previously in this chapter, transaction log files need to be zeroed out at the file system in order for recovery to work properly. However, data files are different, and with their 8-KB page size and allocation rules, the underlying file might contain sections of unused space.

With instant file initialization (IFI), a feature enabled by the Active Directory policy Perform volume maintenance tasks, data files can be instantly resized without zeroing-out the underlying file. This adds a major performance boost.

The trade-off is a tiny, perhaps insignificant security risk: data that was previously used in drive allocation currently dedicated to a database’s data file now might not have been fully erased before use. Because you can examine the underlying bytes in data pages using built-in tools in SQL Server, individual pages of data that have not yet been overwritten inside the new allocation could be visible to a malicious administrator.

Note

It is important to control access to SQL Server’s data files and backups. When a database is in use by SQL Server, only the SQL Server service account and the local administrator have access. However, if the database is detached or backed up, there is an opportunity to view that deleted data on the detached file or backup file that was created with instant file initialization turned on.

Because this is a possible security risk, the Perform volume maintenance tasks policy is not granted to the SQL Server service by default, and a summary of this warning is displayed during SQL Server setup.

Without IFI, you might find that the SQL Server wait type PREEMPTIVE_OS_WRITEFILEGATHER is prevalent during times of data-file growth. This wait type occurs when a file is being zero-initialized; thus, it can be a sign that your SQL Server is wasting time that could skipped with the benefit of IFI. Keep in mind that PREEMPTIVE_OS_WRITEFILEGATHER will also be generated by transaction log files, which do not benefit from IFI.

Note that SQL Server Setup takes a slightly different approach to granting this privilege than SQL Server administrators might take. SQL Server assigns Access Control Lists (ACLs) to automatically created security groups, not to the service accounts that you select on the Server Configuration Setup page. Instead of granting the privilege to the named SQL Server service account directly, SQL Server grants the privilege to the per-service security identifier (SID) for the SQL Server database service; for example, the NT SERVICEMSSQLSERVER principal. This means that the SQL Server service will maintain the ability to use IFI even if its service account changes.

You can determine whether the SQL Server Database Engine service has been granted access to IFI by using the sys.dm_server_services dynamic management view via the following query:

SELECT servicename, instant_file_initialization_enabled

FROM sys.dm_server_services

WHERE filename LIKE '%sqlservr.exe%';

If IFI was not configured during SQL Server setup, and you want to do so later, go to the Windows Start menu, and then, in the Search box, type Local Security Policy. Next, in the pane on the left expand Local Policies (see Figure 3-6), and then click User Rights Assignment. Find the Perform Volume Maintenance Tasks policy, and then add the SQL Server service account to the list of objects with that privilege.

This screenshot shows the Windows Local Security Policy setup page. the Perform volume maintenance tasks item is selected. An entry has been added for an Active Directory account for the SQL Server service.

Figure 3-6 Turning on instant file initialization through the Local Security Policy setup page.

NTFS allocation unit size

SQL Server performs best with an allocation unit size of 64 KB.

Depending on the type of storage, the default allocation unit on NTFS might be 512 bytes, 4,096 bytes (also known as Advanced Format 4K sector size), or some other multiple of 512.

Because SQL Server deals with 64-KB extents (see the section “Mixed extents and uniform extents” earlier in the chapter), it makes sense to format a data volume with an allocation unit size of 64 KB, to align the extents with the allocation units. This applies to the Resilient File System (ReFS) on Windows Server, and XFS and ext4 file systems on Linux.

Note

We cover aligned storage in more detail in Chapter 4.

Container orchestration with Kubernetes

Kubernetes is a container orchestration system that was initially developed by Google. Processes like fault tolerance, workload schedule, and even networking, are all provided by the management layer of Kubernetes.

One of the reasons Kubernetes (also known as K8s, because there are eight letters between the K and the s) has become a staple in modern data centers is its flexibility in container orchestration and management. It provides enterprise-level infrastructure functionality to the container development process favored by most DevOps organizations, making it the “operating system for the data center” as coined by Google.

Let’s use the analogy of an actual orchestra, comprising a conductor, instrument sections, musicians, their instruments, and their sheet music. While no analogy is perfect, this might help you picture things more easily. At the bottom are your musical instruments. Each instrument needs to be played by a musician, guided by their sheet music. Groups of musicians play together in a section. Finally, the conductor oversees the entire performance. In our analogy, Kubernetes is the conductor, and containers are the instruments. Clusters are the instrument families, like the string section or the brass section, and musicians are the pods with their sheet music.

  • Images We introduced containers in the section on virtualization in Chapter 2.

Kubernetes relies on a software-defined infrastructure (the sheet music in our analogy). When you deploy your containers, you use a YAML (a recursive acronym standing for “YAML Ain’t Markup Language”) file that defines:

  • The container image you are using

  • Any storage that you are persisting

  • The container CPU and memory configuration of the pod

  • The networking configuration

  • Other metadata about the deployment

The deployment manifest is converted from YAML to JSON by kubectl, and then deployed to the Kubernetes API where it is parsed and then deployed into a key-value store (called etcd) that stores the cluster metadata. The objects in the manifest are deployed in their respective pods, services, and storage. The cluster controller (part of the control plane) ensures that the manifest is running and is in a healthy application state, and redeploys the manifest in the event of node failure or an unhealthy application state. The cluster will always attempt to maintain the desired state of the configuration, as defined in the deployed manifests.

Kubernetes support for SQL Server

Microsoft introduced support for Kubernetes after the release of SQL Server 2017 (see Figure 3-7). Early releases of Kubernetes lacked support for persisted storage, which is an obvious problem for database containers. The implementation uses a Kubernetes service to act as a persisted front-end name and IP address for the container. In this scenario, if the pod fails, the service stays running, and a new copy of the pod is launched and then pointed at the persisted storage. This is nearly analogous to the architecture of a SQL Server Failover Cluster Instance (FCI).

This is an image of the architecture of SQL Server on Kubernetes. It shows a master node with three worker nodes. There is a database service spanning the three worker nodes and a SQL pod located on one of the worker nodes. The storage also spans the three Worker nodes and shows a database volume on it.

Figure 3-7 SQL Server on Kubernetes Architecture.

  • Images Refer to Chapter 2 for a more in-depth discussion on FCIs.

The SQL Server 2017 Kubernetes deployment provides for just a single container per SQL Server pod. SQL Server 2019 may include other services in the pod deployment. Services provide load balancing and persistent IP addressing, while Persistent Volume Claims ensure that storage is persisted across container failures or node movement. By defining a PersistentVolumeClaim, you are aligning a specific disk volume to your pod deployment to persist data files.

Recent releases of both Kubernetes and Windows Server allow for Kubernetes to support both Windows nodes and Windows containers, but SQL Server Big Data Clusters (BDC) currently only support containers on Linux. Kubernetes is also much more broadly used on Linux, so community support will be much more prevalent on that platform.

  • Images You can learn more about Kubernetes from the following books: The Kubernetes Book (2019), by Nigel Poulton; and Kubernetes: Up and Running (2019), by Kelsey Hightower, Brendan Burns, et al.

BDC licensing offers some flexibility compared to traditional SQL Server core licensing. With Software Assurance on Enterprise edition, you can use those core licenses for your master instance node, and you will get eight BDC core node licenses for every master instance core.

  • Images You can find out more about RHEL in Chapter 5.

Deploying SQL Server in containers

As we mentioned previously, SQL Server runs on Windows, Linux, and in Docker containers. When originally released with SQL Server 2017, container support was touted for use in development. After all, there was limited support in the container world for persisted storage at the time, and SQL Server lacked support for an enterprise orchestration framework like Kubernetes. While database containers still make for a fantastic development environment, the future support in SQL Server for availability groups, Active Directory authentication (not yet available at the time of writing), and Big Data Clusters, means that container deployment is quickly becoming an option for production workloads as well.

  • Images You can read more about availability groups in Chapter 11.

Getting started with SQL Server in a Docker container

One of the biggest attractions of running SQL Server in a container is that your choice of operating system does not matter. While the container images of SQL Server use Linux as their base, your host machine can run Windows, Linux, or macOS. First, you will need to install Docker Desktop on your workstation.

After you have Docker installed, you can deploy a SQL Server container with the following steps. The first thing you should do is pull a copy of the container image from the Microsoft Container Registry (MCR) to your local environment. You should run these commands from either a bash shell on Linux and macOS, or an elevated PowerShell prompt on Windows.

sudo docker pull mcr.microsoft.com/mssql/server:2019-latest

After you have pulled down the image, you then need to deploy the container. The backslash in this command is a way to split a single bash command across multiple lines:

sudo docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=<YourStrong!Passw0rd>' 
   -p 1433:1433 --name sql2019 
   -v /users/randolph/mssql:/mssql 
   -d mcr.microsoft.com/mssql/server:2019-latest

Caution

There is currently no secure way to obfuscate the SA password in a Docker deployment. Microsoft recommends that you change your SA password after you have deployed your container.

You may be curious what these parameters (also called switches in Linux) mean:

  • The docker pull command downloads the container image to your local container repository.

  • The docker run command is where the deployment takes place.

  • The -e switch allows for an environmental variable to be passed into the deployment (we cover environment variables in Chapter 5). In this case, you are accepting the End-User License Agreement (EULA) for SQL Server, and providing a strong password for the SA account.

  • The -p (or --publish, note the double-dash before the parameter) switch publishes the container’s public and private port for your container. To run multiple SQL Server containers simultaneously, specify a TCP port other than 1433 for the subsequent containers that are deployed.

  • The --name switch (note the double-dash before the parameter) specifies the name of your container (which is not required but will result in a system-generated name if not specified).

  • The -v switch is probably the most important in terms of database use. It allows a persistent volume to be mounted from your local machine to your container. In this the case the local directory /users/randolph/mssql will appear in the container as /mssql. Use this directory to store database backups, or data files to be mounted to the container.

  • The -d switch refers to the container image you are deploying. In this case you are a deploying a SQL Server 2019 container from the MCR.

Note

Docker on macOS does not support persistent volumes. Microsoft recommends that you use separate data container volumes to persist data files that are stored in /var/opt/mssql/data. You can read the background to this issue at https://github.com/microsoft/mssql-docker/issues/12, and you can learn more about data container volumes at https://docs.docker.com/storage/volumes/.

After the container has been deployed, you can execute the docker ps command (lists all of the running containers) to confirm that your container is running (in some environments you may need to run sudo docker ps). Also, you can connect to your container using SQL Server tools like SQL Server Management Studio (SSMS), Azure Data Studio, or sqlcmd, by connecting to localhost. This is possible because when you deployed the container, you configured it to run on TCP port 1433, which is the default SQL Server port.

This screen shot shows the output of the docker ps command that shows a single container named sql2019 running. Finally, a connection to the database container using the SQLCMD utility is shown.

Figure 3-8 A screenshot of docker ps output and sqlcmd connection.

Note

If you use a custom TCP port (or deploy multiple SQL Server containers) you can connect to localhost followed by the port number, separated with a comma. For example, localhost,1455.

You can also connect into your container with an interactive shell and execute sqlcmd. The first command will launch the bash shell within your container:

sudo docker exec -it sql1 "/bin/bash"

After launching the interactive bash shell within your container, you will then call sqlcmd using the full path, since it is not in the path by default:

/opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P '<YourNewStrong!Passw0rd>'

Once your SQL Server container is deployed, you can execute T-SQL just like it was any other SQL Server.

Getting started with SQL Server on Kubernetes

Although running SQL Server in a single Docker container is relatively easy, running SQL Server on a Kubernetes infrastructure is more challenging.

Kubernetes as part of Docker Desktop

You can install Kubernetes with Docker Desktop; however, as mentioned previously, persistent volumes are not supported on macOS. If you are using Docker on Windows and you are running Windows 10 Professional, you can configure Kubernetes after enabling Hyper-V.

Kubernetes using Minikube

Another commonly used option for development and testing of Kubernetes is Minikube, which runs across Windows, Linux, and macOS. Minikube is an open source project that allows for a deployment to your local workstation.

Kubernetes using the Azure Kubernetes Service

The recommended way to fully simulate a production environment is to deploy using Azure Kubernetes Service (AKS). AKS is a managed service that allows you to quickly deploy a Kubernetes cluster of one node, up to 100 nodes.

This shows the Microsoft Azure portal, displaying the scale options for an Azure Kubernetes Service cluster. It is currently set to three nodes with a slider bar to increase the scale.

Figure 3-9 A screenshot of Azure Portal showing AKS scale options.

AKS offers the benefit of hosting a highly available control plane for the cluster in Azure, as well as deploying the latest release of Kubernetes without installing software, worrying about dependencies, or finding hardware to build on. The other benefit to AKS is that the service itself is free. You are only charged for the underlying VM compute costs. Storage in AKS is provided by either using Azure Managed Disks or the Azure File service that acts as a file share.

Deploying SQL Server on Kubernetes

Once you have a Kubernetes cluster or simulated cluster, you can get started with deploying SQL Server. First, you will need to create a secret in Kubernetes to store your SA password.

kubectl create secret generic mssql --from-literal=SA_PASSWORD="MyC0m9l&xP@ssw0rd"

If kubectl (the Kubernetes command line tool) is not installed on the machine where you are managing your cluster, you will need to install it in order to manage your deployment.

Next, you will create a Persistent Volume Claim (PVC). As mentioned earlier, containers were originally designed to be ephemeral and not persist data across restarts or failures. A PVC will ask the cluster to provide a mapping to a Persistent Volume (PV). A PV can be statically or dynamically provisioned.

  • A statically-provisioned PV is defined by the cluster administrator and a PersistentVolumeClaim will be matched to that PV based on size and access mode.

  • A dynamically-provisioned PV will be provisioned from a cluster defined StorageClass. A PV asks the Storage Class to provision the volume from the underlying storage subsystem of the cluster. This can be a cloud provider’s persistent volume such as Azure Disks, or even an on-premises SAN.

If you are using Azure Kubernetes Services, save the following code to a file called pvc.yaml.

kind: StorageClass

apiVersion: storage.k8s.io/v1beta1

metadata:

     name: azure-disk

provisioner: kubernetes.io/azure-disk

parameters:

  storageaccounttype: Standard_LRS

  kind: Managed

---

kind: PersistentVolumeClaim

apiVersion: v1

metadata:

  name: mssql-data

  annotations:

    volume.beta.kubernetes.io/storage-class: azure-disk

spec:

  accessModes:

  - ReadWriteOnce

  resources:

    requests:

      storage: 8Gi

This code defines the Azure storage class, and then defines an 8-GB volume. This code example uses Azure storage, which is how you would implement on AKS. You will use slightly different code if you are using storage local to your cluster, like you do when using Minikube or Docker.

kind: PersistentVolumeClaim

apiVersion: v1

metadata:

  name: mssql-data-claim

spec:

  accessModes:

  - ReadWriteOnce

  resources:

   requests:

    storage: 8Gi

Just like in the Azure example, save this file to pvc.yaml and then deploy using the kubectl apply command shown next:

kubectl apply -f C:scriptspvc.yaml

Caution

ReadWriteOnce is one of three access modes available for persistent volumes. It is the only option that allows both writes and single-node mounting. You will corrupt your databases if a volume is mounted by multiple writers.

The next step is to deploy the SQL Server service and the pod itself. In the following code, you are specifying the load balancer service, as well as the container running SQL Server. Kubernetes can use extensive metadata to help describe and categorize your environment, as you will note from the metadata and label fields in the below YAML. Much like in the Docker script earlier, you are defining a port, passing in the SA password you defined in the secret, and accepting the EULA. Finally, in the last section, you are defining the load balancer, which will give you a persistent IP address for your SQL instance.

apiVersion: apps/v1beta1

kind: Deployment

metadata:

  name: mssql-deployment

spec:

  replicas: 1

  template:

    metadata:

      labels:

        app: mssql

    spec:

      terminationGracePeriodSeconds: 10

      containers:

      - name: mssql

        image: mcr.microsoft.com/mssql/server:2019-latest

        ports:

        - containerPort: 1433

        env:

        - name: MSSQL_PID

          value: "Developer"

        - name: ACCEPT_EULA

          value: "Y"

        - name: MSSQL_SA_PASSWORD

          valueFrom:

            secretKeyRef:

              name: mssql

              key: SA_PASSWORD

        volumeMounts:

        - name: mssqldb

          mountPath: /var/opt/mssql

      volumes:

      - name: mssqldb

        persistentVolumeClaim:

          claimName: mssql-data

---

apiVersion: v1

kind: Service

metadata:

  name: mssql-deployment

spec:

  selector:

    app: mssql

  ports:

    - protocol: TCP

      port: 1433

      targetPort: 1433

  type: LoadBalancer

You can save this YAML as sql.yaml, and using the same kubectl apply -f command, you can deploy it from where you manage Kubernetes.

Congratulations! You now have SQL Server running on Kubernetes. You can run the kubectl get pods and get services commands as shown in Figure 3-10 to see your deployment.

This shows a command line, showing the output of the kubectl get services and get pods commands. It shows a load balancer for the mssql-deployment, and a pod name mssql-deployment that is running.

Figure 3-10 A screen shot showing the load balancer and SQL Server pod in a Kubernetes deployment.

If you review the output of the kubectl get services command, you will see the external IP address of your SQL Server service. You can now use any SQL Server client tool to connect to that address with the SA password you created in the secret.

Caution

This configuration exposes port 1433 to the Internet and should only be used for demonstration purposes. In order to secure your cluster for production usage, review AKS networking best practices at https://docs.microsoft.com/azure/aks/best-practices.

Reviewing cluster health

Kubernetes provides a built-in dashboard for monitoring the overall health of your cluster and its resource utilization. If you are using AKS, you can view this by running the az aks browse command with the resource group and cluster names. Depending on the configuration and version of your cluster, you may need to create a cluster role binding in order to view the dashboard as seen in Figure 3-11.

This screen shot shows the Kubernetes dashboard with memory and CPU usage for the cluster, and the status of the workloads.

Figure 3-11 A screenshot of the Kubernetes web dashboard.

Kubernetes deployments move all of your infrastructure into scripts. For some automation-focused administrators, this may be the holy grail that they have been waiting for, but it is important to manage these scripts just as you would do with your application code. They should be version-controlled in a source control system like Azure DevOps or GitHub. If you are hosting your own repositories, you should ensure that they are backed up and highly available.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset