Chapter 14
Performance tuning SQL Server

In this chapter, we review the database concepts and objects most associated with tuning the performance of queries and coded objects within Microsoft SQL Server and Azure SQL DB. We will not be looking at performance tuning queries that make use of the PolyBase feature, as tuning these queries will go beyond tuning SQL Server into tuning the external systems that they access.

In the first two sections of the chapter we will look at isolation levels and durability, which will touch on three of the ACID properties of an RDBMS (Relational Database Management System). These correspond to settings and configurations that let your code affect performance. The ACID properties are:

  • Atomicity. Transactions, which make multiple step processes behave as one operation

  • Isolation. Dealing with how multiple connections interact with one another

  • Durability. When a transaction is committed, we have certain expectations on what happens if the server stops working.

  • Consistency. Making sure data meets business rules in the database at the end of a transaction

Of these, we will cover the first three, but not consistency, as this is more of a database design task, which is covered in Chapter 7, “Understanding table features.”

Then, we explore the process of how SQL Server executes queries, including understanding the execution plans that the query processor creates to execute your query, including how they are used in the Query Store feature. We discuss execution plans in some detail, what to look for when performance tuning, and how to control when they go parallel, meaning SQL Server can use multiple processors to execute your query, without the code changing at all.

The examples that in this chapter behave identically in SQL Server instances and databases in Azure SQL Database unless otherwise noted. All sample scripts in this book are available for download at https://MicrosoftPressStore.com/SQLServer2019InsideOut/downloads.

Understanding isolation levels and concurrency

The fundamental problem when we are working on a multi-user computing system is how to handle the problem that frequently users need access to the same resources. So, if there is a row, say Row X, and User 1 and User 2 both want to do something with this row, what is the effect? If they both want to read the row, one set of concerns exists, and if one wishes to read it, and the other write it, there is another is set of issues. Finally, if both want to write to the row, still another set of concerns arises. This is where the concept of isolation comes in, including how to isolate one connection from the other.

This is all related to the concept of atomicity, and as such, transactions containing one or more statements, because we need to isolate logical atomic operations from one another. Even a single statement in a declarative programming system like Transact-SQL provides will be hundreds and thousands of steps behind the scenes.

Isolation isn’t only a matter of physical access to resources (a disk drive is fetching data from a row, so the next reader must wait for that to complete). This is a different problem for the hardware. Instead, it is a matter that while one transaction is doing its operations, others need to be as isolated from the data the user has used as needed. The performance implications are large, because the more isolated the operations need to be, the slower processing can be, but the less isolated transactions are, the greater the chance for loss of data.

The types of things that can occur between two transactions are generally referred to as phenomenon. They are:

  • Dirty Read. Reading data that another connection is in the process of changing. The problem is much like trying to read a post-it note that your boss is writing your new salary on. Read too early, and it looks like you are getting a $1 raise annually, or $100000, because the decimal point has not been added.

  • Non-Repeatable Read. Reading the same data over again that has changed or gone away. This problem is like when you check the box of doughnuts and see there is one left. While you are standing there, in control of the box, no one can take that last doughnut. But step away to get coffee, and when you come back it has a bite taken out of it. A repeatable read will always give you back rows with the same content as you first read (but may include more rows that did not exists when you first read a set of rows).

  • Phantom Read. A phantom read is when you read a set of data, but then come back and read it again and get new rows that did not previously exist. In the previous doughnut example, this is the happiest day of your life since there are now more doughnuts. However, this can be bad if your query needs to get back the same answer every time you ask the same question.

  • Reading a Previous Committed Version of Data. In some cases, you may be able to eliminate blocking by allowing connections to read a previously committed version of data that another connection is in the process of changing after your transaction started. A real-world example of this regularly happens in personal banking. You and your partner see you have $N in your account, and you both withdraw $N, not realizing the intentions of the other connection. Your ATM may even say you have $0 after both transactions, using stale information. This does not change the fees you will be receiving, but often it is perfectly acceptable for the task at hand to see how the data was at the start of the transaction.

Where this gets complicated is that many operations in a database system will be bundled into multi-step operations that need to be treated as one atomic operation. Reading data and getting back different results when executing the same query again, during what you expect to be an atomic operation, greatly increases the likelihood of returning incorrect results.

You can’t control which individual phenomenon you will allow your transaction to be affected by, but they are bundled into isolation levels that allow certain effects to occur. For example, the default isolation level, READ COMMITTED, is subject to nonrepeatable reads and phantom rows, but not dirty reads. This provides adequate protection and performance in most situations, but definitely not all.

It is important to have a fundamental understanding of these effects because these aren’t just arcane keywords you study only when it is certification time; they can have a profound effect on application performance, stability, and absolutely the most important thing for any RDBMS: data integrity.

For example, consider you are writing software to control the trains that are using track T. Two trains wish to use track T, and both conductors ask if the track is vacant, so both see it is vacant and put their trains on the track heading toward each other. Not good.

Understanding the differing impact of isolation levels on locking and blocking, and therefore on concurrency, is the key to understanding when you should use an isolation level different from the default of READ COMMITTED. Table 14-1 presents the isolation levels available in SQL Server along with the phenomena that are allowed.

Table 14-1 Isolation levels and phenomena that can be incurred

Transaction isolation level

Dirty reads

Nonrepeatable reads

Phantom rows

Reading a Previous Committed Version of Data

READ UNCOMMITTED

X

X

X

 

READ COMMITTED

 

X

X

 

REPEATABLE READ

 

 

X

 

SERIALIZABLE

 

 

 

 

READ COMMITTED SNAPSHOT (RCSI)

 

X

X

X

SNAPSHOT

 

 

 

X

When you are choosing an isolation level for a transaction in an application, you should consider primarily the transactional safety and business requirements of the transaction in a highly concurrent multiuser environment. The performance of the transaction should be a distant second priority (yet still a priority) when choosing an isolation level.

Locking, which SQL Server uses for normal isolation of processes is not bad, it is the way that every transaction in SQL Server cooperates with others when dealing with disk-based tables.

The default isolation level of READ COMMITTED is generally a safe isolation level because it only allows connection to access data that has been committed by other transactions. Dirty reads are generally the only modification phenomenon that is almost universally bad. With READ COMMITTED, modifications to a row blocks reads from other connections to that same row. This is especially important during multi-statement transactions, such as when parent and child rows in a foreign key relationship must be created in the same transaction. In that scenario, reads should not access either row in either table until both changes are updated.

Since READ COMMITTED isolation level allows nonrepeatable reads and phantom rows, it does not ensure that row data and row count won’t change between two SELECT queries on the same data in a transaction. READ COMMITTED isolation levels allow SQL Server to release locks from objects it has read and lets other users have any access, holding only locks on resources that it has changed.

For some application scenarios, this might be acceptable or desired, but not for others. To avoid these two problematic scenarios (which we talk more about soon), you need to choose the proper, more stringent isolation level for the transaction.

For scenarios in which transactions must have a higher degree of isolation from other transactions, escalating the isolation level of a transaction is appropriate. For example, if a transaction must write multiple rows, even in multiple tables and statements, it cannot allow other transactions to change data it has read during the transaction, where escalating the isolation level of a transaction is appropriate. Here are two examples.

In this example, the REPEATABLE READ isolation level blocks other transactions from changing or deleting rows needed during a multistep transaction. Unlike READ COMMITTED, REPEATABLE READ has the effect of holding locks on resources and preventing any other readers from changing them until it has completed, thus avoiding non-repeatable reads.

If the transaction in this example needs to ensure that the same exact rows in a result set is returned throughout a multistep transaction, the SERIALIZABLE isolation is necessary. It is the only isolation level that prevents other transactions from inserting new rows inside of a range of rows. It prevents other connections from adding new rows by not only locking rows it has accessed, but ranges of rows that it would have accessed had they existed. For example, say you queried for rows LIKE 'A%' in a SERIALIZABLE transaction and got back Apple and Annie. If another user tries to insert Aardvark, it is prevented until the LIKE 'A%' transaction is completed.

Lastly, it is essential to understand that every statement is a transaction. UPDATE TableName SET column = 1; operates in a transaction, as does a statement like SELECT 1;. When you do not manually start a transaction, it is referred to as an implicit transaction. An explicit transaction is one where you start with BEGIN TRANSACTION and end with COMMIT TRANSACTION or ROLLBACK TRANSACTION. The REPEATABLE READ and SERIALIZABLE isolation levels can gather a lot of locks, more so with explicit transactions of multiple statements, if they are not quickly closed. The more locks are present, the more likely your connection might be stuck indefinitely waiting.

The most complex of the phenomena concerns reading data that is not the committed version that was initially accessed. There are two main places where this comes into a concern.

  • Reading previous versions of data. Using SNAPSHOT or READ COMMITTED SNAPSHOT (RCSI), your query will see how data looked at the start of the query (RCSI) or once the transaction accesses data in the database. This means that the data you have may not match the data as it exists in the database.

    A side effect of this is that in SNAPSHOT isolation level if two transactions try to modify or delete the same row, you will get an update conflict, requiring you to restart the transaction.

  • Reading new versions of data. In any isolation level that allows phantoms and non-repeatable reads, running the same statement twice can return entirely different results. It is incumbent on the programmer to recognize if this matters. For example, if you try to implement a foreign key construct in READ COMMITTED isolation level, after you check to see if the row exists, another transaction could have deleted the row.

All the topics in this introduction will be covered in greater detail in the following sections. Isolation levels are very important and can be difficult to get right. This is mostly because it is very hard to test your code to see what happens when two connections try to make incompatible reads and modifications to data simultaneously.

Understanding how concurrent sessions become blocked

In this section, we review a series of examples of how concurrency works in a multiuser application interacting with SQL Server tables. First, let’s discuss how to diagnose whether a request is being blocked or if it is blocking another request. Note that in these initial examples, we will be assuming that SQL Server has been configured in the default manner for concurrency. We will be adjusting that later in the chapter to give you more ways to tune performance.

What causes blocking?

We have alluded to it already, and the answer is that when you use resources, they are locked. These locks can be on several different levels and types of resources, as seen in Table 14-2.

Table 14-2 Lockable Resources (Not every type of resource)

Type of Lock

Granularity

Row or row identifier (RID)

A single row in a heap table

Key

A single value in an index (Note that a clustered table is represented as an index in all physical structures.)

Key Range

A range of key values (for example, to lock rows with values from A–M, even if no rows currently exist). Used for SERIALIZABLE isolation level.

Extent

A contiguous group of 8, 8-KB pages

Page

An 8-KB index or data page

HoBT

An entire heap or B-tree structure

Object

An entire table, including all rows and indexes; view, stored procedure, etc.

Application

A special type of lock that is user defined

Metadata

Metadata about the schema, such as catalog objects

Database

An entire database

Allocation unit

A set of related pages that are used as a unit

File

A data or log file in the database

Locks on a given resource are of a mode. In Table 14-3 is the list of modes that an index may be in. Two of the most important ones are shared (indicating a row is being read only), and exclusive (indicating a row should not be accessible by any other connection.)

Table 14-3 Lock Modes

Lock Mode

Definition

Shared

This lock mode grants access for reads only. It’s generally used when users are looking at but not editing the data. It’s called “shared” because multiple processes can have a shared lock on the same resource, allowing read-only access to the resource. However, sharing resources prevents other processes from modifying the resource.

Exclusive

This mode gives exclusive access to a resource and is used during modification of data also. Only one process may have an active exclusive lock on a resource.

Update

This mode is used to inform other processes that you’re planning to modify the data. Other connections may also issue shared, but not update or exclusive, locks while you’re still preparing to do the modification. Update locks are used to prevent deadlocks (covered later in this section) by marking rows that a statement will possibly update, rather than upgrading directly from a shared lock to an exclusive one.

Intent

This mode communicates to other processes that taking one of the previously listed modes might be necessary. It establishes a lock hierarchy with taken locks, allowing processes that are trying to take a lock on a resource (like a table), that there are other connections with locks at a lower level such as a page. You might see this mode as intent shared, intent exclusive, or shared with intent exclusive.

Schema

This mode is used to lock the structure of a resource when it’s in use, so you cannot alter a structure (like a table) when a user is reading data from it. (Note that schema locks show up as part of the mode in many views)

As queries are doing different operations, querying data, modifying data, or changing objects; resources are locked in a given mode. Blocking comes when one connection has a resource locked in a certain mode, and another connection needs to lock a resource in an incompatible mode. In Table 14-4, the compatibility of different modes is presented.

Table 14-4 Lock Modes

Mode

IS

S

U

IX

SIX

X

Intent shared (IS)

X

X

X

X

X

 

Shared (S)

X

X

X

 

 

 

Update (U)

X

X

 

 

 

 

Intent exclusive (IX)

X

 

 

 

 

 

Shared Intent Exclusive (SIX)

 

X

 

 

 

 

Exclusive (X)

 

 

 

 

 

 

To read this table, pick the lock mode in one axis, then an X will be in any compatible column in the other. For example, an update lock is compatible with an intent shared and a shared lock, but not another update lock, or any of the exclusive variants.

If a connection is reading data, it will take a shared lock, allowing other readers to also take a shared lock, which will not cause a blocked situation. However, if another connection is modifying data, it will get an exclusive lock, which will prevent the connection (and any other connections) from accessing the exclusively locked resources in any manner (other than ignoring the locks, which will be discussed later in this section).

How to observe locks and blocking

It’s easy to find out live whether a request is being blocked. The dynamic management object (DMO) sys.dm_db_requests, when combined with sys_dm_db_sessions on the session_id column, provides data about blocking and the state of sessions on the server. This provides much more information than the legacy sp_who or sp_who2 commands, as you can see displayed from this query:

--this query will return a plethora of information in addition to just the session that
--is blocked

SELECT r.session_id, r.blocking_session_id, *

FROM sys.dm_exec_sessions s

LEFT OUTER JOIN sys.dm_exec_requests r ON r.session_id = s.session_id;

--note: requests represent actions that are executing, sessions are connections, hence
--LEFT OUTER JOIN

You can see details of what objects are locked by using the sys.dm_tran_locks DMO, or what locks are involved in blocking using this query:

SELECT

        t1.resource_type,

        t1.resource_database_id,

        t1.resource_associated_entity_id,

        t1.request_mode,

        t1.request_session_id,

        t2.blocking_session_id

    FROM sys.dm_tran_locks as t1

    INNER JOIN sys.dm_os_waiting_tasks as t2

        ON t1.lock_owner_address = t2.resource_address;

You can see in the output of this query, the type of resource that is locked, listed in table 14-3 and the modes listed in table 14-4, with a few exceptions. A lock on a schema is mixed with the mode of the lock, for example, you may have an object that is executing, like a stored procedure where the object is locked with executing, with a request mode of Sch-S, which indicates that it is a shared mode, because it is in use. This prevents the schema of the procedure from being changed, but it can be executed by more than one connection.

Now, let’s review some example scenarios to detail exactly why and how requests can block one another in the real world when using disk-based tables. This is the foundation of concurrency in SQL Server and helps you understand the reason why the NOLOCK query hint appears to make queries perform faster.

Changing the isolation level

As mentioned, by default, connections use the READ COMMITTED isolation level. If you need to change that for a session, there are two methods, using the SET TRANSACTION ISOLATION LEVEL statement, and using hints. In this manner, the isolation level can be changed for an entire transaction, one statement, or one object in a statement.

Using the SET TRANSACTION ISOLATION LEVEL statement

You can change the isolation level of a connection any time, even when already executing in the context of a transaction that is uncommitted. You are not allowed to swap to and from the SNAPSHOT isolation level because, as we’ll will discuss later in the chapter, this isolation level works very differently.

For example, the following code snippet is technically valid, to change from READ COMMITTED to SERIALIZABLE isolation levels. For example, if one statement in your batch required the protection of SERIALIZABLE, but not others, you can execute:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

BEGIN TRAN;

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

SELECT...;

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

COMMIT TRAN;

This code snippet is trying to change from READ COMMITTED to the SNAPSHOT isolation level:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

BEGIN TRAN;

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

SELECT...

Attempting this results in the following error:

Msg 3951, Level 16, State 1, Line 4

Transaction failed in database 'databasename' because the statement was run under snap-
shot isolation but the transaction did not start in snapshot isolation. You cannot
change the isolation level of the transaction after the transaction has started.

In .NET applications, you should change the isolation level of each transaction when it is created as it may not be in READ COMMITTED by default, which is far better for performance.

Using table hints to change isolation

You also can use isolation level hints to change the isolation level at the individual object level. This is an advanced type of coding that you shouldn’t use frequently, because it generally increases the complexity of maintenance and muddies architectural decisions with respect to enterprise concurrency. Just as in the previous session, however, we may wish to hold locks at a SERIALIZABLE level for one table, but not others in the query.

For example, you might have seen developers use NOLOCK at the end of a table, effectively (and dangerously) dropping access to that table into the READ UNCOMMITTED isolation level:

SELECT col1 FROM dbo.Table (NOLOCK);

Note

Aside from the inadvisable use of NOLOCK in the preceding example, using a table hint without WITH is deprecated syntax (since SQL Server 2008). It should be written like this, if you needed to ignore locks.

SELECT col1 FROM dbo.TableName WITH (READUNCOMMITTED);

Aside from the generally undesirable use of the NOLOCK query hint, there are 20-plus other table hints that can be useful, including the ability for a query to use a certain index, to force a seek or scan on an index, or to override the query optimizer’s locking strategy. We discuss how to use UPDLOCK later in this chapter; for example, to force a use of the SERIALIZABLE isolation level.

In almost every case, table hints should be considered for temporary and/or highly situational purposes. They can make maintenance of these queries problematic in the future. For example, using the INDEX or FORCESEEK table hints could result in poor query performance or even cause the query to fail if the table’s indexes are changed.

Understanding and handling common concurrency scenarios

Here we look at some common concurrency scenarios and discuss/demonstrate how SQL Server will process the rows affected by the scenario.

Note

In Chapter 7, the concepts around memory-optimized tables will be covered, and their concurrency model is very different from disk-based tables, though similar to how row versioned concurrency is implemented, particularly the SNAPSHOT isolation level. Later in this chapter the differences for memory optimized tables will be discussed.

Understanding concurrency: two requests updating the same rows

Two users attempting to modify the same resource is possibly the most obvious concurrency issue. As an example, one user wants to add $100 to the total, and another wants to add $50. If both processes happen simultaneously, only one row may be modified (or if we take it to the absurd level, data corruption could occur to the physical structures holding the data if pointers are mixed up by multiple modifications.)

Consider the following steps involving two writes, with each transaction coming from a different session. The transactions are explicitly declared using the BEGIN/COMMIT TRANSACTION syntax. In this example, the transactions are not overriding the default isolation level of READ COMMITTED.

All examples will have these two rows simply so it isn’t just a single row, though we will only manipulate the row where Type = 1. When testing more complex concurrency scenarios, it is best to have large quantities of data to work with as indexing, server resources, etc. do come into play. These examples illustrate simple, fundamental concurrency examples.

  1. A table contains only two rows with a column Type containing values of 0 and 1.

    CREATE SCHEMA Demo;
    
    GO
    
    CREATE TABLE Demo.RC_Test (Type int);
    
    INSERT INTO Demo.RC_Test VALUES (0),(1);
  2. Transaction 1 begins and updates all rows from Type = 1 to Type = 2.

    --Transaction 1
    
    SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
    
    BEGIN TRANSACTION;
    
    UPDATE Demo.RC_Test SET Type = 2
    
    WHERE  Type = 1;
  3. Before Transaction 1 commits, Transaction 2 begins and issues a statement to update Type = 2 to Type = 3. Transaction 2 is blocked and will wait for Transaction 1 to commit.

    --Transaction 2
    
    SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
    
    UPDATE Demo.RC_Test SET Type = 3
    
    WHERE  Type = 2;
  4. Transaction 1 commits.

    --Transaction 1
    
    COMMIT;
  5. Transaction 2 is no longer blocked and processes its update statement. Transaction 2 then commits.

The resulting table will contain a row of Type = 3, and one of Type = 0, as the second transaction will have updated the row after the block was ended. This is because when Transaction 2 started, it waited for the exclusive lock to be removed after Transaction 1 committed.

Understanding concurrency: a write blocks a read

One of the most painful parts of blocking comes when users are trying to write data that other users are blocked from reading. What can even be more problematic is that some modification statements may actually touch and lock rows in the table even if they don’t make any changes (typically due to poorly written where clauses or lack of indexing causing full table scans.)

Next, consider the following steps involving a write and a read, with each transaction coming from a different session. In this scenario, an uncommitted write in transaction 1 blocks a read in Transaction 2. The transactions are explicitly started using the BEGIN/COMMIT TRANSACTION syntax. In this example, the transactions are not overriding the default isolation level of READ COMMITTED:

  1. A table contains only two rows with a column Type value of 0 and 1.

    CREATE TABLE Demo.RC_Test_Write_V_Read (Type int);
    
    INSERT INTO Demo.RC_Test_Write_V_Read VALUES (0),(1);
  2. Transaction 1 begins and updates all rows from Type = 1 to Type = 2.

    --Transaction 1
    
    SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
    
    BEGIN TRANSACTION;
    
    UPDATE Demo.RC_Test_Write_V_Read SET Type = 2
    
    WHERE  Type = 1;
  3. Before Transaction 1 commits, Transaction 2 begins and issues a SELECT statement for rows of Type = 2. Transaction 2 is blocked and waits for Transaction 1 to commit.

    --Transaction 2
    
    SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
    
    SELECT Type
    
    FROM   Demo.RC_Test_Write_V_Read
    
    WHERE  Type = 2;
  4. Transaction 1 commits.

    --Transaction 1
    
    COMMIT;
  5. Transaction 2 is no longer blocked and processes its SELECT statement.

Transaction 2 returns 1 row where Type = 2. This is because when Transaction 2 started, it waited for committed data until after Transaction 1 committed.

Understanding concurrency: nonrepeatable reads

There are certain scenarios where you need to have the same row values returned every time you issue a SELECT statement (or read data in any modification DML), A prime example is the case where you look for the existence of a some data before allowing some other action to occur. Like insert an order row, but only if a payment exists. If that payment is changed or deleted while you are creating the order, free products may be shipped.

Consider the following steps involving a read and a write. In this example, the transactions are not overriding the default isolation level of READ COMMITTED: Each transaction is started from a different session. In this scenario, Transaction 1 will suffer a nonrepeatable read when it reads in rows that are change by a different connection, as the default READ COMMITTED does not offer any protection against phantom or nonrepeatable reads. The transactions are explicitly declared using the BEGIN/COMMIT TRANSACTION syntax.

  1. A table contains only two rows with a column Type value of 0 and 1.

    CREATE TABLE Demo.RR_Test (Type int);
    
    INSERT INTO Demo.RR_Test VALUES (0),(1);
  2. Transaction 1 starts and retrieves rows where Type = 1. One row is returned for Type = 1.

    --Transaction 1
    
    SET TRANSACTION ISOLATION LEVEL READ COMMITTED
    
    BEGIN TRANSACTION
    
    SELECT Type
    
    FROM   Demo.RR_Test
    
    WHERE  Type = 1;
  3. Before Transaction 1 commits, Transaction 2 starts and issues an Update statement, setting rows of Type = 1 to Type = 2. Transaction 2 is not blocked and is immediately processed.

    --Transaction 2
    
    BEGIN TRANSACTION;
    
    UPDATE Demo.RR_Test
    
    SET  Type = 2
    
    WHERE Type = 1;
  4. Transaction 1 again selects rows where Type = 1 and is blocked.

    --Transaction 1
    
    SELECT Type
    
    FROM   Demo.RR_Test
    
    WHERE  Type = 1;
  5. Transaction 2 commits.

    --Transaction 2
    
    COMMIT;
  6. Transaction 1 is immediately unblocked. No rows are returned, which is different than the one row returned earlier, since no committed rows now exist where Type=1. Transaction 1 commits.

    --Transaction 1
    
    COMMIT;

The resulting table now contains a row where Type = 2, because the second transaction has modified the row. When Transaction 2 started, Transaction 1 had not placed any locks on the data, allowing for writes to happen. Because it is doing only reads, Transaction 1 would never have placed any exclusive locks on the data. Transaction 1 suffered from a nonrepeatable read: the same SELECT statement returned different data during the same multistep transaction.

Understanding concurrency: preventing a nonrepeatable read

Consider the following steps involving a read and a write, with each transaction coming from a different session. This time, we protect Transaction 1 from dirty reads and nonrepeatable reads by using the REPEATABLE READ isolation level. A read in the REPEATABLE READ isolation level will block a write. The transactions are explicitly declared by using the BEGIN/COMMIT TRANSACTION syntax:

  1. A table contains only rows with a column Type value of 0 and 1.

    CREATE TABLE Demo.RR_Test_Prevent (Type int);
    
    INSERT INTO Demo.RR_Test_Prevent VALUES (0),(1);
  2. Transaction 1 starts and selects rows where Type = 1 in the REPEATABLE READ isolation level. 1 row with Type = 1 is returned.

    --Transaction 1
    
    SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
    
    BEGIN TRANSACTION;
    
    SELECT Type
    
    FROM   Demo.RR_Test_Prevent
    
    WHERE  TYPE = 1;
  3. Before Transaction 1 commits, Transaction 2 starts and issues an UPDATE statement, setting rows of Type = 1 to Type = 2. Transaction 2 is blocked by Transaction 1.

    --Transaction 2
    
    BEGIN TRANSACTION;
    
    UPDATE Demo.RR_Test_Prevent
    
    SET  Type = 2
    
    WHERE Type = 1;
  4. Transaction 1 again selects rows where Type = 1. The same rows are returned as in step 2.

  5. Transaction 1 commits.

    --Transaction 1
    
    COMMIT TRANSACTION;
  6. Transaction 2 is immediately unblocked and processes its update. Transaction 2 commits.

    --Transaction 2
    
    COMMIT TRANSACTION;

The resulting table will now contain 2 rows, one Type = 2, and the original row of Type = 0. This is because when Transaction 2 started, Transaction 1 had placed read locks on the data it was selecting, blocking writes to happening until it had committed. Transaction 1 returned the same rows each time and did not suffer a nonrepeatable read. Transaction 2 processed its updates only when it could place exclusive locks on the rows it needed.

Understanding concurrency: experiencing phantom rows

Phantom rows cause issues for transactions when you expect the exact same result back from a query. For example, say you were writing a value to a table that summed up 100 other values (flaunting the fundamentals of database design’s normalization rules!), for example a financial transactions ledger table that calculates the current balance. You sum the 100 rows, then write the value. If it is important that the total of the 100 rows matches perfectly, you cannot allow nonrepeatable reads, nor phantom rows.

Consider the following steps involving a read and a write, with each transaction coming from a different session. In this scenario, we describe a phantom read:

  1. A table contains only two rows, with Type values 0 and 1

    CREATE TABLE Demo.PR_Test (Type int);
    
    INSERT INTO Demo.PR_Test VALUES (0),(1);
  2. Transaction 1 starts and selects rows where Type = 1 in the REPEATABLE READ isolation level. Rows are returned.

    --Transaction 1
    
    SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
    
    BEGIN TRANSACTION;
    
    SELECT Type
    
    FROM   Demo.PR_Test
    
    WHERE  Type = 1;
  3. Before Transaction 1 commits, Transaction 2 starts and issues an INSERT statement, adding another row where Type = 1. Transaction 2 is not blocked by Transaction 1.

    --Transaction 2
    
    INSERT INTO Demo.PR_Test(Type)
    
    VALUES(1);
  4. Transaction 1 again selects rows where Type = 1. An additional row is returned compared to the first time the select was run in Transaction 1.

    --Transaction 1
    
    SELECT Type
    
    FROM   Demo.PR_Test
    
    WHERE  Type = 1;
  5. Transaction 1 commits.

    --Transaction 1
    
    COMMIT TRANSACTION;

Transaction 1 experienced a phantom read when it returned a different number of rows the second time it selected from the table inside the same transaction. Transaction 1 had not placed any locks on the range of data it needed, allowing for writes in another transaction to happen within the same dataset. The phantom read would have occurred to Transaction 1 in any isolation level, except for SERIALIZABLE. Let’s look at that next.

It is important to understand that in some scenarios, a phantom row can have the same effect as a non-repeatable read. For example, consider a query summing the payment amounts for a customer’s invoice. In order to ship a product, the invoice needs to be paid in full. An initial read of the data could show the invoice to be paid. But before the order is processed, the same query could indicate the customer had not been paid because a new row arrives with a reversal activity with a negative amount.

Understanding concurrency: preventing phantom rows

Consider the following steps involving a read and a write, with each transaction coming from a different session. In this scenario, we protect Transaction 1 from a phantom read.

  1. A table contains two rows with Type values 0 and 1

    CREATE TABLE Demo.PR_Test_Prevent (Type int);
    
    INSERT INTO Demo.PR_Test_Prevent VALUES (0),(1);
  2. Transaction 1 starts and selects rows where Type = 1 in the SERIALIZABLE isolation level. The one row where Type = 1 is returned.

    --Transaction 1
    
    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
    
    BEGIN TRANSACTION;
    
    SELECT Type
    
    FROM   Demo.PR_Test_Prevent
    
    WHERE  Type = 1;
  1. Before Transaction 1 commits, Transaction 2 starts and issues an INSERT statement, adding a row of Type = 1. Transaction 2 is blocked by Transaction 1.

    --Transaction 2
    
    INSERT INTO Demo.PR_Test_Prevent(Type)
    
    VALUES(1);
  2. Transaction 1 again selects rows where Type = 1. The same result set is returned as it was in step 2, the one row where Type = 1.

  3. Transaction 1 executes COMMIT TRANSACTION.

  4. Transaction 2 is immediately unblocked and processes its insert. Transaction 2 commits. Query the table again and there are now 2 rows where Type = 1.

Transaction 1 did not suffer from a phantom read the second time it selected from the table, because it had placed a lock on the range of rows it needed. The table now contains additional rows for Type = 1, but they were not inserted until after Transaction 1 had committed.

The case against READ UNCOMMITTED isolation level

This also pertains to using the NOLOCK hint on your queries.

If locks take time, ignoring those locks will make things go faster. This is true, but this logic is like saying: if the weight of airbags, seat belts, bumpers, and collapsible steering wheels brings down gas mileage, remove them…what could happen? Mix a little lie with the truth, it is easier to take. It is true that these things have weight, but they may save your life one day.

Locks are similar in nature. Locks coordinate our access to resources, allowing multiple users to do multiple things in the database without crushing the other users’ changes. The READ COMMITTED isolation level (and an extension we will discuss in the section on SNAPSHOT isolation level called READ COMMITTED SNAPSHOT) does the best to balance locks with performance. Locks are still held for dirty resources (exclusively locked data that has been changed by the user) but held only long enough to perform reads on a row, and then released after resources are read. The basic process:

  • Grab a lock on a resource

  • Read that resource

  • Release the lock on the resource

  • Repeat until you are out of resources to read

No one can dirty (modify) the row we are reading because of the lock, but when we are done, we move on and release the lock. Locks on modifications to on-disk tables work the same way in all isolation levels, even READ UNCOMMITTED, and are held until the transaction is committed.

The effect of the table hint NOLOCK or the READ UNCOMMITTED isolation level is that no locks are taken inside the database for reads, save for schema stability locks. (A query using NOLOCK could still be blocked by Data Definition Language [DDL] commands, such as an offline indexing operation.) The basic thinking is that: you turn on READ UNCOMMITTED isolation level for your connection and things will go faster.

This is strategy that many DBA programmers have all tried before: “We had performance problems, but we’ve been putting NOLOCK in all our stored procedures to fix it.” It will improve performance, but it can easily be detrimental to the integrity of your data.

The biggest issue is that a query may be reading a set of data and see data that doesn’t even meet the constraints of the system. So, a transaction for $1,000,000 may be seen in a query, and then when the transaction is rolled back later in the transaction (perhaps because the payment details for the transaction don’t meet a constraint or trigger requirement), who knows what celebratory alarms may have gone off thinking we have a million dollars in sales today.

The case against using the READ UNCOMMITTED isolation level is deeper than performance and more than simply reading dirty data, however, the counter argument a developer might make is that data is rarely ever rolled back, or that the data is for reporting only. In production environments, these are not enough to justify the potential problems.

A query in READ UNCOMMITTED isolation level could return invalid data in the following real-world, provable ways:

  • Read uncommitted data (dirty reads)

  • Read committed data twice

  • Skip committed data

  • Return corrupted data

  • Or the query could fail altogether with error “Could not continue scan with NOLOCK due to data movement.” This is a scenario, where since you ignored locks, the structure that was to be scanned is no longer there when you reach it, like in a page split that occurs as you read the page. The solution to this problem is the solution to a lot of concurrency issues. Be prepared to re-execute your batch on this failure.

One final caveat: in SQL Server you cannot apply NOLOCK to tables when used in modification statements, and it ignores the declaration of the READ UNCOMMITTED isolation level in a batch that includes modification statements; for example:

INSERT INTO dbo.testnolock1 WITH (NOLOCK)

SELECT * FROM dbo.testnolock2;

SQL Server knows that it will use locks for the INSERT, and makes it clear by way of the following error being thrown:

Msg 1065, Level 15, State 1, Line 17

The NOLOCK and READUNCOMMITTED lock hints are not allowed for target tables of INSERT,

UPDATE, DELETE or MERGE statements.

This protection doesn’t apply to the source of any writes, hence yet another danger. This following code is allowed and is dangerous because it could write invalid data:

INSERT INTO testnolock1

SELECT * FROM testnolock2 WITH (NOLOCK);

The bottom line is to not use READ COMMITTED isolation level unless you really understand the implications of reading dirty data and have an ironclad reason for doing so (for example, it is an invaluable tool as a DBA to be able to see the changes to data being made in another connection. For example SELECT COUNT(*) FROM dbo.TableName WITH (NOLOCK); can allow you to see the count of rows being inserted into dbo.TableName.) Using it for performance gains is rather short-sighted. However, continue reading for the real way to increase performance without the chance of seeing dirty data as we introduce version-based concurrency techniques in the next section.

Understanding the enterprise solution to concurrency: row version-based concurrency

In the interest of performance, application developers too often seek to solve concurrency concerns (reduce blocking, limit access to locked objects) by trying to avoid the problem with the tantalizingly named NOLOCK. At first and at scale, the performance gains are too large to consider other alternatives. Since the problems we have mentioned only happen “occasionally,” even if it takes 30 hours of meetings, coding, or testing to try to figure out the issues since they seem random, and non-repeatable. There is a far safer option, without the significant drawbacks and potential for invalid data and errors, which allows you to read a previously committed version of data. Using row versioning techniques that give the user a view of a version of the data that was properly committed, you can get tremendous gains that never let the user see dirty data.

Version based concurrency is available in the SNAPSHOT isolation level, or altering the implementation of READ COMMITTED (This will often be referred to as READ_COMMITTED_SNAPSHOT isolation (RCSI) as a shortcut, even in this book, but it is not an isolation level, rather it is a setting at the database level.)

Row versioning allows queries to read from the same rows that might be locked by other queries by storing previous version(s) of a row and reading that when the other is locked. The SQL Server instance’s TempDB keeps a copy of committed data, and this data can be served to concurrent requests. In this way, row versioning allows access only to committed data, without blocking access to data locked by writes. By increasing the utilization and workload of TempDB for disk-based tables, performance is dramatically improved by increasing concurrency without the dangers of accessing uncommitted data.

The SNAPSHOT isolation level works at the transaction level. Once you start a transaction, and access any data in the transaction, such as the third statement in the following snippet:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

BEGIN TRANSACTION;

SELECT * FROM dbo.Table1;

--Note: COMMIT or ROLLBACK this transaction if you execute this code, with a real table

Your queries will see a transactionally consistent view of the database. No matter what someone does to the data in dbo.Table1 (or any other table in the same database), you will always see how the data looked as of the start of the first statement executed in that database in your transaction (in this case the SELECT statement). This is great for some things such as reporting. SNAPSHOT gives you the same level of consistency to the data you are reading as does SERIALIZABLE, except that work can continue and things changing. It is not susceptible to nonrepeatable reads and phantom rows, only that you can view expired data.).

The SNAPSHOT isolation level can be problematic for certain types of code because if you need to check if a row exists to do some action, you can’t see if the row was created or deleted after you started your transaction context. And as discussed earlier, you can’t switch out of SNAPSHOT temporarily, then apply locks to prevent non-repeatable reads, and go back to seeing a consistent, yet possibly expired view of the database.

While SNAPSHOT isolation level works at the transaction level, READ COMMITTED SNAPSHOT works at the statement level. This means that while data is blocked, you will see the previous version of those rows as you read from the table. But if you execute the same statement again, you may get nonrepeatable reads and phantom rows. So, execute a batch such as this:

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

BEGIN TRANSACTION;

SELECT * FROM dbo.Table1;

SELECT * FROM dbo.Table1;

SELECT * FROM dbo.Table1;

And you may get back 1 row the first time, and 1000000 the second (including a freshly modified version of the first row), and 0 the third, all depending on the number of rows when the statement started

Understanding concurrency: accessing data in SNAPSHOT isolation level

The beauty of the SNAPSHOT isolation level is the effect on readers of the data. If you want to go in and query the database, you generally want to see it in a consistent state. And you don’t want to block others. A typical problem comes in when you are writing an operational report, and you query a child table, and you get back 100 rows with distinct parentId values, but querying the parent table indicates there are only 50 parentId values (because between queries, another process deleted the other 50). When using SNAPSHOT isolation for read write processes, it is important to consider the techniques mentioned for optimistic locking in the Inside OUT sidebar in the “Understanding concurrency: two requests updating the same rows” section above.

Consider the following steps involving a read and a write, with each transaction coming from a different session. In this scenario, we see that Transaction 2 has access to previously committed row data, even though those rows are being updated concurrently.

  1. A table contains only rows with a column Type value of 0 and 1.

    CREATE TABLE Demo.SS_Test (Type int);
    
    INSERT INTO Demo.SS_Test VALUES (0),(1);
  2. Transaction 1 starts and updates rows where Type = 1 to Type = 2.

    --Transaction 1
    
    BEGIN TRANSACTION;
    
    UPDATE Demo.SS_Test
    
    SET  Type = 2
    
    WHERE Type = 1;
  3. Before Transaction 1 commits, Transaction 2 sets its session isolation level to SNAPSHOT and executes BEGIN TRANSACTION.

    --Transaction 2
    
    SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
    
    BEGIN TRANSACTION;
  4. Transaction 2 issues a SELECT statement WHERE Type = 1. Transaction 2 is not blocked by Transaction 1. A row where Type = 1 is returned.

    --Transaction 2
    
    SELECT Type
    
    FROM   Demo.SS_Test
    
    WHERE  Type = 1;
  5. Transaction 1 executes a COMMIT TRANSACTION.

  6. Transaction 2 again issues a SELECT statement WHERE Type = 1. The same rows from step 3 are returned. Even if the table has all of its data deleted, the results will always be the same for the same query while in the SNAPSHOT level transaction. Once Transaction 2 is committed or rolled back, then queries on that connection will see the changes that have occurred since the transaction started.

Transaction 2 was not blocked when it attempted to query rows that Transaction 1 was updating. It had access to previously committed data, thanks to row versioning.

Implementing row-versioned concurrency

You can implement row versioned isolation levels in a database in two different ways. Turning on SNAPSHOT isolation simply allows for the use of SNAPSHOT isolation and begins the process of row versioning. Alternatively, turning on RCSI changes the default isolation level to READ COMMITTED SNAPSHOT. You can implement both or either. It’s important to understand the differences between these two settings, because they are not the same:

  • READ COMMITTED SNAPSHOT configures optimistic concurrency for reads by overriding the default isolation level of the database. When turned on, all queries will use RCSI unless overridden.

  • SNAPSHOT isolation mode configures optimistic concurrency for reads and writes. You must then specify the SNAPSHOT isolation level for any transaction to use SNAPSHOT isolation level. It is possible to have update conflicts with SNAPSHOT isolation mode that will not occur with READ COMMITTED SNAPSHOT. The concept of an update conflict will be covered in the next section.

The statements to implement SNAPSHOT isolation in the database are simple enough but is not without consequence. Even if no transactions or statements use the SNAPSHOT isolation level, behind the scenes, TempDB begins storing row version data for disk-based tables, minimally for the length of the transaction that modifies the row. This way, if a row-versioned transaction starts while rows are being modified, the previous versions are available.

Note

Memory-optimized tables share properties with SNAPSHOT isolation level but are implemented in an extremely different manner that is based completely on row-versioning and do not use TempDB. They will be covered in Chapter 15.

Here’s how to allow transactions in a database to start transactions in the SNAPSHOT isolation level:

ALTER DATABASE databasename SET ALLOW_SNAPSHOT_ISOLATION ON;

After executing only the above statement, all transactions will continue to use the default READ COMMITTED isolation level, but you now can specify the use of SNAPSHOT isolation level at the session level or in table hints, as shown in the following example:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

Using SNAPSHOT isolation level on an existing database can be a lot of work, and as we cover in the next session, it changes some very important ways we handle executing queries, since what once was write blocking becomes an error message for you to try again. Alternatively, if you want to apply the “go faster” solution that mostly works with existing code, you need to alter the meaning of READ COMMITTED to read row-versions instead of waiting for locks to clear.

You can use READ_COMMITTED_SNAPSHOT independently of ALLOW_SNAPSHOT_ISOLATION. Similarly, these settings are not tied to the MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT database setting to promote memory-optimized table access to SNAPSHOT isolation.

Here’s how to turn on RCSI:

ALTER DATABASE databasename SET READ_COMMITTED_SNAPSHOT ON;

Note

Changing the READ_COMMITTED_SNAPSHOT database option on a live database where you have memory-optimized tables set to DURABILITY = SCHEMA_ONLY will empty those tables. You need to move the contents of the table to a more durable table before changing READ_COMMITTED_SNAPSHOT to ON or OFF.

For either of the previous ALTER DATABASE statements to succeed, no other transactions can be open in the database. It might be necessary to close other connections manually or to put the database in SINGLE_USER mode. Either way, we do not recommend that you perform this change during production activity.

It is essential to be aware and prepared for the increased utilization in the TempDB, both in activity and space requirements. To avoid autogrowth events, consider increasing the size of the TempDB data and log files and monitor their size. Although you should try to avoid autogrowth events by growing the TempDB data file(s) yourself, you should also verify that your TempDB file autogrowth settings are appropriate in case things grow larger than expected.

  • Images For more information on file autogrowth settings, see Chapter 8.

Should the TempDB exhaust all available space on its volume, SQL will be unable to row-version rows for transactions and will terminate them with SQL Server error 3958. SQL Server will also issue errors 3967 and 3966 as the oldest row versions are removed from the TempDB to make room for new row versions needed by newer transactions.

Note

Prior to SQL Server 2016, READ COMMITTED SNAPSHOT and SNAPSHOT isolation levels were not supported with columnstore indexes. Beginning with SQL Server 2016, SNAPSHOT isolation and columnstore indexes are fully compatible.

Understanding update operations in SNAPSHOT isolation level

Transactions that read data in SNAPSHOT isolation or RCSI will have access to previously committed data instead of being blocked, when data needed is being changed. This is important to understand and could result in an update statement experiencing a concurrency error when you start to change data. Update conflicts change how systems behave; you need to understand it before deciding to implement your code in SNAPSHOT isolation level.

One of the keys to understanding how modifying data in the SNAPSHOT isolation level behaves is to understand that you can only have one “dirty” version of a physical resource. So, if another connection modifies a row and you only read the row, you see previous versions. But you change a row that another connection has also modified, your update was based on out of data information and will be rolled back.

For example, consider the following steps, with each transaction coming from a different session. In this example, Transaction 2 fails due to a concurrency conflict or “write-write error:”

  1. A table contains multiple rows, each with a unique Type value.

    CREATE TABLE Demo.SS_Update_Test
    
    (Type int CONSTRAINT PKSS_Update_Test PRIMARY KEY,
    
     Value nvarchar(10));
    
    INSERT INTO Demo.SS_Update_Test VALUES (0,'Zero'),(1,'One'),(2,'Two'),(3,'Three');
  2. Transaction 1 begins a transaction in the READ COMMITTED isolation level and performs an update on the row where ID = 1.

    --Transaction 1
    
    BEGIN TRANSACTION ;
    
    UPDATE Demo.SS_Update_Test
    
    SET  Value = 'Won'
    
    WHERE Type = 1;
  3. Transaction 2 sets its session isolation level to SNAPSHOT and issues a statement to update the row where ID = 1, this connection is blocked, waiting on the modification locks to clear.

    --Transaction 2
    
    SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
    
    BEGIN TRANSACTION
    
    UPDATE Demo.SS_Update_Test
    
    SET  Value = 'Wun'
    
    WHERE Type = 1;
  4. Transaction 1 commits, using a COMMIT TRANSACTION statement. The update succeeds.

  5. Transaction 2 then immediately fails with SQL error 3960.

    Msg 3960, Level 16, State 2, Line 8
    
    Snapshot isolation transaction aborted due to update conflict. You cannot use
    snapshot isolation to access table 'dbo.AnyTable' directly or indirectly in database
    'DatabaseName' to update, delete, or insert the row that has been modified or deleted by
    another transaction. Retry the transaction or change the isolation level for the update/
    delete statement.

Note

SNAPSHOT isolation level with disk-based tables in SQL Server is not pure row-versioned concurrency, which is why in the previous example, Transaction 2 was blocked by Transaction 1. Using memory optimized tables, which are based on pure row-versioned concurrency, the transaction would have failed immediately, rather than having been blocked.

Transaction 2 was rolled back. Let’s try to understand why this error occurred, what to do, and how to prevent it.

In SQL Server, SNAPSHOT isolation uses locks to create blocking but doesn’t block updates from colliding for disk-based tables. It is possible for a statement to fail when committing changes from an UPDATE statement, if another transaction has changed the data needed for an update during a transaction in SNAPSHOT isolation level.

For disk-based tables, the update conflict error will look like the Msg 3960 that we saw a moment ago. For queries on memory-optimized tables, the update conflict error will look like this:

Msg 41302, Level 16, State 110, Line 8

The current transaction attempted to update a record that has been updated since this
transaction started. The transaction was aborted.

What this means for you is that if you decide to use SNAPSHOT isolation level for your modification query’s isolation level, you must be ready to handle an error that isn’t really an error. It is just warning to re-execute your statements, after checking to see if anything has changed since you started your query. This is 100% the same answer for handling deadlock conditions and will be the same answer for handling all modification conflicts using memory optimized tables.

Even though optimistic concurrency of snapshot isolation level increases the potential for update conflicts, you can mitigate these by doing the following if you need to specifically attempt to avoid update conflicts:

  • Minimize the length of transactions that modify data. While it seems like this would be less of an issue because readers aren’t blocked, it does increase the likelihood of modification conflicts.

  • When running a modification in SNAPSHOT isolation level, try to avoid using any statements that place update locks to disk-based tables inside multistep explicit transactions.

  • Specifying the UPDLOCK table hint can have utility at preventing update conflict errors for long-running SELECT statements. The UPDLOCK table hints places locks on rows needed for the multistep transaction to complete. The use of UPDLOCK on SELECT statements with SNAPSHOT isolation level is not a panacea for update conflicts, and it could in fact create them. Frequent SELECT statements with UPDLOCK could increase the number of update conflicts with updates. Regardless, your application should handle errors and initiate retries when appropriate.

    If two concurrent statements use UPDLOCK, with one updating and one reading the same data, even in implicit transactions, an update conflict failure is possible if not likely.

  • Consider avoiding writes altogether while in SNAPSHOT isolation mode. Use it only to do reads where you do not plan to write the data in the same transaction you have fetched it in.

Specifying table granularity hints such as ROWLOCK or TABLOCK can prevent update conflicts, although at the cost of concurrency. The second update transaction must be blocked while the first update transaction is running—essentially bypassing SNAPSHOT isolation for the write. If two concurrent statements are both updating the same data in SNAPSHOT isolation level, an update conflict failure is likely for the statement that started second.

Understanding on-disk versus memory-optimized concurrency

Queries using memory-optimized tables (also referred to as in-memory OLTP tables) can perform significantly faster than queries based on the same data in disk-based tables. Memory-optimized tables can improve the performance of frequently written-to tables by up to 40 times over disk-based tables.

However, this almost magical performance improvement comes at a price, not just in the need for extra memory, but also in that they way they implement concurrency controls is rather different than disk-based tables. When in the concurrency scenarios previously introduced, all the concurrency protections provided were based on locking, or in other words, waiting until the other connection has completed, and then applying the changes. However, locking only applies only to on-disk tables, not memory-optimized tables.

In the case of memory-optimized tables, locking isn’t the mechanism that ensures isolation. Instead, the in-memory engine uses pure row versioning to provide row content to each transaction. In pure row versioning, an UPDATE statement is an insert of a new row, and an update to the effective ending timestamp on the previous row. A DELETE statement only updates the effective ending timestamp on the current row. If you are familiar with the data warehousing concept of a Slowly Changing Dimension (SCD), this is similar to an SCD Type II. It is equally similar to how temporal tables work, though both the current and historical data are in the same physical structure.

  • Images For more explanation of temporal tables, see Chapter 7.

Previous versions hang around as long as they are needed by transactions and are then cleaned up as they can be (data is also hardened to what is referred to as the delta file for durability purposes, as well as the transaction log.)

If two transactions attempt to modify the same physical data resource at the same time, one transaction will immediately fail due to a concurrency error, rather than being blocked and waiting. Only one transaction can be in the process of modifying or removing simultaneously. The other will fail with a concurrency conflict (SQL error 41302). However, if two transactions insert the same value for the primary key, it will not fail until the transaction is committed, as it is not the same physical resource.

This is the key difference between the behavior of pessimistic and optimistic concurrency. Pessimistic concurrency uses locks to prevent write conflict errors, whereas optimistic concurrency uses row versions with acceptable risk of write conflict errors. On-disk tables offer isolation levels that use pessimistic concurrency to block conflicting transactions, forcing them to wait. Memory-optimized tables offer optimistic concurrency that will cause a conflicting transaction to fail.

Memory optimized tables allow you to use SNAPSHOT, REPEATABLE READ, and SERIALIZABLE isolation levels and provide the same types of protections by definition. In the case of a nonrepeatable read, SQL error 41305 will be raised. In the case of a phantom read, a SQL error 41325 will be raised. Because of these errors, applications that write to memory-optimized tables must include logic that gracefully handles and automatically retries transactions. They should already handle and retry in the case of deadlocks or other fatal database errors.

  • Images For more information on configuring memory-optimized tables, see Chapter 7.

  • Images We discuss more about indexes for memory-optimized tables in Chapter 15.

Understanding reading memory optimized data in other than SNAPSHOT isolation level

All data that is read by a statement in memory-optimized tables behaves like SNAPSHOT isolation level. Once your transaction starts and you access any memory optimized data in the database, further reads from memory-optimized tables will be from a consistent view of those objects (note that the memory optimized and on-disk table are in different “containers”, so your consistent view of the memory optimized data doesn’t start if you read only on-disk tables.)

However, what makes your work more interesting is that when in REPEATABLE READ or SERIALIZABLE isolation levels, the scan for phantom and non-repeatable read rows is done during commit rather than as they occur.

For example, consider the following steps, with each transaction coming from a different session. A typical scenario might be running a report of some data. You read in a set of data, perform some operation on it, and then read another set of rows, and your process requires the data to all stay the same.

  1. A table contains many rows, each with a unique ID. Transaction 1 begins a transaction in the SERIALIZABLE isolation level and reads all the rows in the table.

  2. Transaction 2 update on the row where ID = 1. This transaction commits successfully

  3. Transaction 1 again reads rows in this same table. No error is raised, and rows are returned as normal.

  4. Transaction 1 commits. An error is raised (41305), alerting you to a non-repeatable read. (Even though this was in SERIALIZABLE isolation level, the check for a non-repeatable read is done first, since this is a reasonably easy operation, whereas the scan for phantom rows requires the engine to run a query on the data to see if extra rows are returned.

Generally speaking, most use of the isolation levels other than SNAPSHOT with memory-optimized data should be limited to operations like data integrity checks, where you want to make sure that one row exists before you insert the next row.

Specifying isolation level for memory-optimized tables in queries

Isolation level is specified in a few, mildly confusing ways for memory optimized tables. The first method is in an ad-hoc query, not in an existing transaction context, you can simply query the table as you always have, and it will be accessed in SNAPSHOT isolation level.

If you are in the context of a transaction, it will not automatically default to SNAPSHOT isolation level. Rather you need to specify the isolation level as a hint, such as:

BEGIN TRANSACTION;

SELECT *
FROM   dbo.MemoryOptimizedTable WITH (SNAPSHOT);

You can make it default to SNAPSHOT isolation level by turning on the MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT database option. This promotes access to all memory-optimized tables in the database up to SNAPHOT isolation level if the current isolation level is not REPEATABLE READ or SERIALIZABLE. It will promote the isolation level to SNAPSHOT from isolation levels such as READ UNCOMMITTED and READ COMMITTED. This option is off by default, but you should consider it because you otherwise cannot use the READ UNCOMMITTED or SNAPSHOT isolation levels for a session including memory-optimized tables.

If you need to use REPEATABLE READ or SERIALIZABLE, or your scenario will not meet the criteria for automatically choosing SNAPSHOT isolation level, you can specify the isolation level using table concurrency hints. (See the section “Using table hints to change isolation” earlier in this chapter). Note that only memory-optimized tables can use this SNAPSHOT table hint, not disk-based tables.

Finally, note that you cannot mix disk-based SNAPSHOT isolation level with memory-optimized SNAPSHOT isolation level. For example, you cannot include memory-optimized tables in a session that begins with SET TRANSACTION ISOLATION LEVEL SNAPSHOT, even if MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT = ON or you specify the SNAPSHOT table hint.

  • Images For more information on configuring memory-optimized tables, see Chapter 7.

  • Images We discuss more about indexes for memory-optimized tables in Chapter 15.

Understanding durability settings for performance

One of the base requirements for a relational database management system is that data saved is durable, meaning that once you believe it is saved in the database, it is permanent and cannot be lost (short of losing the entire server, and your backups). This requirement is a very important requirement of relational databases, but it is certainly a detriment to performance. When data is saved to a table via a DML statement, by default, three steps are required, both for on-disk and in-memory tables. Data is synchronously saved to memory and the transaction log, then control is returned to the user. The third step is done asynchronously, which is to save the data to a disk file, hardening the data and releasing the corresponding transaction log records to be deleted.

For performance’s sake, there are two configurations that we can use to increase performance at the detriment of durability. The first is to use memory-optimized tables, setting the durability setting on the table to SCHEMA_ONLY. This creates a logless memory-based object that will be empty, with only the schema remaining when you restart the server. This is a tool that can be useful in certain scenarios and will provide amazing performance, even hundreds of times faster than on-disk tables. It is not, however, a universally applicable tool to increase application performance because it makes the table completely non-durable.

Note

The process of writing data to disk and memory is changing as new technologies arrive. In SQL Server 2019, a new feature call Hybrid Buffer Pool can use Persistent Memory modules (PMEM) and just write data to disk, and access it from the disk instead of RAM. For more details on Hybrid Buffer Pool, check Chapter 2, in the Hybrid Buffer Pool section, as well as: https://docs.microsoft.com/sql/database-engine/configure-windows/hybrid-buffer-pool.

In this section, we will look at a way you can alter how durability is handled in SQL Server, in a manner that can be useful on very small departmental servers as well as enterprise servers. The other tool you can use is delayed durability, which alters the durability of your data slightly, but enough to make possibly a tremendous difference in how long it takes to return control to your server’s clients when it makes sense.

Delayed durability database options

Delayed durability allows for transactions to avoid synchronously committing to a disk; instead, synchronously committing only to memory, but synchronously committing to a disk. This opens the possibility of losing data in the event of a server shutdown before the log has been written, so it does have dangers. This engine feature was first introduced in SQL Server 2014 and works basically the same today.

While this sounds concerning, it is good to note that unless your SQL Server instance’s databases are running in a synchronous availability group data loss it technically possible in some cases (and even then, chance exists for the databases to drop into asynchronous under pressure), so you already face the possibility of losing recently written rows in the event of a sudden server or drive failure where you lose the log and data files.

Databases in Azure SQL Database also support delayed durability transactions, with the same caveat and expectations for data recovery. Some data loss is possible, and you should only use this feature if you can recreate important transactions on a server crash.

Note

Distributed (DTC) and cross-database transactions are always durable.

At the database level, you can set the DELAYED_DURABILITY option to DISABLED (default), ALLOWED, or FORCED.

ALLOWED allows any explicit transaction to be coded to be optionally set to delayed durability, using:

COMMIT TRANSACTION WITH ( DELAYED_DURABILITY = ON );

Additionally, for natively compiled procedures, you can specify DELAYED_DURABILITY in the BEGIN ATOMIC block. Take, for example, this header of a procedure in the WideWorldImporters database:

CREATE PROCEDURE [Website].[RecordColdRoomTemperatures_DD]

@SensorReadings Website.SensorDataList READONLY

WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER

AS

BEGIN ATOMIC WITH

(

    TRANSACTION ISOLATION LEVEL = SNAPSHOT,

    LANGUAGE = N'English',

    DELAYED_DURABILITY = ON

)

    BEGIN TRY

     …

The FORCED option means that every transaction, regardless of what the person writing the COMMIT statement wishes, will have asynchronous log writes. This obviously has implications on the entirety of the users of the database, and you should consider it carefully with existing applications.

The delayed durability options, implemented either at the database level or at the transaction level, has use in very-high-performance workloads for which the bottleneck to write performance is the transaction log itself. By trading the possibility for new rows to be written only to memory and lost in the event of a shutdown, you can gain a significant performance increase, especially with write-heavy workloads.

Even if you cannot employ delayed durability in your normal day-to-day operations, it can be a very useful setting when loading a database, particularly for test data. Because log writes are done asynchronously, instead of every transaction waiting for small log writes to complete, log writes are batched together in an efficient manner.

Note

While delayed durability does apply to memory optimized tables, the DELAYED_DURABILITY database option is not related to the DURABILITY option when creating optimized tables.

A transaction that changes data under delayed durability will be flushed to the disk as soon as possible, whenever any other durable transaction commits in the same database, or whenever a threshold of delayed durability transactions builds up, You also can force a flush of the transaction log with the system stored procedure sp_flush_log. Otherwise, the transactions are written to a buffer in memory until a log flush event. SQL Server manages the buffer but makes no guarantees as to the amount of time a transaction can remain in buffer.

It’s important to note that delayed durability is simply about reducing the I/O bottleneck of committing a massive quantity of writes to the transaction log. This has no effect on isolation (locking, blocking) or access to any data in the database that must be read to perform the write. Otherwise, delayed durability transactions follow the same rules as other transactions.

Note

Any SQL Server instance service shutdown, whether it be a planned restart or sudden failure, could result in delayed durability transactions being lost. This also applies to the failover of a failover cluster instance (FCI), availability group, or database mirror. Transaction log backups and log shipping will similarly contain only transactions made durable. You must be aware of this potential when implementing delayed durability.

Understanding how SQL Server executes your query

The internal process to execute a query in SQL Server is very complex. It is, however, something that the engineers at Microsoft are working very hard for you to be able to ignore one day. This day has not yet arrived, but in SQL Server 2019, this future is inching ever closer with new and smarter query optimization features. In the next major section, entitled “Understanding advanced engine features for tuning queries,” we will spend time looking at the kind of strides they are making to help the DBA deal with more important things.

However, as a DBA, it will remain essential to understand how queries are executed for many years to come, as this informs how you write queries, and how you know when SQL Server’s optimizer has made an incorrect choice. The reality is that they will likely never reach the place where every query runs perfectly regardless of how it is written, but the number of cases that need manual intervention decreases over time. What it also means is that instead of a lot of easy tuning issues and several complex ones, you will instead have only complex issues to work on.

Understanding the overall query execution process

A user writes a query, which could be 1 line of code like SELECT * FROM dbo.TableName; or it could be a statement that contains 1,000 lines. This query could be a batch of multiple statements, use temporary objects, other coded objects such as user defined functions, not to mention table variables, and perhaps a cursor thrown in for good measure. After writing a query, the user tries to execute it.

First, the code is parsed and translated to structures in the native language of the query processor. If the query is a technically correct SQL syntax (well or poor written, either way), it will pass this phase. If there are syntax errors like SLECT * FROM dbo.TableName; then it fails with a syntax error. If you reference tables that don’t exist like: SELECT * FROM dbr.TableName; (should have been dbo.), that will also fail.

Once the code is parsed and prepared for execution, the query optimizer tries to calculate the best way to run your code, and run it again using the same process, by saving what is referred to as the Query Plan, or the Execution Plan/Explain Plan depending on which tool you are using. Getting the right query plan for different parameters, or server load is where the real rocket science comes in for the engineers. That engine work is why a person with no understanding of how computers work can write a query in less than 30 seconds that will access 100 million rows, aggregating the values of one or more columns, and filtering out values that they don’t want to see, and get back results in just a few seconds.

There are three kinds of execution plans you will deal with: estimated, actual, and live. The estimated plan is something you can ask for, to show you what the execution plan will probably look like. The actual query plan has more details about what actually occurred, which can vary from the estimated plan for multiple reasons, including SQL Server’s Intelligent Query Processing features, which we will discuss more later in the chapter. There is another version of a plan you can see as well, referred to as the live execution plan, which basically lets you see the query plan that was chosen, with the rows of data flowing through as it is executing.

The execution plan that is created is a detailed blueprint of the query optimizer’s plan for processing any statement. Each time you run a statement, including batches with multiple statements, an execution plan is generated. Query plans share with you the plan of execution for queries, inform you of the steps the Database Engine will take to retrieve data, through the various transformation steps to sort, join, and filter data, and finally return or affect data. All statements you execute will create an execution plan, including Data Manipulation Language (DML) and Data Definition Language (DDL).

The plan will contain the estimated costs and other metadata of each piece that it takes to process a query and finally the DML or DDL operation itself. This data can be invaluable to developers and database administrators for tuning query performance. When you look at the query plan, you will be able to see some of the estimates that were made, compared to the actual values.

Execution plans are placed in the procedure cache, which is stored in active memory, that SQL Server uses when a statement is executed again. The procedure cache can be cleared manually, and it is reset when you restart a server. Plans from the procedure cache are reused for a query when that exact same query text is called again. Queries will reuse the same plan only if every character of the query statement matches, including capitalization, whitespace, line breaks, and text in comments. There is one exception to this rule of query reuse, and that is when SQL Server parameterizes a query or stored procedure statement. Parameterization will allow some values, like literals or variables to be treated as having a different value on each execution, a concept we will see later in the chapter.

Retrieving execution plans in SQL Server Management Studio

Let’s now look at how you can see the different types of execution plans in Management Studio in some detail, and how you can view them.

Displaying the estimated execution plan

You can generate the estimated execution plan quickly and view it graphically from within SQL Server Management Studio by choosing the Display Estimated Execution Plan option in the Query menu, or by pressing Ctrl+L. An estimated execution plan will return for the highlighted region, or for the entire file if no text is selected.

You can also retrieve an estimated execution plan in T-SQL code by running the following statement. It will be presented in an XML format:

SET SHOWPLAN_XML ON;

In SSMS, in Grid mode, the results are displayed as a link as for any XML output, but SSMS knows that this is a plan, so click the link to open the plan graphically in SSMS. You can save the execution plan as a .sqlplan file by right-clicking in the neutral space of the plan window.

You can also configure the estimated text execution plan in code by running one of the following statements, which return the execution plan in one result set or two, respectively:

SET SHOWPLAN_ALL ON;

SET SHOWPLAN_TEXT ON;

The text plan of the query using one of these two statements can be useful if you want to send the plan to someone in an email in a compact manner.

Note

Be aware that when any of the aforementioned SET options are turned on for a connection, SQL Server will not run statements, it only returns estimated execution plans. Remember to turn off the SET SHOWPLAN_ option before you reuse the same session for other queries.

As expected, the estimated execution plan is not guaranteed to match the actual plan used when you run the statement, but it is usually a very reliable approximation you can use to see if a query looks like it will execute well enough. The query optimizer uses the same information for the estimate as it does for the actual plan when you run it.

To display information for individual steps, hover over a step in the execution plan. You can also click an object, and then open the Properties window by pressing F4 or, in the View menu, by clicking Properties Window. After you have a bit of experience with plans, you’ll notice the estimated execution plan is missing some information that the actual plan returns. The missing fields are generally self-explanatory in that they are values you would not have until the query has actually executed; for example, Actual Number Of Rows, Actual Number Of Batches, and Number of Executions.

As an example, consider the following query in the WideWorldImporters database:

SELECT *

FROM Sales.Invoices

              JOIN Sales.Customers

                      on Customers.CustomerId = Invoices.CustomerId

WHERE Invoices.InvoiceID like '1%';

In Figure 14-1, we have part of the plan for this query, which looks pretty simple, but the full query plan takes up a good bit more space, due to a security policy (Application.FilterCustomersBySalesTerritoryRole) that implements row-level security that you wouldn’t even notice without looking at the Query Plan.

Query plan for the previous query in the chapter. To the leftmost side of the query, there is a SELECT operator, followed by the Merge Join (Inner Join) operator, which is fed by a Nested Loops operator, and so on. The details of the query plan are not important, that a query plan has operators and data that flows through them from right to left.

Figure 14-1 Example query plan, showing a portion of the plan for the query from the text

Displaying the actual execution plan

You can view the actual execution plan used to execute the query along with the statement’s result set from within SSMS by choosing the Include Actual Execution Plan option in the Query menu, or by pressing Ctrl+M to turn on the setting. After turning on this setting, when you run a statement, you will see an additional tab appear along with the execution results after the results of statements have completed.

Note

Turning on the actual execution plan feature will add extra time to the execution of your query. If you are comparing runs of a query, this could skew the results.

You can also return the actual execution plan as a result set using T-SQL code, returning XML that can be viewed graphically in SSMS, by running the following statement:

SET STATISTICS XML ON;

The actual execution plan is returned as an XML string. In SSMS, in Grid mode, the results display as a link which can be viewed graphically by clicking on the link. Remember to turn off the SET STATISTICS option before you reuse the same session, if you don’t want to get back the actual plan for every query you run on this connection.

You can save both estimated and actual execution plans as a .sqlplan file by right-clicking the neutral space of the plan window.

You might see that the total rows to be processed does not match total estimated number of rows for that step; rather, the multiple of that step’s estimated number of rows and a preceding step’s estimated number of rows. Back in Figure 14-1, on the Merge Join (Inner Join) operator, you can see that 11111 of 6654 rows were processed. The 6654 was the estimate, and 11111 was the actual number of rows.

Displaying live query statistics

Live query statistics are an excellent feature that was introduced in SQL Server 2016 Management Studio. You can generate and display a “live” version of the execution plan by using SQL Server Management Studio. You can access live statistics on versions of SQL Server starting with SQL Server 2014.

You turn on the Live Execution Statistics option for a connection via the Query menu of SQL Server Management Studio, as demonstrated in Figure 14-2.

Query menu from SQL Server Management Studio, with the Include Live Query Statistics highlighted.

Figure 14-2 The Include Live Query Statistics option

Executing the query we used back in the previous section, you will see something like the following in Figure 14-3.

Live query plan, showing the addition of the row counts for each operator. The image was taken during execution of the query from Figure 14-1, showing that, for example, the Merge Join (Inner Join) operator had processed 10321 rows out of an expected 6126 rows.

Figure 14-3 Example Live Query Statistics

In Figure 14-3, you can see that 10231 of the estimated 6126 rows have been processed by the Merge Join Operator, and 574 rows had been processed by the Nested Loops (Left Semi Join) operator.

The Live Query Statistics window initially displays the execution plan more or less like the estimated plan but fills out the details of how the query is being executed as it is processing it. If your query runs quickly, you’ll miss the dotted, moving lines and the various progress metrics including the duration for each step and overall percentage completion. The percentage is based on the Actual rows processed currently incurred versus a total number of rows processed for that step.

The Live Query Statistics contains more information than the Estimated query plan, such as Actual Number Of Rows and Number Of Executions, but less than the Actual query plan. The Live Query Statistics does not display some data from the Actual Execution Plan, Actual Execution Mode, Number Of Rows Read, and Actual Rebinds.

Returning a live execution plan will noticeably slow down the query processing, so be aware that the individual and overall execution durations measured will often be longer than when the query is run without the option to display Live Query Statistics. However, it can be worth it to see where a query is hung up in processing.

If your server is configured correctly to do so, you can see the live query plan of executing queries in action by using Activity Monitor, which is accessed by right-clicking on a server in SSMS. Then you can access the live execution plan by right-clicking any query in the Processes or Active Expensive Queries panes.

To understand the details of setting up your server’s Query Profiling Infrastructure, go to https://docs.microsoft.com/sql/relational-databases/performance/query-profiling-infrastructure. Just note that it is not free to add the ability to capture live query execution statistics, so determine if it is worth it first.

Permissions necessary to view execution plans

Not just anyone can view the execution plans of a query. There are two ways you can view plans, and they require different kinds of permissions.

If you wish to generate and view a query plan, you will first need permissions to execute the query, even to get the estimated plan. Additionally, retrieving the Estimated or Actual execution plan requires the SHOWPLAN permission in each database referenced by the query. The Live Query Statistics feature requires SHOWPLAN in each database, plus the VIEW SERVER STATE permission to see live statistics, so it cannot (and should not) be done by just any user.

It might be appropriate in your environment to grant SHOWPLAN and VIEW SERVER STATE permissions to developers. However, the permission to execute queries against the production database may not be appropriate in your regularly environment. If that is the case, there are alternatives to providing valuable execution plan data to developers without production access:

  • Consider providing database developers with saved execution plan (.sqlplan) files for offline analysis.

  • Consider also configuring the dynamic data masking feature, which may already be appropriate in your environment for hiding sensitive or personally identifying information for users who are not sysadmins on the server. Do not provide UNMASK permission to developers; assign that only to application users.

  • Sometimes, an administrator may need to execute queries on a production server due to differences in environment/hardware, but be cautioned on all of what that means in terms of privacy, etc.

  • Images For more information on dynamic data masking, see Chapter 7.

Providing the ability to see already generated and used execution plans can be done using several tools. SQL Server Extended Events and Profiler have ways to capture execution plans. Activity Monitor, in query reports such as Recent Expensive Queries, will allow you to right click and see the plan if it is still available in cache. Finally, there the DMVs that provide access to queries that executed (such as sys.dm_exec_cached_plans), or requests (sys.dm_exec_requests) will have a column named plan_handle that you can use to pass to the sys.dm_exec_query_plan DMV to retrieve the plan. In order to access plans in this manner, the server principal will need the VIEW SERVER STATE permission or be a member of sysadmin.

Finally, you can see execution plans in Query Store, which is covered later in the chapter, and for it you need VIEW DATABASE STATE, or you must be a member of the db_owner fixed role.

Understanding execution plans

At this point, we have established the basics of what an execution plan is, where to find it, and what permissions you need to see it. After you have a graphical execution plan in front of you, it is important to have a basic understanding of how to read how the statement was processed, and how future queries that use this same plan will operate.

Interpreting graphical execution plans

In the next list, we review some of the most common things to look for as you review execution plans in SSMS.

Note

You can also choose to review execution plans with a well-known third-party tool called Plan Explorer, which is a free download from https://www.sentryone.com/. It provides a number of additional features that are often helpful when working with very large query plans.

First, start an execution plan. For our purposes, a simple one like for the following query:

SELECT Invoices.InvoiceID

FROM Sales.Invoices

WHERE Invoices.InvoiceID like '1%';

Click Ctrl+L, and you will be presented with the following estimated query plan shown in Figure 14-4.

Query plan for the SELECT query from Sales.Invoices with a filter on Invoices.InvoiceID like ‘1%’

Figure 14-4 Simple query plan

To display details for individual steps, position your pointer over a step in the execution plan. A detailed tooltip style window will appear, much like you can see in Figure 14-5. An interesting detail immediately should come to you when you look at the plan. It doesn’t say it is scanning the Sales.Invoices table, but rather an index. This is because the index is set up for not only searching for data, but if an index has all the data needed to execute the query, the index “covers” the needs of the query and the table’s data structures are not touched.

Tool tip window that pops up when you hover over an operator. In this example, it is an Index Scan (NonClustered) that gives details about the operator, including estimated costs.

Figure 14-5 Simple query plan with statistics

You can also click an object, and then open the Properties window by pressing F4 or, in the View menu, clicking Properties Window. You will see the same estimated details in the properties pane. Now, let’s get the Actual Execution Plan, by using Ctrl-M, or one of the other methods discussed earlier. Execute the query, and you will see the actual plan, as seen in Figure 14-6.

Shows the actual query plan for the query. The important detail is that in the Index Scan (NonClustered) operator, it includes a count of 11111 of 6346 rows were processed for 175%.

Figure 14-6 Actual Execution Plan for Example Query

The first thing you should notice is that we have a few more details on our query plan. In particular, that 11111 rows were processed, of 6346. You can see that the 6346 is the estimated number of rows that would be returned by the query. This guess was based on statistics, which do not give perfect answers, and when you are tuning larger queries, very large disparities of such guesses need to be investigated.

Open the properties pane and you’ll notice the returned estimate and actual values for some metrics, including Number of Rows, Executions, IO Statistics, etc. Look for differences in estimates that were made, and actual numbers here; they can indicate an inefficient execution plan and the source of a poor performing query. Your query might be suffering from a poorly chosen plan because of the impact of parameter sniffing or stale, inaccurate index statistics. (We discuss parameter sniffing later in this chapter and discuss index statistics in Chapter 15)

Notice that some values, like cost information, contain only estimated values, even when you are viewing the actual execution plan. This is because the operator costs aren’t sourced separately, they are generated the same way for both estimated and actual plans, and do not change based on statement execution. Furthermore, cost is not just comprised entirely of duration. You might find that some statements far exceed others in terms of duration, but not in cost.

Start with the upper left operator

The upper-left operator will reflect the basic operation that the statement performed. For example, SELECT, DELETE, UPDATE, INSERT, or any of the DML statements (if your query created an index, you might see CREATE INDEX.) This operator might contain warnings or other items that require your immediate attention. These can show up with a small yellow triangle warning icon, with additional detail when you position your pointer on the operator.

For example, in our example plan, the SELECT operator was the far left, and it has a triangle over it. You can see in the ToolTip, that it is due to: “Type conversion in expression (CONVERT_IMPLICIT(varchar(12),[WideWorldImporters].[Sales].[Invoices].[InvoiceID],0)) may affect “CardinalityEstimate” in query plan choice, Type conversion in expression (CONVERT_IMPLICIT(varchar(12),[WideWorldImporters].[Sales].[Invoices].[InvoiceID],0)>='1') may affect "SeekPlan" in query plan choice.

In other words, we used a LIKE on an integer value, so the query plan is warning us that it cannot estimate the number of rows as well as it can if our WHERE clause employed integer comparisons to integers.

Click the upper-left operator, and then press F4 to open the Properties window, or open the Properties window from the View menu in SQL Server Management Studio. In this list are some other things to look for. You’ll see warnings repeated in here, along with additional aggregate information.

Note

Yellow triangles on an operator indicate something that should grab your attention. The alert could tip you off to an implicit conversion—a data type mismatch that could be costly. Investigate any warnings reported before moving on.

Look also for the Optimization Level, which typically says FULL. If the Optimization Level was TRIVIAL, the query optimizer bypassed optimization of the query altogether because it was straightforward enough. For example, if the plan for the query only needed to contain a simple Scan or Seek operation along with the operation operator, like SELECT. If the Optimization Level is not FULL or TRIVIAL, this is something to investigate.

Look next for the presence of a value for Reason For Early Termination, which indicates the query optimizer did not spend as much time as it could have picking the best plan.

  • It may have determined that the plan it had picked was good enough to not need to keep optimizing, so a value of “Good Enough Plan Found” is returned.

  • If the reason is “Time Out,” the optimizer tried as many times as it could to find the best plan before taking the best plan available, which might not be good enough. If you see this case, consider simplifying the query, especially reducing the use of functions, and by potentially modifying the underlying indexes.

  • If the reason is “Memory Limit Exceeded,” this is a rare and critical error indicating severe memory pressure on the SQL Server instance.

Next, look right, then read from right to left

Graphical execution plans build from sources (rightmost objects), and apply operators to join, sort, and filter data from right to left, eventually arriving at the leftmost operator. In the rightmost objects, you’ll see Scans, Seeks, and Lookups of different types. You might find some quick, straightforward insight into how the query is using indexes.

Each of the items in the query plan are referred to as operators. Each of these is a module of code that does a certain task to process data. Reading the query plan, the rightmost operators are acquiring data, possibly filtering it, then passing it off to the left until it reaches the single operator leftmost operator that represents the action of the query. Two of the main types of operators for fetching data from a table or index are seeks and scans.

A seek operator is used to find a portion of a set of data, through the index structure. For example, if you wanted to find Chapter 7 in this book, you would go to the table of contents, find the page number, and go to that page. They are generally the most efficient operators to see and can rarely be improved by additional indexes. The seek operation will find the leaf page in the index, which contains the keys of the index, plus any included column data (which in the case of a clustered table, will be all the data for the row).

Once you have the leaf data, if it contains everything you need, the operation was covered by the index. However, when you need more data than the index contains, you will see a lookup operator joined to the seek operator using a join operator. This means that although the optimizer used a seek, it needed a second pass at the table in the form of a lookup on another object, typically the clustered index, using a second seek operator. Key lookups (on clustered indexes) and RID lookups (on heaps) are expensive and inefficient, particularly when a large number of rows are being accessed. These lookups can tally up to a very large percentage of the cost of a query.

If key lookup operators are needed frequently, they can usually be eliminated from the execution plan with the modification to an existing nonclustered index with included columns for the data that needed to be looked up.

  • Images For an example, see the section “Designing nonclustered indexes” in Chapter 15.

The other typical data source operator is a scan. Scan operations aren’t great unless your query is intentionally performing a query that returns a large number of rows out of a table or index. Scans read all rows from the table or index, which can be very inefficient when you need to return only a few rows, or more efficient when you need a large number of rows. Without a nonclustered index with a well-designed key (if one can be found) to enable a seek for the query, a scan might be the query optimizer’s only option. Scans can be ordered if the source data is sorted, which is useful for some joins and aggregation options.

Scans on nonclustered indexes are often better than scans of clustered indexes, in part due to what is likely a smaller leaf page size (though they suffer from the same issues with key lookups), in fact it is generally worse because the optimizer will likely expect to return many rows from the object. If you are expecting to return a low number of rows, test and compare the performance of a new or updated nonclustered index, created based on the predicates and outputs of Index Scans and Clustered Index Scans.

Note

Very few queries are important enough to deserve their own indexes. Think “big picture” when creating indexes. If you try to cover every query’s needs with an index, you may discover more and more issues creating data. More than one query should benefit from any nonclustered indexes you create. Avoid redundant or overlapping nonclustered indexes. See Chapter 15 for more information on creating nonclustered indexes, including “missing” indexes.

Other types of scans include the following:

  • Table Scans. These indicate that the table has no clustered index. We discuss why this is probably not a good idea in Chapter 15.

  • Index scan. Scan the rows of an index, even the included columns of the index, for values.

  • Remote Scans. This includes any object that is preceded by “remote,” which is the same operation but over a linked server connection. Troubleshoot them the same way, but potentially by making changes to the remote server instead. For more details on PolyBase, see Chapter 20, “Leveraging Big Data and Machine Learning” as an alternative to linked server connections that may be superior in many usages.

  • Constant Scans. These appear when the query optimizer deals with scalar values, repeated numbers, and other “constants.” These are necessary operators for certain tasks and generally not actionable from a performance standpoint.

  • Columnstore Index Scans. These are incredibly efficient operators when you are working with lots of rows, but relatively few columns, and likely will outperform a Clustered Index Scan or Index Seek where millions of rows, for example, must be aggregated. No need to create a nonclustered index to replace this operator, unless your query is searching for a few rows.

Note

Since SQL Server 2016, columnstore indexes are a viable option for read-write tables in a transactional system. In previous versions of SQL Server, nonclustered columnstore indexes did not allow writes to the table, and so couldn’t easily be adopted in transactional databases. If you aren’t using them already to optimize large row count queries, consider adding them to your toolbelt.

Furthermore, since SQL Server 2016 SP1, columnstore indexes are even available to all edition licenses of SQL Server, even Express edition, though editions below Enterprise edition have limits to the amount of columnstore cache in memory. For more information, see Chapter 15, as well as https://docs.microsoft.com/sql/sql-server/editions-and-components-of-sql-server-version-15.

The weight of the lines connecting operators tells part of the story, but isn’t the full story

SQL Server dynamically changes the thickness of the gray lines to reflect the actual number of rows. You can get a visual idea of where the bulk of data is coming from by observing the pipes, drawing your attention to the places where performance tuning could have the biggest impact. If you hover over the line in your query plan, you can see the rows transmitted in each step, as you can see in Figure 14-7.

Figure shows the weight of the line between the SELECT operator and the Index Scan (NonClustered) operator, along with the effect of hovering over the line between operators to see number of rows read, including actual and estimated.

Figure 14-7 Showing number of rows read and estimated in query plan.

The visual weight and the sheer number of rows does not, however, directly translate to cost. Look for where the pipe weight changes from light to heavy, or vice versa. Be aware of when thick pipes are joined or sorted.

Operator cost share isn’t the full story either

When you run multiple queries, the cost of the query relative to the batch is displayed in the query execution plan header, and within each plan, the batch cost relative to the rest of the operators in the statement is displayed. SQL Server uses a cost-based process to decide which query plan to use.

When optimizing a query, it is usually going to be useful to start with the costliest operators. But deciding to address only the highest-cost single operator in the execution plan might be a dead end, as there are definitely examples where this will not be the case.

In Figure 14-8, we can see an example of when operator cost might not align with the amount of data. You should investigate performance tuning this execution plan using all the information provided.

In this snippet from an execution plan, much of the cost is associated with the top operator which has 45% of the cost of the entire query, but more rows are moving on from the bottom operator which only is 1% of the cost of the query.

Figure 14-8 Plan snippet showing cost and relative data amounts

Look for Join operators and understand the different algorithms

As you read from right to left, in a query of any complexity, you’ll likely see the paths meet at a join operator. Two tables, indexes, or the output from another operator can be joined together. There are three types of join operators to be aware of, particularly as this is where you will often find a large portion of the cost of an execution plan.

The types are hash match, merge join, and nested loop, and each can be the fastest way to join two sets of data, depending on the size of the sets, if they are indexed, and if they are sorted already or it won’t be too costly to sort with a sort operator. (There is an adaptive join operator, new to SQL Server 2019, which allows the optimizer to situationally choose between hash match or nested loop. This will be mentioned again in the “Intelligent Query Processing” section later in the chapter.)

A merge join operator takes two large sorted sets of data and merges the sets together. The query processor can scan each set, in order, matching row from each table with a single pass through the sets. This can be quite fast, but the requirement for ordered sets is where the cost comes in. If you are joining two sets that are keyed on the same column, they are then sorted, so two ordered scans can be merged. The optimizer can choose to sort one or both inputs to the merge join, but this is often costly.

Hash match is the join operator that is used to join two large sets to one another, generally when there is no index to easily use and sorting is too costly. As such, it has the most overhead, with a temporary “index” created based on an internal hashing algorithm to bucketize values to make it easier to match one row to another. This hash structure is in memory if possible but may spill onto disk. The presence of a hash join is not necessarily a bad thing, but just know that it is the least efficient algorithm to join two data sets together, precisely because they are not suited for the other two operators.

Note

Understand that usually when the optimizer does something that seems weird, like sort sets of data for an internal operation like a join, it is usually doing this because it has calculated that for the design of your database, this is the best way to do it.

As we discussed earlier, sometimes complex queries take too long to find the absolute best way to process your query, and sometimes the plans are just horribly wrong, like where statistics are not up to date, for example. This is one reason to watch for plans that have full optimization.

  • Images Later in the section named “Analyzing cached execution plans” we will look at a technique to analyze cached plans for different scenarios like partial optimization.

Finally, the most efficient join algorithm is the least optimized sounding one. The nested loops join algorithm is the basic row-by-row processing that people preach about you as a programmer never doing. It takes one row in one input and searches the other input for values that meet the join criteria. When you are joining two sets together, one is indexed on the join key, and doesn’t need to go fetch additional data using a key lookup, nested loops are very fast. The additional operators were implemented to support larger, ideally reporting-style workloads.

Each of the following could reduce the cost of a join operator.

  • There may be an opportunity to improve the indexing on the columns being joined, or perhaps, you have a join on a compound key that is incompletely defined.

  • In the case of a merge join, you may see a preceding sort operator. This can be an opportunity to present the data already sorted according to how the Merge Join requires the data to be sorted. If this is a composite key, perhaps changing the ASC/DESC property or the order of an index key column in a composite index could remove the sort operator. As usual, test to make sure that the original sort order isn’t used in other places as well.

  • Make sure you that are filtering at the lowest level possible. Perhaps a WHERE clause could exist in a subquery instead of at the top level of the query, or in the definition of a derived table or common table expression (CTE) instead of in the subsequent query.

  • Hash operators are the most expensive. Reducing the row counts going into a hash match or hash join could allow the query optimizer to use a less memory-intensive and less costly join operator.

  • Nested loops are often necessitated by key lookups and sometimes quite costly. They too are no longer necessary if a new nonclustered index is added to address the Key Lookup or to make an accompanying Index Seek more capable.

Look for Parallel icons

The left-pointing pair of arrows in a yellow circle shown in Figure 14-9 indicate that your query has been run with a parallel-processing execution plan. We talk more about Parallelism later in this chapter, but the important thing here is to be aware that your query has gone parallel.

This figure shows a Clustered Index Scan that is a cost of 18% of a larger query. On the image for the operator is two arrows pointing left to show that it is a parallel plan.

Figure 14-9 The parallel indicator on a Clustered Index Scan operator.

This doesn’t mean that multiple sources or pipes are being read in parallel; rather, the work for individual tasks has been broken up behind the scenes. The query optimizer decided it was faster if your workload was split up and run into multiple parallel streams of rows.

You might see one of the three different parallelism operators—the distribute streams, gather streams, and repartition streams operators—each of which appear only for parallel execution plans.

Cardinality estimation

When a query plan is being generated, one of the most important factors you will deal with is the cardinality estimation. Cardinality is defined as the number of items in a set (hence, a cardinal number is a non-negative integer). The importance of cardinality estimation cannot be overstated and is analogous to how you might do a task. If you run a store and ship 3 products a week, you may just have a stack of products, and walk 2 miles to the post office and be efficient enough. If you have to ship 300,000 products a day, the net effect of each product being shipped needs to be the same, but the way you will achieve this must be far more optimized and include more than just one person.

SQL Server makes the same choices. You join table X with table Y on column Id. If X has 1000 rows, and Y 10, the solution is easy. If they each have a billion, and you are looking for a specific value in table X, say value = 'Test', there are many choices to make. How may rows in X have a value of 'Test'? Then, once you know that value, how many values of Id in X will match Id values in Y.

This estimation is done in two ways. First: with guesses based on histograms of the data. The table is scanned when creating statistics (for example, by executing UPDATE STATISTICS, or by automatic statistics gathering that happens as data changes, based on the database setting AUTO_UPDATE_STATISTICS). Take the first case where the tables are small. The engine stores a histogram that has something like the following:

RANGE_HI_KEY

RANGE_ROWS

EQ_ROWS

DISTINCT_RANGE_ROWS

AVERAGE_RANGE_ROWS…

Smith

400

200

20

10

Tests

200

120

23

5

From this, the number of rows that are equal to 'Test' can be guessed to be < 400 (since Test is between Smith and Tests), less than 200, since approximately 200 rows matched Smith, and there are approximately 20 distinct values, so it might be guessed that there are 10 matches. Exactly how this estimation is done is proprietary, but it is definitely important to keep your statistics up to date using maintenance methods (see Chapter 8 for more details on proper upkeep of SQL Server).

For many versions of SQL Server, the same cardinality estimation algorithms were pretty much used. However, starting in SQL Server 2014 and continuing in SQL Server 2019, Microsoft has been tweaking the cardinality estimation significantly with pretty much every release. Example changes for recent versions of SQL Server are that the cardinality estimator understands the number of rows may have increased since the samples were taken. Or that certain column values may be related (searching for names of automobiles, like auto maker and model, go together.) While SQL Server 2014 was now 3 releases and 5 years ago, there are still very large amounts of the user base that are on earlier versions (at this year’s PASS Summit, a big focus from Microsoft was to get people to upgrade from SQL Server 2008 and 2008 R2 which are no longer supported.)

The problem with cardinality estimation is that it is a very inexact science. The guesses made are frequently imperfect enough to be bad for performance, sometimes due to the changes made version to version, particularly with very large, very complex queries. In order to control which cardinality estimator is used, there are a few methods. The first is to use the database compatibility level, like this to use the SQL Server 2016 cardinality estimator (and everything else that this limits to 2016 rules too):

ALTER DATABASE database_name

SET COMPATIBILITY_LEVEL = 130;

To use the SQL Server 7.0 legacy cardinality estimator (with a database set in 120 compatibility level, which was SQL Server 2014), you can use the following database configuration value:

ALTER DATABASE SCOPED CONFIGURATION 
    SET LEGACY_CARDINALITY_ESTIMATION = ON;

In compatibility level 130 and higher (SQL Server 2016 SP1), there is a query hint you can use:

SELECT …
FROM …
OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

Alternatively, you can use the compatibility level to use the cardinality estimator of that previous version in all your queries. For the most part, we suggest using the latest cardinality estimator possible, and addressing issues in your queries if possible, but realize this is not always a feasible solution.

Understanding parameterization and “parameter sniffing”

SQL Server parameterization occurs when the query optimizer detects values (such as the search criteria of a WHERE clause statement) that can be parameterized. For example, the statements in a stored procedures are parameterized from the stored procedure declaration. A query can be parameterized when it meets certain conditions.

For example, a query may have values that could take on multiple values. A query such as:

SELECT Value From TableName WHERE Value = 'X';

A simple query such as this can be parameterized and the literal 'X' replaced by a parameter, much like if you were writing a stored procedure. By default, a query where you reference more than one table will not be parameterized, but you can change the database setting of PARAMETERIZATION to FORCED and more complex queries will be parameterized. Finally, you can get a parameterized plan by using a variable:

DECLARE @Value varchar(10) = 'X';
SELECT Value From TableName WHERE Value = @Value;

This query will be parameterized no matter how complex it is. There are additional ways that a query will be parameterized through client APIs, but the best way to parameterize your queries is with stored procedures.

With parameterization, it’s possible that two potentially helpful or potentially problematic conditions can occur:

  • You can reuse a query plan for multiple queries for which the query text is exactly the same, except for parameterized values.

  • The same query could use the same execution plan for two different values of a parameter, resulting in vastly different performance.

For example, you might create the following stored procedure to fetch the orders that have been placed for goods from a certain supplier:

CREATE OR ALTER PROCEDURE Purchasing.PurchaseOrders_BySupplierId

      @SupplierId int

AS

SELECT PurchaseOrders.PurchaseOrderID,

       PurchaseOrders.SupplierID,

       PurchaseOrders.OrderDate

 FROM   Purchasing.PurchaseOrders

WHERE  PurchaseOrders.SupplierID = @SupplierId;

Note

The CREATE OR ALTER statement is a relatively new improvement that either creates to object if it doesn’t exist, or alters it if it does. For more detail, see: https://blogs.msdn.microsoft.com/sqlserverstorageengine/2016/11/17/create-or-alter-another-great-language-enhancement-in-sql-server-2016-sp1

Now, the plan of this procedure will depend on what value is passed in on the first compilation. For example, if the larger rowcount query (@SupplierID = 5) is used first and has its query plan cached, the query plan will choose to scan the clustered index of the table, because the value of 5 has a relatively high cardinality in the table. If the smaller rowcount query (@SupplierID = 1) is run first, its version of the plan will be cached which will use an index seek and a key lookup. In this case, the plan is different, less efficient for very large row counts, and will be used for all versions of the parameterized statement.

Here are a few advanced troubleshooting avenues to alleviate this scenario:

  • You can use the OPTIMIZE FOR query hint to demand that the query analyzer use a cached execution plan that substitutes a provided value for the parameters. You also can use OPTIMIZE FOR UNKNOWN, which instructs the query analyzer to optimize for the most common value, based on statistics of the underlying data object.

  • The RECOMPILE query hint or procedure option does not allow the reuse of a cached plan, forcing a fresh query plan to be generated each time the query is run.

  • You can use the Plan Guide feature (implemented via stored procedures) to guide the query analyzer to a plan currently in cache. You identify the plan via its plan_handle. For information on identifying and analyzing plans in sys.dm_exec_cached_plans, see the upcoming section.

  • You can use the Query Store feature (implemented with a GUI in SQL Server Management Studio and via stored procedures behind the scenes) to visually look at plan performance and force a query to use a specific plan currently in cache.

  • Use the USE PLAN query hint to provide the entire XML query plan for any statement execution. This obviously is the least convenient option, and like other approaches that override the query analyzer, you should consider it an advanced and temporary performance tuning technique.

  • Turn off parameter sniffing at the database level using: ALTER DATABASE SCOPED CONFIGURATION SET PARAMETER_SNIFFING = OFF;. This will cause plans to all act like OPTIMIZE FOR UNKNOWN.

Understanding the Procedure Cache

The Procedure Cache is a portion of memory that contains query plans for statements that have been executed. New execution plans enter the Procedure Cache only when a statement is run. If the procedure cache already contains a plan matching a previous run of the current statement, the execution plan is reused, saving valuable time and resources.

This is one reason that complex statements can appear to run faster the second time they are run, other than the fact that data may be cached on a second execution.

The Procedure Cache is empty when the SQL Server service starts and grows from there. SQL Server manages plans in the cache, removing them as necessary under memory pressure. The size of the Procedure Cache is managed by SQL Server and is inside the memory space configured for the server in the Max Server Memory configuration setting. Plans are removed based on their cost and how recently it has been used. Smaller, older plans and single-user plans are the first to be cleared.

Clearing the Procedure Cache

You might find that manually clearing the Procedure Cache is useful when performance testing or troubleshooting. Typically, you want to reserve this activity for preproduction systems.

There are a few common reasons to clear out cached plans in SQL Server. For example, one common reason is to compare two versions of a query or the performance of a query with different indexes—you can clear the cached plan for the statement to allow for proper comparison.

Note

While this can be a good thing to try, what you are testing is not only your query, but your hardware’s ability to fetch data from the disk. When you look at the output of SET STATISTICS IO ON, the Logical Reads measurement will give you an accurate comparison of two query version’s amount of reads of data from memory. The presence of Physical Reads will tell you that data the query needed was not in cache. Higher amounts of physical reads will indicate that the server’s ability to hold everything needed into RAM may not be sufficient.

You can manually flush the entire Procedure Cache, or individual plans in cache, with the following database-scoped configuration command, which affects only the current database context, as opposed to the entire instance’s procedure cache:

ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;

This command was introduced in SQL Server 2016 and is effectively the same as the command DBCC FREEPROCCACHE within the current database context. It works in both SQL Server and Azure SQL Database. DBCC FREEPROCCACHE is not supported in Azure SQL Database, and should be deprecated from your use going forward in favor of ALTER DATABASE.

Caution

Avoid clearing the Procedure Cache in a live production environment during normal business hours. Doing so will cause all new statements to have their execution plans compiled, dramatically increasing processor utilization and potentially dramatically slowing performance.

You can also remove a single plan from cache by identifying its plan_handle and then providing it as the parameter to the ALTER DATABASE statement. Perhaps this is a plan you would like to remove for testing or troubleshooting purposes that you have identified with the script in the previous section:

ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE 0x06000700CA920912307B86
7DB701000001000000000000000000000000000000000000000000000000000000;

You can alternatively flush the cache by object type. This command clears cached execution plans that are the result of ad hoc statements and prepared statements (from applications, using sp_prepare, typically through an API):

DBCC FREESYSTEMCACHE ('SQL Plans');

The advantage of this statement is that it does not wipe the cached plans from “Programmability” database objects such as stored procedures, multi-statement table-valued functions, scalar user-defined functions, and triggers. The following command clears the cached plans from those type of objects:

DBCC FREESYSTEMCACHE ('Object Plans');

Note that DBCC FREESYSTEMCACHE is not supported in Azure SQL Database.

You can also use DBCC FREESYSTEMCACHE to clear cached plans association to a specific Resource Governor Pool, as follows:

DBCC FREESYSTEMCACHE ('SQL Plans', 'poolname');
Analyzing cached execution plans

You can analyze execution plans in aggregate starting with the dynamic management view sys.dm_exec_cached_plans, which contains a column named plan_handle. The plan_handle column contains a system-generated varbinary(64) string that can be used with a number of other dynamic management views. As seen in the code example that follows, you can use the plan_handle to gather information about aggregate plan usage, plan statement text, and to retrieve the graphical execution plan itself.

You might be used to viewing the graphical execution plan only after a statement is run in SSMS, but you can also analyze and retrieve plans for queries executed in the past by using the following query against a handful of dynamic management objects (DMOs). These DMOs return data for all databases in SQL Server instances, and for the current database in Azure SQL Database. The following queries can be used to analyze different aspects of cached execution plans. Note that this query may take considerable amount of time as written, so you may wish to pare down what is being output for your normal usage.

SELECT

    p.usecounts AS UseCount,

    p.size_in_bytes / 1024 AS PlanSize_KB,

    qs.total_worker_time/1000 AS CPU_ms,

    qs.total_elapsed_time/1000 AS Duration_ms,

    p.cacheobjtype + ' (' + p.objtype + ')' as ObjectType,

    db_name(convert(int, txt.dbid )) as DatabaseName,

    txt.ObjectID,

    qs.total_physical_reads,

    qs.total_logical_writes,

    qs.total_logical_reads,

    qs.last_execution_time,

      qs.statement_start_offset as StatementStartInObject,

    SUBSTRING (txt.[text], qs.statement_start_offset/2 + 1 ,

      CASE WHEN qs.statement_end_offset = -1
           CRLF THEN LEN (CONVERT(nvarchar(max), txt.[text]))

         ELSE qs.statement_end_offset/2 - qs.statement_start_offset/2 + 1 END)

      AS StatementText,

      qp.query_plan as QueryPlan,

      aqp.query_plan as ActualQueryPlan

FROM sys.dm_exec_query_stats AS qs

INNER JOIN sys.dm_exec_cached_plans p ON p.plan_handle = qs.plan_handle

OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) AS txt

OUTER APPLY sys.dm_exec_query_plan (p.plan_handle) AS qp

OUTER APPLY sys.dm_exec_query_plan_stats (p.plan_handle) AS aqp

--tqp is used for filtering on the text version of the query plan

CROSS APPLY sys.dm_exec_text_query_plan(p.plan_handle, qs.statement_start_offset,
qs.statement_end_offset) AS tqp

WHERE txt.dbid = db_id()

ORDER BY qs.total_worker_time + qs.total_elapsed_time DESC;

Note that the preceding query orders by a sum of the CPU time and duration, descending, returning the longest running queries first. You can adjust the ORDER BY and WHERE clauses in this query to hunt, for example, for the most CPU-intensive or most busy execution plans. Keep in mind that the Query Store feature, as detailed later in this chapter, will help you visualize the process of identifying the most expensive and longest running queries in cache.

As you will see after running in the previous query, you can retrieve a wealth of information from these DMOs, including statistics for a statement within an object that generated the query plan. The query plan appears as blue hyperlink in SQL Server Management Studio’s Results To Grid mode, opening the plan as a new .sqlplan file. You can save and store the .sqlplan file for later analysis. Note too that this query may take quite a long time to execute as it will include a line for every statement in each query.

For more detailed queries, you can add some code to search only for queries that have certain details in the plan. For example, looking for plans that have a Reason For Early Termination. In the execution plan XML, the Reason For Early Termination will show in a node StatementOptmEarlyAbortReason. Simply add the search conditions before the ORDER BY in the script, using the following logic:

and tqp.query_plan LIKE '%StatementOptmEarlyAbortReason%'

Included in the query is sys.dm_exec_query_plan_stats, which is new in SQL Server 2019 and will return the actual plan, with the details that are added to the plan after an execution.

Permissions required to access cached plan metadata

The only permission needed to run the previous query in SQL Server is the server-level VIEW SERVER STATE permission, which might be appropriate for developers to have access to in a production environment because it does not give them access to any data in user databases.

In Azure SQL Database, because of the differences between the Basic/Standard and Premium tiers, different permissions are needed. In the Basic/Standard tier, you must be the server admin or Azure Active Directory Admin to access objects that would usually require VIEW SERVER STATE. In Premium tier, you can grant VIEW DATABASE STATE in the intended database in Azure SQL Database to a user who needs permission to view the above DMVs.

Understanding parallelism

Parallelism in query processing (much less general computing) is a very complex topic, but luckily much of the complexity of parallelism in SQL Server is generally encapsulated from the DBA and programmer. A query that uses parallelism and one that doesn’t can be the same query, with the same plan (other than allowing one or more operators to work in parallel.) When SQL Server decides to split and stream data needed for requests into multiple threads, it uses more than one processor to get the job done. The number of different parallel threads used for the query is called the degree of parallelism. Because parallelism can never exceed the number of logical processors, naturally the maximum degree of parallelism (MAXDOP) is capped.

The main job of the DBA is to tune the MAXDOP for the server, database, and individual queries when the defaults don’t behave well. On a server with a mixed load of OLTP and analytics workloads, some larger analytics queries can overpower other active users.

The max degree of parallelism is set at the server level using the server UI in SSMS, or more commonly using the sp_configure system stored procedure. In SQL Server 2019, there is a MaxDOP tab in the Database Engine Configuration that will calculate what MaxDOP probably ought to be for your server configuration. In previous versions, the system default was 0 (allowing all processors to be used in a single statement).

Parallelism is a seemingly magical way to make queries run faster (most of the time), but even seeming like magic comes at a price. While queries may perform fastest in a vacuum going massively parallel, the overuse of parallelism creates a multithreading bottleneck at scale with multiple users. Split into too many different parts, queries slow down en masse as CPU utilization rises and SQL Server records increasing values in the CXPACKET wait type.

  • Images We talk about CXPACKET here, but for more about wait type statistics, see Chapter 8.

Until SQL Server 2016, MAXDOP was only a server-level setting, a setting enforced at the query level, or a setting enforced to sessions selectively via the Resource Governor, an Enterprise edition feature. Since SQL server 2016, the MAXDOP setting is also available as a database-scoped configuration. You can also use the MAXDOP query hint in any statement to override the database or server level MAXDOP setting.

Setting a reasonable value for MAXDOP will determine how many CPUs will be used to execute a query, but there is another setting to determine what queries to allow to use parallelism. This is the Cost Threshold for Parallelism (CTFP), which enforces a minimum bar for query cost before a query can use a parallel execution plan. The higher the threshold, the fewer queries go parallel. This setting is fairly low by default, but its proper setting in your environment is quite dependent on the workload and processor count. More expensive queries usually benefit from parallelism more than simpler queries, so limiting the use of parallelism to the worst queries in your workload can help. Similarly, setting the CTFP too high could have an opportunity impact, as performance is limited, queries are executed serially, and CPU cores go underutilized. Note that CTFP is a server-level setting only.

If large queries are already a problem for performance and multiple large queries regularly run simultaneously, raising the CTFP might not solve the problem. In addition to the obvious solutions of query tuning and index changes, it may be worth it to include the use of columnstore indexes for analytic queries and using MAXDOP as a hint instead to limit some very large queries from taking over your server.

A possible indication of parallelism being an issue is when the CXPACKET wait is a dominant wait type experienced over time by your SQL Server. You may need to adjust both MAXDOP and CTFP when performance tuning. You can also view the live and last wait types for a request using sys.dm_exec_requests. Make these changes in small, measured gestures, and don’t overreact to performance problems with a small number of queries. Use Query Store to benchmark and trend the performance of high-value and high-cost queries as you change configuration settings.

Another flavor of CPU pressure, and in some ways the opposite of the CXPACKET wait type, is the SOS_SCHEDULER_YIELD wait type. The SOS_SCHEDULER_YIELD is an indicator of CPU pressure, indicating that SQL Server had to share time or “yield” to other CPU tasks, which may be normal and expected on busy servers. Whereas CXPACKET is the SQL Server complaining about too many threads in parallel, the SOS_SCHEDULER_YIELD is the acknowledgement that there were more runnable tasks for the available threads. In either case, first take a strategy of reducing CPU-intensive queries and rescheduling or optimizing CPU-intensive maintenance operations. This is more economical than simply adding CPU capacity.

Forcing a parallel execution plan

Released with SQL Server 2017 (and also implemented in SQL Server 2016 CU2 and later) is a query hint that can force a statement to compile with a parallel execution plan. This can be valuable in troubleshooting, or to force a behavior in the query optimizer, but is not usually a necessary or recommended option for live code.

Appending the following hint to a query will force a parallel execution plan, which you can see using the Estimate or Actual execution plan output options:

OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE'));

Note

The presence of certain system variables or functions can force a statement to compile to be serial, that is, without any parallelism. This behavior will override the new ENABLE_PARALLEL_PLAN_PREFERENCE option.

The @@TRANCOUNT system variable will force a serial plan, as will any of the built-in error reporting functions, including ERROR_LINE(), ERROR_MESSAGE(), ERROR_NUMBER(), ERROR_PROCEDURE(), ERROR_SEVERITY(), or ERROR_STATE(). Note that this pertains only to using these objects in a query. Using them in the same batch, such as in a TRYCATCH handler, will not affect the execution plans of other queries in the batch.

Understanding advanced engine features for tuning queries

In the past few versions of SQL Server and Azure SQL DB, the programmers building the core relational engine have started to add especially advanced features to go from the same cost-based optimizations we have had for many years, to tools that can sense when plans need adjusted before a DBA does.

Not that any of these features will replace well written code and a DBA that understands the architecture of how queries work, but as data needs explode, the more the engine can do for you the better.

Plan Guides and Query Store

Plan Guide and Query Store are two complimentary, and slightly overlapping technologies that will help you tune your queries.

Plan Guides first existed in SQL Server 2005, and allow you to influence or set a particular plan for a certain query string. You can influence a plan by simply adding a hint (a common example would be WITH RECOMPILE) or providing an entire plan for a query. This can be very useful if you have a plan that is being chosen by SQL Server that doesn’t work for you and you have no way to change the query in question. A common example is parameter sniffing, which was discussed earlier in the chapter, but other scenarios exist, like complex queries that don’t work with a new version of the query optimizer, even though the rest of your system is performing better.

The Query Store feature is a more complete tuning solution, which captures the plans from the plan cache in your local database and allows you to see how a query performs over time. Where it overlaps with Plan Guides is that you can override the query plan chosen where there is a previous plan that worked well, but now does not.

In this section, we will review aspects of both tools, to help guide you as to which tool to choose. Note however, that tools that force a plan to override what the optimizer has chosen are not considered the best approach to query tuning. If you have an application where you own the source code, forcing a plan might be good to do until you can make a change to code, but should not be your primary tuning tool. If it is a third-party application, you should work with your vendor on a proper solution, but these features will help you to get past a serious issue.

Using Plan Guides

Plan Guides are often maligned because they are complex to work with, and for good reason, they can be complex to use. There was not a UI for them initially, and the UI that is in SSMS does not really guide you to how they are used. However, once you get the basics down, they help to allow you to optimize queries when you cannot change the text of a query or coded object.

What makes them powerful is that while you can change the entire plan of a query, you can also just influence the plan with a query hint, either in a coded object like a stored procedure, or a textual query that is executes outside of an object. You can also affect how an ad-hoc query parameterizes.

There are three types of Plan Guides we will cover in the following sections: Object Plan Guides, SQL Plan Guides, and Template Plan Guides.

You can create a Plan Guide using the UI in the Programmability section of the database, or with the following stored procedures, which will be demonstrated.

  • sp_create_plan_guide – allows you to create the plan guide from the text of the query.

  • sp_create_plan_guide_from_handle – allows you to create the plan guide from the plan handle you have retrieved from another DMV or system object.

  • sp_control_plan_guide – used to drop, disable, and enable one or all plan guides in a database.

In the following examples, we tweak the plans of the queries with hints, but if you have a complete plan you can use the @hints parameter to include the entire query plan to override. However, we suggest doing this with Query Store generally, unless you are doing very advanced tuning and are taking an existing plan and tweaking it. Doing this is beyond the scope of this book.

Object Plan Guides

An object plan guide lets you override the plan of a query in an object. For example, we used in the parameterization section earlier, as the following procedure.

CREATE OR ALTER PROCEDURE Purchasing.PurchaseOrders_BySupplierId

@SupplierId int

AS

SELECT PurchaseOrders.PurchaseOrderID,

       PurchaseOrders.SupplierID,

       PurchaseOrders.OrderDate

 FROM   Purchasing.PurchaseOrders

WHERE  PurchaseOrders.SupplierID = @SupplierId;

In our example, this procedure is not changeable (it came as part of a third party system), and you want to make sure it optimizes for the lower cardinality values, such as 2. Sometimes, the procedure has optimized for a parameter value that gave us a poorly executing plan, and performance is suffering because most of your queries would benefit from the lower cardinality plan. The optimal plan will use an index seek and a lookup, which is great for a few rows, whereas it will be more costly on the few occasions that you retrieve the higher cardinality set of rows.

You apply the hint using the following call, where you provide the query to be guided, and the hint to add. The query text doesn’t have to perfectly match, for example it ignores white space, but the query text itself must match (For complete details, see: https://docs.microsoft.com/sql/relational-databases/system-stored-procedures/sp-create-plan-guide-transact-sql)

EXECUTE sp_create_plan_guide

@name = N' SP-Purchasing.PurchaseOrders_BySupplierId_AddOption',

@stmt = N'SELECT PurchaseOrders.PurchaseOrderID,

       PurchaseOrders.SupplierID,

       PurchaseOrders.OrderDate

 FROM   Purchasing.PurchaseOrders

WHERE  PurchaseOrders.SupplierID = @SupplierId',

@type = N'OBJECT',

@module_or_batch = N'Purchasing.PurchaseOrders$BySupplierId',

@params = NULL,  --null for object since it is declare in the parameters

@hints = N'OPTION (OPTIMIZE FOR (@SupplierId = 13))';

To drop the plan guide, you can execute:

EXEC sp_control_plan_guide 'DROP', N' SP-Purchasing.
PurchaseOrders_BySupplierId_AddOption';

You can check to see if your Plan Guide is working by looking at the properties of a query plan, In the properties you will see properties PlanGuideDB and PlanGuideName if so.

SQL Plan Guides

A SQL Plan guide works the same way as the object, except it will match any query that uses the same query in an ad-hoc fashion. For example, say the query from the stored procedure we just looked at was actually an ad-hoc SQL call. If it is a call that a user is making by typing in a query, it would be almost impossible to use a plan guide, since the query must match character for character, including white space.

But if it is an application, it will use a parameterized version of the query call, such as the following:

EXECUTE sp_executesql @stmt = N'SELECT PurchaseOrders.PurchaseOrderID,

                 PurchaseOrders.SupplierID,

                 PurchaseOrders.OrderDate

FROM   Purchasing.PurchaseOrders

WHERE PurchaseOrders.SupplierID = @SupplierId',

@params = N'@SupplierId int',

@SupplierId = 2;

You can easily create a plan guide to handle this query, because the application will always be sending the query with the exact same text:

EXECUTE sp_create_plan_guide

      @name = N'SQL-Purchasing_PurchaseOrders$BySupplierId_NoParallelism',

      @stmt =

'SELECT PurchaseOrders.PurchaseOrderID,

                      PurchaseOrders.SupplierID,

                      PurchaseOrders.OrderDate

FROM   Purchasing.PurchaseOrders

WHERE PurchaseOrders.SupplierID = @SupplierId',

      @type = N'SQL',

      @module_or_batch = NULL,

      @params = '@SupplierId int',

      @hints = N'OPTION (OPTIMIZE FOR (@SupplierId = 2))';
Template Plan Guide

Finally, you can use a template guide to apply a parameterization hint, to make a single query either parameterize or not parameterize, based on the database setting of FORCED PARAMETERIZATION. In this example, assume that that setting is turned off, so the following query will not have a parameterized plan:

Use WideWorldImporters;

SELECT * FROM Sales.Orders AS o

INNER JOIN Sales.OrderLines AS ol

    ON ol.OrderID = o.OrderID

WHERE o.OrderId = 679;

You can make queries of this form parameterized by using the following code. The sp_get_query_template procedure formats and parameterizes the query to make it easier to match. Then we create a plan guide, so further calls that do the same SELECT statement will be parameterized, instead of an individual plan for each value of OrderId, without making all complex statements parameterized.

DECLARE @stmt nvarchar(max);

DECLARE @params nvarchar(max);

EXEC sp_get_query_template

    N'SELECT * FROM Sales.Orders AS o

      INNER JOIN Sales.OrderLines AS ol

           ON ol.OrderID = o.OrderID

      WHERE o.OrderId = 679;  ',

    @stmt OUTPUT,

    @params OUTPUT

EXEC sp_create_plan_guide N'TemplateGuide1',

    @stmt,

    N'TEMPLATE',

    NULL,

    @params,

    N’OPTION(PARAMETERIZATION FORCED)’;
Using the Query Store feature

The Query Store provides a practical history of execution plan performance for a single database, which persists even when the server has been restarted (unlike the plan cache itself, which is cleared, eliminating all of the interesting data that one needs for tuning queries over time.) It can be invaluable for the purposes of investigating and troubleshooting sudden negative changes in performance, by allowing the administrator or developer to identify both high-cost queries and the quality of their execution plans, and especially where the same query has multiple plans, where one performs poorly, and the other well.

The Query Store is most useful for looking back in time toward the history of statement execution. It can also assist in identifying and overriding execution plans by using a feature similar to, but different from the plan guides feature. As discussed in the previous section, Plan Guides are used to add guidance to a query’s optimization, parameterization, or by even replacing the entire plan. The Query Store allows you find plans that are not working well, but only gives you the ability to force an entire plan that worked better from the history it has stored. The Query Store has a major benefit over Plan Guides, in that there is a user interface in SSMS to access it, see the benefits, and find places where you may need to apply a new plan.

You see live Query Store data as it happens from a combination of both memory-optimized and on-disk sources. Query Store minimizes overhead and performance impact by capturing cached plan information to in-memory data structure. The data is “flushed” to disk at an interval defined by Query Store, by a default of 15 minutes. The Disk Flush Interval setting defines how much Query Store data can be lost in the event of an unexpected system shutdown.

Note

Queries are captured in the context of the database where the query is executed. In the following code example, the query’s execution is captured in the Query Store of the WideWorldImporters database.

USE WideWorldImporters;

GO

SELECT *

FROM

AdventureWorks.[Purchasing].[PurchaseOrders];

The Query Store is a feature that Microsoft delivered to the Azure SQL Database platform first, and then to the SQL Server product. In fact, Query Store is at the heart of the Azure SQL Database Advisor feature that provides automatic query tuning. The Query Store feature’s overhead is quite manageable, tuned to avoid performance hits, and is already in place on millions of databases in Azure SQL Database.

The VIEW DATABASE STATE permission is all that is needed to view the Query Store data.

Initially configuring the query store

The Query Store feature is identical between the Azure SQL DB and SQL Server in operation, but not in how you activate the feature. Query Store is turned on automatically on Azure SQL Database, but it is not automatically on for new databases in SQL Server. While it is possible to set this on the model database so all new databases will inherit Query Store turned on, we suggest against it because it is best to think about your needs and tailor them when you create your database.

When should you enable Query Store? Enabling Query Store on all databases you have in your environment is a generally acceptable practice, as it will be useful in discovering performance issues in the future when they arise. You can turn on Query Store via the database Properties dialog box, in which Query Store is a page on the menu on the left. Or, you can turn it on via T-SQL by using the following command:

ALTER DATABASE [DatabaseOne] SET QUERY_STORE = ON;

This will turn on Query Store with the defaults and you can adjust them using the UI.

Note

As with pretty much any configuration task, while it is OK to use the UI the first time or two, it will always be better to have a script in T-SQL/Powershell/etc. to capture any settings in a repeatable manner. Use the Script button in most of the UIs in SSMS to output a script of what has changed when you are setting new values.

Query Store begins collecting when you activate it. You will not have any historical data when you first turn on the feature on an existing database, but you will begin to immediately see data for live database activity. You can then view plans and statistics about the plan in the Query Store reports.

In versions of SQL Server prior to 2019, the Query Store Capture Mode default setting was All, which included all queries that were executed. In SQL Server 2019, this default has been changed to AUTO, and for earlier versions it is best to set the same way because the additional data of one-use plans might not be useful and can reduce the amount of historical data can be retained.

Note

The Query Store data is stored in the user database. It is backed up and restored along with the database.

The Query Store retains data up to two limits: a Max Size (1000 MB by default in 2019, 100 MB in earlier versions), and a Stale Query Threshold time limit of Days (30 by default). If Query Store reaches its Max Size, it will clean up the oldest data. Because Query Store data is saved on a drive, its historical data is not affected by the commands we looked at earlier in this chapter to clear the Procedure Cache, such as DBCC FREEPROCACHE.

You should almost always keep the Size Based Cleanup Mode set to the default Auto. If not, when the Max Size is reached, Query Store will stop collecting data and enter Read Only mode, which does not collect new data. If you find that the Query Store is not storing more historical days of data than your Stale Query Threshold setting in days, increase the Max Size setting.

Using query store data in your troubleshooting

Query Store has several built-in dashboards, shown in Figure 14-10, to help you examine query performance and overall performance over recent history.

A snippet of the SQL Server Management Studio, showing the Object Explorer list of built in queries on the Query Store, including: Regressed Queries, Overall Resource Consumption, Top Resource Consuming Queries, Queries With Forced Plans, Queries With High Variation, Query Wait Statistics, and Tracked Queries

Figure 14-10 The SQL Server Object Explorer list of built-in dashboards available for Query Store in SQL Server Management Studio 18.

You can also write your own reports against the collection of system DMO that present Query Store data to administrators and developers by using the VIEW DATABASE STATE permission. You can view the schema of the well-documented views and their relationships at https://docs.microsoft.com/sql/relational-databases/performance/how-query-store-collects-data.

On many of the dashboards, there is a button with a crosshairs symbol, as depicted in Figure 14-11. If a query seems interesting, expensive, or is of high value to the business, you can click this button to view a new screen that tracks the query when it’s running as well as various plans identified for that query.

A snippet of the query store tool bar that shows you how to track a specific query in the Query Store

Figure 14-11 The Query Store tool bar at the top of the screen on many of the dashboards, in this example, the tool bar for the Regressed Queries report.

You can review the various plans for the same statement, compare the plans, and if necessary, force a plan that you believe is better than the optimizer will choose, into place. Compare the execution of each plan by CPU Time, Duration, Logical Reads, Logical Writes, Memory Consumption, Physical Reads, and several other metrics.

Most of all, the Query Store can be valuable by informing you when a query started using a new plan. You can see when a plan was generated and the nature of the plan; however, the cause of the plan’s creation and replacement is not easily answered, especially when you cannot correlate to a specific DDL operation or system change. Query plans can become invalidated automatically due to large changes in statistics due to data inserts or deletes, changes made to other statements in the stored procedure, changes to any of the indexes used by the plan, or manual recompilation due to the RECOMPILE option.

As discussed in the Plan Guides section, forcing a statement (see Figure 14-11) to use a specific execution plan is not a recommended common activity. If you have access to the source code, work on a code change quickly, only using a forced plan temporarily. For systems where you have no code access, you can use a plan guide for specific performance cases, problematic queries demanding unusual plans, etc. Note that if the forced plan is invalid, such as an index changing or being dropped, SQL Server will move on without the forced plan and without a warning or error, although Query Store will still show that the plan is being forced for that statement.

Image showing query performance in the Tracked Queries dialog from 7:49 PM to 8:05 PM. There are 3 executions that take over 140 ms, and 3 that are under 50 ms. Two of the calls that took 40 ms use a forced plan, shown with a circle that has a check over it.

Figure 14-12 The Query Store has recorded the execution results of the query.

Note that one plan has been Forced (using the Force Plan button) for this statement and is displayed with a check mark.

Automatic Plan Correction

Automatic Plan Correction is a tool that is capable of detecting and reverting plan regression. For example, a commonly executed query runs in 100 milliseconds, but then starts running in 2 minutes. Instead of just waiting on the company owner’s complaints of slow performance, the engine can notice this regression and deal with it. SQL Server 2017 introduced this feature to the on-premises versions of the relational engine. The feature was originally released for the Azure SQL Database platform where the DBA is given considerably less control over tuning than what we have in SQL Server.

In SQL Server 2016, you could use Query Store to manually identify a query that regressed in performance, and then force a past execution plan into use. With Automatic Plan Correction, the database can be configured to detect plan regression and take this action automatically. The sample syntax for enabling automatic plan correction is below:

ALTER DATABASE WideWorldImporters SET AUTOMATIC_TUNING (FORCE_LAST_GOOD_PLAN = ON );

In both SQL Server 2017 and 2019, FORCE_LAST_GOOD_PLAN is the only option for automatic plan tuning.

The DMO sys.dm_db_tuning_recommendations captures plan recommendations based on query performance regression. This doesn’t happen immediately–the feature has an algorithm that requires several executions before regression is identified. When a recommendation appears in sys.dm_db_tuning_recommendations, it includes a large amount of diagnostic data, including a plain-language “reason” explanation for the recommendation to be generated, and a block of JSON data containing diagnostic information. A sample query to parse this data is available at https://docs.microsoft.com/sql/relational-databases/automatic-tuning/automatic-tuning.

Intelligent Query Processing

Intelligent Query Processing is a somewhat random set of awesome features that make the processing of queries more efficient in some manner. Some features will help to pick or adapt a query plan to current conditions. Others are just straight up changes to how the SQL Server relational engine uses long existing query constructs. In every case though, the goal is to change the way queries are processed to give you better performance without much, if any change in code. All of these features are available in SQL Server 2019, but only some will be available in 2017 (which is noted in each section).

Adaptive Query Processing

Adaptive Query Processing is a set of features that will allow SQL Server to adapt a given query to new conditions from when the query was originally compiled or last executed. The type of adaptations that can be made currently include:

  • Batch Mode Adaptive Joins – In a query that is executing in batch mode (that is, dealing with more than 1 row at a time), the query can use an adaptive join operator that lets the choice of hash or nested loops join be made after one of the inputs has been scanned. So, if it is a small, selective set, the nested loops join may work better, for example. Or if the two input sets are huge, hash match may be the better choice.

  • Memory Grant Feedback – When a query executes, it uses some amount of memory to execute. The Memory Grant Feedback feature lets future executions of the same query know if the memory granted for the execution was too much, or too little and let it adjust future executions. This was available on SQL Server 2017 for batch mode executions, and in 2019 it is available for row mode executions.

    This feature can be turned on and off for batch mode queries without breaking compatibility using the following statements:

    -- SQL Server 2017
    
    ALTER DATABASE SCOPED CONFIGURATION SET DISABLE_BATCH_MODE_MEMORY_GRANT_FEEDBACK = OFF;
    
    -- Azure SQL Database, SQL Server 2019
    
    ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_MEMORY_GRANT_FEEDBACK = ON;

    For row mode queries, this feature is controlled using:

    ALTER DATABASE SCOPED CONFIGURATION SET ROW_MODE_MEMORY_GRANT_FEEDBACK = ON;
  • Interleaved Execution – Introduced in SQL Server 2017, this feature currently helps with plans that make use of Multi-Statement Table-Valued functions (MSTVF). When a plan is generated that uses as MSTVF in the query, prior to SQL Server 2014, the number or rows returned was estimated as the literal value of 1. Subsequent versions use 100 as the estimate, but neither estimate is very good unless you are actually returning very few rows. Interleaved execution lets the optimizer execute parts of the query during optimization to get better estimates, because if your MSTVF actually is going to output 100000 rows, the plan needs to be considerably different than the 100 row estimate would have provided.

    Interleaved execution can be controlled using the following statements:

    -- SQL Server 2017
    
    ALTER DATABASE SCOPED CONFIGURATION SET DISABLE_INTERLEAVED_EXECUTION_TVF = ON; --or OFF
    
    -- Azure SQL Database and SQL Server 2019
    
    ALTER DATABASE SCOPED CONFIGURATION SET INTERLEAVED_EXECUTION_TVF = OFF; --or ON
Table Variable Deferred Compilation

Much like Interleaved Execution of MSTVFs, Table Variable Deferred Compilation seeks to deal with the lack of statistics for a table variable at compile time, the standard guess was 1 row would the number of rows contained in the table variable. This provided poorly performing plans if the programmer had stored thousands (or more) rows in the table variable.

Table variable deferred compilation, instead of using the guess of 1 to define the query plan, waits to complete the actual plan until the table variable has been loaded the first time, and then the rest of the plan is generated. This feature may be one of the best for many DBAs who have had performance problems with very large table variables, despite the fact that they sound like they should be lightweight, highly performant tools for programmers to use.

Note

Just because they have improved how table variables are optimized does not make them the best choice for large numbers of rows. Table variables still lack column statistics – a key difference between them and temp tables (prefixed with # or ##) that can make temp tables far superior.

Batch Mode on Rowstore

One of the features that was added to SQL Server along with columnstore indexes was a type of processing known as Batch Mode. Typical processing in SQL Server had always used a method that sounds terrible. Row by row, now referred to as row mode processing. But columnstore indexes were built to process large amounts of rows so it was updated to, when the heuristics told the query processor it would be worth it, work on batches of rows at a time.

In SQL Server 2019, this feature is extended to work for certain types of queries with row store tables and indexes as well as columnstore tables and indexes. A few examples:

  • Queries that use large quantities of rows in a table, often in analytical queries touching hundreds of thousands of rows.

  • Systems that are CPU bound in nature. (I/O bottlenecks are best handled with a columnstore index).

The feature is enabled in SQL Server and Azure SQL DB when the compatibility level is 150, but if you find it is harming performance, you can turn it off using ALTER DATABASE.

ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_ON_ROWSTORE = OFF;

If you want to only then allow this feature to apply to a specific query you may need to execute that touches large number of rows, you can use the query hint ALLOW_BATCH_MODE.

SELECT …

FROM …

OPTION(RECOMPILE, USE HINT('ALLOW_BATCH_MODE'));
T-SQL scalar User Defined Functions (UDF) inlining

Of all the features to make a negative impact to performance, T-SQL UDFs have generally been the worst. Every programmer who has taken any class in programming instinctively desires to modularize their code. So if you have a scenario where you want to classify some data (say something simple like CASE WHEN 1 THEN 'True' ELSE 'False' END, it makes sense to bundle this up into a coded module). So, when user defined functions were created, there were shouts of joy. Until performance was taken into consideration.

In WideWorldImporters, we create the following, extremely simply UDF to do exactly what was described.

USE WideWorldImporters;

GO

CREATE SCHEMA Tools;

GO

CREATE FUNCTION Tools.Bit_Translate

(

     @value bit

)

RETURNS varchar(5)

as

 BEGIN

     RETURN (CASE WHEN @value = 1 THEN 'True' ELSE 'False' END);

 END;

Execute this function in the following query, once in 150 (SQL Server 2019) compatibility level, and again in 140.

SET STATISTICS TIME ON;

ALTER DATABASE WideWorldImporters SET COMPATIBILITY_LEVEL = 140; --SQL Server 2017

 GO

SELECT Tools.Bit_Translate(IsCompressed) AS CompressedFlag,

            CASE WHEN IsCompressed = 1 THEN 'True' ELSE 'False' END AS CompressedFlag_Desc

 FROM   Warehouse.VehicleTemperatures;

GO

ALTER DATABASE WideWorldImporters SET COMPATIBILITY_LEVEL = 150; -- SQL Server 2019

 GO

SELECT Tools.Bit_Translate(IsCompressed) AS CompressedFlag,

            CASE WHEN IsCompressed = 1 THEN 'True' ELSE 'False' END

FROM   Warehouse.VehicleTemperatures;

On the 65,998 rows that are returned, you will likely not notice a difference in performance. Checking the output from SET STATISTICS TIME ON, on my test computer it was around 450 ms for the SQL Server 2017 compatibility level version, and 500 ms for the SQL Server 2019 compatibility level version.

Looking at the actual plan CPU used for the two executions in Figure 14-13, you can see an interesting difference.

Image shows the query plan for the two executions of the query in the two compatibility levels. The first image shows the Compute Scalar operator has each row passing through it, while the second execution in SQL Server 2019 compatibility does not

Figure 14-13 Query plan output for two runs. First in SQL Server 2019 compatibility level, the second in 2017.

The big thing to notice between these two executions is that the compute scalar in Query 2 appears as a typical Compute Scalar for any scalar expression not including a UDF. In Query 1, it shows rows passing through, and an amount of time as it calculates the scalar for each row that passes through. Even in this extremely simple case, we saved .245 seconds because instead of running the function for every execution as a separate call, it behaved as if you had written the query using the scalar expression alone.

While the function was extremely simple, without accessing tables, this is not limited to simple functions, nor does the function need to avoid accessing data in a table. There are limitations, such as not using time dependent intrinsic functions like SYSDATETIME(), not changing security context using EXECUTE AS (only EXECUTE AS CALLER, the default, is allowed), and not referencing table variables or table-valued parameters.

This feature will have immediate value for programmers who completely ignored the common advice to not use scalar UDFs, but for most, it will be more of a change where something you have long wanted to do is now perfectly acceptable. Formatting functions, translation functions where it might be easier than creating a table, are now possible and will perform very well, as opposed to destroying your performance.

Approximate Query Processing

Approximate Query Processing is the primary feature of the Intelligent Query Processing family of changes that does require code changes to work. A very costly query aggregate operation is doing something like counting distinct values. Say you execute a query such as:

SELECT COUNT(DISTINCT(CustomerID))

FROM   Sales.Invoices;

This will give you an exact number of customers that have been invoiced. If this is a very large set (which it is not in this case), then it may be a very costly query to execute, using a lot of memory. To combat this, they have created a new aggregate: APPROX_COUNT_DISTINCT, which cuts out one operation from the plan, and instead of giving you the exact answer, it provides an answer that is close enough for many operations that need to be done for analysis. So the query can be rewritten as:

SELECT APPROX_COUNT_DISTINCT(CustomerID)

FROM Sales.Invoices;

If you execute these two queries in the WideWorldImporters database, the plans to do the APPROX_COUNT_DISTINCT will actually show to be a bit more costly at 51 to 49% in the plan, as shown in Figure 14-14.

Shows the different in plan between COUNT(DISTINCT and APPROX_COUNT_DISTINCT(. The top half of the image shows that the COUNT(DISTINCT( query is 49% of the cost, and has from right to left, an IndesScan (Nonclustered) operator, then a Stream Aggregate(Aggregate) operator, another Stream Aggregate(Aggregate) operator, a Compute Scalar operator, flowing to the SELECT operator. The APPROX_COUNT_DISTINCT( version is in the lower half of the image, is 51% of the cost, and the primary difference is that it only a single Stream Aggregate (Aggregate) operator.

Figure 14-14 Plans compared for COUNT(DISTINCT()) versus APPROX_COUNT_DISTINCT.

The first query will return the exact answer 663, and the second a close value (in this case 666 was returned). Execute the queries with SET STATISTICS TIME turned on and you will see an interesting cost savings, even on this smaller set. On the test machine, over several test runs,

  • COUNT(DISTINCT())): CPU time = 15 ms, elapsed time = 38 ms

  • APPROX_COUNT_DISTINCT: CPU time = 16 ms, elapsed time = 18 ms.

Just as in previous examples, a 20 ms difference seems inconsequential, and it generally is. But when we are working with millions or billions of rows, the amount of memory used by the second query will be far less, and the time savings not inconsequential at all.

Note

While APPROX_COUNT_DISTINCT may very well provide a tremendous increase in performance on very large sets, you need to be very clear about the differences in the value returned APPROX_COUNT_DISTINCT versus the exact answer from COUNT(DISTINCT())). It is likely largely inappropriate for most reporting queries where an approximate answer may not be acceptable to your client.

However, it may be useful as a performance improvement for scenarios when counting large number of rows (in the high millions/billions) of rows where a good enough answer is acceptable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset