Chapter 47. How to use Dynamic Management Views

Glenn Berry

This chapter covers how to use Dynamic Management Views (DMVs) to easily uncover detailed information about your SQL Server workload that can help you diagnose and correct many common SQL Server performance issues. This type of information is extremely valuable, especially in a high-volume production online transaction processing (OLTP) situation, where poor database performance can have a severe impact on your business.

Why should I use DMV queries?

Dynamic Management Views (and Dynamic Management Functions) were introduced in SQL Server 2005, and enhanced in SQL Server 2008. Back in the pre–SQL Server 2005 days, it was much more difficult to find out what was happening inside of SQL Server when you encountered performance issues. Older versions of SQL Server were more like a “black box,” which made you work much harder to find bottlenecks. Fortunately, tracking down bottlenecks is much easier now.

In the past, many DBAs would use SQL Server Profiler to capture performance information, or they’d use tools such as Performance Monitor to get a high-level view of what was happening on a particular database server. One problem with using SQL Server Profiler was the significant load that it could add to an already busy system (although this could be minimized if you used some care in how you configured and captured the trace). Performance Monitor is useful, but it doesn’t let you drill down to the same level of SQL Server–specific detail that’s possible with DMV queries.

Setting up security to run DMV queries

In order to run most DMV queries that are server-scoped, you’ll need VIEW SERVER STATE permission on the server, whereas database-specific DMV queries require VIEW DATABASE STATE permission for the database. It’s useful to remember that VIEW DATABASE STATE permission for all databases is implied by VIEW SERVER STATE permission.

A common best practice is to create a specific login that you use only for monitoring, add that login as a database user, and then grant VIEW SERVER STATE permission to that login. For example, listing 1 creates a login called SystemMonitor.

Listing 1. Creating a monitoring login with view server state permission
USE master
GO

CREATE LOGIN SystemMonitor WITH PASSWORD=N'RubyRoxy456#',
DEFAULT_DATABASE=[master], CHECK_EXPIRATION=OFF,
CHECK_POLICY=ON;
GO

USE AdventureWorks
GO

CREATE USER SystemMonitor FOR LOGIN SystemMonitor;
GO

USE master
GO

GRANT VIEW SERVER STATE TO SystemMonitor;
GO

This approach is preferable to using a higher-permission login (such as sa) for this purpose. Once you’re set up to monitor SQL Server 2005/2008 with DMVs, what should you be looking at?

Looking at top waits at the instance level

One of the first DMV queries that I like to run on a busy or unfamiliar database server looks at top waits for the server instance. This gives me a much better idea where to focus the rest of my information gathering, because I’d run different sets of DMV queries to further investigate CPU pressure, I/O pressure, memory pressure, index usage, and so on. The top waits DMV query is shown in listing 2.

Listing 2. Top waits query
  -- Isolate top waits for server instance
WITH Waits AS
(SELECT wait_type, wait_time_ms / 1000. AS wait_time_s,
100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct,
ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN( 'SLEEP_TASK', 'BROKER_TASK_STOP',
'SQLTRACE_BUFFER_FLUSH', 'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT',
'LAZYWRITER_SLEEP')) -- filter out additional irrelevant waits
SELECT W1.wait_type,
CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s,
CAST(W1.pct AS DECIMAL(12, 2)) AS pct,
CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct
FROM Waits AS W1
INNER JOIN Waits AS W2
ON W2.rn <= W1.rn
GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct
HAVING SUM(W2.pct) - W1.pct < 95; -- percentage threshold

This query rolls up the top 95 percent of cumulative waits, grouped by wait type since SQL Server 2005/2008 was last restarted, unless you’ve manually cleared the wait stats since then. The wait types that show up at the top of this query give you a good idea what to look at in more detail. For example, if you saw WRITELOG and ASYNCH_IO_COMPLETION as your top wait types, this would be a definite clue that you’re seeing I/O bottlenecks. If you see a top wait type of SOS_SCHEDULER_YIELD, that indicates CPU pressure.

SQL Server 2005 has more than 200 different wait types, whereas SQL Server 2008 has more than 400 different wait types. You can get a fairly complete list of the wait types and what they mean by searching for sys.dm_os_wait_stats in SQL Server Books Online (BOL).

You can clear the wait stats by running this command:

DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

I wish you could just as easily reset other DMV measurements with a simple command, but unfortunately that’s not the case. Perhaps Microsoft will add that functionality in a future release of SQL Server. In the meantime, many DBAs like to capture DMV metrics in permanent tables or use the new Performance Data Warehouse feature in SQL Server 2008.

Looking for CPU pressure and what’s causing it

One common performance issue you might run into with SQL Server is CPU pressure. Even with modern, multi-core CPUs, this can still be a problem. How can you detect whether your SQL Server is under CPU pressure? An initial quick and dirty method is to look at the Performance tab in Task Manager, and look for sustained periods above 95 percent CPU utilization.

Going a little deeper, I like to monitor Processor\%Processor Time and SystemContext Switches/sec in Performance Monitor to get a more complete high-level impression of CPU pressure.

Finally, drilling into SQL Server itself, here are two DMV queries that you can run to get a feel for what’s going on with the server, CPU-wise. The query in listing 3 is a good measure of CPU pressure. If the signal waits are above 20 percent, that’s a strong indicator of CPU pressure.

Listing 3. Checking CPU pressure
-- Total waits are wait_time_ms
-- (high signal waits indicates CPU pressure)
SELECT SUM(signal_wait_time_ms) AS 'signal_wait_time_ms',
CAST(100.0 * SUM(signal_wait_time_ms)/
SUM (wait_time_ms)AS NUMERIC(20,2)) AS '%signal (cpu) waits',
SUM(wait_time_ms - signal_wait_time_ms) AS 'resource_wait_time_ms',
CAST(100.0 * SUM(wait_time_ms - signal_wait_time_ms) /
SUM (wait_time_ms) AS NUMERIC(20,2)) AS '%resource waits'
FROM sys.dm_os_wait_stats

You should be aware that the results of this query are cumulative since SQL Server was started or since the statistics were cleared using this command:

DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

The query in listing 4 is another confirmation of CPU pressure. If the runnable_tasks_count is above single digits, you’re likely seeing some CPU pressure (although it could be something else, such as severe blocking).

Listing 4. Checking the runnable tasks count
 -- Check SQL Server Schedulers to see if they are waiting on CPU
SELECT scheduler_id, current_tasks_count, runnable_tasks_count
FROM sys.dm_os_schedulers
WHERE scheduler_id < 255

Once you have some solid evidence that you’re experiencing CPU pressure, you’ll want to start narrowing your focus to see what part of your work load is causing the most CPU pressure. The query in listing 5 will show you the cached stored procedures that are the most expensive from a CPU perspective.

Listing 5. Finding the most expensive stored procedures
-- Get Top 20 executed SP's ordered by total worker time
-- (CPU pressure)
SELECT TOP 20 qt.text AS 'SP Name',
qs.total_worker_time AS 'TotalWorkerTime',
qs.total_worker_time/qs.execution_count AS 'AvgWorkerTime',
qs.execution_count AS 'Execution Count',
ISNULL(qs.execution_count/DATEDIFF(Second, qs.creation_time,
GetDate()), 0) AS 'Calls/Second',
ISNULL(qs.total_elapsed_time/qs.execution_count, 0)
AS 'AvgElapsedTime',
qs.max_logical_reads, qs.max_logical_writes,
DATEDIFF(Minute, qs.creation_time, GetDate()) AS 'Age in Cache'
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
WHERE qt.dbid = db_id() -- Filter by current database
ORDER BY qs.total_worker_time DESC

One thing to pay attention to when you look at the results from the total worker time query is the Age in Cache column, which shows how long the plan for that stored procedure has been in the cache. If the query has recently been recompiled for any reason, it’ll clear the plan out of cache, which will obviously affect the total worker time and skew the query results. Total worker time is the overall CPU cost of a particular query since it has been in cache. Depending on your workload, you might have queries that are relatively inexpensive for individual executions but are called very frequently (which makes them expensive in aggregate), or you may have individual queries that are more expensive CPU-wise, but are not called as often. Looking at total worker time is a reliable method for finding the most expensive queries from an overall CPU perspective.

Another similar DMV query, shown in listing 6, sorts by average worker time. This will let you find expensive CPU queries that may be easier to improve at the database level with standard database tuning techniques.

Listing 6. Finding expensive stored procedures, sorted by average worker time
-- Get Top 20 executed SP's ordered by Avg worker time (CPU pressure)
SELECT TOP 20
qt.text AS 'SP Name', qs.total_worker_time/qs.execution_count
AS 'AvgWorkerTime',
qs.total_worker_time AS 'TotalWorkerTime', qs.execution_count
AS 'Execution Count',
ISNULL(qs.execution_count/
DATEDIFF(Second, qs.creation_time, GetDate()), 0)
AS 'Calls/Second',
ISNULL(qs.total_elapsed_time/qs.execution_count, 0)
AS 'AvgElapsedTime',
qs.max_logical_reads, qs.max_logical_writes,
DATEDIFF(Minute, qs.creation_time, GetDate())
AS 'Age in Cache'
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
WHERE qt.dbid = db_id() -- Filter by current database
ORDER BY qs.total_worker_time/qs.execution_count DESC

Finding I/O pressure in SQL Server

Most large-scale SQL Server 2005/2008 deployments sooner or later run into I/O bottlenecks. This happens for several reasons. First, systems engineers often just think about CPU and RAM when sizing “big” database servers, and neglect the I/O subsystem.

Second, many DBAs are unable to completely tune the SQL Server workload to minimize excessive I/O requirements. Finally, there are often budgetary issues that prevent the acquisition of enough I/O capacity to support a large workload.

Whatever your situation, it helps if you know how recognize and measure signs of I/O pressure on SQL Server 2005/2008. One thing you can do to help reduce I/O pressure in general is to make sure you’re not under memory pressure, which will cause added I/O pressure. We’ll look at how to detect memory pressure a little later.

For large SQL Server 2005/2008 deployments, you should make sure you’re running a 64-bit edition of SQL Server (so you can better use the RAM that you have), and you should try to get as much RAM as you can afford or will fit into the database server. Having sufficient RAM installed will reduce I/O pressure for reads, and will allow SQL Server to issue checkpoints less frequently (which will tend to minimize write I/O pressure).

The DMV queries in listings 7 through 11 are useful for measuring signs of I/O pressure.

You’ll want to run the query in listing 7 multiple times, because the results will rapidly change on a busy system. Don’t get too excited by a single high number. If you see consistently high numbers over time, then you have evidence of I/O pressure.

Listing 7. Checking for I/O pressure
-- Check for Pending I/O (lower is better)
SELECT pending_disk_io_count
FROM sys.dm_os_schedulers

The query in listing 8 can help you identify which data and log files are causing the highest I/O waits. For example, perhaps you have a transaction log file on a slower RAID 5 array or LUN (which isn’t a good idea). This query will help prove that the log file is causing user waits.

Listing 8. Identifying the highest I/O waits
-- Avg I/O Stalls (Lower is better)
SELECT database_id, file_id , io_stall_read_ms, num_of_reads,
CAST(io_stall_read_ms/(1.0 + num_of_reads)
AS NUMERIC(10,1)) AS 'avg_read_stall_ms',
io_stall_write_ms, num_of_writes,
CAST(io_stall_write_ms/(1.0 + num_of_writes) AS NUMERIC(10,1))
AS 'avg_write_stall_ms',
io_stall_read_ms + io_stall_write_ms
AS io_stalls, num_of_reads + num_of_writes AS total_io,
CAST((io_stall_read_ms + io_stall_write_ms)
/(1.0 + num_of_reads + num_of_writes) AS NUMERIC(10,1))
AS 'avg_io_stall_ms'
FROM sys.dm_io_virtual_file_stats(null,null)
-- This can be filtered by database and file id
ORDER BY avg_io_stall_ms DESC

The query in listing 9 lets you focus on the read/write activity for each file in a particular database. It shows you the percentage of reads and writes, both in number of reads and writes and in actual bytes read and written. This can help you analyze and size your disk I/O subsystem.

Listing 9. Checking I/O statistics for a database
-- I/O Statistics for a single database
SELECT
file_id
, num_of_reads
, num_of_writes
, (num_of_reads + num_of_writes) AS 'Writes + Reads'
, num_of_bytes_read
, num_of_bytes_written
, CAST(100. * num_of_reads/(num_of_reads + num_of_writes)
AS DECIMAL(10,4)) AS '# Reads Pct'
, CAST(100. * num_of_writes/(num_of_reads + num_of_writes)
AS DECIMAL(10,4)) AS '# Write Pct'
, CAST(100. * num_of_bytes_read
/(num_of_bytes_read + num_of_bytes_written)
AS DECIMAL(10,4)) AS 'Read Bytes Pct'
, CAST(100. * num_of_bytes_written
/(num_of_bytes_read + num_of_bytes_written)
AS DECIMAL(10,4)) AS 'Written Bytes Pct'
FROM sys.dm_io_virtual_file_stats(DB_ID(N'yourdatabasename'), NULL);

The query in listing 10 will help you find the stored procedures that are causing the most physical read I/O pressure (which means that the data has to be read from your disk subsystem instead of being found in memory). Obviously, adding more RAM to the server will help here, but standard query and index tuning can make a big difference also.

Listing 10. Locating physical read I/O pressure
-- Get Top 20 executed SP's ordered by physical reads (read I/O pressure)
SELECT TOP 20 qt.text AS 'SP Name', qs.total_physical_reads, qs.total_physical_reads/qs.execution_count AS 'Avg Physical Reads',
qs.execution_count AS 'Execution Count',
qs.execution_count/DATEDIFF(Second, qs.creation_time, GetDate()) AS 'Calls/Second',
qs.total_worker_time/qs.execution_count AS 'AvgWorkerTime',
qs.total_worker_time AS 'TotalWorkerTime',
qs.total_elapsed_time/qs.execution_count AS 'AvgElapsedTime',
qs.max_logical_reads, qs.max_logical_writes,
DATEDIFF(Minute, qs.creation_time, GetDate()) AS 'Age in Cache'
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
WHERE qt.dbid = db_id() -- Filter by current database
ORDER BY qs.total_physical_reads DESC

The query in listing 11 will help you find the stored procedures that are causing the most write activity. Query and index tuning can help here. You can also talk to your developers about middle-tier write caching or other application changes to reduce writes if possible.

Listing 11. Finding stored procedures with the most write activity
-- Get Top 20 executed SP's ordered by logical writes/minute (write I/O pressure)
SELECT TOP 20 qt.text AS 'SP Name', qs.total_logical_writes, qs.total_logical_writes/qs.execution_count AS 'AvgLogicalWrites',
qs.total_logical_writes/DATEDIFF(Minute, qs.creation_time, GetDate()) AS 'Logical Writes/Min',
qs.execution_count AS 'Execution Count',
qs.execution_count/DATEDIFF(Second, qs.creation_time, GetDate()) AS 'Calls/Second',
qs.total_worker_time/qs.execution_count AS 'AvgWorkerTime',
qs.total_worker_time AS 'TotalWorkerTime',
qs.total_elapsed_time/qs.execution_count AS 'AvgElapsedTime',
qs.max_logical_reads, qs.max_logical_writes, qs.total_physical_reads,
DATEDIFF(Minute, qs.creation_time, GetDate()) AS 'Age in Cache',
qs.total_physical_reads/qs.execution_count AS 'Avg Physical Reads'
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
WHERE qt.dbid = db_id() -- Filter by current database
ORDER BY qs.total_logical_writes DESC

SQL Server memory pressure

SQL Server 2005/2008 loves memory, the more the better. If you’re running on a 64-bit version of Windows Server 2003 or newer, with a 64-bit Enterprise Edition version of SQL Server 2005/2008, you can and should take advantage of as much memory as will fit in your server. In the commodity-level server space, the sweet spot for installed memory has moved up from 32 GB to 64 GB to 128 GB over the past several years, which makes it even more important to make sure you’re running a 64-bit version of SQL Server.

Once you’ve loaded up your 64-bit database server with as much memory as possible, it’s important that you do two things to make sure that SQL Server 2005/2008 will play nicely with all of the available memory. First, you should grant the Lock Pages in Memory Windows right (using gpedit.msc) to the SQL Server Service account. Second, you should set the Max Server Memory setting in SQL Server to a value that will leave sufficient available memory for the operating system (and anything else that’s running on the server) while SQL Server is under a load. This is typically anywhere from about 2 GB to 4 GB available, depending on how much RAM you have installed and what else is running on the server.

After you have SQL Server properly configured for memory, you’ll want to monitor how SQL Server is handling the memory that it has to work with. With Performance Monitor, I like to keep track of these three counters:

  • SQL ServerMemory ManagerMemory Grants Pending (lower is better)
  • SQL ServerBuffer ManagerBuffer Cache Hit Ratio (higher is better)
  • SQL ServerBuffer ManagerPage Life Expectancy (higher is better)

The trend for these values is more important than their absolute value at any given time. If the average Page Life Expectancy is dropping over time, that’s significant (and not in a good way). Still, people typically want to know what values are “good” and what values are “bad” for these counters, so here’s what I look for:

  • Memory Grants Pending above 1 is bad
  • Buffer Cache Hit Ratio below 95 percent is bad
  • Page Life Expectancy below 300 is bad

There are many good DMV queries that give you much more detail about how SQL Server 2005/2008 is using memory. We’ll cover some of the more useful ones. Listing 12 lists the top 10 memory consumers in your buffer pool.

Listing 12. Top 10 consumers of memory from buffer pool
SELECT TOP (10) type, SUM(single_pages_kb) AS [SPA Mem, KB]
FROM sys.dm_os_memory_clerks
GROUP BY type
ORDER BY SUM(single_pages_kb) DESC

Here are some of the common types you’ll see when you run the query in listing 12:

  • CACHESTORE_SQLCP—SQL plans (dynamic or prepared SQL)
  • CACHESTORE_OBJCP—Object plans (stored procedures, functions, and triggers)
  • CACHESTORE_PHDR—Bound Trees
  • USERSTORE_TOKENPERM—The User and Token permissions cache that caused so many performance issues with early builds of SQL Server 2005

In listing 13, we’ll obtain the use counts for each query plan.

Listing 13. Getting query mix and use counts for each plan
SELECT usecounts, cacheobjtype, objtype, bucketid
FROM sys.dm_exec_cached_plans AS cp
WHERE cacheobjtype = 'Compiled Plan'
ORDER BY objtype, usecounts DESC

You want to see plans with high use counts. Avoiding ad hoc queries with concatenated WHERE clauses can help here. You can also take advantage of table valued parameters in SQL Server 2008. Using the new Optimize for Ad Hoc Workloads instance setting in SQL Server 2008 is also beneficial here.

The query in listing 14 will tell you which tables and indexes are taking up the most buffer space.

Listing 14. Finding indexes and tables that use the most buffer space
-- Breaks down buffers by object (table, index) in the buffer cache
SELECT OBJECT_NAME(p.object_id) AS 'ObjectName', p.object_id,
p.index_id, COUNT(*)/128 AS 'buffer size(MB)', COUNT(*)
AS 'buffer_count'
FROM sys.allocation_units AS a
INNER JOIN sys.dm_os_buffer_descriptors AS b
ON a.allocation_unit_id = b.allocation_unit_id
INNER JOIN sys.partitions AS p
ON a.container_id = p.hobt_id
WHERE b.database_id = db_id()
GROUP BY p.object_id, p.index_id
ORDER BY buffer_count DESC

In listing 15, we’ll find the largest ad hoc queries sitting in the plan cache.

Listing 15. Finding ad hoc queries that are bloating the plan cache
SELECT TOP(100) [text], size_in_bytes
FROM sys.dm_Exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE cacheobjtype = 'Compiled Plan'
AND objtype = 'Adhoc' AND usecounts = 1
ORDER BY size_in_bytes DESC

Again, using the new Optimize for Ad Hoc Workloads instance setting in SQL Server 2008 can really help if you have problems here.

The query in listing 16 will show you your 25 most expensive queries from a logical reads perspective (which equates to memory pressure).

Listing 16. Finding your 25 most expensive queries
-- Get Top 25 executed SP's ordered by logical reads (memory pressure)
SELECT TOP 25 qt.text AS 'SP Name', total_logical_reads,
qs.execution_count AS 'Execution Count', total_logical_reads/qs.execution_count AS 'AvgLogicalReads', qs.execution_count/DATEDIFF(Second, qs.creation_time, GetDate()) AS 'Calls/Second',
qs.total_worker_time/qs.execution_count AS 'AvgWorkerTime',
qs.total_worker_time AS 'TotalWorkerTime',
qs.total_elapsed_time/qs.execution_count AS 'AvgElapsedTime',
qs.total_logical_writes,
qs.max_logical_reads, qs.max_logical_writes, qs.total_physical_reads,
DATEDIFF(Minute, qs.creation_time, GetDate()) AS 'Age in Cache'
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
WHERE qt.dbid = db_id() -- Filter by current database
ORDER BY total_logical_reads DESC

The query in listing 17 will help you find tables with the most reads. (User scans are much more expensive than user seeks or lookups.)

Listing 17. Finding tables with the most reads
SELECT object_name(s.object_id) AS 'Tablename',
SUM(user_seeks) AS 'User Seeks', SUM(user_scans) AS 'User Scans',
SUM(user_lookups)AS 'User Lookups',
SUM(user_seeks + user_scans + user_lookups)AS 'Total Reads',
SUM(user_updates) AS 'Total Writes'
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = db_id()
GROUP BY object_name(s.object_id)
ORDER BY 'Total Reads' DESC

The query in listing 18 will help you find tables with the most writes.

Listing 18. Finding tables with the most writes
SELECT object_name(s.object_id) AS 'Tablename',
SUM(user_updates) AS 'Total Writes',
SUM(user_seeks) AS 'User Seeks', SUM(user_scans) AS 'User Scans',
SUM(user_lookups)AS 'User Lookups',
SUM(user_seeks + user_scans + user_lookups)AS 'Total Reads'
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = db_id()
GROUP BY object_name(s.object_id)
ORDER BY 'Total Writes' DESC

SQL Server index usage

As you’re probably aware, having proper indexes in place to support your workload is critical with SQL Server 2005/2008 (as with any relational database). Generally speaking, you’ll want more indexes with a reporting or DSS workload, and fewer indexes with an OLTP workload. Regardless of your workload type, you should be aware of whether your indexes are being used and whether you’re missing any indexes that would be useful for SQL Server.

In the dark ages before SQL Server 2005, it was difficult to discover this critical information, but with DMV queries, you can easily discover what’s going on with your indexes. You can find indexes that aren’t being used and you can find missing indexes. As Microsoft’s Rico Mariani says, “If you aren’t measuring, you’re not engineering.”

The DMV queries that you see in this section will tell you this information. If you see an index that has millions of writes, with zero or very few reads, then that means that you’re getting little to no benefit from maintaining the index, and you should strongly consider dropping that index. As the number of reads goes up, it becomes more of a judgment call. That’s why being familiar with your workload is important.

One caveat with the “missing index” query is that it can return results based on ad hoc queries or maintenance job–related work that can make it harder to interpret. You always want to look at the last_user_seek and the user_seeks columns to see the last time and how often SQL Server thinks it wants the index that it thinks is “missing.” If you see a row with a high index advantage with a last_user_seek from a few seconds or minutes ago, it’s probably from your regular workload, so you probably want to seriously consider adding that index. You should also be aware that this query won’t recommend adding any clustered indexes.

One feature I’ve discovered over time is that if you add a new index of any sort to a table, or if you delete an index, it will clear out all of the missing index stats for that table. This may lead you to believe that there are no more missing indexes on the table, which is probably not true. Wait a little while, and then run the missing index query again to confirm whether there are any more missing indexes for the table.

The DMV queries in listings 19 through 21 will show bad indexes and missing indexes, and then let you concentrate on an individual table to determine whether you should make any index changes for that table based on your workload.

Listing 19. Finding bad indexes
-- Possible Bad Indexes (writes > reads)
SELECT object_name(s.object_id) AS 'Table Name',
i.name AS 'Index Name', i.index_id,
user_updates AS 'Total Writes',
user_seeks + user_scans + user_lookups AS 'Total Reads',
user_updates - (user_seeks + user_scans + user_lookups) AS 'Difference'
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = db_id() -- Filter for current database
AND user_updates > (user_seeks + user_scans + user_lookups )
ORDER BY 'Difference' DESC, 'Total Writes' DESC, 'Total Reads' ASC;

Indexes that have many more writes than reads are possible candidates for elimination. You should be aware that user scans are much more expensive than user seeks or user lookups. If you see lots of user scans, you may be missing some important indexes or you could be doing lots of aggregate queries or have small tables.

Listing 20. Looking at Index Advantage to find missing indexes
-- Missing Indexes by Index Advantage (make sure to also look at last user seek time)
SELECT user_seeks * avg_total_user_cost * (avg_user_impact * 0.01) AS index_advantage, migs.last_user_seek,
mid.statement AS 'Database.Schema.Table',
mid.equality_columns, mid.inequality_columns, mid.included_columns,
migs.unique_compiles, migs.user_seeks, migs.avg_total_user_cost, migs.avg_user_impact
FROM sys.dm_db_missing_index_group_stats AS migs WITH (NOLOCK)
INNER JOIN sys.dm_db_missing_index_groups AS mig WITH (NOLOCK)
ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS mid WITH (NOLOCK)
ON mig.index_handle = mid.index_handle
ORDER BY index_advantage DESC;
Listing 21. Looking at Last User Seek to find missing indexes
-- Missing Indexes by Last User Seek
SELECT user_seeks * avg_total_user_cost * (avg_user_impact * 0.01) AS index_advantage,
migs.last_user_seek, mid.statement AS 'Database.Schema.Table',
mid.equality_columns, mid.inequality_columns, mid.included_columns,
migs.unique_compiles, migs.user_seeks, migs.avg_total_user_cost, migs.avg_user_impact
FROM sys.dm_db_missing_index_group_stats AS migs WITH (NOLOCK)
INNER JOIN sys.dm_db_missing_index_groups AS mig WITH (NOLOCK)
ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS mid WITH (NOLOCK)
ON mig.index_handle = mid.index_handle
ORDER BY migs.last_user_seek DESC;

After running the DMV queries in listings 20 and 21, you may see the same table showing up with multiple missing indexes. If I see this, I start to examine that table more closely.

Once you’ve narrowed your focus to a particular table, you can gather more specific index information about that table with the queries in listings 22 and 23.

Listing 22. Getting statistics for a table
-- Index Read/Write stats for a single table
SELECT object_name(s.object_id) AS 'TableName',
i.name AS 'IndexName', i.index_id,
SUM(user_seeks) AS 'User Seeks', SUM(user_scans) AS 'User Scans',
SUM(user_lookups)AS 'User Lookups',
SUM(user_seeks + user_scans + user_lookups)AS 'Total Reads',
SUM(user_updates) AS 'Total Writes'
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = db_id()
AND object_name(s.object_id) = 'YourTableName'
GROUP BY object_name(s.object_id), i.name, i.index_id
ORDER BY 'Total Writes' DESC, 'Total Reads' DESC;
Listing 23. Missing indexes for a single table
SELECT user_seeks * avg_total_user_cost * (avg_user_impact * 0.01) AS index_advantage, migs.last_user_seek,
mid.statement AS 'Database.Schema.Table',
mid.equality_columns, mid.inequality_columns, mid.included_columns,
migs.unique_compiles, migs.user_seeks, migs.avg_total_user_cost, migs.avg_user_impact
FROM sys.dm_db_missing_index_group_stats AS migs WITH (NOLOCK)
INNER JOIN sys.dm_db_missing_index_groups AS mig WITH (NOLOCK)
ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS mid WITH (NOLOCK)
ON mig.index_handle = mid.index_handle
WHERE statement = '[databasename].[dbo].[yourtablename]' -- Specify one table
ORDER BY index_advantage DESC;
-- Show existing indexes for this table (does not show included columns)
EXEC sp_HelpIndex 'yourtablename';

As you consider making index changes to a large, busy table, you need to consider your workload characteristics. You should be more reluctant to add additional indexes if you have an OLTP workload. You should take advantage of online index operations whenever possible if you’re running Enterprise Edition in order to avoid locking and blocking issues during index builds. You also should consider using the MAXDOP option during index builds to prevent SQL Server from using all of the CPU cores for an index build. This may mean that the index takes longer to build, but the rest of your workload won’t be starved for CPU during the index-creation process.

Detecting blocking in SQL Server

Especially with OLTP workloads, you may run into blocking issues with SQL Server. Listing 24 shows one quick DMV query that’s a good, indirect early warning signal of blocking.

Listing 24. Checking SQL Server schedulers to see if you may have blocking
SELECT scheduler_id, current_tasks_count, runnable_tasks_count
FROM sys.dm_os_schedulers
WHERE scheduler_id < 255

If you see the current_tasks_count above 10 for a sustained period, it’s likely that you have blocking. When you have severe blocking, the current_tasks_count for each scheduler tends to go up quickly. The two DMV queries in listings 25 and 26 will give you more direct confirmation of blocking.

Listing 25. Detecting blocking
SELECT blocked_query.session_id AS blocked_session_id,
blocking_query.session_id AS blocking_session_id,
sql_text.text AS blocked_text, sql_btext.text AS blocking_text,
waits.wait_type AS blocking_resource
FROM sys.dm_exec_requests AS blocked_query
INNER JOIN sys.dm_exec_requests AS blocking_query
ON blocked_query.blocking_session_id = blocking_query.session_id
CROSS APPLY
(SELECT * FROM sys.dm_exec_sql_text(blocking_query.sql_handle))
AS sql_btext
CROSS APPLY
(SELECT * FROM sys.dm_exec_sql_text(blocked_query.sql_handle)) AS sql_text
INNER JOIN sys.dm_os_waiting_tasks AS waits
ON waits.session_id = blocking_query.session_id
Listing 26. Detecting blocking (a more accurate and complete version)
SELECT t1.resource_type AS 'lock type',db_name(resource_database_id) AS 'database',
t1.resource_associated_entity_id AS 'blk object',t1.request_mode AS 'lock req', --- lock requested
t1.request_session_id AS 'waiter sid', t2.wait_duration_ms AS 'wait time',
(SELECT [text] FROM sys.dm_exec_requests AS r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle)
WHERE r.session_id = t1.request_session_id) AS 'waiter_batch',
(SELECT substring(qt.text,r.statement_start_offset/2,
(CASE WHEN r.statement_end_offset = -1
THEN LEN(CONVERT(nvarchar(max), qt.text)) * 2
ELSE r.statement_end_offset END - r.statement_start_offset)/2)
FROM sys.dm_exec_requests AS r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS qt
WHERE r.session_id = t1.request_session_id) AS 'waiter_stmt',
t2.blocking_session_id AS 'blocker sid',
(SELECT [text] FROM sys.sysprocesses AS p
CROSS APPLY sys.dm_exec_sql_text(p.sql_handle)
WHERE p.spid = t2.blocking_session_id) AS 'blocker_stmt'
FROM sys.dm_tran_locks AS t1
INNER JOIN sys.dm_os_waiting_tasks AS t2
ON t1.lock_owner_address = t2.resource_address

The query in listing 27 will give you additional useful information about locks and blocking.

Listing 27. Looking at locks that are causing problems
SELECT t1.resource_type, t1.resource_database_id,
t1.resource_associated_entity_id,t1.request_mode,
t1.request_session_id,t2.blocking_session_id
FROM sys.dm_tran_locks AS t1
INNER JOIN sys.dm_os_waiting_tasks AS t2
ON t1.lock_owner_address = t2.resource_address;

Summary

The queries that I’ve shown you in this chapter are just the beginning. If you have responsibility for production database servers and databases, you really should be putting together a library of these types of DMV queries that you can use, both in normal usage and during a crisis. I have an extensive and growing collection of DMV queries that I’ve written and found over the past several years, and they’ve been extremely valuable for detecting and correcting performance problems.

DMV queries have much less server impact than SQL Profiler traces, and they provide much more granular SQL Server–specific detail than Performance Monitor traces. They also give you more control and detail than SQL Server Activity Monitor. DMV queries are oriented toward gathering snapshot-type information about current activity and aggregate activity since SQL Server was last started. As such, they’re less useful for historical reporting and trend analysis unless you take steps to write their output to permanent tables.

You can easily set up a dedicated SQL Server login that your operations staff can use to run these queries to do first-level troubleshooting on a database server or database. It’s easy to convert any of these queries to stored procedures, and then call them from an application. DMV queries were one of the more useful features added in SQL Server 2005. If you use them wisely, you can quickly and easily detect and diagnose many SQL Server ailments, truly becoming Dr. DMV.

About the author

Glenn works as a database architect at NewsGator Technologies in Denver, Colorado. He’s a SQL Server MVP, and he has a whole collection of Microsoft certifications, including MCITP, MCDBA, MCSE, MCSD, MCAD, and MCTS, which proves that he likes to take tests.

He is also an adjunct faculty member at University College, University of Denver, where he’s been teaching since 2000.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset