Shaun J Stuart

Just another SQL Server weblog

You are an awesome DBA. You have complete control over your dominion. You have all your backups automated and the testing of those backups is automated. You have alerts properly configured on all your servers. You have your index maintenance plans running regularly. You ride a unicorn to the office and George Hamilton asks you for suntanning tips. You are da bomb.

Ola's morning commute

Ola's morning commute

When everything is running smoothly, it can be easy to lose sight of the little details because they are being handled automatically by your routines. But if you take the time to peek under the covers, you may discover some wasteful behavior that your perfect maintenance routines are camouflaging.

I'm specifically talking about index defragmentation. I use Ola Hallengren's index maintenance script because, well, why wouldn't I? It rocks. Ola doesn't ride a unicorn to the office - he rides a flying unicorn. Because his routine works so well, it can be easy to just set it up and forget about it. I was surfing the web the other day and came across this blog post by Tim Ford and I immediately realized he was writing about my situation. Although I had not spent the time to investigate, I just knew I had databases whose indexes are stuck in an endless cycle of defragmenting. They are in use for a week, they get defragged over the weekend, they get used the next week, they get defragged again, etc. Rinse and repeat. This is a waste of resources. Index rebuilds or reorgs generate a lot of transaction log entries, for one thing. It also chews up time in your maintenance window that could be used for other tasks.

So how can you stop this? One way is to change the fill factor on your indexes. By default, SQL Server tries to completely fill index pages and that can lead to heavy fragmentation as rows get modified and inserted. But if you set your fill factor to a lower percentage than the default 0 (which is equivalent to 100%), SQL will leave some space on the pages and then it may have more room to fit newly inserted or modified rows onto the page. Less page splits = less fragmentation.

But you probably don't want to change the fill factor on all your indexes across the board. Specifically targeting the indexes that are most frequently defragmented will allow you to only make changes where you will reap the most benefit.

And this is what Tim's code helps you do. It will scan the log table that Ola's index maintenance routine creates and will identify indexes that are frequently rebuilt or reorganized. I've taken Tim's script and made some additions to it. I've turned it into a stored procedure that you can run that will send out an email it if finds any indexes that are being defragged often. It will also look at the current fill factor and create a rebuild statement to rebuild the index with a lower fill factor to help reduce fragmentation. I've parameterized most everything. Here's how it works:

You call the routine and specify several parameters:

  • @DatabaseToInvestigate - This is the database you want to scan for frequently defragged indexes
  • @CommandLogDatabase - The database where the CommandLog table from Ola's routine is located
  • @CountThreshold - Indexes that are defragged equal to or more than this number of times are flagged as being excessively defragged
  • @OverRebuildThresholdNewFF - This is the new fill factor recommendation for indexes that are being rebuilt more than @CountThreshold number of time. It is only used when the current fill factor is 0 or 100.
  • @OverReorgThresholdNewFF - This is the new fill factor recommendation for indexes that are being reorganized more than @CountThreshold number of time. It is only used when the current fill factor is 0 or 100.
  • @MailProfileName - the name of the mail profile this routine can use to send out the email notification
  • @MailRecipients - the list of email addressed the notification will be sent to. Must be in the form xxx@xxx.xxx. Multiple address should be separated with a semicolon

The procedure works by scanning the CommandLog table for indexes that are either rebuilt or reorganized more than @CounterThreshold times. If it finds any, it checks the current fill factor of those indexes. If the fill factor is 0 or 100, it will create an ALTER INDEX statement with the new fill factor specified in the @OverRebuildThresholdNewFF or @OverReorgThresholdNewFF parameter. (This is because indexes that are reorganized typically have a lower fragmentation value than those that are rebuilt, so they may not need to have their fill factor adjusted quite so much.) If the current fill factor is not 0 or 100, it will create an ALTER INDEX statement where the fill factor is 10% less than the current value. IT WILL NOT CREATE A STATEMENT WITH A FILL FACTOR LESS THAN 50!

If you have indexes that are being repeatedly defragged and the fill factor is set to 50% or less, you've got a heck of a busy table. That may be perfectly normal and a lower fill factor may be exactly what is needed, but I wanted to raise a red flag in this case and bring this to your attention rather than just provide some code that might get run without being looked at. To that end, if the procedure does find this case, it will add some additional text to the email bringing this to your attention and will list the tables and indexes involved.

One very important note on running this procedure: You cannot just mindlessly plug values into these parameters and run the code whenever you want. You need to take into account how often you perform index maintenance and how long you retain records in the CommandLog table. These two factors will determine how you pick your @CountThreshold value and how often you run this routine.

For example, suppose you perform index maintenance each weekend and you retain 30 days worth of log records. In this case, you'd probably choose @CountThreshold to be 3 or 4 and you'd run this procedure once every 30 days. Because you are only retaining 30 days worth of records in the log table, the maximum possible amount of log entries an index could have would be 4. (Typically. You may get 5 depending on how your run schedule coincides with your log purging schedule in months that have 5 weekends, such as November, 2013)).  You only want to run the procedure again after all your logs have had a chance to roll out of the table since the last time this procedure was run. Why is this?

Let's look at what would happen if you defrag weekly, keep 30 days of log records and you run this routine daily. The first time you run it, it may tell you to reduce the fill factor of some indexes. You do that. Then the next day, the routine runs again. But your defrag routine has not run again and you still have all those old log records in your log table. This routine doesn't know you changed the fill factor yesterday, so it's going to recommend you reduce the fill factor by another 10 percent over what you reduced it to yesterday! (Remember, the routine will look at the current fill factor and make suggestions based on that.)

So your @CountThreshold and frequency of running this monitoring routine depend on other factors. Suppose you defrag indexes weekly and you retain 60 days of log records. In this case, you might want to pick @CountThreshold to be 6 or 7 and you would run this monitoring routine every 60 days. Likewise, if you perform index maintenance nightly and you retain 30 days of log records, you might set @CountThreshold to be somewhere between 20 and 30 (depending on how often you want to define "frequently")  and you would run this routine every 30 days.

Here is a screenshot of the email the routine sends out:

Click to embiggen

Click to embiggen

The routine defaults to making commands to rebuild the index with SORT_IN_TEMPDB = ON and ONLINE = ON. The command will fail for XML and spatial indexes because they cannot be rebuilt online. Also, online rebuilds are an Enterprise only feature, so if you are not running Enterprise edition, you'll want to modify this to be off. The commands it prints out are meant to help a DBA, not replace one. They should be evaluated for appropriateness before being run. (This is why I wrote the routine to just print the statements instead of actually executing the command.)

The code for this procedure is below. As always, give any code you get from the internet a good once-over before running it. Let me know if you have any suggestions for improvements. Or if you know of a place to get cheap unicorn food.

CREATE PROCEDURE [dbo].[up_IndexDefragFrequencyCheck]
       (
        @DatabaseToInvestigate VARCHAR(100) = NULL
       ,@CommandLogDatabase VARCHAR(100) = NULL
       ,@CountThreshold TINYINT = 4
       ,@OverRebuildThresholdNewFF TINYINT = 70
       ,@OverReorgThresholdNewFF TINYINT = 80
       ,@MailProfileName SYSNAME
       ,@MailRecipients VARCHAR(MAX) = NULL
	   )
AS /*

	This procedure scans the CommandLog table of Ola Hallengren's index defrag script
		(http://ola.hallengren.com/) and identifies indexes that are being rebuilt or
		reorganized regularly. The original code comes courtesy of Tim Ford, who blogged
		this at http://thesqlagentman.com/2013/10/whats-the-frequency-kenneth/. This routine
		expands his work. The procedure is meant to be run on a regular basis and will send
		an email when it finds indexes being defragged more than a specified number of times.
		It will also suggest rebuilding the index with a new fill factor to reduce the number
		of rebuilds or reorgs. The new fill factor recommended is one you specify, if the current
		fill factor of the index is 0 or 100%, otherwise it is 10% less than the current
		fill factor. It will not recommend anything less than 50%.

		IMPORTANT!!! This routine looks at the log table and the _current_ fill factor setting
		of the indexes. Therefore, in order to get appropriate fill factor recommendations,
		this routine should NOT be run more often than the log table is cleared. FOR EXAMPLE,
		IF YOU KEEP 30 DAYS WORTH OF LOG RECORDS IN YOUR TABLE, THIS ROUTINE SHOULD NOT BE RUN
		MORE OFTEN THAN EVERY 30 DAYS. This is because if you change the fill factor of an index
		it may result in stopping rebuilds, but the log table will still have the records of the
		prior rebuilds, which may trigger another recommendation to reduce the fill factor, which
		would then be 10% less than what you just changed it to.

		Do not blindly run the commands in the email notification! They are meant to make it easier
		for a DBA to change the fill factor, but they should be evaluated first to ensure they
		are appropriate for your situation.

		Input parameters:

		@DatabaseToInvestigate - The database to scan for excessive defrags
		@CommandLogDatabase - Database holding the CommandLog table of Ola's routine
		@CountThreshold - indexes being defragged equal to or greater this number of times
							are flagged as being excessively defragged
		@OverRebuildThresholdNewFF - Initial starting fill factor suggestion for indexes that
							are being rebuilt more than @CountThreshold times. Only used when
							current fill factor is 0 or 100
		@OverReorgThresholdNewFF - Initial starting fill factor suggestion for indexes that
							are being reorganized more than @CountThreshold times. Only used when
							current fill factor is 0 or 100
		@MailProfileName - name of mail profile the routine can use to send email
		@MailRecipients - list of email addresses the notification will be sent to. Must be of
							the form xxx@xxx.xxx  Multiple addresses should be separated by a ;

		Shaun J. Stuart
		shaunjstuart.com

		v1.0 - 10/8/13
		v1.1 - 10/13/13 - SJS - Added support for case where current fill factor between 100 and the
								min thresholds passed in.

*/

       DECLARE @SQLstr VARCHAR(2000)
       DECLARE @EmailBody VARCHAR(MAX)
       DECLARE @CRLF CHAR(2)
       DECLARE @EmailSubject VARCHAR(200)

       SET @CRLF = CHAR(13) + CHAR(10)

/* Data validation */

       IF @DatabaseToInvestigate IS NULL
          OR @DatabaseToInvestigate = ''
          BEGIN
                SELECT  'A database to investigate must be specified.'
                RETURN
          END

       IF @CommandLogDatabase IS NULL
          OR @CommandLogDatabase = ''
          BEGIN
                SELECT  'A database holding the CommandLog table must be specified.'
                RETURN
          END

       IF @MailRecipients IS NULL
          OR @MailRecipients = ''
          BEGIN
                SELECT  'At least one email recipient must be specified.'
                RETURN
          END

       IF @MailRecipients NOT LIKE '%@%.%'
          BEGIN
                SELECT  'Email recipient not valid address format.'
                RETURN
          END

--============================================================================
-- INDEX REBUILD COUNT
--============================================================================

       IF EXISTS ( SELECT   name
                   FROM     tempdb.sys.objects
                   WHERE    name = '##Recommendations' )
          BEGIN
                DROP TABLE ##Recommendations
          END

       IF EXISTS ( SELECT   name
                   FROM     tempdb.sys.objects
                   WHERE    name = '##Index_History' )
          BEGIN
                DROP TABLE ##Index_History
          END

       IF EXISTS ( SELECT   name
                   FROM     tempdb.sys.objects
                   WHERE    name = '##Indexes' )
          BEGIN
                DROP TABLE ##Indexes
          END

       CREATE TABLE ##Recommendations
              (
               CommandLine VARCHAR(MAX)
              );

       CREATE TABLE ##Index_History
              (
               the_database SYSNAME
              ,the_schema SYSNAME
              ,the_object SYSNAME
              ,the_index SYSNAME
              ,index_type VARCHAR(13)
              ,fill_factor TINYINT
              ,rebuild_count INT NULL
              ,reorg_count INT NULL
              );

       CREATE TABLE ##Indexes
              (
               the_schema SYSNAME
              ,the_object SYSNAME
              ,the_index SYSNAME
              ,fill_factor TINYINT
              );

       SET @SQLstr = 'INSERT  INTO ##Indexes
        (the_schema
        ,the_object
        ,the_index
        ,fill_factor)
        SELECT  OBJECT_SCHEMA_NAME(SO.object_id, DB_ID('''
           + @DatabaseToInvestigate + ''')) AS the_schema
               ,SO.name AS the_object
               ,SI.name AS the_index
               ,SI.fill_factor
        FROM    ' + @DatabaseToInvestigate + '.sys.objects SO
                INNER JOIN ' + @DatabaseToInvestigate
           + '.sys.indexes SI ON SO.object_id = SI.object_id
        WHERE   SI.index_id > 0;'
		--PRINT @SQLstr
       EXEC (@SQLstr)

       SET @SQLstr = 'INSERT  INTO ##Index_History
        (the_database
        ,the_schema
        ,the_object
        ,the_index
        ,index_type
        ,fill_factor
        ,rebuild_count
        ,reorg_count)
        SELECT  C.DatabaseName
               ,C.SchemaName
               ,C.ObjectName
               ,C.IndexName
               ,CASE C.IndexType
                  WHEN 1 THEN ''Clustered''
                  ELSE ''Non-Clustered''
                END AS IndexType
               ,IX.fill_factor
               ,COUNT(C.ID) AS rebuild_count
               ,0 AS reorg_count
        FROM    ' + @CommandLogDatabase + '.dbo.CommandLog C
                LEFT JOIN ##Indexes IX ON C.SchemaName = IX.the_schema
                                          AND C.ObjectName = IX.the_object
                                          AND C.IndexName = IX.the_index
        WHERE   C.CommandType = ''ALTER_INDEX''
                AND C.ObjectType = ''U''
                AND C.Command LIKE ''%REBUILD%''
                AND C.DatabaseName = ''' + @DatabaseToInvestigate + '''
        GROUP BY C.DatabaseName
               ,C.SchemaName
               ,C.ObjectName
               ,C.IndexName
               ,IndexType
               ,IX.fill_factor;'
		--PRINT @SQLstr
       EXEC (@SQLstr)
--============================================================================
-- INDEX REORGANIZE (ONLY) COUNT
--============================================================================
       SET @SQLstr = 'INSERT  INTO ##Index_History
        (the_database
        ,the_schema
        ,the_object
        ,the_index
        ,index_type
        ,fill_factor
        ,rebuild_count
        ,reorg_count)
        SELECT  C.DatabaseName
               ,C.SchemaName
               ,C.ObjectName
               ,C.IndexName
               ,CASE C.IndexType
                  WHEN 1 THEN ''Clustered''
                  ELSE ''Non-Clustered''
                END AS IndexType
               ,IX.fill_factor
               ,0 AS rebuild_count
               ,COUNT(C.ID) AS reorg__count
        FROM    ' + @CommandLogDatabase
           + '.dbo.CommandLog C
                LEFT JOIN ##Indexes IX ON C.SchemaName = IX.the_schema
                                          AND C.ObjectName = IX.the_object
                                          AND C.IndexName = IX.the_index
                LEFT JOIN ##Index_History IH ON C.DatabaseName = IH.the_database
                                                AND C.SchemaName = IH.the_schema
                                                AND C.ObjectName = IH.the_object
                                                AND C.IndexName = IH.the_index
        WHERE   C.CommandType = ''ALTER_INDEX''
                AND C.ObjectType = ''U''
                AND C.Command LIKE ''%REORGANIZE%''
                AND C.DatabaseName = ''' + @DatabaseToInvestigate + '''
                AND IH.the_database IS NULL
        GROUP BY C.DatabaseName
               ,C.SchemaName
               ,C.ObjectName
               ,C.IndexName
               ,IndexType
               ,IX.fill_factor;'
		--PRINT @SQLstr
       EXEC (@SQLstr)
--========================================================
-- ACCOUNT FOR INDEXES BOTH REBUILT AND REORGANIZED
--========================================================
       SET @SQLstr = 'UPDATE  ##Index_History
					  SET     reorg_count = C2.reorganize_count
					  FROM    ##Index_History IH
					  INNER JOIN (SELECT  C.DatabaseName
						     ,C.SchemaName
							   ,C.ObjectName
							,C.IndexName
                           ,	COUNT(C.ID) AS reorganize_count
                      FROM    ' + @CommandLogDatabase + '.dbo.CommandLog C
                      WHERE   C.CommandType = ''ALTER_INDEX''
                              AND C.ObjectType = ''U''
                              AND C.Command LIKE ''%REORGANIZE%''
                              AND C.DatabaseName = ''' + @DatabaseToInvestigate
           + '''
			          GROUP BY C.DatabaseName
                            ,C.SchemaName
                            ,C.ObjectName
                            ,C.IndexName
                    ) C2 ON IH.the_database = C2.DatabaseName
                            AND IH.the_schema = C2.SchemaName
                            AND IH.the_object = C2.ObjectName
                            AND IH.the_index = C2.IndexName
					  WHERE   IH.rebuild_count > 0'
--print @SQLstr
       EXEC (@SQLstr)
--============================================================================
-- RETURN THE RESULTS
--============================================================================

       SELECT   the_database
               ,the_schema
               ,the_object
               ,the_index
               ,index_type
               ,ISNULL(CONVERT(VARCHAR(20), fill_factor), 'No longer exists') AS fill_factor
               ,rebuild_count
               ,reorg_count
       FROM     ##Index_History
       WHERE    rebuild_count >= @CountThreshold
                OR reorg_count >= @CountThreshold
       ORDER BY the_database
               ,rebuild_count DESC
               ,reorg_count DESC
               ,the_object
               ,the_index

       INSERT   INTO ##Recommendations
                (CommandLine)
                SELECT  'USE [' + the_database + ']; ' + @CRLF
                        + 'ALTER INDEX [' + the_index + '] ON [' + the_schema
                        + '].[' + the_object + '] ' + @CRLF
                        + 'REBUILD PARTITION = ALL WITH (PAD_INDEX = OFF, '
                        + 'STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, ONLINE = ON, '
                        + @CRLF
                        + 'ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = '
                        + CASE WHEN fill_factor = '0'
                               THEN CAST(@OverRebuildThresholdNewFF AS VARCHAR(3))
                               WHEN fill_factor = '100'
                               THEN CAST(@OverRebuildThresholdNewFF AS VARCHAR(3))
                               WHEN fill_factor <= @OverRebuildThresholdNewFF                                THEN CAST(@OverRebuildThresholdNewFF - 10 AS VARCHAR(3))                                
							   WHEN (fill_factor > @OverRebuildThresholdNewFF AND 
								fill_factor <100)
                               THEN CAST(@OverRebuildThresholdNewFF AS VARCHAR(3))
							  ELSE NULL /* unknown case */                           END + '); '  + @CRLF AS RebuildCommand                 FROM    ##Index_History                 WHERE   rebuild_count >= @CountThreshold
                        AND (fill_factor = 0
                             OR fill_factor >= 60)
                UNION ALL
                SELECT  'USE [' + the_database + ']; ' + @CRLF
                        + 'ALTER INDEX [' + the_index + '] ON [' + the_schema
                        + '].[' + the_object + '] ' + @CRLF
                        + 'REBUILD PARTITION = ALL WITH (PAD_INDEX = OFF, '
                        + 'STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = ON, ONLINE = ON, '
                        + @CRLF
                        + 'ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = '
                        + CASE WHEN fill_factor = '0'
                               THEN CAST(@OverReorgThresholdNewFF AS VARCHAR(3))
                               WHEN fill_factor = '100'
                               THEN CAST(@OverReorgThresholdNewFF AS VARCHAR(3))
                               WHEN fill_factor <= @OverReorgThresholdNewFF                                THEN CAST(@OverReorgThresholdNewFF - 10 AS VARCHAR(3))                                
							   WHEN (fill_factor > @OverReorgThresholdNewFF AND 
								fill_factor <100)
                               THEN CAST(@OverReorgThresholdNewFF AS VARCHAR(3))
							  ELSE NULL /* unknown case */                           END + '); ' + @CRLF AS RebuildCommand                 FROM    ##Index_History                 WHERE   reorg_count >= @CountThreshold
                        AND (fill_factor = 0
                             OR fill_factor >= 60)

       IF EXISTS ( SELECT   1
                   FROM     ##Recommendations )
          BEGIN
                SET @EmailBody = @CRLF
                    + 'Analysis of the index defrag log table has been performed and it appears '
                    + 'there are indexes that are being repeatedly defragged. Repeated defragging may '
                    + 'indicate the indexes need a lower fill factor to reduce page splits. Repeated '
                    + 'defragging also wastes resources and generates large amounts of potentially '
                    + 'unnecessary transaction log entries. The parameters used to generate this report '
                    + 'are: ' + @CRLF + @CRLF + 'Server: ' + @@SERVERNAME
                    + @CRLF + 'Database: ' + @DatabaseToInvestigate + @CRLF
                    + 'Min number of rebuilds / reorgs flagged as excessive: '
                    + CAST(@CountThreshold AS VARCHAR(3)) + '. ' + @CRLF
                    + 'Starting suggested fill factor for excessive rebuilds: '
                    + CAST(@OverRebuildThresholdNewFF AS VARCHAR(3)) + '. '
                    + @CRLF
                    + 'Starting suggested fill factor for excessive reorgs: '
                    + CAST(@OverReorgThresholdNewFF AS VARCHAR(3)) + '. '
                    + @CRLF + @CRLF + @CRLF
                    + 'Below are the suggested commands to change the fill factor on the indexes in question. '
                    + 'These commands should be carefully evaluated before being run to ensure they are '
                    + 'appropriate for the situation! In particular, if this routine is run more frequently '
                    + 'than the defrag log table is cleared, it may result in inappropriate fill factor '
                    + 'recommendations.' + @CRLF + @CRLF + @CRLF

                IF EXISTS ( SELECT  1
                            FROM    ##Index_History
                            WHERE   (rebuild_count >= @CountThreshold
                                     OR reorg_count >= @CountThreshold)
                                    AND fill_factor <= 50 									AND fill_factor > 0)
                   BEGIN
                         SET @EmailBody = @EmailBody + @CRLF + @CRLF
                             + 'Note: Some indexes were found that are being defragged regularly and have a fill factor '
                             + 'setting of 50 or less. Reducing the fill factor further is not generally recommended '
                             + 'and further investigation is warranted.'
                             + @CRLF + @CRLF

                         INSERT INTO ##Recommendations
                                (CommandLine)
                                SELECT  '/* Database [' + the_database
                                        + '], index [' + the_index + ']'
                                        + ' on [' + the_schema + '].['
                                        + the_object + '] '
                                        + 'is being repeatedly rebuilt and its fill factor '
                                        + 'is already set to 50 or less. This likely is worth a deeper '
                                        + 'investigation. */' AS RebuildCommand
                                FROM    ##Index_History
                                WHERE   rebuild_count >= @CountThreshold
                                        AND fill_factor <= 50 										AND fill_factor > 0
                                UNION ALL
                                SELECT  '/* Database [' + the_database
                                        + '], index [' + the_index + ']'
                                        + ' on [' + the_schema + '].['
                                        + the_object + '] '
                                        + 'is being repeatedly reorganized and its fill factor '
                                        + 'is already set to 50 or less. This likely is worth a deeper '
                                        + 'investigation. */' AS RebuildCommand
                                FROM    ##Index_History
                                WHERE   reorg_count >= @CountThreshold
                                        AND fill_factor <= 50 										AND fill_factor > 0

                   END
                SET @EmailSubject = 'Excessive Index Defragmenting found on '
                    + @@Servername

                EXEC msdb.dbo.sp_send_dbmail
                    @profile_name = @MailProfileName
                   ,@recipients = @MailRecipients
                   ,@subject = @EmailSubject
                   ,@body = @EmailBody
                   ,@query = 'SELECT * FROM ##Recommendations'
                   ,@query_result_width = 32767
                   ,@query_no_truncate = 1

          END

--============================================================================
-- CLEANUP THE MESS
--============================================================================

IF EXISTS ( SELECT  name
            FROM    tempdb.sys.objects
            WHERE   name = '##Index_History' )
   BEGIN
         DROP TABLE ##Index_History
   END

IF EXISTS ( SELECT  name
            FROM    tempdb.sys.objects
            WHERE   name = '##Indexes' )
   BEGIN
         DROP TABLE ##Indexes
   END

IF EXISTS ( SELECT  name
            FROM    tempdb.sys.objects
            WHERE   name = '##Recommendations' )
   BEGIN
         DROP TABLE ##Recommendations
   END

Share

To celebrate the release of my first published work, I'm giving away a copy of Tribal SQL. To have a chance of winning, just leave a comment on this post before November 11, 2013 6:00 AM Mountain Standard Time. I'll assign a number to each comment in the order they were received and use www.random.org to randomly pick the winning number. Be sure to enter your email address when leaving a comment so I can contact you! If you don't want to leave your email address, be sure to check back on this post or follow my Twitter account to see if you won!

tribalSQLcover

If you don't win, you can always order the book from Amazon. It makes a great Christmas gift!

Authors and editors of the book are not eligible. Due to shipping costs, I can only ship to addresses in the United States. If you are outside of the U.S., feel free to enter, but if you win, you must provide a U.S. shipping address for me to ship to. Only one entry per person please. Winner has 20 days to contact me to claim their prize before an alternate winner is chosen.

As an added bonus, the winner will also receive one free enrollment to my online SQL Server 101 course at Udemy.com! If you are already a student of that course, you'll receive one free enrollment to my next course, which should be available soon.

Good luck!

 

UPDATE: The contest is closed. I have sent email to the winner and am waiting to hear back.

Share

Note: For more details about this script and the variables and outputs, see my post here.

I've made two changes to my script that checks for new database objects to compress. First, it will now enclose table and index names in square brackets to prevent errors with object names that are also keywords or include dashes (such as Sharepoint databases). I also added some code to handle situations where a table or index might have been scanned initially, but by the time the procedure got around to performing the compression, that object had been deleted. (Because the script is designed to be run over multiple days, a system that creates and deletes tables on a regular basis may run into this situation.) I also fixed a bug where the scan to determine the compression estimates was not limited to only the database specified. This was causing some objects to be incorrectly tagged as no longer existing.

As always, don't run any code from the internet unless you understand what it is doing. Let me know if you have any other suggestions for improvements.


CREATE PROC [dbo].[up_CompressDatabase]
      (
       @minCompression FLOAT = .25 /* e.g. .25 for minimum of 25% compression */
      ,@MaxRunTimeInMinutes INT = 60
      ,@ExecuteCompressCommand BIT = 0	/* 1 to execute command */
      ,@DBToScan SYSNAME /* database to compress */
      )
AS
      SET nocount ON;

/*
	Original script by Paul Nielsen www.SQLServerBible.com March 13, 2008
	Modified by Shaun J. Stuart www.shaunjstuart.com February 27, 2013

  Sets compression for all objects and indexes in the database needing adjustment
  If estimated gain is equal to or greater than min compression parameter
    then enables row or page compression, whichever gives greater space savings
  If row and page have same gain then, enables row compression
  If estimated gain is less than min compression parameter, then compression is set to none

  - SJS 2/27/13
  - Added @MaxRunTimeInMinutes to limit run length (checked afer each command, so
		may run longer) Note: This timer only applies to the actual compression process.
		It does not limit the time it takes to perform a scan of all the tables / indexes
		and estimate compression savings.
  - Sorted compression cursor to compress smallest tables first
  - Added flag to execute compression command or not
  - Added sort in tempdb option (always)
  - Re-wrote slightly to persist initial scan results. Will only re-scan after all tables
		have been processed

	- SJS 7/17/13
	- Modified to only look at data that is not already compressed
	- Modified for that any items with null none_size, page_size, or row_size are no longer set as
		AlreadyProcessed and now have their sizes calculated (i.e. if process was cancelled before
		initial scan was completed)
	- Modified so that time limit now applies to compression estimate process as well as actual
		compression process
	- Only look at partitions that have at least one row

	- SJS 8/6/13
	- Changed name of dbEstimate table to dbCompressionEstimates for clarity
	- Added StatusText column to dbCompressionEstimates and added status messages
	- Modified to take database name as input parameter to allow use in utility database
		instead of the DB being compressed (Table dbCompressionEstimates, which stores sizes and
		compression estimates, is still created in the current database.)
	- Note: Compression estimates are only made for the database supplied as an input
		parameter. However, that database name is stored in the dbCompressionEstimates table and
		will be read from there and used when actually performing the compression. This allows you to
		create estimates only for multiple databases (using @ExecuteCompressCommand=0), then perform
		the compression	across multiple databases later (with @ExecuteCompressCommand=1).
	- Removed all references to code that checked already compressed data since this routine now only
		looks at uncompressed data.

	- SJS 8/21/13
	- Put []s around table and index names in compression commands.

	- SJS 10/7/13
	- Added check to make sure table / index still exists before estimating or performing compression.
	- Fixed bug in cursor to determine compression estimates (left out db name in where clause)

*/

      IF ISNULL(@DBToScan, '') NOT IN (SELECT   [name]
                                       FROM     sys.databases)
         BEGIN
               SELECT   'Database ' + ISNULL(@DBToScan, 'NULL')
                        + ' not found on server.'
               RETURN
         END

      DECLARE @CompressedCount INT;
      SET @CompressedCount = 0;

      DECLARE @SQL NVARCHAR(MAX);
      DECLARE @ParmDefinition NVARCHAR(100);
      DECLARE @TestResult NVARCHAR(20);
      DECLARE @CheckString NVARCHAR(1000);

      DECLARE @StartTime DATETIME2;
      SET @StartTime = CURRENT_TIMESTAMP;

      DECLARE @CurrentDatabase SYSNAME;
      SET @CurrentDatabase = DB_NAME()

      IF OBJECT_ID('tempdb..##ObjEst', 'U') IS NOT NULL
         DROP TABLE ##ObjEst

      CREATE TABLE ##ObjEst
             (
              PK INT IDENTITY
                     NOT NULL
                     PRIMARY KEY
             ,object_name VARCHAR(250)
             ,schema_name VARCHAR(250)
             ,index_id INT
             ,partition_number INT
             ,size_with_current_compression_setting BIGINT
             ,size_with_requested_compression_setting BIGINT
             ,sample_size_with_current_compression_setting BIGINT
             ,sample_size_with_requested_compresison_setting BIGINT
             );

      IF NOT EXISTS ( SELECT    1
                      FROM      sys.objects
                      WHERE     object_id = OBJECT_ID(N'[dbo].[dbCompressionEstimates]')
                                AND type IN (N'U') )
         BEGIN
               CREATE TABLE dbo.dbCompressionEstimates
                      (
                       PK INT IDENTITY
                              NOT NULL
                              PRIMARY KEY
                      ,DatabaseName SYSNAME
                      ,schema_name VARCHAR(250)
                      ,object_name VARCHAR(250)
                      ,index_id INT
                      ,ixName VARCHAR(255)
                      ,ixType VARCHAR(50)
                      ,partition_number INT
                      ,data_compression_desc VARCHAR(50)
                      ,None_Size INT
                      ,Row_Size INT
                      ,Page_Size INT
                      ,AlreadyProcessed BIT
                      ,StatusText VARCHAR(75)
                      );
         END

  /*
 If all objects have been processed, rescan and start fresh. Useful for catching
  added objects since last scan. But beware - this is I/O intensive and can take a while.
*/

      IF NOT EXISTS ( SELECT    1
                      FROM      dbo.dbCompressionEstimates
                      WHERE     AlreadyProcessed = 0
                                AND DatabaseName = @DBToScan )
         BEGIN
               DELETE   FROM dbo.dbCompressionEstimates
               WHERE    DatabaseName = @DBToScan;

               SET @SQL = 'USE [' + @DBToScan + '];
						INSERT   INTO [' + @CurrentDatabase
                   + '].dbo.dbCompressionEstimates
								(DatabaseName
								,schema_name
								,object_name
								,index_id
								,ixName
								,ixType
								,partition_number
								,data_compression_desc
								,AlreadyProcessed
								,StatusText)
                        SELECT  ''' + @DBToScan + '''
								,S.name
                               ,o.name
                               ,I.index_id
                               ,I.name
                               ,I.type_desc
                               ,P.partition_number
                               ,P.data_compression_desc
                               ,0 AS AlreadyProcessed
							   ,''Initial load'' AS StatusText
                        FROM    [' + @DBToScan + '].sys.schemas AS S
                                JOIN [' + @DBToScan
                   + '].sys.objects AS O ON S.schema_id = O.schema_id
                                JOIN [' + @DBToScan
                   + '].sys.indexes AS I ON o.object_id = I.object_id
                                JOIN [' + @DBToScan
                   + '].sys.partitions AS P ON I.object_id = P.object_id
                                                            AND I.index_id = p.index_id
                        WHERE   O.TYPE = ''U''
                                AND P.data_compression_desc = ''NONE''
								AND P.rows > 0;'		-- only look at objects with data
               EXEC (@SQL)
         END
			-- Determine Compression Estimates
      DECLARE @PK INT
             ,@DatabaseName SYSNAME
             ,@Schema VARCHAR(150)
             ,@object VARCHAR(150)
             ,@DAD VARCHAR(25)
             ,@partNO VARCHAR(3)
             ,@indexID VARCHAR(3)
             ,@ixName VARCHAR(250)
             ,@ixType VARCHAR(50)
             ,@Recommended_Compression VARCHAR(10);

      DECLARE cCompress CURSOR FAST_FORWARD
      FOR
              SELECT    schema_name
                       ,object_name
                       ,index_id
                       ,partition_number
                       ,data_compression_desc
                       ,ixName
              FROM      dbo.dbCompressionEstimates
              WHERE     (None_size IS NULL
                         OR Row_Size IS NULL
                         OR Page_size IS NULL)
                        AND DatabaseName = @DBToScan;

      OPEN cCompress;

      FETCH cCompress INTO @Schema, @object, @indexID, @partNO, @DAD, @ixName;

      WHILE @@Fetch_Status = 0
            BEGIN
							/* evaluate objects with no compression */
                  IF DATEDIFF(mi, @StartTime, CURRENT_TIMESTAMP) < @MaxRunTimeInMinutes
                     BEGIN
							/* First, make sure the table / index still exists (in case this
								process is run over multiple days */

                           IF @indexID = 0
                              BEGIN /* heaps */
                                    SET @CheckString = 'IF object_ID('''
                                        + @DBToScan + '.' + @Schema + '.'
                                        + @object
                                        + ''') IS NULL
									BEGIN
										SELECT @TestResultOUT = ''Does Not Exist''
									END
									ELSE
									BEGIN
										SELECT @TestResultOUT = ''Exists''
									END';
                              END
                           ELSE
                              BEGIN /* indexes */
                                    SET @CheckString = 'IF object_ID('''
                                        + @DBToScan + '.' + @Schema + '.'
                                        + @ixName
                                        + ''') IS NULL
									BEGIN
										SELECT @TestResultOUT = ''Does Not Exist''
									END
									ELSE
									BEGIN
										SELECT @TestResultOUT = ''Exists''
									END';
                              END

                           SET @ParmDefinition = '@TestResultOUT varchar(20) OUTPUT';
                           EXECUTE sp_executesql
                            @CheckString
                           ,@ParmDefinition
                           ,@TestResultOUT = @TestResult OUTPUT;
                           IF @TestResult = 'Exists'
                              BEGIN

                                    IF @DAD = 'none'
                                       BEGIN
									/* estimate Page compression */
                                             SET @SQL = 'USE [' + @DBToScan
                                                 + '];
										INSERT  ##ObjEst
												(object_name
												,schema_name
												,index_id
												,partition_number
												,size_with_current_compression_setting
												,size_with_requested_compression_setting
												,sample_size_with_current_compression_setting
												,sample_size_with_requested_compresison_setting)
												EXEC sp_estimate_data_compression_savings
													@Schema_name = ' + @Schema
                                                 + '
												   ,@object_name = ' + @object
                                                 + '
												   ,@index_id = ' + @indexID
                                                 + '
												   ,@partition_number = '
                                                 + @partNO
                                                 + '
												   ,@data_compression = ''page'';'
                                             EXEC (@SQL)

                                             UPDATE dbo.dbCompressionEstimates
                                             SET    none_size = O.size_with_current_compression_setting
                                                   ,page_size = O.size_with_requested_compression_setting
                                                   ,StatusText = 'Compression estimate 50% complete'
                                             FROM   dbo.dbCompressionEstimates D
                                                    JOIN ##ObjEst O ON D.Schema_name = O.Schema_Name
                                                              AND D.Object_name = O.object_name
                                                              AND D.index_id = O.index_id
                                                              AND D.partition_number = O.partition_number;

                                             DELETE ##ObjEst;

									-- estimate Row compression
                                             SET @SQL = 'USE [' + @DBToScan
                                                 + '];
										INSERT  ##ObjEst
												(object_name
												,schema_name
												,index_id
												,partition_number
												,size_with_current_compression_setting
												,size_with_requested_compression_setting
												,sample_size_with_current_compression_setting
												,sample_size_with_requested_compresison_setting)
												EXEC sp_estimate_data_compression_savings
													@Schema_name = ' + @Schema
                                                 + '
												   ,@object_name = ' + @object
                                                 + '
												   ,@index_id = ' + @indexID
                                                 + '
												   ,@partition_number = '
                                                 + @partNO
                                                 + '
												   ,@data_compression = ''ROW'';'
                                             EXEC (@SQL)

                                             UPDATE dbo.dbCompressionEstimates
                                             SET    row_size = O.size_with_requested_compression_setting
                                                   ,StatusText = 'Compression estimate 100% complete'
                                             FROM   dbo.dbCompressionEstimates D
                                                    JOIN ##ObjEst O ON D.Schema_name = O.Schema_Name
                                                              AND D.Object_name = O.object_name
                                                              AND D.index_id = O.index_id
                                                              AND D.partition_number = O.partition_number;

                                             DELETE ##ObjEst;
                                       END /* end evaluating objects with no compression */
                              END
                           ELSE /* table or index no longer exists */
                              BEGIN
                                    UPDATE  dbo.dbCompressionEstimates
                                    SET     AlreadyProcessed = 1
                                           ,StatusText = 'Object no longer exists at compression estimate stage'
                                    WHERE   schema_name = @Schema
                                            AND object_name = @object
                                            AND index_id = @indexID
                                            AND partition_number = @partNO
                                            AND data_compression_desc = @DAD
                                            AND DatabaseName = @DBToScan;
                              END

                           FETCH NEXT FROM cCompress INTO @Schema, @object,
                                 @indexID, @partNO, @DAD, @ixName
                     END -- end time check block
                  ELSE
                     BEGIN
                           PRINT 'Max runtime reached. No compression performed. Compression estimate still processing. Exiting...';
                           CLOSE cCompress
                           DEALLOCATE cCompress
                           DROP TABLE ##ObjEst
                           RETURN
                     END
            END -- end while loop

      CLOSE cCompress
      DEALLOCATE cCompress

      PRINT 'Initial scan complete.'
         --END
  /* End evaluating compression savings. Now do the actual compressing. */

      PRINT 'Beginning compression.';

  /* Do not process objects that do not meet our compression criteria */

      UPDATE    dbo.dbCompressionEstimates
      SET       AlreadyProcessed = 1
               ,StatusText = 'Best compression method less than minCompression threshold'
      WHERE     (1 - (CAST(Row_Size AS FLOAT) / None_Size)) < @minCompression
                AND (Row_Size <= Page_Size)
                AND None_Size > 0
				AND AlreadyProcessed = 0;

      UPDATE    dbo.dbCompressionEstimates
      SET       AlreadyProcessed = 1
               ,StatusText = 'Best compression method less than minCompression threshold'
      WHERE     (1 - (CAST(Page_Size AS FLOAT) / None_Size)) < @minCompression
                AND (Page_Size <= Row_Size)
                AND None_Size > 0
				AND AlreadyProcessed = 0;

  /* Do not set compression on empty objects */

      UPDATE    dbo.dbCompressionEstimates
      SET       AlreadyProcessed = 1
               ,StatusText = 'No data in table to compress'
      WHERE     None_size = 0
				AND AlreadyProcessed = 0;

 -- set the compression
      DECLARE cCompress CURSOR FAST_FORWARD
      FOR
              SELECT    DatabaseName
                       ,schema_name
                       ,object_name
                       ,partition_number
                       ,ixName
                       ,ixType
                       ,CASE WHEN (1 - (CAST(Row_Size AS FLOAT) / none_Size)) >= @minCompression
                                  AND (Row_Size <= Page_Size) THEN 'Row'
                             WHEN (1 - (CAST(page_Size AS FLOAT) / none_Size)) >= @minCompression
                                  AND (Page_Size <= row_Size) THEN 'Page'
                             ELSE 'None'
                        END AS Recommended_Compression
                       ,PK
              FROM      dbo.dbCompressionEstimates
              WHERE     None_Size <> 0
                        AND (CASE WHEN (1 - (CAST(Row_Size AS FLOAT)
                                             / none_Size)) >= @minCompression
                                       AND (Row_Size <= Page_Size) THEN 'Row'
                                  WHEN (1 - (CAST(page_Size AS FLOAT)
                                             / none_Size)) >= @minCompression
                                       AND (Page_Size <= row_Size) THEN 'Page'
                                  ELSE 'None'
                             END <> data_compression_desc)
                        AND AlreadyProcessed = 0
              ORDER BY  None_Size ASC;		/* start with smallest tables first */

      OPEN cCompress

      FETCH cCompress INTO @DatabaseName, @Schema, @object, @partNO, @ixName,
            @ixType, @Recommended_Compression, @PK  -- prime the cursor;

      WHILE @@Fetch_Status = 0
            BEGIN

                  IF @ixType = 'Clustered'
                     OR @ixType = 'heap'
                     BEGIN
                           SET @SQL = 'USE [' + @DatabaseName + '];
								ALTER TABLE [' + @Schema + '].[' + @object
                               + '] Rebuild with (data_compression = '
                               + @Recommended_Compression
                               + ', SORT_IN_TEMPDB=ON)';

                           SET @CheckString = 'IF object_ID('''
                               + @DatabaseName + '.' + @Schema + '.' + @object
                               + ''') IS NULL
								BEGIN
									SELECT @TestResultOUT = ''Does Not Exist''
								END
								ELSE
								BEGIN
									SELECT @TestResultOUT = ''Exists''
								END';
                     END

                  ELSE
                     BEGIN
                           SET @SQL = 'USE [' + @DatabaseName + '];
								ALTER INDEX [' + @ixName + '] on [' + @Schema
                               + '].[' + @object
                               + '] Rebuild with (data_compression = '
                               + @Recommended_Compression
                               + ',SORT_IN_TEMPDB=ON)';

                           SET @CheckString = 'IF object_ID('''
                               + @DatabaseName + '.' + @Schema + '.' + @ixName
                               + ''') IS NULL
								BEGIN
									SELECT @TestResultOUT = ''Does Not Exist''
								END
								ELSE
								BEGIN
									SELECT @TestResultOUT = ''Exists''
								END';
                     END

                  IF DATEDIFF(mi, @StartTime, CURRENT_TIMESTAMP) < @MaxRunTimeInMinutes
                     BEGIN
                           IF @ExecuteCompressCommand = 1
                              BEGIN

							/* verify that table / index still exists before doing anything */

                                    SET @ParmDefinition = '@TestResultOUT varchar(20) OUTPUT';
                                    EXECUTE sp_executesql
                                        @CheckString
                                       ,@ParmDefinition
                                       ,@TestResultOUT = @TestResult OUTPUT;
                                    IF @TestResult = 'Exists'
                                       BEGIN

                                             UPDATE dbo.dbCompressionEstimates
                                             SET    StatusText = 'Compressing data...'
                                             WHERE  PK = @PK;

                                             PRINT 'Compressing table/index: '
                                                   + @Schema + '.' + @object
                                                   + ' in database '
                                                   + @DatabaseName;
                                             EXEC sp_executesql
                                                @SQL;

                                             UPDATE dbo.dbCompressionEstimates
                                             SET    AlreadyProcessed = 1
                                                   ,StatusText = 'Compression complete'
                                             WHERE  PK = @PK;

                                             SET @CompressedCount = @CompressedCount
                                                 + 1;
                                       END
                                    ELSE
                                       BEGIN
                                             UPDATE dbo.dbCompressionEstimates
                                             SET    AlreadyProcessed = 1
                                                   ,StatusText = 'Object no longer exists at compression stage'
                                             WHERE  PK = @PK;
                                       END
                              END
                           ELSE
                              BEGIN
                                    PRINT 'Command execution not enabled. Command is:'
                                          + @SQL;
                              END
                     END
                  ELSE
                     BEGIN
                           PRINT 'Max runtime reached. Some compression performed. Exiting...';
                           CLOSE cCompress
                           DEALLOCATE cCompress
                           DROP TABLE ##ObjEst
                           BREAK
                     END

                  FETCH cCompress INTO @DatabaseName, @Schema, @object,
                        @partNO, @ixName, @ixType, @Recommended_Compression,
                        @PK;
            END

      CLOSE cCompress;
      DEALLOCATE cCompress;

      PRINT 'Objects compressed: ' + CONVERT(VARCHAR(10), @CompressedCount);
      DROP TABLE ##ObjEst
      RETURN

Share

Sharepoint and SQL Server have a somewhat rocky relationship. Sharepoint likes to use GUIDs everywhere and likes to handle database maintenance on its own. It sometimes seems like the database design was specifically concocted as an example of how NOT to design databases for best performance. On the other hand, Sharepoint is really good about automatically leveraging features of SQL Server if they are available. For example, if Sharepoint detects it is using SQL Server Enterprise edition, will will automatically use Enterprise-only features. One such feature is partitioned tables and, if you are not aware of this, it can really mess up your backup and restore strategies and scripts.

In my case, I have a Sharepoint server that is configured to use Web Analytics. This Sharepoint feature provides lots of reports regarding the way people use your website. If Sharepoint is using SQL Server Enterprise Edition to store its data, Web Analytics will automatically enable table partitioning for its tables. It aggregates data on a weekly basis and stores each week's worth of data in it's own partition. This provides improved reporting performance.

From a DBA standpoint, this means your web analytics database will get a new .NDF data file added to it each week. If your Sharepoint admin did not set up a data retention policy, as time passes, you can end up with hundreds of database files for your web analytics database. This may cause problems with any backup restore scripts you may have or any automated processes that have to manipulate database files. In fact, this was how I discovered this behavior. I have a routine that randomly selects backups to restore to a test server to check their validity. It parses the header of the backup files and builds the restore command to restore the database to a test server. When it encountered my Sharepoint web analytics database, the number of .NDF files had grown so large that it exceeded the length of the variable I used to construct the restore command.

This Sharepoint behavior is documented here.

Share
This is how I tune queries.

This is how I tune queries.

I don't spend a lot of my time tuning queries. Many of the databases in my company are used by third party products and we have limited control over the changes we can make to those. However, we do have several databases that are used by home-grown applications and these we have much more latitude to modify.

Last week, my monitoring software started sending me alerts that a job was blocking other processes. This went on for days, with the length of blocking increasing each day. By Friday, the job was blocking for 2.5 hours. I decided to take a look.

My alert identified the statement that was causing the blocking and it was a stored procedure. I opened it up and started looking at the code. The first 100 lines or so weren't too bad. I found one query that was non-SARGable. The WHERE clause was

WHERE datediff(day, h.Transaction_Date, getdate()) < 180 )

I changed to the equivalent and SARGable

WHERE h.Transaction_date <DATEADD(d,180,GETDATE()))

But that couldn't have been what was causing the procedure to run for so long, especially considering the table it reference only had about 10,000 rows. I kept reading and got to the final statement in the procedure. As soon as I saw it, I knew I had found my culprit:

SELECT DISTINCT
        MemberNumber
       ,Account_Number
       ,KeyDate = dbo.fnFormatDate(GETUTCDATE(), 'M/D/YYYY')
       ,Branch_Name
       ,Employee_Name
       ,Transaction_Date
       ,Transaction_Description
       ,Call_Type
       ,Engagement_Number
       ,Engagement_Type
       ,Address_Name
       ,Supervisor
       ,Call_Duration
FROM    dbo.Member_Data
WHERE   MemberNumber + Account_Number + dbo.fnFormatDate(GETDATE(), 'M/D/YYYY')
           NOT IN (SELECT  MemberNumber + Account_Number + dbo.fnFormatDate(GETDATE(),
                                                          'M/D/YYYY')
        FROM    dbo.Member_Survey_Results)

I didn't even need to look at existing indexes on the table to see if this would be a problem. There are no indexes in the world that would help this query. The WHERE clause is concatenating two columns and then appending the current date, which SQL must first format by calling a user defined function! Yikes!

So I started looking into this to find out how I could make it better. The first thing I realized was that appending the current date in the WHERE clause doesn't do anything. It's adding identical characters to the end of the values being compared. These will always match, so we can toss that out. Not only will that make what we are comparing shorter, but it will also eliminate the call to the user defined function, which would need to happen for each row in each table.

Looking at the remaining terms, we can see the query is looking for all records in the Member_Data table that do not exist in the Member_Survey_Results table and the comparison is being done using the MemberNumber and Account_Number columns. This same selection can be accomplished by using a left outer join. With a left out join, records not present in the right hand table, will be returned with NULL as a value for the columns for that table. So to find the records that are not present in the right hand table, we just need to filter for one (or more) columns that are NULL from that table. (Be careful here. If your right hand table has columns where NULL is a valid value, you would not want to use those columns in your WHERE clause.)

We can re-write this query as follows:

SELECT DISTINCT
        mmd.MemberNumber
       ,mmd.Account_Number
       ,KeyDate = dbo.fnFormatDate(GETUTCDATE(), 'M/D/YYYY')
       ,mmd.Branch_Name
       ,mmd.Employee_Name
       ,mmd.Transaction_Date
       ,mmd.Transaction_Description
       ,mmd.Call_Type
       ,mmd.Engagement_Number
       ,mmd.Engagement_Type
       ,mmd.Address_Name
       ,mmd.Supervisor
       ,mmd.Call_Duration
FROM    dbo.Member_Data mmd
        LEFT OUTER JOIN dbo.Member_Survey_Results mtsvr ON mmd.MemberNumber = mtsvr.MemberNumber
                                                              AND mmd.account_Number = mtsvr.Account_Number
WHERE   mtsvr.Account_number IS NULL

This is a much easier query to SQL Server to optimize. The WHERE clause is SARGable and we are performing a simple join on two columns. Just to make sure things are as easy as possible for SQL, I verified the MemberNumber and Account_Number columns in both tables had indexes.

I ran both queries and verified they returned the same records. The performance improvement I saw with these changes was astounding. The original query took 1.5 hours to run, had a cost of 55.59, and required 878,224 logical reads. By contrast, the re-written version ran in less than 1 second, had a cost of 0.74, and required just 441 logical reads. That's almost a 2000 times improvement! Amazing! And what's even more amazing is one table only has 450 rows and the other has just under 20,000 rows. Clearly, as those tables grow, the performance of the original query would have degraded very quickly.

Share