How SQL Query Plans Can Change Between Versions

I’ve been working lately on creating some automated check scripts, similar to Brent Ozar’s great sp_Blitz script. I’m intending my script to be able to be run against SQL 2005, 2008, and 2008 R2 servers (I just got my hands on SQL 2012, so my next pass will be to make sure everything works on that version too.) During this process, I ran across a query that worked on SQL 2008 R2 but failed on 2005.

The query was one I wrote to check the last known successful DBCC CHECKDB times for the databases on the server. The query used a common table expression. The basis for the CTE was the output of the DBCC LOGINFO command, part of which looks like this:

Id ParentObject Object Field Value
4 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_dbname master
5 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_maxDbTimestamp 4000
6 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_version 611
7 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_createVersion 611
8 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_ESVersion 0
9 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_nextseqnum 1900-01-01 00:00:00.000
10 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_crdate 1900-01-01 00:00:00.000
11 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_filegeneration 0
12 DBINFO @0x000000000B51CE00 dbi_checkptLSN m_fSeqNo 405
13 DBINFO @0x000000000B51CE00 dbi_checkptLSN m_blockOffset 272
14 DBINFO @0x000000000B51CE00 dbi_checkptLSN m_slotId 65
15 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_RebuildLogs 0
16 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_dbccFlags 0
17 DBINFO STRUCTURE: DBINFO @0x000000000B51CE00 dbi_dbccLastKnownGood 2011-03-15 14:12:22.927

Of note here are the varying datatypes in the last column. Well, technically, they are all varchars, but for my use, I need the date values to be able to be implicitly converted to datetime and I also need to pull out the database name, which is a varchar. My first attempt was this:

WITH    DBs
          AS (SELECT    Field
                       ,Value
                       ,DBID = ROW_NUMBER() OVER (PARTITION BY Field ORDER BY ID)
              FROM      #DBs
              WHERE     Field = 'dbi_dbccLastKnownGood'
                        OR field = 'dbi_dbname'
             )
     SELECT *
     FROM   DBs AS DB1
            INNER MERGE JOIN DBs AS DB2 ON DB1.DBID = DB2.DBID
     WHERE  DB1.Field = 'dbi_dbname'
            AND DB2.Field = 'dbi_dbccLastKnownGood'
            AND CAST(DB2.Value AS DATETIME) < DATEADD(DD, -60,
                                                      CURRENT_TIMESTAMP)

(Somewhat simplified for presentation here.) So the data in the table above comes from the #DBs field, which is used in the CTE for DBs. I am using the CTE table joined to itself – once to get the database names and once to get the last known good DBCC dates. In SQL 2008 R2, this works. When I ran it in SQL 2005, I got the error “unable to convert character data to datetime.”

I spent some time on the 2005 server verifying that the data in #DB was, indeed, valid. It was. That meant the only difference was that SQL 2005 was running this query differently than 2008 R2. (I should note here that I do not have a SQL 2008 box to test on, so I don’t know if this change happened in2008 or 2008 R2.)

I obtained the query plans from both the 2008 R2 and 2005 servers. Not surprisingly, they were different.

SQL 2008 R2:

|--Merge Join(Inner Join, MANY-TO-MANY MERGE:([Expr1007])=([Expr1003]), RESIDUAL:([Expr1007]=[Expr1003]))</strong>
 |--Filter(WHERE:([Expr1009]&lt;dateadd(day,(-60),getdate()) AND [Expr1010]=(1)))
 |    |--Compute Scalar(DEFINE:([Expr1009]=CONVERT(datetime,[tempdb].[dbo].[#DBs].[Value],0), [Expr1010]=isdate(CONVERT_IMPLICIT(nvarchar(255),[tempdb].[dbo].[#DBs].[Value],0))))
 |         |--Sequence Project(DEFINE:([Expr1007]=row_number))
 |              |--Segment
 |                   |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]),  WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood') ORDERED FORWARD)
 |--Sequence Project(DEFINE:([Expr1003]=row_number))
 |--Segment
 |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]),  WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbname') ORDERED FORWARD)

SQL 2005:

|--Hash Match(Inner Join, HASH:([Expr1007])=([Expr1003]), RESIDUAL:([Expr1007]=[Expr1003]))
|--Filter(WHERE:([Expr1009]&lt;dateadd(day,(-60),getdate()) AND [Expr1010]=(1) AND [tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood'))
|    |--Compute Scalar(DEFINE:([Expr1009]=CONVERT(datetime,[tempdb].[dbo].[#DBs].[Value],0), [Expr1010]=isdate(CONVERT_IMPLICIT(nvarchar(255),[tempdb].[dbo].[#DBs].[Value],0))))
|         |--Sequence Project(DEFINE:([Expr1007]=row_number))
|              |--Compute Scalar(DEFINE:([Expr1012]=(1)))
|                   |--Segment
|                        |--Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|                             |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood' OR [tempdb].[dbo].[#DBs].[Field]='dbi_dbname'))
|--Filter(WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbname'))
|--Sequence Project(DEFINE:([Expr1003]=row_number))
|--Compute Scalar(DEFINE:([Expr1014]=(1)))
|--Segment
|--Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood' OR [tempdb].[dbo].[#DBs].[Field]='dbi_dbname'))

The obvious difference is that 2008 R2 is using a merge join instead of the hash match in 2005. But if we look at these a little closer, we can see exactly why the query is failing in 2005 but not 2008 R2. Compare these lines:

2008 R2:

Sequence Project(DEFINE:([Expr1007]=row_number))
|              |–Segment
|                   |–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]),  WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbccLastKnownGood’) ORDERED FORWARD)
|–Sequence Project(DEFINE:([Expr1003]=row_number))
|–Segment
|–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]),  WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbname’) ORDERED FORWARD)

2005:

Sequence Project(DEFINE:([Expr1007]=row_number))
|              |–Compute Scalar(DEFINE:([Expr1012]=(1)))
|                   |–Segment
|                        |–Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|                             |–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbccLastKnownGood’ OR [tempdb].[dbo].[#DBs].[Field]=’dbi_dbname’))

Compute Scalar(DEFINE:([Expr1014]=(1)))
|–Segment
|–Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbccLastKnownGood’ OR [tempdb].[dbo].[#DBs].[Field]=’dbi_dbname’))

In the 2008 R2 query plan, the join is performed how I envisioned it in my mind – it took the rows containing the last known good DBCC date and joined them to the rows containing the database names. You can see how the lines in red nicely separate the data.

Now look at the 2005 plan. From the lines in blue, you can see the rows containing the database name and the last known good DBCC date are combined with an OR clause. Thus, when the query analyzer tries to implicitly convert the Value field into a datetime value, it gets to the rows where the database name is stored in this column and throws the error.

The solution was to rewrite the query to use two distinct CTE tables, one that only has the database name rows and one that only has the last known good DBCC date rows.

WITH    DB1
          AS (SELECT    Field
                       ,Value
                       ,DBID = ROW_NUMBER() OVER (PARTITION BY Field ORDER BY ID)
              FROM      #DBs
              WHERE     Field = 'dbi_dbname'
             ),
        DB2
          AS (SELECT    Field
                       ,Value
                       ,DBID = ROW_NUMBER() OVER (PARTITION BY Field ORDER BY ID)
              FROM      #DBs
              WHERE     Field = 'dbi_dbccLastKnownGood'
             )
     SELECT *
     FROM   DBs AS DB1
            INNER MERGE JOIN DBs AS DB2 ON DB1.DBID = DB2.DBID
     WHERE  DB1.Field = 'dbi_dbname'
            AND DB2.Field = 'dbi_dbccLastKnownGood'
            AND CAST(DB2.Value AS DATETIME) < DATEADD(DD, -60,
                                                      CURRENT_TIMESTAMP)

This query will work on both 2005 and 2008 R2.

Leave a Reply

Your email address will not be published. Required fields are marked *

I am a real person and can prove it by doing this math problem: (required) Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.