How SQL Query Plans Can Change Between Versions

I’ve been working lately on creating some automated check scripts, similar to Brent Ozar’s great sp_Blitz script. I’m intending my script to be able to be run against SQL 2005, 2008, and 2008 R2 servers (I just got my hands on SQL 2012, so my next pass will be to make sure everything works on that version too.) During this process, I ran across a query that worked on SQL 2008 R2 but failed on 2005.

The query was one I wrote to check the last known successful DBCC CHECKDB times for the databases on the server. The query used a common table expression. The basis for the CTE was the output of the DBCC LOGINFO command, part of which looks like this:

Id	ParentObject	Object	Field	Value
4	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_dbname	master
5	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_maxDbTimestamp	4000
6	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_version	611
7	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_createVersion	611
8	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_ESVersion	0
9	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_nextseqnum	1900-01-01 00:00:00.000
10	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_crdate	1900-01-01 00:00:00.000
11	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_filegeneration	0
12	DBINFO @0x000000000B51CE00	dbi_checkptLSN	m_fSeqNo	405
13	DBINFO @0x000000000B51CE00	dbi_checkptLSN	m_blockOffset	272
14	DBINFO @0x000000000B51CE00	dbi_checkptLSN	m_slotId	65
15	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_RebuildLogs	0
16	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_dbccFlags	0
17	DBINFO STRUCTURE:	DBINFO @0x000000000B51CE00	dbi_dbccLastKnownGood	2011-03-15 14:12:22.927

Of note here are the varying datatypes in the last column. Well, technically, they are all varchars, but for my use, I need the date values to be able to be implicitly converted to datetime and I also need to pull out the database name, which is a varchar. My first attempt was this:

WITH    DBs
          AS (SELECT    Field
                       ,Value
                       ,DBID = ROW_NUMBER() OVER (PARTITION BY Field ORDER BY ID)
              FROM      #DBs
              WHERE     Field = 'dbi_dbccLastKnownGood'
                        OR field = 'dbi_dbname'
             )
     SELECT *
     FROM   DBs AS DB1
            INNER MERGE JOIN DBs AS DB2 ON DB1.DBID = DB2.DBID
     WHERE  DB1.Field = 'dbi_dbname'
            AND DB2.Field = 'dbi_dbccLastKnownGood'
            AND CAST(DB2.Value AS DATETIME) < DATEADD(DD, -60,
                                                      CURRENT_TIMESTAMP)

(Somewhat simplified for presentation here.) So the data in the table above comes from the #DBs field, which is used in the CTE for DBs. I am using the CTE table joined to itself – once to get the database names and once to get the last known good DBCC dates. In SQL 2008 R2, this works. When I ran it in SQL 2005, I got the error “unable to convert character data to datetime.”

I spent some time on the 2005 server verifying that the data in #DB was, indeed, valid. It was. That meant the only difference was that SQL 2005 was running this query differently than 2008 R2. (I should note here that I do not have a SQL 2008 box to test on, so I don’t know if this change happened in2008 or 2008 R2.)

I obtained the query plans from both the 2008 R2 and 2005 servers. Not surprisingly, they were different.

SQL 2008 R2:

|--Merge Join(Inner Join, MANY-TO-MANY MERGE:([Expr1007])=([Expr1003]), RESIDUAL:([Expr1007]=[Expr1003]))</strong>
 |--Filter(WHERE:([Expr1009]&lt;dateadd(day,(-60),getdate()) AND [Expr1010]=(1)))
 |    |--Compute Scalar(DEFINE:([Expr1009]=CONVERT(datetime,[tempdb].[dbo].[#DBs].[Value],0), [Expr1010]=isdate(CONVERT_IMPLICIT(nvarchar(255),[tempdb].[dbo].[#DBs].[Value],0))))
 |         |--Sequence Project(DEFINE:([Expr1007]=row_number))
 |              |--Segment
 |                   |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]),  WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood') ORDERED FORWARD)
 |--Sequence Project(DEFINE:([Expr1003]=row_number))
 |--Segment
 |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]),  WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbname') ORDERED FORWARD)

SQL 2005:

|--Hash Match(Inner Join, HASH:([Expr1007])=([Expr1003]), RESIDUAL:([Expr1007]=[Expr1003]))
|--Filter(WHERE:([Expr1009]&lt;dateadd(day,(-60),getdate()) AND [Expr1010]=(1) AND [tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood'))
|    |--Compute Scalar(DEFINE:([Expr1009]=CONVERT(datetime,[tempdb].[dbo].[#DBs].[Value],0), [Expr1010]=isdate(CONVERT_IMPLICIT(nvarchar(255),[tempdb].[dbo].[#DBs].[Value],0))))
|         |--Sequence Project(DEFINE:([Expr1007]=row_number))
|              |--Compute Scalar(DEFINE:([Expr1012]=(1)))
|                   |--Segment
|                        |--Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|                             |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood' OR [tempdb].[dbo].[#DBs].[Field]='dbi_dbname'))
|--Filter(WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbname'))
|--Sequence Project(DEFINE:([Expr1003]=row_number))
|--Compute Scalar(DEFINE:([Expr1014]=(1)))
|--Segment
|--Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]='dbi_dbccLastKnownGood' OR [tempdb].[dbo].[#DBs].[Field]='dbi_dbname'))

The obvious difference is that 2008 R2 is using a merge join instead of the hash match in 2005. But if we look at these a little closer, we can see exactly why the query is failing in 2005 but not 2008 R2. Compare these lines:

2008 R2:

Sequence Project(DEFINE:([Expr1007]=row_number))
| |–Segment
| |–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbccLastKnownGood’) ORDERED FORWARD)
|–Sequence Project(DEFINE:([Expr1003]=row_number))
|–Segment
|–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbname’) ORDERED FORWARD)

2005:

Sequence Project(DEFINE:([Expr1007]=row_number))
|              |–Compute Scalar(DEFINE:([Expr1012]=(1)))
|                   |–Segment
|                        |–Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|                             |–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbccLastKnownGood’ OR [tempdb].[dbo].[#DBs].[Field]=’dbi_dbname’))

…

Compute Scalar(DEFINE:([Expr1014]=(1)))
|–Segment
|–Sort(ORDER BY:([tempdb].[dbo].[#DBs].[Field] ASC, [tempdb].[dbo].[#DBs].[Id] ASC))
|–Clustered Index Scan(OBJECT:([tempdb].[dbo].[#DBs]), WHERE:([tempdb].[dbo].[#DBs].[Field]=’dbi_dbccLastKnownGood’ OR [tempdb].[dbo].[#DBs].[Field]=’dbi_dbname’))

In the 2008 R2 query plan, the join is performed how I envisioned it in my mind – it took the rows containing the last known good DBCC date and joined them to the rows containing the database names. You can see how the lines in red nicely separate the data.

Now look at the 2005 plan. From the lines in blue, you can see the rows containing the database name and the last known good DBCC date are combined with an OR clause. Thus, when the query analyzer tries to implicitly convert the Value field into a datetime value, it gets to the rows where the database name is stored in this column and throws the error.

The solution was to rewrite the query to use two distinct CTE tables, one that only has the database name rows and one that only has the last known good DBCC date rows.

WITH    DB1
          AS (SELECT    Field
                       ,Value
                       ,DBID = ROW_NUMBER() OVER (PARTITION BY Field ORDER BY ID)
              FROM      #DBs
              WHERE     Field = 'dbi_dbname'
             ),
        DB2
          AS (SELECT    Field
                       ,Value
                       ,DBID = ROW_NUMBER() OVER (PARTITION BY Field ORDER BY ID)
              FROM      #DBs
              WHERE     Field = 'dbi_dbccLastKnownGood'
             )
     SELECT *
     FROM   DBs AS DB1
            INNER MERGE JOIN DBs AS DB2 ON DB1.DBID = DB2.DBID
     WHERE  DB1.Field = 'dbi_dbname'
            AND DB2.Field = 'dbi_dbccLastKnownGood'
            AND CAST(DB2.Value AS DATETIME) < DATEADD(DD, -60,
                                                      CURRENT_TIMESTAMP)

This query will work on both 2005 and 2008 R2.

Shaun J Stuart

Just another SQL Server weblog

How SQL Query Plans Can Change Between Versions

Leave a Reply Cancel reply