The Importance Of Trust

Recently, I was going through all my servers and performing some basic health checks. One of these checks was to look for foreign keys and constraints that are not trusted. I figured this would be something of a rare occurrence and was completely surprised when I found out that roughly 75% of my servers had at least one database where foreign keys and / or constraints were not trusted.

I’m telling the truth. Trust me.

Why is this important? When these items are trusted, SQL can make some assumptions about the data in the tables and can use those assumptions to create more efficient query plans. If, however, the constraints are not trusted, SQL can’t make any assumptions and must construct a query plan that may be more computationally intensive.

Before I get started, let me first explain what an untrusted foreign key or constraint is. (From here on, I’ll use the term constraint to include both constraints and foreign keys.) When you define a constraint on a table, you are telling SQL Server to only allow certain data in certain columns. In the case of a foreign key, you are telling SQL Server that the value in a column in Table A must exist in Table B as a primary key. If you try to enter a value that is not in Table B, the insert will fail.

However, you can tell SQL Server to cheat and allow you to insert the value anyway. You can do this in a couple of ways. The most obvious is to disable the constraint and insert the data. You can then re-enable the constraint. Another way is to perform a bulk insert operation without specifying the CHECK_CONSTRAINTS option. This is often done to speed imports of large amounts of data.

Unfortunately, once you do this, SQL Server marks the constraint as “not trusted”. Simply re-enabling the constraint will not change this. The constraint remains untrusted, even after being re-enabled. Re-enabling will prevent bad data from being inserted into the table again, but it does not validate the data that was inserted while the constraint was disabled. In order to make the constraint trusted, you need to tell SQL to validate the constraint against all the data that is currently in the table. I’ll show how to do this later.

So who cares? If I know the data I am importing is valid, why not go ahead and disable the constraint, load the data, then re-enable the constraint? The problem is you know the data is valid, but SQL Server doesn’t. And that can lead to sub-optimal performance.

Let me give a very simplified example. The following script will create two tables: Orders and Customers. There is a foreign key constraint on the Orders table that requires the value in Orders.CustomerNumber to be in the Customers table.


CREATE TABLE [dbo].[Orders]
       (
        [OrderNumber] [int] IDENTITY(1, 1)
                            NOT NULL
       ,[CustomerNumber] [int] NOT NULL
       ,[ProductNumber] [varchar](100) NOT NULL
       ,[Qty] [int] NOT NULL
       ,CONSTRAINT [PK_Orders] PRIMARY KEY CLUSTERED ([OrderNumber] ASC)
            WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
                  IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
                  ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
       )
ON     [PRIMARY]

GO

CREATE TABLE [dbo].[Customers]
       (
        [CustomerNumber] [int] IDENTITY(1, 1)
                               NOT NULL
       ,[CustomerName] [varchar](100) NOT NULL
       ,[Address] [varchar](100) NOT NULL
       ,[City] [varchar](50) NOT NULL
       ,[State] [char](2) NOT NULL
       ,[ZipCode] [varchar](10) NOT NULL
       ,CONSTRAINT [PK_Customer] PRIMARY KEY CLUSTERED ([CustomerNumber] ASC)
            WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
                  IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
                  ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
       )
ON     [PRIMARY]

GO

ALTER TABLE [dbo].[Orders]  WITH CHECK ADD  CONSTRAINT [FK_Orders_Customers]
FOREIGN KEY([CustomerNumber])
REFERENCES [dbo].[Customers] ([CustomerNumber])
GO

ALTER TABLE [dbo].[Orders] CHECK CONSTRAINT [FK_Orders_Customers]
GO

Now, let’s insert a few rows of data. First, we’ll create a customer record, then two order records that link to that customer.


INSERT  INTO Customers
        (CustomerName
        ,Address
        ,City
        ,State
        ,ZipCode)
VALUES  ('Big Spender'
        ,'123 Main Street'
        ,'Gotham'
        ,'NY'
        ,'10111')

INSERT  INTO Orders
        (CustomerNumber
        ,ProductNumber
        ,Qty)
VALUES  (1
        ,'ABC123'
        ,'10')

INSERT  INTO Orders
        (CustomerNumber
        ,ProductNumber
        ,Qty)
VALUES  (1
        ,'ABC123'
        ,'11')

Now, let’s say we want to execute the following query:


SELECT  *
FROM    Orders
WHERE   orders.CustomerNumber IN (SELECT    CustomerNumber
                                  FROM      Customers)

Not that great of a query, but this is just an example. Let’s now run the query with the Include Actual Execution Plan option and see what query plan SQL came up with:

Notice that SQL doesn’t even touch the Customers table. This is because the constraint we defined guarantees that every value in Orders.CustomerNumber exists in the Customers table.

Now, let’s disable the constraint and re-run the same query:

ALTER TABLE Orders NOCHECK CONSTRAINT FK_Orders_Customers

Here’s is the query plan now:

Because the constraint is not trusted, SQL must construct a query that accesses the Customer table. This will obviously require SQL Server to do more work than the plan we got when the constraint was trusted.

Now let’s re-enable the constraint and see what happens:

ALTER TABLE Orders CHECK CONSTRAINT FK_Orders_Customers

When we run the query again, here’s the plan SQL generates:

The plan is exactly the same as the one we got when the constraint was disabled! This is because the constraint is still untrusted. Even though we did not add any data to the tables while the constraint was untrusted, the query engine does not know that and SQL leaves the constraint marked as untrusted. Therefore, the query optimizer cannot use the additional information the constraint provides when it optimizes the query.

So how to we get the constraint trusted again? By running the ALTER TABLE command to tell SQL to verify the constraint:

ALTER TABLE Orders WITH CHECK CHECK CONSTRAINT FK_Orders_Customers

Note the double CHECK. This is required. Now when we run our query, we get our initial execution plan again:

What happens if, while the constraint was disabled, someone did insert invalid data? In that case, the above statement would fail with an error message. If this happens, you need to fix the problem before the constraint can be re-trusted.

Now this example was a bit contrived. We’re dealing with two very simple tables with a total of three rows of data. Performance will not be an issue no matter what which query plan we end up with. But imagine you have a large data warehouse with millions of records. Each week, there is a new bulk load of data and someone forgot to code the import process to use the CHECK_CONSTRAINTS option. Queries against that data warehouse could end up taking much longer than they should.

How can you tell if you have any tables in your databases that have untrusted constraints? The sys.foreign_keys table contains a column named is_trusted. If the value in that column is 1, the foreign key is not trusted. For constraints, the sys.check_constraints table contains a column with the same name and functionality.

Below is some code that will search through all the foreign keys in a database and attempt to make them trusted.  Note that this will only look at foreign keys that are enabled but not trusted. If any are disabled, this will not try to enable them.

DECLARE @CorrectedCount INT
DECLARE @FailedCount INT
DECLARE UntrustedForeignKeysCursor CURSOR
FOR
        SELECT  '[' + s.name + '].' + '[' + o.name + ']' AS TableName
               ,i.name AS FKName
        FROM    sys.foreign_keys i
                INNER JOIN sys.objects o ON i.parent_object_id = o.OBJECT_ID
                INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
        WHERE   i.is_not_trusted = 1
                AND i.is_not_for_replication = 0
                AND i.is_disabled = 0
        ORDER BY o.name

DECLARE @TableName AS VARCHAR(200)
DECLARE @FKName AS VARCHAR(200)

SET @CorrectedCount = 0
SET @FailedCount = 0

OPEN UntrustedForeignKeysCursor
FETCH NEXT FROM UntrustedForeignKeysCursor INTO @TableName, @FKName
WHILE @@FETCH_STATUS = 0
      BEGIN
			/* SELECT 'ALTER TABLE ' + @TableName + ' WITH CHECK CHECK CONSTRAINT [' + @FKName + ']' */

            BEGIN TRY
				/*
					This try-catch will allow the process to continue when a constaint fails to get re-trusted
				*/
                  EXECUTE('ALTER TABLE ' + @TableName + ' WITH CHECK CHECK CONSTRAINT [' + @FKName + ']')
                  SET @CorrectedCount = @CorrectedCount + 1
            END TRY
            BEGIN CATCH
                  SET @FailedCount = @FailedCount + 1
            END CATCH

            FETCH NEXT FROM UntrustedForeignKeysCursor INTO @TableName,
                  @FKName
      END

CLOSE UntrustedForeignKeysCursor
DEALLOCATE UntrustedForeignKeysCursor
SELECT  CAST(@CorrectedCount AS VARCHAR(10)) + ' constraints re-trusted.'
SELECT  CAST(@FailedCount AS VARCHAR(10))
        + ' constraints unable to be re-trusted.'

This code will only look for foreign keys that are untrusted. If you want to also check for untrusted constraints, change the table in the cursor definition from sys.foreign_keys to sys.check_constraints. Everything else can stay the same. The code will report a count of constraints it has fixed and was unable to fix.

As I said before, I was completely surprised by the number of databases I had that contained untrusted foreign keys and constraints. I recommend taking a look at your systems to see how many there are in your environment.

(Standard code disclaimers apply – do not run unless you understand what the code is doing. This code has been tested against SQL 2005 and SQL 2008 R2 servers.)

Share

One thought on “The Importance Of Trust

Leave a Reply

Your email address will not be published. Required fields are marked *

I am a real person and can prove it by doing this math problem: (required) Time limit is exhausted. Please reload CAPTCHA.