Shaun J Stuart

Just another SQL Server weblog

I submitted a script to SQLServerCentral.com and it's been published today. The script automatically re-trusts foreign keys or constraints. It will also report any that could not be re-trusted. Discussion about the script can be found here. I have post here about how your foreign keys and constraints might become untrusted and why it is important for performance that they are trusted.

Note that the script I submitted to SQLServerCentral.com is a nicer version of  the script I give at the end of my post linked to above. The difference is that the version at SQLServerCentral.com will report out the names of any foreign keys / constraints that could not be re-enabled whereas my earlier version only reported a count.

Share

I think most DBAs by now know that heaps, tables without a clustered index, are bad and should be avoided, but the word hasn't seemed to have trickled down to software developers yet. I still come across to many vendor databases that contain hundreds, if not thousands, of heaps. In fact, what prompted me to write this was one database I have to support that has 335,000 total tables, 295,000 of which are heaps. Yes, this is a 1.2 terabyte database that is 88% heaps. (Although, to be fair, most of those heaps have zero records in them.)

I'm in the process of moving the database from SQL 2005 to SQL 2008 R2 so we can implement data compression. I've had to get the vendor involved for some parts of the move and they mentioned the database size and that they wanted to run some scripts to purge some old data and reduce the size. (They also wanted to run a shrink, but I put a stop to that. We'll do a shrink and defrag after the compression has been implemented.)

But given that the vendor took note of the size of the database, I thought this might be a good opportunity to educate them to the perils of heaps. It's easy to tell them about the performance issues heaps can cause and those are fairly well documented. However, a lesser-known side effect of heaps is that they can unnecessarily increase the size of your database.

If you have a heap that has variable length columns, SQL will stuff as many rows on a page as it can. But then if you go back and modify that data such that the total row length increases, that row will no longer fit in the same space on the data page that it used to. If this happens, SQL creates a forwarding record that 151822725_4cc38cd14c_mpoints to a new page, where the updated, longer row is written. The new row also has a return pointer pointing back to the original position (which is used if the row is updated again). Paul Randal has a great write up about forwarding and back-pointers here, with code to demonstrate exactly what is happening. The upshot of all this is that, due to the forwarding and back pointers, each forwarded record in a heap adds 21 bytes to the table - 12 for the forwarding record pointer and 9 for the return pointer.

That may not seem like a lot, but when you are dealing with a 1.2 terabyte database with 295,000 heaps, I suspected this might add up to a significant chunk of space. So I wrote the below code to see exactly how much space forwarding pointers are using in a database with heaps.

Before getting to the code, let me point out that not all heaps will have forwarding pointers. If your heap is used in in a read-only manner, such as a data warehouse application for example, the records in the heap won't be updated and no forwarding records would need to be created. Or, if your heap contains only columns of fixed length datatypes, you also won't have forwarding records. But if your heap is a table that has frequent updates and contains variable length data that changes size, you can have lots of them.

The dynamic management function sys.dm_db_index_physical_stats will tell you how many forwarded records a heap has, provided you run it with the DETAILED option. The code below will scan your database for heaps and write them to a table. It will then run the sm_db_index_physical_stats function against each one to determine the total number of rows and the total number or forwarded records. The last step gives a total, in megabytes, of the total space used by the pointers. It is my hope that providing this data to vendors, along with other information about the other performance issues with heaps, we can start to impress upon the developers the need for clustered indexes.

FYI, for my 1.2 TB database, I had 55 MB of pointer data. I don't know if this is enough to make any developers look into eliminating heaps in their designs, but it's worth a try.

Note: Enter your database ID where noted in line 40. Also note, depending on the size of your database, this can take a long time to run. (It took 6 hours on my big database.) I suggest running it during off peak hours or during a maintenance window.

CREATE TABLE ForwardPointers
       (
        ObjectName VARCHAR(50)
       ,ObjectID INT
       ,TotalRows INT
       ,ForwardedRecords INT
       )

/* Load list of heaps into table */

INSERT  INTO ForwardPointers
        (ObjectName)
        SELECT  SCH.name + '.' + TBL.name AS TableName
        FROM    sys.tables AS TBL
                INNER JOIN sys.schemas AS SCH ON TBL.schema_id = SCH.schema_id
                INNER JOIN sys.indexes AS IDX ON TBL.object_id = IDX.object_id
                                                 AND IDX.type = 0 /* Heaps */
/* Get the Object IDs for the heaps */

UPDATE  ForwardPointers
SET     ObjectID = OBJECT_ID(ObjectName)
FROM    ForwardPointers

/* Run each heap through sys.dm_db_index_physical_stats to get the required detais */

DECLARE ObjectCursor CURSOR
FOR
        SELECT  ObjectID
        FROM    ForwardPointers

DECLARE @ObjectID INT

OPEN ObjectCursor
FETCH NEXT FROM ObjectCursor INTO @ObjectID
WHILE @@fetch_status = 0
      BEGIN
            UPDATE  ForwardPointers
            SET     TotalRows = record_count
                   ,ForwardedRecords = forwarded_record_count
            FROM    sys.dm_db_index_physical_stats(EnterYourDatabaseIDHere, @ObjectID, 0, 1, 'DETAILED')
            WHERE   ObjectId = @ObjectID

            FETCH NEXT FROM ObjectCursor INTO @ObjectID
      END

CLOSE ObjectCursor
DEALLOCATE ObjectCursor

/* Calculate the total size of forwarding pointers.
	See http://www.sqlskills.com/blogs/paul/forwarding-and-forwarded-records-and-the-back-pointer-size/
	for details */

SELECT SUM(ForwardedRecords*21) / 1024.0 / 1024.0 AS ForwardPointersSizeInMB
FROM ForwardPointers

--DROP TABLE ForwardPointers

Share

Back in December, a couple of checks I wrote were included in Brent Ozar Unlimited's sp_Blitz script. I was chagrined to see that some people discovered some bugs in my code and submitted fixes. To be fair, one was a bug in SQL 2008 & 2008 R2 where DBCC DBINFO WITH TABLERESULTS returned the dbccLastKnownGood entry twice. (Microsoft has said they aren't going to fix this, unfortunately.) The other was an issue when systems were using the British date format. I guess this just goes to show how hard it is to test every possible use case to discover bugs.

Even though there were only problems with one of the three tests I submitted, I still felt the need to redeem myself and submit another check. I had recently read a blog post about monitoring the values of identity columns in tables to make sure you aren't approaching the maximum value for the data type used. Surprisingly, such a check wasn't already in sp_Blitz. I felt this was something I could tackle.

My first attempt worked, but it did so by cycling through all the tables in a database, finding any identity columns, and comparing the maximum value used to the maximum value the data type can accommodate (255 for TINYINT, 2147483648 for INT, etc.) I submitted this and the feedback from Brent was that he wasn't sure if this should be included because of the performance implications. Although I tested the code on all my servers without issue, Brent mentioned that some databases, especially those for SAP, may have 10,000 to 50,000 tables. Running code to look at each of those tables could have a severe impact on performance or, at the very least, greatly increase the time it took the script to complete. That's a valid point (and again goes to show how hard it is to test for all use cases). Apparently, he spoke to Kendra Little about this and she brought up a system table that both Brent and I didn't know about (or more likely, knew about, but forgot): sys.identity_columns. Using that, I can get the same information by querying one system table instead of each individual user table in the database.

So I re-wrote my script to use that and re-submitted it to sp_Blitz. It's in the queue for evaluation, so I don't know when (or if) it will be included, but I thought I post the test here. (And of course, once I had the name of this table, I discovered a Google search returns lots of scripts that provide the same functionality.)

A quick note: The script assumes the identity increment is positive and the identity values are positive. The script will issue a warning when the maximum value is more than 90% of the maximum value of the data type used.

  DECLARE @SchemaName sysname
  DECLARE @TableName sysname
  DECLARE @ColumnName sysname

  DECLARE @Name sysname
  DECLARE DBNameCursor CURSOR
  FOR
          SELECT    name
          FROM      sys.databases
          WHERE     source_database_id IS NULL	/* no database snapshots */
                    AND is_read_only = 0		/* no read-only dbs (log shipping) */
                    AND database_id <> 2		/* skip tempdb */
                    AND state = 0	/* online databases */
          ORDER BY  name;

  CREATE TABLE #IdentityCheck
         (
          PK INT IDENTITY(1, 1)
         ,DatabaseName sysname
         ,SchemaName sysname
         ,TableName sysname
         ,ColumnName sysname
         ,ColumnType VARCHAR(8)
         ,MaxValue BIGINT
         )

  OPEN DBNameCursor
  FETCH NEXT FROM DBNameCursor INTO @Name
  WHILE @@fetch_status = 0
        BEGIN

              EXEC('USE [' + @Name + '];  INSERT  INTO #IdentityCheck
              SELECT ''' + @Name + ''' as DatabaseName
              ,SCHEMA_NAME(o.SCHEMA_ID) AS SchemaName
              ,OBJECT_NAME(o.OBJECT_ID) AS TableName
              ,c.[name] AS ColumnName
              ,t.name AS ColumnType
              ,CAST(last_value AS bigint) AS MaxValue
              FROM    sys.identity_columns c
              ,sys.types t
              ,sys.objects o
              WHERE   c.is_identity = 1
              AND t.system_type_id = c.system_type_id
              AND o.object_id = c.object_id
              AND o.type = ''u''
              AND t.is_user_defined = 0 /* only look at system defined types */
              AND t.name IN (''int'', ''bigint'', ''smallint'', ''tinyint'')')
              FETCH NEXT FROM DBNameCursor INTO @Name
        END

  CLOSE DBNameCursor
  DEALLOCATE DBNameCursor

  SELECT    'Table [' + SchemaName + '].[' + TableName + '], column ['
            + ColumnName + '] in database [' + DatabaseName
            + '] is an identity column that has reached over 90% of the maximum value for that datatype.' AS Details
           ,'' AS ChangeRequirements
  FROM      #IdentityCheck
  WHERE     1 = CASE WHEN ColumnType = 'bigint'
                          AND (CONVERT(DECIMAL(19, 0), MaxValue)
                               / 9223372036854775807.0) > .90 THEN 1
                     WHEN ColumnType = 'int'
                          AND (CONVERT(DECIMAL(10, 0), MaxValue)
                               / 2147483648.0) > .90 THEN 1
                     WHEN ColumnType = 'smallint'
                          AND (CONVERT(DECIMAL(5, 0), MaxValue) / 32767.0) > .90
                     THEN 1
                     WHEN ColumnType = 'tinyint'
                          AND (CONVERT(DECIMAL(3, 0), MaxValue) / 255.0) > .90
                     THEN 1
                END

  DROP TABLE #IdentityCheck

So what do you do if you get this warning? Well, a quick fix that Michelle Ufford suggested, is to reseed your identity value to the lowest negative value the data type supports. For example, if your identity column is an integer, reseed the column to start at -2147483648 and increment by a positive number. (This assumes you started your initial identity value at zero and used a positive increment to begin with.)

One more note: you'll notice my script uses a cursor to cycle through the databases while many other scripts you find on the internet use ms_foreachdb. Either method works. I prefer the cursor because it's 100% supported. (For the cursors-give-sucky-performance crowd, the undocumented command uses a cursor behind the scenes anyway, so that's a moot point.) Ms_foreachdb is an unsupported command and can change at any time. There is also evidence that it can skip some databases and even completely lock up SQL 2005 servers in some situations. Given the issue I ran into above with the DBCC DBINFO command, which is also an unsupported command, and Microsoft's statement that they are not going to fix it, I am committed to not using any undocumented commands in any scripts I pan to re-use. I'll use them for any one-off work I might be doing, but for something that I want to save and re-use as part of my SQL toolkit, I will pass.

Share

My co-worker showed me a SQL Server requirements document that he found online somewhere and modified at a previous company. He would give this document to vendors who said they needed a SQL Server and they would need to select the appropriate answers before he would let them install their product.  I think this is a great idea and thought I'd share it.

SQL Server Host Requirements (OS, minimum RAM, Number of CPUs, VM support, etc.)

Version of SQL Server Supported

  • SQL 2012 SP2
  • SQL 2012 SP1
  • SQL 2012 RTM
  • SQL 2008 R2 SP2
  • SQL 2008 R2 SP1

Edition Of SQL Server Supported

  • Enterprise
  • Standard

32-bit vs. 64-bit

  • 32-bit
  • 64-bit

Exclusivity For The Hosted Database

  • Shared SQL Server Environment
  • Dedicated SQL Server required

Maximum Concurrent User Count

IOPS (I/O per second) Requirement

Initial Database Size - Please provide for each database, if multiple. Specify in MB or GB.

Projected Annual Database Growth Rate. Specify in MB or GB.

Special SAN Requirements, if any.

Dedicated Test, Development, and / or Training Environment Required (Include Details)

  • Auxiliary environments to be delete / removed post-go live
  • Auxiliary environments to remain accessible for duration of project use
  • Auxiliary environments will need to be recreated for future product upgrades / enhancements

Database Security

  • Single Windows (Active Directory) network login (connection pooling)
  • Active Directory gorup login
  • Individual or AD Group network login for each user
  • Single SQL login ID and password (Login and password must be customizable and not identical to other product installations at other clients)
  • Individual SQL login and password for each user

Maximum Level Of Permissions Required for Users / Administators / Service Account (Please provide explanation for any response greater than Database role, db_owner)

  • Database role, less than db_owner
  • Database role, db_owner
  • Aliased as dbo for database
  • Some level of server-level permissions less than system administrator (sa)
  • SA role membership
  • Local administrator rights on SQL Server host machine

Database Connection / Session Handling

  • Application maintains persisted connections / sessions to database
  • Application established and releases connections / sessions as requested during normal operation
  • Other

Vendor Access Required To Support Database Issues

  • Vendor requires local login rights to SQL Server
  • Vendor requires remote-connection via the application
  • Vendor requires VPN access to our network to connect to the database via SSMS
  • No remote support capabilities. Database backup will need to be sent via secure FTP to vendor for analysis
  • No remote support capabilities. Secure method of transferring database backup not available.

Application and Database Support Exists for the Following High Availability Architectures

  • Failover clustering
  • Mirroring
  • Log shipping
  • Replication

How Frequently Do You Certify SQL Server Patches, Service Packs, and Release Updates?

  • As released by Microsoft
  • Quarterly
  • Annually
  • Other

How Frequently Do You Issue Database Updates?

  • Quarterly
  • Annually
  • Other

Performance Tuning Requirements

  • Index tuning changes (adding or disabling indexes, etc.) must be submitted to vendor for approval
  • Index tuning changes (adding or disabling indexes, etc.) do not require vendor approval
  • Index changes by customer technical staff are not permitted

Data Retention / Archiving and Growth Management Support

  • Vendor does not support database archiving
  • Vendor has database archiving process in place or available
  • Database archiving solutions created by customer must be submitted to vendor for approval
  • Database archiving solutions created by customer do not require vendor approval
  • Vendor has tested and supports SQL Server data compression

Database Installation Process

  • TSQL scripts or database backup file to be submitted to customer DBA for installation
  • Installation accomplished via executable or other method run by vendor / customer on non-SQL Server host (e.g., application server or client workstation). Any additional permission requirements needed for installation should be noted here.
  • Other

Note any Non-Standard Collation, Server-Level, or Instance-Level Settings Required

Identify Tables with Sensitive Data (Personally Identifiable Information, Financial Data, Passwords, etc.)

Identify Any Custom SQL Agent Jobs, Proprietary Backup or Maintenance Processes, or Similar Items.

Vendor Technical Contact ____________________________
Vendor Technical Contact Email _______________________
Vendor Technical Contact Phone _______________________

I would be surprised if many vendors knew the answer to all these questions right off the bat. However, this questionnaire will at least bring these issues to the forefront where they can start being addressed in a pro-active manner, before the product is installed. It will also raise any red flags. (The application needs sa access on a shared SQL Server? I don't think so! Your maintenance plan shrinks the database weekly? I don't think so!)

 

Share

Pop Quiz, hotshot. A vendor installed a maintenance plan on one of your SQL Servers. It looks like this:

Once the free space rises above 10%, the plan is armed.

Once the free space rises above 10%, the plan is armed.

What do you do?

 

Share