One of the challenges of my new job is working with a couple very large databases (VLDBs) and very large tables. One database in particular has one table with 61 million rows and another with 78 million rows. They are growing by about 80,000 rows per day. My job is to purge data older than 1 year from these tables. Did I mention these tables support a 24 x 7 financial system? And blocking of more than a few seconds sets off all kinds of alarms? This presents something of a challenge.
Unfortunately, the tables are not partitioned. If they were (and if they were portioned correctly), this task would be simple, fast, and easy. Although I won’t go into details of partitioning here, suffice it to say that deleting or moving all the data in a partition is just a meta-data change that SQL Server can do in seconds without moving any actual rows of data. Going forward, we will need to create partitions on these tables, but for now, I needed to come up with a way to purge vast amounts of data (roughly 2/3 of each table) without causing excessive blocking or transaction log growth.
I was handed a script that was developed by a previous DBA here as a starting point. This script worked, but it was slow and caused lots of blocking. Here is a portion of it:
DECLARE @DeletePriorToDate DATETIME SET @DeletePriorToDate = DATEADD(yy,-1,current_timestamp) /*Get the TOP x PKs to delete */ DECLARE @PKsToDelete TABLE (id INT) INSERT INTO @PKsToDelete (id) SELECT TOP 1000 user_node_log_id FROM v_t_u_logs WITH (NOLOCK) WHERE create_time <= @DeletePriorToDate DELETE vtul FROM v_t_u_logs vtul WITH (ROWLOCK) INNER JOIN @PKSToDelete ptd ON vtul.user_node_log_id = ptd.id
That removed records from the smaller of the two tables. The next step in the script was to delete records from the larger table. In this case, the smaller table is related to the larger table via a field called node_id. Further, there is a many to many relationship between the smaller table and the larger table, so that you can’t just get a list of the node_ids to delete from the smaller table and delete them all from the larger table. Some might need to be retained. So the script I was given did the following:
CREATE TABLE #NodeIDs (Node_Id NUMERIC(18, 0)) INSERT INTO [#NodeIDs] (Node_Id) SELECT DISTINCT vtn.NODE_ID FROM dbo.v_t_NODE vtn WITH (NOLOCK) LEFT JOIN dbo. v_t_u_logs vtul WITH (NOLOCK) ON vtn.NODE_ID = vtul.NODE_ID WHERE vtul.NODE_ID IS NULL CREATE INDEX tmpidxNodeID ON [#NodeIDs] (Node_ID ASC) DELETE v FROM dbo. v_t_n_history v INNER JOIN [#NodeIDs] nid ON v.Node_Id = nid.Node_ID
Kudos for throwing an index on the temporary table to speed things up, but this still didn’t work well. In words, the process worked like this: find the first 1000 records from vtul that are older than 1 year and store their primary key values in a table variable. Then, delete those records from vtul by joining to that table variable. Next, look through vtn and find all the records that do not have a corresponding node_id left in vtul. Store those node_ids in a temp table. Delete records with those node_ids from vtnh.
This approach has one benefit – if the query is canceled at any time (due to blocking issues), subsequent runs will pick up any records that still need to be deleted. This is thanks to the left join on vtul and selecting records where vtul.node_id is null. But this approach has multiple problems. First, the temp table is populated by joining the 61 million row table with vtn, a 3.5 million row table. That’s a lot of work and creates blocking, even with the NOLOCK hint. (Much of the blocking simply came from disk I/O waits, as all those rows needed to be read.) Second, the deletion from the vtnh table, the one with 78 million rows, was also causing blocking. This one was harder to control because there was no way of controlling how many records might be in the temp table. If you had canceled the script part way through several times, the temp table might contain 10,000 or more node_ids that needed to be deleted. Clearly, this process needed to be re-worked.
My first attempt was to try to limit the number of rows being deleted with a TOP X clause and looping. This was marginally successful. Blocking still occurred and to completely eliminate it, I had to drop down to deleting the TOP 5 rows, which is pretty close to RBAR – row by agonizing row. This was surprising, as I felt the TOP clause surely would have been useful here. But in doing some research, I found a post on Robert Davis’ (twitter | blog) blog about archiving and purging from large tables. His post contained this statement: “One more side note: simply adding TOP to the delete statement will NOT minimize the amount of locks taken initially. Please don’t do this!!” Yup. I could confirm that. (I’m guessing SQL Server does the whole join first anyway, and then simply picks the top X to delete, which is why you don’t see a performance benefit.)
Reading the rest of his post showed me I was at least heading in the right direction. I was selecting records to delete based on the primary key and then doing a targeted delete of those records. But I needed to make some modifications to make this script truly useful. My first target for optimization was that join of two big tables to determine which node_ids needed to be kept. This was done to allow the routine to pick up any records that should have been deleted in prior runs, but might not have been due to the query being cancelled. I knew this delete process was a one-time task. Once we got caught up, we were going to implement a scheduled job to keep up with deleting old records on a daily basis. So I made the decision to not worry about records that might be “orphaned” due to canceling a query. My thinking was to get most of the records deleted first, then, once we are caught up, I could run a query to find all those stragglers that were missed. The tables would be much smaller by that time, so the query should run quicker then. Another purpose of that join was to retain any records that might still have node_ids in the vtn table (which was in a many-to many relation, remember). My solution was to grab my list of primary keys of records to delete and the associated node_ids. I would delete the records identified by the primary keys, then scan the table again to see if any of the node_ids I had grabbed were still present in the table. These needed to be retained, so I removed those ids from my table of ids to delete.
Finally, I needed some way to control the number of records I was deleting at once in order to both minimize or eliminate blocking and to prevent the transaction log from growing out of control with millions of deletions. I did this by using two TOP statements. The first limits the number of records from vtul to be deleted. The second limits to number of records to delete at a time. I already knew that TOP didn’t really seem to limit locking when joining tables in a DELETE statement, so instead, I inserted my node_ids to delete in a second table variable which would be used to join on for the DELETE. By using the TOP statement to limit the number of rows inserted into this table to a subset of my total records, I would effectively be able to limit the number of records deleted at one time. By looping through the DELETE statement multiple times, I would be able to process all the records selected from vtul. The looping would also help eliminate blocking by periodically freeing up the table for use by other processes – also known as “trickle delete.” Here is the code I ended up using (which also includes some stuff for reporting):
DECLARE @DeletePriorToDate DATETIME -- Delete all records prior to this date (usually 1 year ago) DECLARE @RowsDeletedInTable INT -- Counter variable for reporting DECLARE @StartTime DATETIME -- For tracking execution time DECLARE @EndTime DATETIME -- For tracking execution time DECLARE @CycleLoopCounter TINYINT DECLARE @MaxCycleLoops TINYINT -- Number of times to loop through this whole routine DECLARE @WorkTable TABLE (id INT) -- Table containing small set of PK of records to delete DECLARE @LastPKDeleted INT -- Variable used to load @WorkTable with values SET @LastPKDeleted = 0 SET @MaxCycleLoops = 40 SET @CycleLoopCounter = 1 WHILE @CycleLoopCounter <= @MaxCycleLoops BEGIN PRINT '==========================' PRINT ' Starting pass ' + CONVERT(VARCHAR(2), @CycleLoopCounter) + ' of ' + CONVERT(VARCHAR(2), @MaxCycleLoops) PRINT '==========================' SET @StartTime = CURRENT_TIMESTAMP SET @DeletePriorToDate = DATEADD(yy, -1, CURRENT_TIMESTAMP) SET @RowsDeletedInTable = 0 DECLARE @PKsToDelete TABLE ( id INT, node_id NUMERIC(18, 0) NOT NULL ) -- Get PKs to delete from vtul. INSERT INTO @PKsToDelete (id, node_id ) SELECT TOP 5000 user_node_log_id, node_id FROM v_t_u_logs WITH (NOLOCK) WHERE create_time <= @DeletePriorToDate PRINT 'Deleting data from v_t_u_LOGS (1/4)..' /* Only delete in small batches to prevent blocking. Note, using the TOP keyword in a delete statement does not prevent blocking. So take a subset of the primary keys, put those in their own table and delete by joining on that */ WHILE EXISTS ( SELECT vtul.* FROM dbo.v_t_u_logs vtul WITH (NOLOCK) INNER JOIN @PKSToDelete pk2d ON vtul.user_node_log_id = pk2d.id ) BEGIN INSERT INTO @WorkTable (id ) SELECT TOP 100 id FROM @PKsToDelete WHERE ID > @LastPKDeleted ORDER BY ID ASC SELECT @LastPKDeleted = MAX(id) FROM @WorkTable DELETE vtul FROM v_t_u_logs vtul WITH (ROWLOCK) INNER JOIN @WorkTable wt ON vtul.user_node_log_id = wt.id SET @RowsDeletedInTable = @RowsDeletedInTable + @@rowcount DELETE FROM @WorkTable PRINT 'Total rows deleted in table: ' + CONVERT (VARCHAR(10), @RowsDeletedInTable) PRINT 'Deletion completed for v_t_u_LOGS.' END /* Some records with node_ids created more than 1 year ago might still be in other tables due to updates. We need to keep these records. So purge any node_ids that need to remain */ DELETE FROM @PKsToDelete WHERE node_id IN (SELECT node_id FROM dbo.v_t_u_logs WITH (NOLOCK)) PRINT 'Starting Purging for v_t_n_HISTORY (2/4)..' SET @RowsDeletedInTable = 0 -- trickle delete from vtnh to prevent blocking SET @LastPKDeleted = 0 WHILE EXISTS ( SELECT v.* FROM dbo.v_t_n_history v WITH (NOLOCK) INNER JOIN @PKSToDelete pk2d ON v.Node_Id = pk2d.Node_ID ) BEGIN INSERT INTO @WorkTable (id ) SELECT TOP 100 node_id FROM @PKsToDelete WHERE node_ID > @LastPKDeleted ORDER BY node_ID ASC SELECT @LastPKDeleted = MAX(id) FROM @WorkTable DELETE v FROM dbo.v_t_n_history v WITH (ROWLOCK) INNER JOIN @WorkTable wt ON wt.Id = v.Node_ID SET @RowsDeletedInTable = @RowsDeletedInTable + @@rowcount DELETE FROM @WorkTable END PRINT 'Total rows deleted in table: ' + CONVERT (VARCHAR(10), @RowsDeletedInTable) PRINT 'Purging completed for v_t_n_HISTORY.' PRINT 'Process completed.' SET @EndTime = CURRENT_TIMESTAMP PRINT 'Elapsed time for one delete cycle: ' + CONVERT(VARCHAR(5), DATEDIFF(s, @StartTime, @EndTime)) + ' seconds.' DELETE FROM @PKsToDelete SET @CycleLoopCounter = @CycleLoopCounter + 1 END
(I should note that in actuality, I have to delete from 4 tables altogether, not just the two I have shown here, but the others are handled the same way. They were left out for clarity and this is why some of the messages say 1/4, 2/4, etc.)
One other thing to note. Even though the DELETE statements specify the ROWLOCK hint, this is only a hint and SQL Server will ignore it when it wants to. For example, if a data page contains many rows to be deleted, SQL Server will escalate the row locks into a page lock. This is because each lock takes up memory and resources to manage and SQL Server will opt for a page lock instead of multiple row locks on a page to conserve resources. This is one reason why I was seeing blocking with larger delete batches.
After some experimentation, I discovered that selecting the TOP 5000 records to delete from vtul and then deleting from the other table in batches of 100 seems to work well. This is a huge improvement over deleting in batches of 5 rows from my first attempt. I’ve added a loop counter so that I can loop the whole script multiple times. I’ve been running this script continuously now for a couple of days and have not encountered any blocking issues. Also, the deletes are happening slow enough that our regular transaction log backups can keep up with the pace and the transaction log does not grow at all. (If this were not the case, I would have to add a transaction log backup as part of the script.)
When working with very large tables, partitioning is clearly the way to go. Unfortunately, DBAs often find themselves dealing with large tables that were not set up with partitions initially and are stuck with purging data the “old-fashioned” way. However, once you have caught up on your purging, the first thing you should do is implement partitioning so that going forward, you won’t have to go through this tedious process again.