After watching Brent Ozar’s webinar on backups and virtual machines, I was inspired to take a deeper look at our company’s overall backup plans. I had already gone through the code that was actually doing the backup, found some crazy stuff, and fixed it, but this time through I wanted to take a closer look at our schedules and retention periods. I’m glad I did.
I and one other DBA maintain approximately 55 SQL Servers here. We’re both fairly new to the company and therefore were not involved in setting up most of the current environment. And you know how things go – if it ain’t broke, don’t fix it. So there were some things we haven’t gotten around to reviewing. Backups were one of them.
I went through and noted the backup schedules on all our servers and how long they took to complete, with the goal of staggering the times to ease network traffic, per Brent’s webinar. What I found was a mess of inconsistency and duplication. I found several servers that would take a full backup of all databases each night at midnight, then take differential backups 30 minutes later – and there was no other activity on these systems during that 30 minutes! I found retention times for transaction log backups set to 8 days but the differential and full backup retention times set to 4 days. So half of those transaction log backups are worthless because they can never be restored. The amount of disk space and processor resources that were being wasted was huge!
It easy to see how this came about. In an environment with a large number of servers, no centralized or standardized backup functionality, and multiple DBAs, this happens. To help prevent this, I’ve created an internal wiki page that documents our standard jobs, their schedules, and our standard server configuration. While my manager is impressed because he sees this as me helping the company and future DBAs, the real reason I do it is because I can’t remember everything. (But don’t tell him that!) This is the same reason airline pilots go through a written checklist before take off – there are just too many little things that are easy to forget.
So if you have been in your position long enough to have seen the number of SQL Servers grow significantly, or you are in a new position and have inherited a bunch of machines, it would probably be worth your while to take the time to revisit your backup jobs across your company and make sure you have things set up they way you intend them to be.