“Honey, your phone’s been going crazy.” My wife stood next to the bed holding my work iPhone. I groaned. This is not a particularly good way to be woken up, especially when you are suffering from pneumonia and are all drugged up on codeine-based cough syrup, as I was. It was 10:30 PM on Thursday night. As she stood there, the iPhone buzzed again. Email notification. I took the phone and, with some trepidation, checked my email. There was a job failing on one of the SQL Servers at work. A failure message was being sent out every minute. Ugh.
I dragged myself out of bed and tried to focus my blurry mind. I got to my computer, plopped down in the chair, and logged in to the server remotely to see what was going on. Because I am the only DBA at my company, I have set up the SQL alerts to go to the IT department mailing list. That way, if I am unavailable for some reason, at least someone else will be notified that there is a problem and can hopefully take action or at the very least, find some way to contact me. A couple of my coworkers had responded to the alerts asking me if I was getting them. I replied I was on the case.
My first order of business was to shut off the alerts spewing out every minute. The culprit turned out to be a job that was scheduled to run every minute. I disabled it, then took a look at the job history to see what was going on. The job was some sort of housecleaning job someone set up before I was employed at the company. It ran some DELETE statements against a Sharepoint database. Looking at the history, I could see it was running fine up to 10:16 PM. At 10:17 PM, it started failing. The error message was “Unable to connect to database.” Hmm.. My next step was to check the database – was it corrupt, unaccessible, offline, detached, what? Turns out, it was just plain missing! The database did not show up in SSMS. Uh-oh. I was thinking this was getting more complicated that I wanted it to be, given my drug-induced stupor. I checked the location of the .MDF and .LDF files. Those files were missing too. Ok, that ruled out a detached database. I checked the error logs for both SQL Server and SQL Agent. Nothing out of the ordinary, apart from the errors about the job failing.
I could only think of one thing that would cause the .MDF and .LDF files to be deleted – the database was dropped / deleted. Whether by accident or on purpose, I had no way of knowing. I also had no way of knowing who did it. It was now about 20 minutes after the first alert was sent out. I checked to see who was connected to the server, but didn’t see any unusual connections – mostly a lot of other computers and service accounts. This server was external facing and so there was the possibility we had been hacked. According to the emails flying around that night, no one in the IT department was doing anything with Sharepoint. I did get a report that we had a contractor off-site who was doing some Sharepoint work for us. Maybe he was responsible. I didn’t have a phone number, so I fired off an email and waited. Ten minutes went by. I had no idea what precisely this database was used for, other than Sharepoint in general, so I had no idea what effect the loss of the database was having on the company.
With still no response from the contractor, I decided to go ahead and restore the database. As the saying goes, you never have a good backup until you have restored it, so this was a good test of our backup strategy, although one I would have preferred to try under more controlled circumstances. I restored the weekly full backup and the last daily differential backup and brought the database back online. (It was in simple recovery mode, so no transaction logs had to be restored.) I checked out some tables and I seemed to be able to access the data without problems. I re-enabled the job that was failing and it ran happily. It seemed like we were good to go.
Except I still didn’t know who dropped the database or why. I forced myself to stay awake a bit longer. I figured if the database was dropped on purpose, whoever did it, might see that it was restored and try to drop it again. I thought about putting some DDL triggers on it to prevent that from happening, but in my current state, the last thing I wanted to do was start messing around with triggers. After about 10 minutes, nothing else happened, so I went back to bed, determined to investigate more in the morning.
On the drive in the following day, my mind was going through possible things to try to identify who dropped the database. Without the codeine, my thinking was a bit clearer and I could come up with some possible plans of attack. I knew Paul Randal had a method for reading the transaction log to determine who dropped a table. Perhaps I could modify this to find who dropped a database. Except the database was in simple recovery mode. And the log file was gone. And in which logfile would the drop of a database be recorded anyway? I had already checked the server logs and didn’t see anything there. I could check the default profiler trace that runs by default on SQL 2005 and 2008 (this was a 2008 server I was dealing with) or use the Schema Changes History report from the Standard Reports option. (Both these methods are described here.) I was also thinking about how I could implement DDL triggers to either prevent someone from dropping a database again or, at the very least, log who did it.
When I got into the office and checked my email, I was relieved to see I would not need to do much digging. I had received an email from the Sharepoint contractor. He was, in fact, the one who dropped the database. It turns out it was an unused database and he was cleaning things up. He didn’t think any alarms would be set off if he dropped it. Given that knowledge, I went ahead and disabled the job again, archived the database with a final full backup, then dropped it. Although the whole incident created a bit of stress, it was good to know our alerts and backup plans functioned as they should.