Last week, I spent some time testing database mirroring as part of our disaster recovery strategy. I have to say, I am very impressed with the concept. It was incredibly easy to set up and worked like a charm. It is so much easier than log shipping or replication. And mirroring failovers are faster than a cluster failover. The drawback is that mirroring works on single databases, not an entire server. You can set up mirroring for more than one database on a server, but if you are using a 32 bit version of SQL, it is recommended that you mirror no more than 10 databases. (Mirroring consumes 5 threads per database mirrored and the number of threads you can have is limited by the number of CPUs you have. Because each thread takes up memory, the larger 64 bit memory address space allows for more threads to be used. This article gives a bit more information. And the number of 10 databases max in 32 bit systems is really just a suggestion. If you need to mirror more, try testing performance in your environment. You may find you can handle more. There are some good detailed articles here and here about the factors that affect mirroring performance.)
I’m not going to go into a full explanation of database mirroring. That’s beyond the scope of this post. You can get a good overview from this Microsoft TechNet article. I also recommend reading this Microsoft Knowledge Base article about things to consider when setting up mirroring. (Note that some of these articles, and other ones I link to in this post, refer to SQL 2005. This was the version when mirroring was first released. The details still hold for SQL 2008 and 2008 R2.) I am going to talk about setting up and configuring mirroring.
First, watch this video on setting up mirroring.
It’s really that easy!
My topology was the same as the person who made the above video – a principal server, a mirror server, and a witness server. The configuration supports automatic failover and is referred to as High Availability mode. My setup is: Principal server is 32 bit SQL 2008 R2 CU1 running on a virtual machine. The mirror server is a 64 bit 2008 R2 CU1 two node active/passive cluster. The monitor server is 32 bit SQL 2008 R2 CU1 Express. Note that all versions of SQL are the same on all three machines. This is a best practices recommendation. But also note that the principal server is running the 32 bit version of SQL and the mirror server is running the 64 bit version of SQL. This is ok. Another best practices recommendation is that the directory structure on the principal and mirror servers be the same. I also made sure that the SQL Server service used the same account on all servers.
Once you have your systems set up, actually getting mirroring up and running is easy. Simply make a full backup of the database to be mirrored on the principal server, then make a transaction log backup. Restore those backups onto the mirror server. As mentioned in the video, be sure to leave the database on the mirror server in the restoring state. Then, on the principal server, right click on the database to be mirrored, choose Tasks –> Mirror… and go through the wizard. In the video, the user left the service account fields blank. When I set it up, I actually inserted the network account that was used by the various server services (which was the same account for all). That’s it! When you finish the wizard, you’ll see the database on the principal listed as “(Principal, Synchronized).” On the mirror server, the database will be listed as “(Mirror, Synchronized / Restoring…)” Since it is in the restoring state, it is not accessible on that server.
The video shows the user testing a failover by actually stopping the SQL service on the principal server. That’s rather drastic (although it does simulate a server failure). You can manually failover a mirrored database by right-clicking the database, choosing Tasks –> Mirror… and then clicking the Failover button.
One thing I did run into in setting this up was the old orphaned login problem. I restored the database onto the principal server from a backup made on our production server. Then the backup from the principal server was restored on the mirror server. So I had orphaned logins on both servers. I fixed that by using this script. I first ran it on the principal server to fix the logins there. Then, I failed over the database to the mirror server (which I had to do because the database would be in the restoring state otherwise), and ran the script on the mirror server. I verified the logins worked on the mirror server, failed the database over again and then verified the logins worked on the original principal server.
So that was all it took to get mirroring up and running. Fairly easy and much easier than log shipping.
Things To Watch For
Time to point out a few more things to watch out for.
- Database mirroring works at the database level. If your application uses just one database, this is fine. But if it uses multiple databases, you have to watch out for the situation where not all the databases fail over. In this case, you can set an alert for a database failover and then manually failover the other databases via a job that runs when the alert is raised. Details on setting this up can be found here.
- Another thing to watch out for is combining mirroring with clustering. This is allowable, but you need to be aware of one fact: the default timeout for the monitor server to detect a database failure and initiate a mirroring failover is 10 seconds. It typically takes longer than 10 seconds for a cluster to failover from one node to another. Therefore, if your principal server is a cluster, you should change the mirror monitor timeout to a larger value to prevent false failovers. Read Paul Randal’s blog post on this. Keep in mind that by increasing the timeout value, you are increasing the time the database may be inaccessible before it fails over. You’ll need to balance this timeout setting with your business requirements. In my case, the cluster server is the mirror server and I’m not too concerned about this timing issue. Our mirror server (cluster) is located in a different physical location and, should a failover occur, we will have a slower network connection for the users. We will still have functionality, but it will just be slower than normal. This is a business decision we had made, but what it means is, should a failover occur, we will most likely be manually failing from the cluster back to the previous principal as soon as possible, so we are not too concerned about an automatic failover in this direction.
- Consider your network bandwidth. You want to have a good connection between your two servers and your monitoring server to prevent false failovers. The monitor server will ping the principal and mirror server one each second. If you are running in High Availability mode, you’ll also want a good network connection between the principal and the mirror server. Because mirroring is based on synchronizing the transaction log between the principal and mirror server, if that cannot be done, the transaction log on the principal cannot be truncated and will continue to grow. The Knowledge Base article I linked to in the second paragraph gives some figures and details which Perfmon counters to monitor. SQL will compress the transaction log as it sends it to the mirror server, so that will help improve throughput. I had previously found a site that did some real world testing of the amount of data that could be sent over various networks before mirroring had problems, but I can’t find it again – sorry. In my situation, we are dealing with traffic volume an order of magnitude less than what the testers were dealing with, so I wasn’t too concerned with this.
My next post will talk about monitoring for mirroring events and why you should do that, as well as what you need to do to support mirroring on the client side of things.