We had a request from our internal audit team recently to come up with a process to identify which SQL logins had not been used in 30 days so we could deactivate them. The driving force behind this was to make sure that a user’s access was disabled when they left the company. Now most of our access is granted via Windows Active Directory accounts, so when that account is deactivated, their SQL Server access is automatically cut off. But we have some third party programs that require SQL logins and we had to come up with a method for dealing with those. Our audit team was worried about someone forgetting to disable a SQL account and wanted to have a method of detecting old accounts that might still be active but not used.
Out of the box, SQL Server has no way to track the last time a login was used. (Nor the last time a database was accessed. These are two features I’d very much like to see implemented in the future.) I decided to use a logon trigger to log each time an account connected to a table in our DBA Utilities database that we have installed on each server. We could then compare those active logins to all logins on the server and see which ones hadn’t been used in X number of days. That information would then be passed on to the appropriate managers, who would give us feedback as to which accounts we could delete.
So I asked my co-DBA to develop a trigger that would log this information to a simple table that contained two columns – the login account and a date field indicating the time of the login. Because we didn’t need to keep historical logon data, he wrote the routine so that if there was not already a row for the existing login, it was added. If there was a row, the date field was updated to the current time. Thus, our table would have a most one row for each login on the server and not require any sort of maintenance to keep from growing over time.
It all seemed very straightforward. Then we installed it onto our test system and ran into problems right away.
One very thing to note when working with logon triggers is that they execute in the security context of the login being used. So if User A is logging in, the trigger will execute with the permissions assigned to User A. In our case, User A (and all users other than sysadmins) doesn’t have access to our DBA Utility database, so the trigger was not able to read the login table. What happens then? SQL doesn’t let that account log in. You will get a login failed error with the message “Logon failed for login UserA due to trigger execution.”
We quickly discovered a corollary to this – when working with logon triggers, it is very easy to lock yourself out of your server! If you have some sort of error in your trigger code that causes it to fail, you will not be able to login, even using an account with sysadmin rights. Fortunately, you can get around this by connecting via the DAC or by starting the server in minimum configuration mode (using the -f command line option). Once you’ve done that, you can connect and disable or delete the trigger.
So, back to our permissions issue. We overcame that problem by changing our trigger to use the EXECUTE AS clause to switch to a security context that had permissions in the DBA Utilities database. Now, because we want to log the account that is connecting, we also had to take care to store the original login into a variable BEFORE issuing the EXECUTE AS command. Otherwise, the SUSER_NAME() function returns the account specified in the EXECUTE AS clause.
Once all those changes were made, our trigger worked well. Except we noticed some strange behavior. We started seeing a lot of severity 20 alerts stating “The client was unable to reuse a session with SPID XX.” What’s more, these alerts were being fired at times when no one was using the system – 2 AM, 5 AM, etc. and usually at the top of the hour. That pointed to some sort of scheduled job.
After some investigating, my fellow DBA came across some posts that indicated this problem might be caused when a logon trigger encounters a deadlock. This made sense. We have several SQL Agent jobs that kick off at the same time, so the trigger was trying to update the same row in our login audit table.
We had a couple different ways to work around this. We could schedule our jobs to not start at the same time – move the start time of one by 1 minute forward or backwards, for example. This is not the best solution as I imagine in the future, someone (most likely me) would forget about this and schedule a job at the same time as another again. We went with a more robust method and ended up making a logging exception table that contained a list of logins that the trigger would not log and we added the SQL Server Agent account to that table. Now, the trigger checks the exception table before trying to log a connection. This solved our problem. (Although it could still happen if two users tried to login using the same account at exactly the same time, but that scenario is highly unlikely.)
Logon triggers provide a powerful means to audit server access, but you need to be careful when setting them up, as you can easily lock yourself and others out of the server and / or cause other problems.