Post by Chris Shaw
Introducing Availability Groups into your SQL Server environment isn’t an overly complex technical task, and with the wizard that is included in SQL Server Management Studio, most organizations can count on the deployment of the availability groups being successful. However, there are some pitfalls that may be encountered, where inexperience could leave an organization flustered with the technology. In this series, we will review a few of the common pitfalls and provide solutions to increase your confidence that your databases will be online when they are needed the most.
Seasoned IT professionals think of traditional clustering as being implemented on a server level, when one server in the cluster has a qualifying event the entire SQL Server instance fails over to the other server. One of the key distinctions between availability groups and traditional Windows clusters is what happens during the failover and what components are included in the failover.
With Availability Groups the core technology is database based and not server based. When a failover occurs only the databases in the availability group move over to the other server. Anything that is not stored in a user database will not failover to the secondary nodes. By comparison, traditional clustering is an all or nothing solution while availability groups are more granular with more flexibility. There are many other differences that are out of the scope for this post, that we can review later.
One of the crucial areas where the Availability Group wizard falls short is ensuring that the SQL Server authenticated accounts that users and applications utilize for data access exist everywhere they need to be. If the accounts are not set up manually on each of the replicate nodes, a failover could occur and, even though the databases may be online and operational, access will be denied. The outage was caused by the user becoming orphaned upon failover. To avoid this situation, we need to understand the basic principle of accessing SQL Server.
Understanding how a user authenticates within SQL Server to gain access the data required is critical to understand when working with SQL Server Availability Groups. If accounts are not set up correctly when an Availability Group is configured, your experience with a failover will leave you with the impression the failover didn’t work. The most frustrating aspect of this is if you create SQL Server accounts using the standard method and simply try to repeat these steps on secondary servers it’s likely that the accounts on the secondary nodes will still fail.
Understand the basics of SQL Server Security
An account in SQL Server can be one of two basic types, Windows authenticated accounts (active directory accounts) and SQL Server authenticated accounts. A SQL Server authenticated account is created in the SQL Server system database (master) with the password stored in an encrypted state. When either account type is added to SQL Server, the account is required to have a unique ID that DBA’s call a SID. With Windows authenticated accounts the SID mathes the one assigned by AD, however with SQL Server Authenticated accounts the SQL Server creates the SID. Once the account has a SID assigned, the user now has a login. Understand at this point the login doesn’t have a lot that it can do, there are a handful of system tables it has access to along with some stored procedures in the system databases but that is about it. For this login to get access to the data it needs, it needs to be associated with a database user.
A database user is what SQL Server uses to grant access to use database objects. When a login needs access to something that is inside a database, a user needs to be created along with the permissions that the user has access to. These security details are stored in the individual database where the database user has access. SQL Server links logins to database users based of the SID that was created with the login was created.
The key point to understand about security in regards to Availability Groups is, to have access to a database the database user must have a link to a server login. The login is server based where the user is database based, and this relationship is based on the SID.
Failovers and Logins
When an availability group failover occurs all the information that is contained in the databases doesn’t really move anywhere (this includes the database users), the primary database simply changes mode from primary to secondary, and depending on your configuration it may enter a read only mode otherwise it will not be readable (the only thing that moves is the traffic, and this requires a listener). The replicate (secondary) becomes the primary node and because the databases are kept in sync all the database users already exist in the database. This is because the users of the database are stored in the databases.
This is where many organizations have issues. The SQL authenticated logins on the primary node were created with a specific SID. If that login does not exist on the secondary node with that exact SID, the link between the user and the login cannot be made (this is also referred to as orphaned), therefore access is going to be denied. The SQL Server database failover may have occurred with no negative impacts to the data yet the users are unable to log in giving the appearance the failover didn’t work.
Here is where experience and planning come into the picture. There are a few ways to solve this problem so this type of outage does not occur:
- Use the stored procedure sp_help_revlogin to script out the logins from your primary server to generate a script to be executed on the secondary servers. This will ensure the password for the logins is the same and that the SID’s match. One thing to consider when using Windows Authenticated accounts the SID is assumed from Active Directory, therefore, the login simply needs to exist on the replicates.
- Schedule a process to compare logins and SIDS between the replicates, this will ensure that all the logins exist where they should be.
- SQL Server has introduced contained logins. These can be powerful and encapsulate all the login information inside the user database, these will avoid the issue by removing the possibility of non-matching SIDS between replicates.
When my spouse is waiting for a web based game to load or looking at a new purchase online, she is quick to point out that when something doesn’t work correctly, it is the cause of the DBA. She makes this claim even if our internet connection is down, or any other non-database related issues. I use to have fun with this until it occurred to me, as far as she is concerned, if she can’t get to the data, the database is down. Just because an Availability group failsover successfully does not mean the applications are online. If users cannot connect to the database, their confidence in the database plummets.
Availability Groups were added to SQL Server giving organizations options when it comes to the resiliency for their databases. The technology is sound and offers Enterprise availability when implemented correctly. If you need assistance in architecting or implementing your SQL Server Availability Groups contact us.