When a cluster has an Availability Group installed on it, there is an optional listener which is used to manage connections through fail-over events. This listener has an IP address assigned to it and, when the Availability Group fails over within the same sub-net, the listener simply comes online on the new node. That node will now own the IP address and all connections going to the virtual network name, of the listener, will go to the correct node. This behavior is the same as using a SQL Server Fail-over Cluster Instance. There are, however, two features which the Availability Group listener has, which complicate things to a small degree.
The first feature is the ability to off-load read operations to a read-only replica. To coordinate this, there is a read-only routing list configured in SQL Server and the Availability Group Listener will accept a new connection parameter called ApplicationIntent. You can read more on that parameter here. For now, I will simply state that passing in READONLY or, the default, READWRITE, will tell the listener to route you to different nodes of the cluster to meet your connection needs.
The second feature is the ability to use a single virtual network name in multiple sub-nets. An Availability Group Listener can have multiple IP addresses, but only one active at any given time. When you fail-over to a different sub-net, the active IP is disabled and the other sub-net’s IP will come online. This is wonderful, but causes some issues due to the behaviors of your’s or your server’s local DNS cache and time-to-live settings. To cut down time during a fail-over, the MultiSubnetFailover parameter was created. Once again, please read here for more detail. This parameter allows the client to attempt a connection on multiple IP addresses at once. The first one to connection successfully wins. This works because the listener cannot have more than one IP address online at a time, so you will always get the active IP without any complications from an out-of-date DNS cache.
The SSMS challenge
So you have all of your applications and services using the ApplicationIntent and MultiSubnetFailover parameters. They’ve run successfully and you’ve conducted a few successful fail-over tests. So what is the problem?
- Sometimes, when I use SSMS to connect to my listener, it hangs for 15 seconds and then fails to connect. Sometimes it works, though.
- When I connect to the listener, I always hit the READ/WRITE node and the other DBAs have been upset with the read load I’m putting on the primary node.
- I can’t save these parameters anywhere… I’ll explain this problem more down below.
Solving problems #1 and #2
To guarantee you will connect on the correct IP address and get routed to the correct node, you need to pass in the parameters mentioned above. In SSMS, you will need to follow these steps to connect.
Enter your listener name and then select Options.
Setting your Connection time-out to 30 seconds can be helpful here. It is not required to connect to an Availability Group listener but, if you forget to pass in the MultiSubnetFailover parameter, it will allow for a successful connection, even if a bit delayed.
Next, navigate to the Additional Connection Parameters tab and enter the connection parameters as seen below.
Dealing with problem #3
Using the above method will connect you to your listener as expected. Problem #3 is the fact that these connection parameters cannot be saved anywhere. SQL Server Central Management Server and local Registered Servers do not have a means of storing this information. In my opinion, this is a major over-sight by the SSMS development / project management team and the feature has not been added in SSMS 2014 either. Please take the time to vote up this connect.microsoft.com ticket so that Microsoft will see our desire for this feature.
With that being said, I have no solution for making this process easier, except for the connection time-out trick, that I mentioned above. After about 20 seconds of trying to hit the wrong IP address, the connection will find the correct one succeed. This means that, if you are OK hitting the primary replica, you don’t need these parameters to make a connection. That time-out setting is persisted, so you won’t have to change it each time you try to connect.
UPDATED: AS OF 3/10/15 WE HAVE A SOLUTION!
Microsoft responded to our connect ticket and this issue will be fixed in the version after SQL Server 2014. Until then there is a work around, check out my new post on storing optional parameters in SSMS.