Availability Group Architecture – Adding a Second Site

Last time, I discussed the simplest Availability Group architecture, with only a single site. In this post I will expand upon the same architecture, adding in a second site to achieve more than just high-availability.

Cross sub-net availability group

AG-HA-DR

In the architecture above, replica A and B are in the primary data center while replica C and D are in the disaster recovery (DR) site. Like the previous architecture, the disks are displayed as local but the most important part is that they are physically separate. SANs are wonderful systems with a lot of redundancy but they can also be a single point of failure. Keep your Availability Group disks separate.

The Windows Server Fail-over Cluster (WSFC) spans both data centers and sub-nets. When nodes of a WSFC are not in the same network you should customize the heartbeat thresholds. Click here for more details. In addition, you will have to configure your quorum. Quorum is a much larger topic than can be covered here but it is a recommended research topic before beginning to use Availability Groups.

The connecting lines represent the data flow between replicas. The streams come from the primary replica, replica A in this case, and go to each secondary. For this reason, each secondary replica has minimal information about the health of the other replicas. Ideally the synchronization mode between replica A and B would be synchronous to achieve automatic fail-over while the modes from A to C and A to D would be asynchronous to prevent performance issues due to network latency. In the event of a manual fail-over to the DR site, the synchronization modes should be reconfigured to perform synchronous commit only within the local data center.

The Availability Group Listener is not represented in the diagram but you will need a listener to perform the fastest fail-overs and utilize features such as read-only routing. The listener is a virtual network name (VNN) that follows the primary replica wherever it fails over.

Pros

  • High-availability is achieved! This is true for all four of the architectures that I will be covering.
  • Disaster Recovery achieved! Now that we have added a remote site, we can recover from a disaster.
  • Fast cross data center fail-overs! When using the Availability Group Listener, your applications can connect to a single virtual network name and that name will handle changing sub-nets without manually updating DNS aliases or changing application connection strings. Use MultiSubnetFailover=true;

Cons

  • EVEN MORE Data Duplication. Now that you have four replicas, you have four complete copies of your data. This means that your DR site must be able to support the same storage capacity as your primary site (matching hardware is recommended even when not using AGs also).
  • Server Objects. You now need to synchronize your server objects across four nodes.
  • Complications. Non-default configurations are beginning to come into play such as the cluster heartbeat settings and quorum configuration.

When to choose this

This is my preferred architecture of the four that will be covered in this blog series. This architecture establishes a good balance between complexity and features. Most important is that high-availability and disaster recovery are satisfied. I also like that local disks are still an option. The thought of having local, dedicated, solid state drives (if it can be afforded) to support a database server makes me drool.

Next time

In the next post of this blog series I will add SQL Server Fail-over Cluster Instances into the mix. This adds to the overall complexity but will reduce disk space requirements and the number of cluster nodes to synchronize server objects between.

Leave a Reply

%d bloggers like this: