Availability Group Architecture – DR on the Cheap

Last time, I discussed how to limit data duplication between data centers as a cost-effective option with an Availability Group. In this post I will take cost reduction one step further by short-changing the disaster recovery site.

Disaster recovery on the cheap

Disaster Recovery on the Cheap

This architecture is the most cost-effective way to use an Availability Group and have high-availability (in the primary site) and disaster recovery. The first level of cost savings comes from having only two copies of your database(s). Remember, in my preferred architecture there were four copies of the database(s). The second level of cost savings comes from only having one server in the disaster recovery site and shared storage is not required. Not having shared storage might mean that you do not have to own a SAN in your DR site.

Aside from the reduction is hardware, this architecture follows all the same concepts previously covered in this blog series. A SQL Server Failover Cluster Instance is handling automatic failover within the primary site and an Availability Group is connecting the two data centers asynchronously.

It is important to point out that high-availability is only available until a disaster strikes. Once the Availability Group fails over to the DR site there is nowhere else to go, if the server itself has any issues. This is less protection than either of the multi-site architectures we already covered offer.

Pros

  • Reduced data duplication! 2x data duplication instead of 4x.
  • Reduced server object maintenance! Since FCIs use shared storage, even you system databases will failover within their local site. This means that you will only have to synchronize your server objects between data centers.
  • High-availability is achieved, but only for the primary site.
  • Disaster Recovery achieved! The Availability Group synchronizes the data and handles manual failover between sites.
  • Fast cross data center fail-overs! When using the Availability Group Listener, your applications can connect to a single virtual network name and that name will handle changing sub-nets without manually updating DNS aliases or changing application connection strings. Use MultiSubnetFailover=true;

Cons

  • No high-availability in DR site! With only one server in your DR site, stability of that server becomes the lynch-pin in your DR plan. A successful failover followed by a complete outage might be a résumé generating event.
  • Complications. The same non-default configurations recommended before still apply, such as the cluster heartbeat settings and quorum configuration.
  • Troubleshooting challenges. By mixing two features together your troubleshooting process becomes much more complicated. You have to concern yourself with more variables revolving around how and why a failover could occur and there are many more moving parts to worry about.
  • Shared disks, in the primary site. The remote nature of shared storage makes connectivity a concern. In addition to network stability, you have to be concerned with storage system up-time. No physical separation of disks is a risk.

When to choose this

This architecture can be used when your organization does not value their secondary data center the same as the primary. It is a best practice to have matched or similar hardware between your primary and disaster recovery sites but that is not always possible. When costs need to be reduced it is better to have one failover server that you know can handle the work load rather than two servers which are under powered. Under powered hardware can easily become an effective outage if they cause timeouts as soon as a production work load is placed on them.

Wrap-up

In this four part blog series I covered Availability Group architectures which meet the needs of different types of organizations. These are the same architectures that I reference in my presentation, Architecting Availability Groups. I urge you to check out the presentation materials and look me up at your local SQL Saturday.

Leave a Reply