How to: Configure SQL Server 2012 AlwaysOn – Part 6 of 7

How to: Configure SQL Server 2012 AlwaysOn

How to: Fail-over

In this part of the blog series we will go over the various methods of manually failing over an Availability Group. Naturally if you are using automatic fail-over then there are no steps required for this to happen in an emergency but you might find a reason to conduct a planned manual fail-over when there is no actual problem (for example, fail-over testing).

What Happens During a Fail-over

Database Component:

A fail-over occurs by sending what amounts to a “make primary” command to a secondary replica. This happens because, of course, the primary database might not be available to receive commands.

Once the command to make primary comes through, the database will become transactional (in a read/write mode) and begin accepting connections of all types. If you were using synchronous commit then there will be no data loss and this new primary replica will begin stock piling transactions in the transaction log until the previous primary comes back online to receive the transactions or data transfer is manually severed. When the primary replica comes online there are no user actions required to resume the data transfer but a fail-back would require user interaction if it was desired.

If you are using asynchronous commit (like we are in our architecture) then the secondary replica will come back online but the data transfer will be paused. At this time you can use the database as a read-only replica to identify what data loss occurred or you can resume transfer. By resuming the transfer you lose the ability to identify what data might have been lost during the fail-over.

Cluster resource / AG listener

At the same time that the database component is failing over the cluster resource is also working on its fail-over.

Which ever secondary replica is becoming the new primary now needs to run the cluster resource for the Availability Group. The most important part of this fail-over is the AG listener. When crossing within the same sub-net the move causes a momentary outage of the IP address associated to the listener (no longer than the outage incurred by the database fail-over). If the resource is moving to a node where the listener must change sub-nets it will take the former sub-net IP offline and bring online the new sub-net’s IP address.

Availability Group Fail-over Wizard

The availability group fail-over wizard can be run from either the primary or secondary replica but you will have to be able to connect to the secondary replica in order to be successful.

There are two ways to access the fail-over wizard:

Availability group dashboard

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Fail-over-Page-Dashboard

The Start Failover Wizard link is available from both the primary and secondary replica dashboards.

SSMS Object explorer

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Fail-over-Page-Object-Explorer

To access the fail-over wizard from the Object Explorer; right click on the availability group and select Failover…

Wizard Walk-through

Upon opening the availability group fail-over wizard for the first time you will meet the Introduction page. If you’d like, check the Do not show this page again then click Next.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Fail-over-Page-Failover-Wizard-1

On the Select New Primary Replica page you will see a list of all available replicas with some status information. I find the most important column of this page to be the Failover Readiness column which will let you know whether you will have data loss by choosing this replica for fail-over.

NOTE: The Failover Readiness column is indicating data loss based on configuration not based on an actual representation of what has been synchronized. If you created a database inserted one record, verified that the secondary has the record, and then ran this wizard you would see that data loss is listed for asynchronous connections. It does not matter that there will actually not be any data loss in this situation.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Fail-over-Page-Failover-Wizard-2

Once you have selected your replica to fail-over to; click Next.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Fail-over-Page-Failover-Wizard-3

The Summary page will provide a list of the actions and affected components. This is where you can script out the process or go ahead and click Finish. 

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Fail-over-Page-Failover-Wizard-4

Finally, you reach the Results page and you are complete.

T-SQL Fail-over

Microsoft did a really great job at keeping the fail-over scripts simple for moving an Availability Group.

These commands must be run on the secondary replica that you wish to become the primary replica.

When no data will be lost by the fail-over

ALTER AVAILABILITY GROUP [AG_SQLTest] FAILOVER;

When data might be lost by the fail-over

ALTER AVAILABILITY GROUP [AG_SQLTest] FORCE_FAILOVER_ALLOW_DATA_LOSS;

Wrap-up:

It is easy to see that both the Availability Group Fail-over Wizard and the T-SQL methods are both very straight forward and easy to use. That is one characteristic of AlwaysOn that is very appealing, especially to junior DBAs.

Follow the links below to move on through this tutorial.

How to: Configure SQL Server 2012 AlwaysOn – Part 5 of 7

How to: Configure SQL Server 2012 AlwaysOn

Explanation of Components

Now that we have our AlwaysOn configuration setup I’d like to go over some of the components. Some are very self explanatory such as the Availability Replicas, Databases, and Group Listeners folders in your object explorer. These we dealt with directly when setting up the Availability Group (AG). Others such as the cluster resources which were implicitly created, the AG dashboards, and certain status information in the object explorer have not yet been covered.

Cluster Resources / Objects

As we have stated in previous parts of this blog series, AlwaysOn can only be used on SQL Server instances which are a member of a Windows Failover Cluster. Up until now we haven’t seen any real reason for this requirement.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Explanation-of-Components-Page-1

Above you can see the cluster resource that was implicitly created when you created your Availability Group (AG). There are two important components to this resource and one very important note.

NOTE: Do not alter this cluster resource through the Server Manager or Fail-over Cluster Manager. Microsoft mentions it in their documentation and I’ve experienced it; you can really mess up your AG if you don’t use SSMS and/or T-SQL to handle your AG changes.

Back to the componets…

When you create an AG it will create the cluster resource (AG_SQLTest as seen in the screenshot) and it will be mostly empty. Unlike the screenshot above you would only have the Other Resources section with the AG_SQLTest  component. When you create your first AG Listener the top section of the resource will be added. As you can see it is under the Server Name category and, in this case, named AGTest_Listener. 

Within the AGTest_Listener there are currently two IP addresses registered. In this case one is in the sub-net of our data center 1 and the other is in the sub-net of data center two. The listener will always live in only one sub-net at a time which is why there is one marked as offline and the other marked as online. When a server outage occurs the entire AG will attempt to fail-over to an available node. If that fail-over happens to cross sub-nets then the current IP address for the listener will go offline and the other IP address will start up.

Availability Group Dashboards

MSDB AlwaysOn Dashboard Description

One nice feature Microsoft included with AlwaysOn is a dashboard for viewing the status of your Availability Group (AG). To open the dashboard you need to:

  1. Open SSMS and, in the object explorer, right click the name of your AG. This can be found in the AlwaysOn High Availability > Availability Groups folder (as seen in the screenshot below).
  2. Select Show Dashboard.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Explanation-of-Components-Page-2

Now that the dashboard is open I’d like to bring to your attention that each replica in the AG does not have the same amount of information available to it regarding status. Only the Primary replica will be able to see the whole picture which each individual replica will only be able to speak about themselves. For this reason there are actually two different sets of information you will see from this dashboard depending upon whether you are accessing a primary or secondary replica. The first image I am presenting will be the primary replica. I will explain all the information displayed and then present the secondary replica screenshot with references to what you no longer are capable of seeing from that server.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Explanation-of-Components-Page-3

 

Orange box

This is your indicator of which replica type you are viewing for this dashboard. In the screenshot above you will notice that we are viewing the Primary replica’s dashboard.

Brown box

Availability group state: High level indicator of the health of the replica.

Primary Instance: name of the primary replica.

Failover Mode: This will be manual or automatic indicating what actions would occur if the primary replica went down.

Cluster state: This is the name of the Windows Cluster and Quorum type.

Red box

Listing of the replicas, their failover modes, and synchronization state. The synchronization state can be Not Synchronizing, Synchronizing or Synchronized. In transactional systems it is likely that this will always say Synchronizing.

Blue box

Similar to the data in the RED BOX but as you can see this area can be filtered and sorted for easy viewing. In addition, the Failover Readiness indicates whether or not data loss would occur if you failed over to that replica immediately.

Green box

At the top of this box you will notice a last updated and an auto refresh toggle. The date is the time the dashboard last refreshed (pause and unpause the dashboard to force a refresh).

Next we have three very important and useful links:

Start Failover Wizard - this is very self explanatory and the wizard itself is very straight forward to follow. Needless to say, if you want to failover manually or have to failover manually this is what I recommend you use. You also have the option of scripting out the failover from the wizard and you’ll notice that the T-SQL is actually very simple also. If there is a chance of data loss during a failover this requires you to confirm the failover one extra time which is similar to the T-SQL where they make you send a FORCE_FAILOVER_ALLOW_DATA_LOSS switch to the ALTER AVAILABILITY GROUP command.

View AlwaysOn Health Events - once again, these links are all very self explanatory. This will give you a historical break down of events with your AG.

View Cluster Quorum Information - View cluster information (see below).

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Explanation-of-Components-Page-4

Moving on to our final dashboard view. Below is a screenshot of the Secondary replica dashboard.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-5-of-7-Explanation-of-Components-Page-5

As you can see above this is the exact same setup and information as the Primary replica’s dashboard but with less available to you.

  • Your high level health indicators are grey because the secondary replica doesn’t know the state of the entire AG.
  • The replica lists only include the current replica because this replica is not aware of any other replica statuses. The only awareness comes from the Primary Instance line at the top. The replica know what instance is currently primary but does not know its status.
  • Health events and cluster quorum information links still function because this data is stored at the Windows Cluster level which this replica does have access to rather than the local instance level of the other information.

Object Explorer Status Information

In general I don’t find the Object Explorer extremely useful for monitoring Availability Groups. There are a couple of indicators which give you a very high level view of the status but the dashboard feature is far easier to use and provides far more information. I recommend utilizing the dashboard heavily and possibly look into some of the T-SQL queries for your monitoring solution.

For a measure of completeness below is a reference to the Object Explorer information and to the T-SQL monitoring.

Wrap-up:

Thank you all for following this blog series this far. By now you would know how to setup an AlwaysOn High Availability Cluster and monitor its status. Next section will focus on how to failover and the different methods to do so.

Follow the links below to move on through this tutorial.

SQL Server 2008 MCM Readiness Videos

The other day I was doing some research on the Transaction Log of SQL Server and stumbled upon the MS SQL Server MCM Readiness videos, many of which are made by Paul Randal.

After watching just one of these videos I quickly found out how useful they can be for general knowledge, even if you aren’t trying to ramp up for the MCM certification.

Below is a link to over 70 videos about all areas of SQL Server from a detailed perspective. I have found them very useful in growing my own knowledge base and recommend that others pick and chose the topics from this list that relate to them and give them a watch.

http://technet.microsoft.com/en-us/sqlserver/ff977043

How to: Configure SQL Server 2012 AlwaysOn – Part 4 of 7

How to: Configure SQL Server 2012 AlwaysOn

Setting up an Availability Group

The AlwaysOn Availability Group is a logical group of databases that you want to fail-over to your secondary site together. It is also a Windows Cluster Resource and should always be manipulated via the AlwaysOn wizards in SSMS or via T-SQL. Microsoft explicitly states that you should not treat the cluster resources the same as you would other resources and that they do not support altering these resources in any way with exception of via SSMS or T-SQL.

In each Availability Group (AG) you can have N number of databases and you may have N number of AGs per cluster. Microsoft has extensively tested configurations with up to 10 AGs with up to 100 databases per AG but states that larger numbers are supported. For larger AGs you will want to be cautious with your system resources. The following is a quoted warning from Microsoft about going beyond those tested measures.

Signs of overloaded systems can include, but are not limited to, worker thread exhaustion, slow response times for AlwaysOn system views and DMVs, and/or stalled dispatcher system dumps. Please make sure to thoroughly test your environment with a production-like workload to ensure it can handle peak workload capacity within your application SLAs. When considering SLAs be sure to consider load under failure conditions as well as expected response times.” (MSDN Citation)

Methods

There are three methods to creating a fully functional Availability Group (AG). You can use T-SQL, the AG Creation Wizard, or create each piece with SSMS. The below instructions outlines the SSMS method without using the wizard. This is done to demonstrate how each piece is configured individually in case only certain pieces are needed in different situations. The wizard would take you through all components at once and very closely matches the process of doing them individually with only minor differences.

T-SQL Steps are not being covered here but all of the same concepts apply. In addition, if you chose to use the Availability Group Wizard; at the end you are capable of clicking the Script button to export the entire configuration as T-SQL.

Group Creation

  1. Open SQL Server Management Studio.
  2. Drill down into the AlwaysOn High Availability folder under your instance in the object explorer.
  3. Right click Availability Groups in the object explorer and select New Availability Group.
  4. Configure all options on the General page. Each element will be discussed in detail below.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-4-of-7-New-Availability-Group-General-page

Availability Group Name

This is the name of your AG. This will be the name which shows up as a cluster resource in your fail-over cluster manager and in your SSMS object explorer.

Your AG name must be 15 characters or less. If you are using the GUI then it will constrain you to this limit. If you are using T-SQL you will be constrained to a 24 character limit. Even though the T-SQL scripts will allow 24 characters and your AG will be successfully created; if you ever create another cluster resource on that same cluster with identical first 15 characters then you will have conflicts. When the 24 character names are displayed they are actually being referenced as 15 character truncated names.

Availability Databases

At this time you can add databases to the AG but we are going to cover this step in the Adding Databases to the Availability Group section (below). Part of the reason for this is so that the steps for adding a database to an existing AG can be covered and the other part is because with the other wizard we will use, it is easier to see what databases are available and whether they meet the AlwaysOn pre-requisites.

You can leave this section blank if you want to follow along with this walk through step by step.

Availability Mode

There are two options for your availability mode; synchronous commit and asynchronous commit. When using SQL Server Failover Cluster Instances (FCI) like we are here you aren’t expecting to have any replicas in the local data center so asynchronous commit will be selected to prevent a large performance hit generated by large geographic distances.

If you are using this blog post as a point of reference and not seeking our Target Architecture here is a brief description of the two choices.

Synchronous Commit

This setting will commit transactions synchronously. What this means is that when a transaction is opened (whether implicit or explicit) on the primary replica; the same transaction will open on the secondary replica. Once the primary is finished; it will then wait for the secondary to send back confirmation messages indicating that the same operations were performed at the secondary and then both replicas will commit their transaction.

This adds over head and will always have some degree of performance degradation. It could be minimal for local servers but I do not recommend using this for servers which reside outside of your primary site. A large geographic distance increases the performance hit.

This setting is the only setting which supports automatic database level fail-over. What this means is that, with our target architecture, we won’t be using it because we already have automatic instance level fail-over with our SQL Server FCI.

Asynchronous Commit

This setting will transmit transactions asynchronously. This means that when a transaction begun on the primary replica; it will begin on the secondary replica as well. The difference between this and synchronous commit and that the primary replica will not wait for the secondary to finish. The primary replica will commit its transaction as soon as it is finished and there will be a send queue with virtual log files held as active until AlwaysOn finished transmitting all data to the secondary replica. The secondary replica will commit on it’s own, once the transaction is completed.

This feature does not support automatic database level fail-over and is often used for cross site manual fail-over.

Failover Mode

Your options are automatic or manual. Automatic is not supported in a SQL Server FCI and you want the instance level failover with shared storage to handle that process. Select manual so that all automated failovers are handled by the cluster and manual failovers can be triggered across data centers. In addition, automatic is not supported for the asynchronous commit setting due to potential data loss.

Connections in Primary Role

The options include allow all connections and allow read/write connections only. Here you will select allow read/write connections only. The reason for this is because, if you have data center 1 up and data center 2 also up, you don’t want to off load your read-intent only connections to the other data center. This will produce large performance problems. Allow all connections should only be used if there are local replicas available to fail-over to; thus making this replica capable of accepting read-intent only connections when it is a secondary.

Readable Secondary

For this setting you can select yes, no, or read-intent only. Here we will choose No. This is because we don’t want to allow anyone to send read-intent only connections to the replica secondary site. This is, again, because we don’t want unintentional performance issues due to hitting a remote site unnecessarily.

When a fail-over occurs; the replica will become primary and this setting will not apply until it becomes a secondary again.

Session Timeout

This is the setting which dictates when a replica is considered offline. This is mostly used for automated fail-over. Play with this setting only if you have a reason to expect that your sessions could be non-responsive for longer than the default 10 seconds.

5. Next we will select the Backup Preferences page.

6. Select Primary: All automated backups for this availability group must occur on the current primary replica. Similar to our Connections in Primary Role setting; we are choosing this only because we have a SQL Server FCI and a dual data-center architecture. We don’t want to off-load backups to a secondary site due to performance hindrances. If you are ever in a situation where you have local replica(s) then you can select Prefer Secondary and ensure that your Backup Priority is weighted properly in the Replica backup priorities group box.

7. Click OK.

Adding a Listener Across Multiple Sub-nets

  1. With SSMS object explorer open; drill down into AlwaysOn High Availability –> Availability Groups –> “your AG name”.
  2. Right click on Availability Group Listeners and select Add Listener.
  3. Set the Listener DNS Name (15 Character Limit)
  4. Set Port to anything you want noting that this is the port your application connection strings will have to use.
  5. Set Network Mode to Static IP.
  6. Click Add to add a new IP address to the listener (a minimum of one is required but in our architecture you would need two, one per subnet).
  7. Select the appropriate subnet from the drop down box.
  8. Enter your IPv4 static IP address and click OK.
  9. Repeat steps 6-8 for each subnet in the cluster.
  10. Click OK to create your new AG listener.

Adding Databases to the Availability Group

  1. With SSMS object explorer; drill down into AlwaysOn High Availability -> Availability Groups -> “Your AG Name”.
  2. Right click on Availability Databases and select Add Database.
  3. Click Next on the Introduction page.
  4. Ensure that next to the database you wish to add, the Status column indicates meets prerequisites. The most common message here for new databases is “full backup required”. Correct any issues in the status and then check the check box next to the database(s) you want to add.
  5. Click Next.
  6. Select the Full data synchronization method and specify a network share that both the primary instance and all replica instances can access using their SQL Service domain account. Do not put this share on the data or log drives for the instance in use.
  7. Click Next.
  8. Upon successful completion of the Validation click Next.
  9. On the Summary page; click Finish.
  10. View your results and click Close.

Adding a Replica to a Database

  1. With SSMS object explorer; drill down into AlwaysOn High Availability -> Availability Groups -> “Your AG Name”.
  2. Right click on Availability Replicas and select Add Replica.
  3. Click Next on the Introduction page.
  4. On the Replicas tab; click Add Replica.
  5. How-to-Configure-SQL-Server-2012-AlwaysOn-Part-4-of-7-New-Replica-Wizard-Replica-Page
  6. If this replica is in your primary data center change the Readable Secondary field to Yes else leave it to default which is No. Automatic Failover and Synchronous Commit should be unchecked for remote replicas but checked for local replicas.
  7. In the Backup Preferences tab; ensure backup priorities are set with the primary data center replicas weighed higher than remote data centers.
  8. How-to-Configure-SQL-Server-2012-AlwaysOn-Part-4-of-7-New-Replica-Wizard-Backup-Preferences-Page
  9. Click Next.
  10. Select Full on the Select Data Synchronization page since this is likely an initial creation of the replica. Change the default share listed to one on the backup storage (not the clustered or local storage for the serve) and to one where the service accounts for SQL Server for both primary and secondary instances can access the share.
  11. How-to-Configure-SQL-Server-2012-AlwaysOn-Part-4-of-7-New-Replica-Wizard-Data-Sync-Page
  12. Click Next.
  13. Upon completion of the Validation checks; click Next.
  14. Click Finish on the Summary page.
  15. Click Close once completed on the Results page.

Wrap-up:

At this point you have a functioning Windows cluster with two SQL Server FCIs using AlwaysOn for multiple sub-net fail-over. In the coming three parts we will get into the couple of components that you’ll see in the SSMS object explorer and Windows Fail-over Cluster Manager and then discuss fail-over procedures and how to use the AG listener to your full advantage.

Follow the links below to move on through this tutorial.

April 2013 – Updates for SQL Server 2008 R2 / 2012

SQL Server 2012 RTM Cumulative Update # 7

  • KB Article: KB #2823247
  • Build number is 11.0.2405
  • Applicable to @@VERSION 11.0.2100 through 11.0.2401

SQL Server 2008 R2 Service Pack 2 Cumulative Update # 6

  • KB Article: KB #2830140
  • Build number is 10.50.4279
  • Applicable to @@VERSION 10.50.4000 through 10.50.4276

SQL Server 2008 R2 Service Pack 1 Cumulative Update # 12

  • KB Article: KB #2828727
  • Build number is 10.50.2874
  • Applicable to @@VERSION 10.50.2500 through 10.50.2869

How to: Configure SQL Server 2012 AlwaysOn – Part 3 of 7

How to: Configure SQL Server 2012 AlwaysOn

Enabling the AlwaysOn Feature

Quick post on how to enable the AlwaysOn feature at the database instance level. This needs to be accomplished for each instance involved in the architecture. In our target architecture this will need to be accomplished once per node for the SQL Server Failover Cluster Instance (FCI) in data center 1 and once per node for the SQL Server FCI in data center 2.

Prerequisites

MSDN has an exhaustive set of Prerequisites, Restrictions, and Recommendations. What I’ve done here are list some of the highlights which will apply to most setups.

  • System Requirements
    • Windows Server 2008 SP2, Windows Server 2008 R2 SP1, or Windows Server 2012.
    • x86 or x64 only.
    • Each computer must be a node in a Windows Failover Cluster (WFC)
    • Each node must have a drive with the same letter as the other nodes.
  • SQL Server Requirements.
    • For Kerberos to work – all SQL instances need to run as the same domain account and SPNs must be manually registered.
    • Enterprise Edition is required.
    • All instances must use the same collation.
    • Availability Group Listeners use only TCP/IP. All clients connecting must use TCP/IP.

Enabling AlwaysOn Feature

First step in our process is to enable the AlwaysOn feature for your instance. Repeat the following steps for each of your nodes and each instance which will be involved in AlwaysOn. This will require a service restart.

  1. Open the SQL Server Configuration Manager.
  2. Select SQL Server Services on your navigation panel on the left.
  3. Right click on the SQL Service and select Properties.
  4. Once the Properties window is open; select the AlwaysOn High Availability tab.
  5. Check Enable AlwaysOn Availability Groups.
  6. Click OK.
  7. Accept the warning message which pops up.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-3-of-7-Enable-AlwaysOn-Feature

Wrap-up:

That’s it! Configuring the feature is the fastest and simplest part of this process. After reviewing my blog series plan, I probably should have merged this with my next part. Since I’ve already announced a 7 part series, though, I’ll just make sure to get part 4 out to you guys sooner than I had planned to make up for the lack of content here.

Follow the links below to move on through this tutorial.

sp_Blitz v18 – Lots of Bug Fixes

Brent Ozar’s sp_Blitz v18 has been released. For those of you unfamiliar with sp_blitz I highly recommend not only downloading it but also watching Brent’s video explanation and reading all of his documentation on it.

sp_Blitz is a great way to get a fairly complete overview of a server that you might be unfamiliar with especially from a configuration and performance perspective.

See his blog post and download link below.

How to: Configure SQL Server 2012 AlwaysOn – Part 2 of 7

How to: Configure SQL Server 2012 AlwaysOn

Building the Environment

When building our environment there is an assumption of basic familiarity with Windows Failover Clustering. The following will begin with a Windows Failover Cluster configured with 2 nodes, a shared disk added to the cluster, and 1 SQL Server Failover Cluster Instance already installed in data center 1. Also, we have 2 nodes added to the WFC for data center 2. If you need to get up to speed on how to set these up please see the references below.

What we will be doing in this part of our series is adding a new shared storage from our secondary data center (cross sub-net), setting cluster timeout settings, and then installing a second SQL Server Failover Cluster Instance in our secondary data center. We will get into detail on everything except the SQL Server installation, which we will breeze over, as it is generic and no special steps are needed for this architecture.

References:

Adding a Shared Disk to the WFC from Data Center 2:

When adding a shared disk to a WFC (normally) a validation process will occur which checks to see if the disk can be owned by all nodes currently in the cluster before allowing it to be added. This is a problem for our target architecture because when using multiple sub-nets in a WFC your disk located at one site cannot be owned by the other sites in the configuration. For that reason, we won’t be able to use the Windows Failover Cluster Manager GUI as seen below.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-2-of-7-Add-Storage-GUI

Instead we have to go to a lower-level and hit up the command line but we are getting just slightly ahead of ourselves here. First lets add a new shared disk to the nodes in data center 2. The steps to accomplish this will be more specific to your architecture because there are more than one set of steps to accomplish this depending upon whether you are using iSCSI, Fiber, etc. and what SAN software you are using.

Once the disk is added to Windows;

  1. Open up the Windows Disk Manager to initialize the disk.
    1. Right click on the disk that you want and select Initialize Disk.
    2. In the Initialize Disk dialog box, select the disk(s) to initialize. You can select whether to use the master boot record (MBR) or GUID partition table (GPT) partition style.
  2. Format the disk.
    1. NOTE: For disks containing SQL Server files it is recommended to format the disk as 64k. This provides a performance enhancement due to the fact that SQL Server processes everything in 64k extents.
    2. Using Windows Disk Manager; Right click on the drive you just initialized and select Format…
    3. Click Yes if a warning pops up.
    4. Set your format settings and then click OK.
  3. Assign a drive letter, also in Windows Disk Manager.
    1. Right click on your drive and select Change Drive Letter and Paths.
    2. Click Add…
    3. Select the drive letter you want to use. IMPORTANT NOTE: File paths must be identical for SQL Server AlwaysOn Availability Groups to Failover correctly. This means that your drive letter must be the same as the shared disk already setup in data center 1.
    4. Click OK.

Now that we have our disk ready to go, we need to add it into the WFC as available storage. This is where the command line comes into play because we need to skip the Failover Cluster Manager’s disk validation checks.

  1. Run command prompt as administrator. (To use Poweshell instead, see John Toner’s great blog post)
  2. Execute the following to create the clustered disk:
    1. In the command below; note that the “Disk Z:” is not the actual drive letter you want to use. It is going to be the name of the disk in the Failover Cluster Manager. This can be whatever you like. In the screenshot below you will see Cluster Disk 1 which is the GUI default name and Disk E: which is the one I added with these commands. Drive letter will be referenced in the next command.
    2. cluster res "Disk E:" /create /type:"Physical Disk" /group:"Available Storage"
  3. Execute the following to set the DiskPath property:
    1. Alter the DiskPath property to the letter previously selected when adding the disk to Windows.
    2. Alter the clustered disk name (in this case “Disk E:”) to whatever you named it in step 2.B.
    3. cluster res "Disk E:" /priv DiskPath="E:"
  4. Once that is complete you can close command prompt and navigate to your Failover Cluster Manager.
  5. Drill down into your WFC and navigate to the Storage folder.
  6. Right click on your new cluster disk (should be under Available Storage).
  7. Select Properties.
  8. Navigate to the Advanced Policies tab.
  9. Ensure that only data center 2′s nodes are selected checked in the Possible Owners section.
  10. Click OK.
    1. NOTE: If your disk is currently offline or in a failed state that might be because when you added the disk it tried to be owned by a node which it cannot be owned by (such as in data center 1). Now that you’ve set the possible owners correctly you will have to reboot which ever node is currently owning the failed disk so the disk can come up on a node which it can be owned by. From this point forward you will not have this issue.

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-2-of-7-Cluster-Disk-Possible-Owners

*** Our target architecture includes 2 nodes in data center 2 but in the above screenshot you will notice that only 3 nodes are present. This screenshot is only for demonstration of the possible owners properties and is not an exact indication of what your environment should look like. What is important is that you see only 1 node is checked off as a possible owner. This is because in the screenshot that is the only node in data center 2. The others cannot own the node because they are on a different sub-net. ***

Setting Cluster Heart Beat Timeout Settings:

With multi-site clustering there is always a major concern of latency. Naturally, geographic distance affects data transfer times and in some cases our companies might not want to spend the VERY LARGE amount of money to maintain a large pipe to your secondary data centers. In any case, we probably don’t want our cluster timeout settings to be default because those defaults were designed for connections local to a single data center.

By default your WFC connection will fail when 5 pings are lost (1 ping per second for a total of 5 seconds). The below cluster.exe commands are run on only one of the clustered nodes and will apply to all of them. In my example, the thresholds will be changed to a threshold of 10 pings lost with pings being sent every 2 seconds for same site and every 4 seconds for cross site (totals of 20 seconds and 40 seconds) but you can configure them to whatever you feel is appropriate to the latency and stability of your cross data-center connection.

  1. Run command prompt as administrator.
  2. Execute the following to change the same-site heart beat to 2 seconds (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop SameSubnetDelay=2000:DWORD
  3. Execute the following to change the cross-site heart beat to 4 seconds (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop CrossSubnetDelay=4000:DWORD
  4. Execute the following to change the same-site ping loss threshold to 10 lost pings (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop SameSubnetThreshold=10:DWORD
  5. Execute the following to change the cross-site ping loss threshold to 10 lost pings (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop CrossSubnetThreshold=10:DWORD

Install an Instance of SQL Server 2012 in Data Center 2:

As mentioned above, we are only going to touch on this section because there is no special steps necessary to setup this SQL Server FCI on the secondary site.

We are going to be using database level failover (AlwaysOn Availability Groups) to manually fail between data center 1 and data center 2. Since this is a database level failover as opposed to an instance level failover we need to install a second SQL Server FCI in data center 2 because the existing SQL Server FCI in data center 1 will not move during a failover.

IMPORTANT NOTE: AlwaysOn requires that your databases exist with the exact same file path on all instances. So when you setup your next instance make sure to use the disk we just configured (the one who’s drive letter is the same as the one used in data center 1) and use the same user database paths as you did for data center 1. If you don’t install with the correct path, don’t worry too much about it, you’ll just have to create an identical path on that drive to match data center 1. The identical path does not have to be the instance’s default data/log paths; it just needs to be available with property security settings when initializing the replica.

See these references for help with installing a SQL Server FCI:

Wrap-up:

We’ve now covered the abnormal steps necessary to add a shared disk when joining a multi-site Windows Failover Cluster and the configuration of heart beat thresholds to protect against latency induced by geographic distance. Please feel free to comment or ask questions and I would be more than happy to clarify any steps or provide additional detail were necessary.

Follow the links below to move on through this tutorial.

Set Windows Failover Cluster Timeout Thresholds

Purpose:

When dealing with a single site the default Windows Failover Cluster timeout settings have always done a good job for me but, once you branch out into multiple geographic sites, maintaining clusters across those sites often requires tweaking these settings.

NOTE: These changes are generic to any cluster but they are most notably used when you have an off-site DAG for Microsoft Exchange.

Setting Cluster Heart Beat Timeout Settings:

By default your WFC connection will fail when 5 pings are lost (1 ping per second for a total of 5 seconds). The below cluster.exe commands are run on only one of the clustered nodes and will apply to all of them. In my example, the thresholds will be changed to a threshold of 10 pings lost with pings being sent every 2 seconds for same site and every 4 seconds for cross site (totals of 20 seconds and 40 seconds) but you can configure them to whatever you feel is appropriate to the latency and stability of your cross data-center connection.

With Command Prompt:

  1. Run command prompt as administrator.
  2. Execute the following to change the same-site heart beat to 2 seconds (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop SameSubnetDelay=2000:DWORD
  3. Execute the following to change the cross-site heart beat to 4 seconds (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop CrossSubnetDelay=4000:DWORD
  4. Execute the following to change the same-site ping loss threshold to 10 lost pings (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop SameSubnetThreshold=10:DWORD
  5. Execute the following to change the cross-site ping loss threshold to 10 lost pings (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop CrossSubnetThreshold=10:DWORD

With Powershell:

  1. Run PowerShell as administrator.
  2. Import the FailoverClusters module.
    1. Import-Module FailoverClusters
  3. Execute the following to change the same-site heart beat to 2 seconds (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop SameSubnetDelay=2000:DWORD
  4. Execute the following to change the cross-site heart beat to 4 seconds (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop CrossSubnetDelay=4000:DWORD
  5. Execute the following to change the same-site ping loss threshold to 10 lost pings (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop SameSubnetThreshold=10:DWORD
  6. Execute the following to change the cross-site ping loss threshold to 10 lost pings (replace clustername with the actual WFC name).
    1. cluster /cluster:clustername /prop CrossSubnetThreshold=10:DWORD

How to: Configure SQL Server 2012 AlwaysOn – Part 1 of 7

How to: Configure SQL Server 2012 AlwaysOn

Scope and Architecture

This is the first post in a series to provide step-by-step instructions on how to setup and configure a SQL Server 2012 AlwaysOn in a dual data-center Windows clustered environment using shared storage in multiple subnets.

When I say, step-by-step, what I mean is that I will cover the steps which are specific to using AlwaysOn in a dual data-center Windows clustered environment using shared storage in multiple subnets. This does not include generic setup procedures for Windows Failover Clusters and installation of SQL Server Failover Cluster Instances. The basics will be touched on with references but only the steps specific to this architecture and AlwaysOn will be presented in detail.

Scope:

There is a large amount of information already out on the internet about AlwaysOn and I will attempt to link out to these sources as much as possible when it comes to background information and concepts of AlwaysOn. The scope of this blog series is to provide step-by-step instructions on how to setup a particular architecture (architecture explanation is below). I have chosen an architecture which includes Windows clustering (required for AlwaysOn), two data-centers, and a combination of AlwaysOn database level fail-over and Windows clustering instance level fail-over. The most desirable aspect of this architecture is the ability to use shared storage to minimize disk space usage but still provide multiple data-center capabilities. The steps provided here were tested only with the target architecture but in most cases apply to all OS and SQL versions that AlwaysOn is available to.

Background:

Target Architecture:

How-to-Configure-SQL-Server-2012-AlwaysOn-Part-1-of-7-Target-Architecture

In this setup we are looking to produce two SQL Server Fail-over Cluster Instances (FCI), one in Data Center 1 and one in Data Center 2. Each data center has two physical nodes to the cluster but all four nodes are technically a member of the same Windows Fail-over Cluster (WCF).

Before we get into too much more detail I want to explain why I’m doing this. AlwaysOn has the ability to build this same architecture without using FCIs; instead it just uses normal standalone instances on each node but it produces a replica copy on each instance for up to four total replicas. There are two situations I can think of where you would want to use the FCIs as I’m targeting. The first is if you have a large cluster such as five nodes in Data Center 1 alone and you still want to add more to fail-over to Data Center 2. With a limit on replica copies you might constrain yourself to smaller clusters. The second scenario I can think of is if you have disk space limitations with very large databases (VLDB). This second reason was why I set this up at my organization, I had a 7 TB database that I didn’t want multiple copies of. I accepted the need for two copies (one in each data center) but I didn’t want to go any further than I had to while still maintaining automatic fail-over locally and manual fail-over cross data centers.

In addition, this step-by-step guide will cover the entire range of options that you could use to setup your AlwaysOn architecture. That makes this architecture a good tool for anyone.

Back to the details; below is a list of the hardware and software specs used during this demonstration.

Hardware

  • Nodes A and B
    • 4 CPUs
    • 4 GBs of RAM
    • 60 GB local storage (C:)
    • 20 GB ISCSI SAN attached storage (E:)
    • Located in Data Center 1
  • Nodes C and D
    • 4 CPUs
    • 4 GBs of RAM
    • 60 GB local storage (C:)
    • 20 GB ISCSI SAN attached storage (also named E:)
    • Located in Data Center 2

Software

  • Windows Server 2008 R2 Enterprise SP1
  • 64-bit
  • SQL Server 2012 Enterprise w/ CU2 (11.0.2325)

Wrap-up:

That should set the basis for what we will accomplish with this blog series. Follow the links below to move on through this tutorial.