Writing code for a database is more of a challenge than writing for applications. Before the developers bite my head off, I do not mean that the languages, such as T-SQL, are more difficult or that applications are any less sophisticated than database management systems (DBMS). An important difference, however, is that the state of the database is absolutely critical when running any code while you can simply replace assemblies or script files with most applications. This is because applications only exist while they are being run, otherwise they are simply files. When it comes to data, we are not updating a feature in the DBMS we are modifying the data itself.
Desired state development
The state sensitive nature of a database means that code you tested five minutes ago might still blow up in production if, during that five minutes, the state of the database changed in a way that is significant to your script. This is why we are seeing products, like Microsoft’s SQL Server Data Tools, which focus on desired state development. The concept is that the product will compare the database to the desired state and generate the non-breaking script to move from state A to state B, even if you did not know what state A would be until it was executed.
I have a lot of positive comments that I could make about this concept but I maintain that it will never be complete. There are state migration scenarios which require custom logic to handle properly and any tool is going to error on the side of caution when coming across these issues. It is safest to put the decision into the developer’s or DBA’s hands and I completely agree with that decision.
Understanding the problem and accepting the fact that we will need to deal with these challenges ourselves for the foreseeable future, now we need to learn to always write repeatable code.
I have often seen, what I refer to as, lazy resilience. The best example of this is when trying to create a table.
USE [Demo] DROP TABLE dbo.Demo1 CREATE TABLE dbo.Demo1 ( id INT IDENTITY(1,1) NOT NULL, myCol NVARCHAR(10) NULL )
This method technically works. Hopefully you are already getting angry with the author of this snippet. Some problems with this script.
- The DROP TABLE could throw an error if this is a first time creation. Just because SSMS defaults to XACT_ABORT OFF, does not mean that you should rely upon the script continuing and creating your table. Maybe this script ends up being run with XACT_ABORT ON.
- If the table does exist, you have just set yourself up for potential data loss. Read that again, data loss.
To be fair, I did choose the worst case scenario here. I can hear whispers in my mind saying, “we would use an IF NOT EXISTS to prevent the error,” and, “well of course we would not do this for a table, but a stored procedure or view, that is ok.”
Let us take a look at a stored procedure example with those whispers in mind. Taking an existing stored procedure, dbo.spDemo, we will glance at the current permissions.
SELECT dpals.name , OBJECT_NAME(dperms.major_id) [ObjectName] , dperms.permission_name , dperms.state_desc FROM sys.database_principals dpals INNER JOIN sys.database_permissions dperms ON dperms.[grantee_principal_id] = dpals.[principal_id] WHERE name = 'DemoUser' AND OBJECT_NAME(dperms.major_id) = 'spDemo'
Now we will update our stored procedure with our “repeatable” script. I even included the schema name in the IF statement, which I often see omitted.
USE [Demo] IF EXISTS ( SELECT * FROM sys.objects o INNER JOIN sys.schemas s ON s.schema_id = o.schema_id WHERE [type] = 'P' AND s.name = 'dbo' AND o.name = 'spDemo' ) BEGIN DROP PROCEDURE dbo.spDemo END GO CREATE PROCEDURE dbo.spDemo AS BEGIN SELECT 'My updated procedure' [Message] END
Executing our permissions script again shows us that we have just revoked access to this procedure from DemoUser.
You could include the permissions at the bottom of your create script each time but that becomes problematic when you are not the person who manages security on all of the servers.
My preference is to do what you actually intend. If you are looking to create or alter an object then CREATE or ALTER it, do not DROP anything. This is what I use for stored procedures and is equally relevant for other SQL Server object types.
USE [Demo] IF NOT EXISTS ( SELECT * FROM sys.objects o INNER JOIN sys.schemas s ON s.schema_id = o.schema_id WHERE [type] = 'P' AND s.name = 'dbo' AND o.name = 'spDemo' ) BEGIN EXEC ( 'CREATE PROC dbo.spDemo AS SELECT ''Stubbed''' ); END GO ALTER PROCEDURE dbo.spDemo AS BEGIN SELECT 'My updated procedure' [Message] END
Using the CREATE/ALTER method you preserve metadata such as object level permissions, extended properties, etc. You also have a means of quickly detecting problems, if proper error handling did not already bubble up an exception for you.
select OBJECT_NAME(id) [ObjectName] from sys.syscomments where text like '%stubbed%'
This query will find all of your objects which failed to properly create. Remember though, when altering, you never dropped anything. Which means that you failed to update the object, leaving you with no change. You will never be left with the object dropped and no new version to replace it.
Creating repeatable DDL scripts is easy compared to DML scripts. With DDL, you can design a pattern, such as the one I recommend above, and reuse it for everything. With DML your code has to be extremely customized to what you are trying to do. To demonstrate this, I will make up a scenario which could exist in reality and then show how I would make the script repeatable.
You are working for a company who provides tax preparation services. This company has the following table.
USE Demo CREATE TABLE dbo.TaxReturn ( taxReturnId UNIQUEIDENTIFIER NOT NULL DEFAULT(NEWID()), taxReturnXml XML NULL, originalReturnId UNIQUEIDENTIFIER NULL, isSupportCopy BIT NOT NULL DEFAULT(0) ) INSERT INTO dbo.TaxReturn ( taxReturnId, taxReturnXml, originalReturnId, isSupportCopy ) VALUES ('55304D5A-3ED5-4208-853A-2BA90F32FD0E', '<Return>data</Return>', NULL, 0), ('FAD116FF-D269-4C48-BC05-32649E419BC0', '<Return>data</Return>', '55304D5A-3ED5-4208-853A-2BA90F32FD0E', 1), ('88C705B5-0640-4D2B-AEF6-7963C622798F', '<Return>orig</Return>', NULL, 0), ('FFA528DB-0124-4799-A722-AD93EAFDA72C', '<Return>changed</Return>', '88C705B5-0640-4D2B-AEF6-7963C622798F', 0), ('E803EA1F-2C48-4922-ACBE-BCCB57C29480', '<Return>orig</Return>', NULL, 0) SELECT * FROM dbo.TaxReturn
A performance analysis was conducted and it was noticed that the originalReturnId was only ever being used to determine if the tax return was the original or not. A decision is made to convert the UNIQUEIDENTIFIER column to a BIT.
Here is the script that you want to use to make the change. It will run but is not repeatable because the ALTER TABLE statement will throw and exception on the second run. The UPDATE statement can be run multiple times but, best case, it would perform unnecessary reads and writes and, worst case, would reset data that you had already made more changes to.
USE Demo ALTER TABLE dbo.TaxReturn ADD isOriginal BIT NULL UPDATE dbo.TaxReturn SET isOriginal = CASE WHEN originalReturnId IS NOT NULL THEN 0 ELSE 1 END ALTER TABLE dbo.TaxReturn ALTER COLUMN isOriginal BIT NOT NULL --Update stored procs and views as well. SELECT * FROM dbo.TaxReturn
The below version of the script performs the exact same actions but can be safely executed over and over regardless of the state of the database.
- The column is only added if it does not exist already.
- The UPDATE will only write any data if the isOriginal column is NULL.
- The isOriginal column is altered only if the column is still nullable.
USE Demo IF NOT EXISTS ( SELECT * FROM sys.tables t INNER JOIN sys.schemas s ON s.schema_id = t.schema_id INNER JOIN sys.columns c ON c.object_id = t.object_id WHERE s.name = 'dbo' AND t.name = 'TaxReturn' AND c.name = 'isOriginal' ) BEGIN ALTER TABLE dbo.TaxReturn ADD isOriginal BIT NULL END UPDATE dbo.TaxReturn SET isOriginal = CASE WHEN originalReturnId IS NOT NULL THEN 0 ELSE 1 END WHERE isOriginal IS NULL IF NOT EXISTS ( SELECT * FROM sys.tables t INNER JOIN sys.schemas s ON s.schema_id = t.schema_id INNER JOIN sys.columns c ON c.object_id = t.object_id WHERE s.name = 'dbo' AND t.name = 'TaxReturn' AND c.name = 'isOriginal' AND c.is_nullable = 1 ) BEGIN ALTER TABLE dbo.TaxReturn ALTER COLUMN isOriginal BIT NOT NULL END --Update stored procs and views as well. --With the recommend pattern above. SELECT * FROM dbo.TaxReturn
This certainly is a lot more code than the non-repeatable example. The resilience to database state change is very powerful, though. As a final remark, I would like to point out something that is not easy to see in the script. I am creating a NULL column, populating it, then converting to NOT NULL. One could point out that I would use less code if I made the column NOT NULL with a default constraint before updating the values to what is true. I did not do this for two reasons. First, I do not like inaccurate data to exist in my database for longer than it absolutely has to. NULL works because we understand that it loosely means unknown and any queries which run looking for valid values would not include these records. The second reason is that you would end up putting all of your code into the first IF NOT EXISTS block. This is not bad, assuming you add a transaction around all of the commands inside the IF, but it makes a block of code dependent upon a base condition rather than making each statement flexible to any variations of the database state.
What is important, though, is that you think through all of the possible failure and partial complete scenarios and account for them, even if you believe the script will only ever be executed once. I have spent many off-hours troubleshooting production deployments that went bad. The deployment window can become the Wild West in a blink of an eye and you get to play the part of the Cowboy.