It seems that, of the DBAs who use FILESTREAM, it is common to ask, “why do I have more file system objects then I have rows with BLOB columns?” This question inevitably bring us to the topic of the garbage collection process. Paul Randal, as always, did a great job of explaining the garbage collection in his post here. The purpose of this post, however, is a bit more specialized. In Paul’s post he explains everything but, when you use Availability Groups, it is not clear how the logic relates to the differences induced by database mirroring. We are going to analyze what happens when you add FILESTREAM to a database in an availability group and how this can cause file system bloat and prevent the garbage collection from deleting files that you might think should be getting deleted.
To summarize the garbage collecting process; when a CHECKPOINT occurs it will kick off and read a tombstone table to find what files to delete. The ones deleted will only be ones that relate to log sequence numbers (LSN) available to be truncated. Specifically, the VLF that they were written to has to have been marked inactive. You can also kick off the garbage collecting process with sp_filestream_force_garbage_collection but it still will only delete files that relate to inactive VLFs.
Typically the condition of VLFs being inactive would be satisfied by log backups and CHECKPOINTs. When using availability groups, however, you can take as many log backups as you’d like and kick off numerous CHECKPOINTs but still be left with large quantities of files stuck in the tombstone table. This is because of the lazy log truncation but I’m getting ahead of myself, we’ll get to that.
Establish a database in an availability group
Disclaimer: there is an assumption that the reader has a basic understanding of availability groups. This is not intended to be an instructional post on how to setup and configure them. Rather, the script is included as a convenience to anyone who is running this on a development box while following along.
USE [master]; --Create database CREATE DATABASE filestream_garbage_collection_test; GO --Backup to remove it from PSEUDO-SIMPLE recovery. BACKUP DATABASE filestream_garbage_collection_test TO DISK = '\\shareserver\backups\filestream_garbage_collection_test_FULL.bak' WITH COMPRESSION, STATS=10; GO --Configure AlwaysOn Availability Group CREATE ENDPOINT AG_FS_Endpoint AS TCP (LISTENER_PORT = 5022) FOR DATA_MIRRORING (ROLE = ALL, ENCRYPTION = REQUIRED ALGORITHM AES) GO ALTER ENDPOINT AG_FS_Endpoint STATE = STARTED GO CREATE LOGIN [DOMAIN\Acct] FROM WINDOWS GRANT CONNECT ON ENDPOINT::AG_FS_Endpoint TO [DOMAIN\Acct] GO CREATE AVAILABILITY GROUP [AG_FILESTREAM] WITH (AUTOMATED_BACKUP_PREFERENCE = SECONDARY) FOR REPLICA ON N'Server' WITH (ENDPOINT_URL = N'TCP://Server.libtax.com:5022' , FAILOVER_MODE = MANUAL, AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT , SESSION_TIMEOUT = 10, BACKUP_PRIORITY = 50 , PRIMARY_ROLE(ALLOW_CONNECTIONS = ALL) , SECONDARY_ROLE(ALLOW_CONNECTIONS = ALL)); GO --Wait 5 secs to give the AG time to come online WAITFOR DELAY '00:00:05' GO --Add db to AG ALTER AVAILABILITY GROUP [AG_FILESTREAM] ADD DATABASE [filestream_garbage_collection_test]; GO
Setup FILESTREAM objects
--Create FILESTREAM FILEGROUP ALTER DATABASE filestream_garbage_collection_test ADD FILEGROUP filestream_filegroup CONTAINS FILESTREAM GO ALTER DATABASE filestream_garbage_collection_test ADD FILE ( NAME= 'filestream_filegroup', FILENAME = 'M:\MSSQL11.SITCLDB017D\MSSQL\DATA\filestream.ndf' ) TO FILEGROUP filestream_filegroup GO --Create FILESTEAM table USE filestream_garbage_collection_test GO IF EXISTS (SELECT TOP 1 1 FROM sys.tables WHERE name = 'FS_Table') DROP TABLE FS_Table; GO CREATE TABLE FS_Table (ID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE, BLOB VARBINARY(MAX) FILESTREAM NULL); GO
At this point we have our test case prepared for data. Before we move forward we need to take a look at DBCC LOGINFO. DBCC LOGINFO will give us a list of all of the VLFs in the transaction log and the status column indicates whether it’s active (status 2) or inactive (status 0). Notice that we have one active VLF and two inactive.
Now we will fill up these VLFs with transaction data.
--INSERT the blobs SET NOCOUNT ON INSERT INTO FS_Table (ID, BLOB) VALUES (NEWID(),CAST('unnecessary test data 1' AS VARBINARY(MAX))) ; GO 40 INSERT INTO FS_Table (ID, BLOB) VALUES (NEWID(),CAST('unnecessary test data 2' AS VARBINARY(MAX))) ; GO 40 INSERT INTO FS_Table (ID, BLOB) VALUES (NEWID(),CAST('unnecessary test data 3' AS VARBINARY(MAX))) ; GO 40 --Verify data USE filestream_garbage_collection_test SELECT ID, CAST(BLOB AS VARCHAR(100)) AS [String_Data] , BLOB, BLOB.PathName() AS [Blob_PathName] FROM FS_Table
We now have 120 records in our table which, if you will check your file system now, will equate to 120 files. Next we need to produce files for the tombstone table. We will UPDATE BLOBs because partial updates of FILESTREAM columns is not supported. SQL Server will create a new file, related to the LSN, and mark the old file for delete. We will also delete BLOB records which will mark those files for deletion.
--Update some records to create versions for garbage collection UPDATE FS_Table SET BLOB = CAST('unnecessary test data - updated' AS VARBINARY(MAX)) WHERE BLOB = CAST('unnecessary test data 1' AS VARBINARY(MAX)); GO --Delete some records to create versions for garbage collection DELETE FROM FS_Table WHERE BLOB = CAST('unnecessary test data 2' AS VARBINARY(MAX)); GO --Verify data USE filestream_garbage_collection_test SELECT ID, CAST(BLOB AS VARCHAR(100)) AS [String_Data] , BLOB, BLOB.PathName() AS [Blob_PathName] FROM FS_Table
You can now see that we are left with 80 records in the table. However, there are now 160 files in the file system. Now we want those files to be deleted so we know to see why our VLFs aren’t being cleared.
DBCC LOGINFO SELECT log_reuse_wait_desc, name FROM sys.databases WHERE name = DB_NAME()
Our first three VLFs are now active and we’ve gained one more VLF via auto growth. We also see that we need a log backup to clear the VLFs. So, let’s do that.
--Backup log in attempt to mark VLFs inactive. BACKUP LOG filestream_garbage_collection_test TO DISK = '\\share\backups\filestream_garbage_collection_test_LOG_1.trn' WITH COMPRESSION; DBCC LOGINFO SELECT log_reuse_wait_desc, name FROM sys.databases WHERE name = DB_NAME()
Our log_reuse_wait_desc is NOTHING but our VLFs have not converted their status from 2 (active) to 0 (inactive). We know we need the VLFs to be inactive but maybe we can force things with another log backup followed by a forced call to the FILSTREAM garbage collector.
--Backup log in attempt to mark VLFs inactive. BACKUP LOG filestream_garbage_collection_test TO DISK = '\\share\backups\filestream_garbage_collection_test_LOG_2.trn' WITH COMPRESSION; --Force FS garbage collection CHECKPOINT EXEC sp_filestream_force_garbage_collection @dbname = N'filestream_garbage_collection_test';
Even after all of that, when checking the file system, you will still see 160 files instead of the intended 80. In addition, you’ll notice that the sp_filestream_force_garbage_collection output showed counts in the num_marked_for_collection_items and/or num_unprocessed_items. This is a representation of items in the tombstone table that have not been deleted. To wrap up the demo we will now remove the database from the availability group and induce another call to the garbage collection.
--Drop AG USE [master] GO DROP AVAILABILITY GROUP [AG_FILESTREAM]; GO DBCC LOGINFO SELECT log_reuse_wait_desc, name FROM sys.databases WHERE name = DB_NAME() --Backup log in attempt to mark VLFs inactive. BACKUP LOG filestream_garbage_collection_test TO DISK = '\\share\backups\filestream_garbage_collection_test_LOG_3.trn' WITH COMPRESSION; --Force FS garbage collection CHECKPOINT EXEC sp_filestream_force_garbage_collection @dbname = N'filestream_garbage_collection_test';
Now that we’ve removed the availability group from the equation we are back to the expected one VLF marked as active and, after checking the file system, we have the expected 80 files to relate to our 80 existing BLOB records.
When you have a database in an availability group (AG) the lazy log truncator takes effect. That lazy log truncator will only mark a VLF inactive (ready for truncation/clearing) when that VLF is needed for transactional processing. This means that once you’ve used all of your VLFs, they will all stay in the active status for as long as you are in an AG. As the transation log round-robins the VLFs it will clear one VLF at a time as needed. This is a feature intended to aid in synchronizing new replicas added to the AG. The more data in the log, the wider your window for synchronization is.
Due to this behavior and the logic in the garbage collection for FILESTREAM, your tombstone files will not be deleted until the VLF that they were written to is cleared by the round-robin affect of log writing. This can be a big problem if you do what one development team I know did, which was to create a Storage database and put all their BLOBs there because they wanted it segregated from their primary transactional database. Since FILESTREAM only logs the non-BLOB data this made their log filled very slowly and large numbers of transactions would have to go by before they would move on to a different VLF and, in affect, clear a single VLF. This left them in a situation where their file system was three times larger than the number of BLOBs in their database which was wasting hundreds of GBs of space.