Clustered Indexes in SQL Server

In the last post of this blog series, I discussed what a heap is and how it impacts query performance.

In this post, I will describe clustered indexes and how they differ from a heap. They are indexes but also tables. Like a heap, the clustered index stores the data pages and is indistinguishable from that physical table.

When should you use clustered indexes?

Clustered indexes should be your default choice for structuring your table data. Many people will demonize heaps and declare clustered indexes as the only choice. I disagree with that mentality, instead I embrace the DBA’s it depends.

Heaps can outperform clustered indexes in some scenarios but the scenarios are less common. Review the last post for an explanation of the few cases where heaps can be better.

I recommend that you create a clustered index for all tables unless you can prove that the clustered index is a burden or that a heap will outperform it. I will discuss the performance grains of the clustered index below.

Clustered index structure

A clustered index is a B+-Tree as described in the first post of this series. It is a hierarchical tree which has a root level, zero to many intermediate levels, and a leaf level. The leaf level is where all of the table data is stored with pointers to the leaf pages which comes before and after in the index key sort order.

Clustered-Index-Structure

Each root node is an index allocation map (IAM) page. These pages are allocated for each 4 GB chunk of the below three different allocation units, at least one per unit type.

  1. IN_ROW_DATA allocation unit
  2. LOB_DATA allocation unit
  3. ROW_OVERFLOW_DATA allocation unit

The intermediate nodes contains pointers to a sub-set of the leaf level pages. The number of intermediate levels and pages is determined by the size of the table. A small table might not have any intermediate levels while a large table may have many.

DBCC IND ('AdventureWorks2012','Person.Address',1)

clustered-iam-page

Above you can see three types of index pages. The first row, PageType 10, is an IAM page. As seen below, the IAM page records page allocations.

DBCC PAGE ('AdventureWorks2012',1,836,3);

IAM-page

The second row is an intermediate page, PageType 2. Only pointers to the leaf level are found here.

DBCC PAGE ('AdventureWorks2012',1,840,3);

intermediate-page

Third, we have the first of many of our data pages, PageType 1. Below is the bitmap and column data.

DBCC PAGE ('AdventureWorks2012',1,20847,3);

data-page

Sort order

The data in a clustered index is logically sorted but does not guarantee that it will be physically sorted. The physical sorting is simply a common misconception. In fact, the rows on a given page are not sorted even though all rows contained on that page will be appropriate to its place in the logical sort order. Also, the pages on disk are not guaranteed to be sorted by the logical key either.

The most likely time where you will have a clustered index that is physically sorted is immediately after an index rebuild operation. If you are trying to optimize for sequential reads, setting a fill factor to leave free space on your pages will help limit how often you have pages physically out of order at the expense of disk space.

What makes a good clustered index?

Clustered indexes impact your read and write performance of every query and every data manipulation operation. There are key characteristics of a good clustered index.

In order of importance…

1.      Keep it skinny

Your clustered indexes should be as narrow as possible.

Clustered indexes include all columns of your table, because it is the table. This means that each page you read will have the smallest number of rows possible for that table, making each logical read cost more than a narrower non-clustered index would.

The lack of a suitable non-clustered index will force SQL Server to read the clustered index. In addition, the clustered index will be used for key lookups. All non-clustered index rows include the clustered key. More on non-clustered indexes in the next post. This makes it possible for SQL Server to use a non-clustered index and then look up the corresponding rows in the clustered index to complete the query.

Performance impact

In the query below, there is a non-clustered index which covers the person’s name but SQL Server must refer back to the clustered index to pull the AdditionalContactInfo field, since it is not included in the non-clustered index. These key lookups are costly.

SELECT   [LastName] ,
[FirstName] ,
[MiddleName] ,
[AdditionalContactInfo]
FROM   [Person].[Person]
WHERE  lastname = 'Duffy' 

Key Lookup

The clustered key being stored in the non-clustered index is the main reason that heaps are chosen for certain designs. In the heap post you learned that a row ID (RID) is 8 bytes. A RID is included in every non-clustered index instead of a clustered key. The size of your clustered key depends upon the data types of the columns which make up the key. The maximum size of a clustered key is 900 bytes. This is why you should strive to keep your clustered keys very narrow. You do not want a large key bloating all of your non-clustered indexes. This will cost disk space and limit the amount of rows you can store on each non-clustered index page.

To show how many rows fit in each of our pages I have barrowed a query from Michelle Ufford’s Estimating Rows per Page post.

Select object_name(i.object_id) As 'tableName'
, i.name As 'indexName'
, i.type_desc
, Max(p.partition_number) As 'partitions'
, Sum(p.rows) As 'rows'
, Sum(au.data_pages) As 'dataPages'
, Sum(p.rows) / Sum(au.data_pages) As 'rowsPerPage'
From sys.indexes As i
Join sys.partitions As p
On i.object_id = p.object_id
And i.index_id = p.index_id
Join sys.allocation_units As au
On p.hobt_id = au.container_id
Where au.type_desc = 'IN_ROW_DATA'
AND object_name(i.object_id) = 'Person'
Group By object_name(i.object_id)
, i.name
, i.type_desc

The original clustered index on the Person table was a single INT, 4 bytes wide. It can only hold 5 rows per page and the associated indexes held 307 and 186 each.

Person-ClusterKey-1

By adding the rowguid column (16 bytes) and the ModifiedDate column (8 bytes) the clustered key becomes a total of 28 bytes, that is 7 times the size of the original. This did not affect our clustered index but it did make our non-clustered indexes 22% and 35% less effective.

Person-ClusterKey-2

Primary keys create a clustered index by default which matches their key but this is not a requirement. Often data architects will choose to use a different clustered key so that they can keep it small. One common strategy is to create an integer identity column and use it as the clustered key. There are additional benefits to this method, aside from being very narrow, keep reading.

2.      Static

Your clustered keys should not be updated.

Your non-clustered indexes reference table data by storing the clustered key. This causes a clustered key update to be very expensive because all non-clustered indexes have to be updated to reflect the change to the modified clustered index row. It is possible to create up to 999 non-clustered indexes on a table and it is common to see 5-10 on OLTP database tables or several dozen on OLAP database tables.

Keeping your key static removes the problem completely. There are two ways to achieve a static key. The first is to understand how your system works with the table and simply select a set of columns which make sense and you have no intention of ever including in an update statement. The second is to create an identity column. A column which stores a small data type satisfies the narrow recommendation from above and, if it has no direct link to reality there will be no reason to update it.

Performance impact

To demonstrate the write impact of updating a clustered index key, I have a table with five indexes, one of which is the clustered index. The two indexes in yellow include the SalesOrderId field. The two indexes in orange include the CustomerId field. I will be updating these two fields to represent the difference between two nonclustered indexes being updated and what would seem to be only two other indexes being updated but one is the clustered index.

Index-setup-static

First I backup the log to truncate it, preventing old transactions from slipping into our numbers. In addition, I have named my transactions to make parsing the log file easier. Updating a single row’s clustered key results in 11 log records written.

--/* Backup log to verify that it has been logically cleared
BACKUP DATABASE AdventureWorks2014
TO DISK = 'C:\Backups\AdventureWorks2014_Log.trn'
WITH INIT;
--*/

BEGIN TRANSACTION Take1

UPDATE t
SET SalesOrderId = 6
FROM dbo.SalesOrderHeader t
WHERE SalesOrderId = 7

COMMIT

--Show log records to indicate writes
SELECT [Transaction ID],
SUM([Log Record Length]) TotalLogSize,
COUNT(*) [TotalLogRecords]
FROM sys.fn_dblog(NULL,NULL)
WHERE [Transaction ID] = '0000:0000bfe4'
GROUP BY [Transaction ID]

Clustered-update

Next I will update a single record’s CustomerId field. Two indexes will be updated but only 7 log records are written. This is because the clustered key update forced updates of all of the non-clustered indexes while the CustomerId update only forced writes for two of the five indexes.

--/* Backup log to verify that it has been logically cleared
BACKUP DATABASE AdventureWorks2014
TO DISK = 'C:\Backups\AdventureWorks2014_Log.trn'
WITH INIT;
--*/

BEGIN TRANSACTION Take2

UPDATE t
SET CustomerId = 7
FROM dbo.SalesOrderHeader t
WHERE SalesOrderID = 6

COMMIT

--Show log records to indicate writes
SELECT [Transaction ID],
SUM([Log Record Length]) TotalLogSize,
COUNT(*) [TotalLogRecords]
FROM sys.fn_dblog(NULL,NULL)
WHERE [Transaction ID] = '0000:0000bff6'
GROUP BY [Transaction ID]

Nonclustered-update

Should you care about four additional writes during your update? No, not if it is only four. However, this performance hit scales up rather quickly. Below I updated the entire table with the same queries as above, just no WHERE clause. The SalesOrderId update is 44% slower than the CustomerId update which makes a large difference at only 35,000 rows.

large-scale-index-update

3.      Sequential

Your clustered keys should be inserted in the order that your clustered key is sorted.

One term I have heard when referring to clustered indexes is that they should be ever increasing. I phrased my recommendation specifically because this depends upon the sort order of the columns in your key. If you ever choose to use a descending key instead of the default ascending key then your inserts should have a decreasing value.

The sequential nature of your inserts is important because of page splits. If you used a GUID as your clustered key, please do not do that, the random nature of something like the NEWID() function would cause a lot of page splits. This is assuming that you are using the default fill factor of 100. You can limit the page splits by using a lower fill factor such as 80 but your table would then be 20% larger. That can end up being a huge waste of space. It will further reduce the number of rows that can fit on each page thus increasing the expense of each read operation.

Not only do page splits allocate a new page, but if that page is not at the end of the index it will need to update the intermediate and/or adjacent pages as well.

Performance impact

The below tables will be used to demonstrate the page splits between sequential key inserts and non-sequential key inserts.

DROP TABLE dbo.PageSplits_Sequential
CREATE TABLE dbo.PageSplits_Sequential
(
id INT IDENTITY(1,1) NOT NULL,
txt VARCHAR(30) NOT NULL,
txt1 VARCHAR(30) NOT NULL,
txt2 VARCHAR(30) NOT NULL,
txt3 VARCHAR(30) NOT NULL,
txt4 VARCHAR(30) NOT NULL
)

INSERT INTO dbo.PageSplits_Sequential
(
txt ,txt1 ,txt2 ,txt3 ,txt4
)
SELECT TOP 10000
REPLICATE('a',20) ,REPLICATE('a',20)
,REPLICATE('a',20) ,REPLICATE('a',20)
,REPLICATE('a',20)
FROM sys.columns c
CROSS JOIN sys.columns c1

ALTER TABLE dbo.PageSplits_Sequential
ADD CONSTRAINT PK_PageSplits_Sequential
PRIMARY KEY CLUSTERED (id);

CREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt
ON dbo.PageSplits_Sequential (txt)
CREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt1
ON dbo.PageSplits_Sequential (txt1)
CREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt2
ON dbo.PageSplits_Sequential (txt2)
CREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt3
ON dbo.PageSplits_Sequential (txt3)
CREATE NONCLUSTERED INDEX IX_PageSplits_Sequential_txt4
ON dbo.PageSplits_Sequential (txt4)

DROP TABLE dbo.PageSplits_NonSequential
CREATE TABLE dbo.PageSplits_NonSequential
(
id INT NOT NULL,
txt VARCHAR(30) NOT NULL,
txt1 VARCHAR(30) NOT NULL,
txt2 VARCHAR(30) NOT NULL,
txt3 VARCHAR(30) NOT NULL,
txt4 VARCHAR(30) NOT NULL
)

INSERT INTO dbo.PageSplits_NonSequential
SELECT id * 2
,txt,txt1,txt2,txt3,txt4
FROM dbo.PageSplits_Sequential

ALTER TABLE dbo.PageSplits_NonSequential
ADD CONSTRAINT PK_PageSplits_NonSequential
PRIMARY KEY CLUSTERED (id);

CREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt
ON dbo.PageSplits_NonSequential (txt)
CREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt1
ON dbo.PageSplits_NonSequential (txt1)
CREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt2
ON dbo.PageSplits_NonSequential (txt2)
CREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt3
ON dbo.PageSplits_NonSequential (txt3)
CREATE NONCLUSTERED INDEX IX_PageSplits_NonSequential_txt4
ON dbo.PageSplits_NonSequential (txt4)

Since the tables and indexes have just been created, they are fresh and show no page splits in sys.dm_db_index_operational_stats. Below query taken from this article.

SELECT
IOS.INDEX_ID,
O.NAME AS OBJECT_NAME,
I.NAME AS INDEX_NAME,
IOS.LEAF_ALLOCATION_COUNT AS PAGE_SPLIT_FOR_INDEX,
IOS.NONLEAF_ALLOCATION_COUNT PAGE_ALLOCATION_CAUSED_BY_PAGESPLIT
FROM SYS.DM_DB_INDEX_OPERATIONAL_STATS(DB_ID(N'DB_NAME'),NULL,NULL,NULL) IOS
INNER JOIN SYS.INDEXES I ON IOS.INDEX_ID=I.INDEX_ID
AND IOS.OBJECT_ID = I.OBJECT_ID
INNER JOIN SYS.OBJECTS O ON IOS.OBJECT_ID=O.OBJECT_ID
WHERE O.TYPE_DESC='USER_TABLE'
AND O.Name LIKE 'PageSplits%'
ORDER BY o.name DESC, i.name ASC 

PageSplits-Fresh

By adding equal number of rows to the two tables we can see that the non-sequential inserts produce more page splits. It is important to note that not all page splits are bad. In the case of the sequential inserts, a page split is registered for each time a new page is allocated at the end of the index. These are considered to be good page splits. On the non-sequential inserts there will be good page splits and bad page splits. Bad page splits are when the index splits in the middle of the index. These are bad because they fragment the index and might require splitting intermediate level pages which increases the number of writes even further. The intermediate level splits are indicated in the PAGE_ALLOCATION_CAUSED_BY_PAGESPLIT. This table is rather small so the count will be low. These splits can become rather large, however.

INSERT INTO dbo.PageSplits_Sequential
(
txt,txt1,txt2,txt3,txt4
)
SELECT TOP 20000
REPLICATE('a',20),REPLICATE('a',20)
,REPLICATE('a',20),REPLICATE('a',20)
,REPLICATE('a',20)
FROM sys.columns c
CROSS JOIN sys.columns c1

INSERT INTO dbo.PageSplits_NonSequential
SELECT id - 1
,txt,txt1,txt2,txt3,txt4
FROM dbo.PageSplits_NonSequential

INSERT INTO dbo.PageSplits_NonSequential
SELECT TOP 10000 id + 50000
,txt,txt1,txt2,txt3,txt4
FROM dbo.PageSplits_NonSequential

PageSplits-AfterInserts

4.      Uniqueness

When possible, your clustered keys should be unique.

SQL Server needs to uniquely identify each row within the clustered index. Uniqueness is not a requirement for a clustered index, however. Instead, SQL Server will append an uniqueifier to each duplicate row. The uniqueifier is 4 bytes. The 4 bytes alone can cause a lot of bloat if you have a lot of duplicates, especially when accounting for the bloat to all of the non-clustered indexes. However, there is an additional penalty if the table does not already contain any variable length columns. In that case, there is 4 bytes of additional overhead to manage the variable length uniqueifier, resulting in a total of 8 bytes of bloat per duplicate row.

To put this in perspective, if you were to use a single non-unique INT column (4 bytes in size) the potential uniqueifier cost could triple the size of your clustered key.

Performance impact

I created two versions of the Person.Person table, dbo.Person_Unique and dbo.Person_WithDups. They have the exact same amount of rows in them but the dbo.Person_WithDups includes around 9000 duplicate BusinessEntityIDs.

Select object_name(i.object_id) As 'tableName'
, i.name As 'indexName'
, i.type_desc
, Max(p.partition_number) As 'partitions'
, Sum(p.rows) As 'rows'
, Sum(au.data_pages) As 'dataPages'
, Sum(p.rows) / Sum(au.data_pages) As 'rowsPerPage'
From sys.indexes As i
Join sys.partitions As p
On i.object_id = p.object_id
And i.index_id = p.index_id
Join sys.allocation_units As au
On p.hobt_id = au.container_id
Where au.type_desc = 'IN_ROW_DATA'
AND (object_name(i.object_id) = 'Person_WithDups'
OR object_name(i.object_id) = 'Person_Unique')
Group By object_name(i.object_id)
, i.name
, i.type_desc 

Person-Uniqueness

With identical row counts, the Person_Unique table has 40% fewer data pages. Instead of 5 rows per page in the Person_Unique table, the Person_WithDups table can only hold 3 rows per page. This is because of the additional overhead induced by the clustered index uniqueifier.

Next time

The clustered index is the recommended default choice for structuring your table data. You learned the best practices for designing your clustered keys and how they impact performance of your system and resulting non-clustered indexes. In the next part of this series I will cover SQL Server’s easiest go-faster button, non-clustered indexes.

References

[1] About SQL Server Clustered Indices (Solarwinds)

[2] Effective Clustered Indexes (Simple Talk)

[3] Expert Performance Indexing (Book)

[4] Clustered Index Do Not Guarantee Physically Ordering or Sorting of Rows (SQL with Manoj)

This article has 5 comments

  1. SQL Server Performance Tune
    Thursday 30 June 2016, 8:35 am

    “This is assuming that you are using the default fill factor of 100. You can limit the page splits by using a lower fill factor such as 80 but your table would then be 20% larger.”

    This kind of mistake is made very often. If you stretch the 80% to become the new 100%, you easily see that the 20% left before extends to 25%. That is, you fill your pages 80/100 of what they can contain and you are going to need 25 pages for the 20/100 data that “remains” with the same 80/100 ratio.

    If something is twice as big as something else, could you say this something else is double as small as something?
    If something is moving twice as fast as something else, is that something else double as slow as something?
    I find it hard to tell, because speed is something else than slowness.

    If something takes up twice the time of something else, if that something else double as short as something?
    No, that something else just takes up half the time of something.

    • I am not certain that I am interpreting your comment correctly. If you believe that I was recommending a fill factor of 80 as a default best practice, that was not my intent. I was just giving a simple example of how you could reduce page splits with the trade off of increased disk space and reduced row counts per page.

      • SQL Server Performance Tune
        Monday 4 July 2016, 4:45 am

        I sure do understand you do not recommend using that fillfactor. It’s just your remark that by applying that fillfactor of 80 the table would grow 20% larger. That is not correct. It would be 25% larger. If you had 20 records on 4 pages (5 on each, fillfactor 100) and apply a fillfactor of 80, you will get 4 records on each page. To store those 20 records, you now need 5 pages. That is an increase of 25% and not simply by the same number as the decrease of filling is. It’s just math, nothing else.

  2. Hi Derik, related to the performance tuning, I’ve my database running on SQL 2012 Enterprise where couple tables have high read/write access (5000+/10 sec) causing bottlenecks. I’m not a seasoned DBA (actually there is no DBA), just a developer but this thing has come to my plate to look into. Do you suggest upgrading 2012 to 2016 for the sake of in-memory tables and other features is a reason worth looking into? Or this is something that can be tuned to gain better performance.

    • Unfortunately, there is no where near enough information in your comment for me to give you an answer. It sounds like what you need is a consulting engagement. In-memory table is nice but is not simply a go-faster button. There are large effects on design when you use them.

Leave a Reply

%d bloggers like this: