I recently ran into SQL Server’s page compression being applied to a heap, and I figured I’d cover why that won’t work how some folks expect.
SQL Server’s page compression is really neat when applied on tables and indexes that are good candidates for it.
Even more so in cloud environments where storage costs can quickly add up.
Kendra Little has this great post where she explains why she opts to default to compression for Azure SQL MI.
But some folks use page compression on heaps, which, depending on a few factors, might not end up fully taking advantage of it.
Setting up the demo
First, I need to create a database and 3 tables for this demo.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | IF EXISTS (SELECT 1 FROM sys.[databases] WHERE name = N'oh_my_heap') DROP DATABASE [oh_my_heap]; GO CREATE DATABASE [oh_my_heap]; GO USE [oh_my_heap]; GO IF OBJECT_ID(N'uncompressed_heap', N'U') IS NOT NULL BEGIN DROP TABLE [uncompressed_heap]; END; GO CREATE TABLE [uncompressed_heap] ( [id] INT NOT NULL IDENTITY(1, 1), [short_char] CHAR(45), [long_char] CHAR(300), [medium_nvarchar] NVARCHAR(180), [long_nvarchar] NVARCHAR(300), [time_stamp] DATETIME DEFAULT GETDATE() ); GO IF OBJECT_ID(N'row_compressed_heap', N'U') IS NOT NULL BEGIN DROP TABLE [row_compressed_heap]; END; GO CREATE TABLE [row_compressed_heap] ( [id] INT NOT NULL IDENTITY(1, 1), [short_char] CHAR(45), [long_char] CHAR(300), [medium_nvarchar] NVARCHAR(180), [long_nvarchar] NVARCHAR(300), [time_stamp] DATETIME DEFAULT GETDATE() ) WITH (DATA_COMPRESSION = ROW); GO IF OBJECT_ID(N'page_compressed_heap', N'U') IS NOT NULL BEGIN DROP TABLE [page_compressed_heap]; END; GO CREATE TABLE [page_compressed_heap] ( [id] INT NOT NULL IDENTITY(1, 1), [short_char] CHAR(45), [long_char] CHAR(300), [medium_nvarchar] NVARCHAR(180), [long_nvarchar] NVARCHAR(300), [time_stamp] DATETIME DEFAULT GETDATE() ) WITH (DATA_COMPRESSION = PAGE); GO |
Notice how each heap table has a different compression level.
Now, to also populate them with data.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | SET NOCOUNT ON; DECLARE @i INT = 0, @rows INT = 660000, @string1 VARCHAR(45) = REPLICATE('abc', 15), @string2 VARCHAR(300) = REPLICATE('def', 100), @string3 NVARCHAR(180) = REPLICATE(N'ghi', 60), @string4 NVARCHAR(300) = REPLICATE(N'jkl', 100); BEGIN TRAN; WHILE @i < @rows BEGIN INSERT INTO [uncompressed_heap] ([short_char], [long_char], [medium_nvarchar], [long_nvarchar]) VALUES (@string1,@string2,@string3,@string4); INSERT INTO [row_compressed_heap] ([short_char], [long_char], [medium_nvarchar], [long_nvarchar]) VALUES (@string1,@string2,@string3,@string4); INSERT INTO [page_compressed_heap] ([short_char], [long_char], [medium_nvarchar], [long_nvarchar]) VALUES (@string1,@string2,@string3,@string4); SET @i+=1; END; COMMIT TRAN; |
So at this point I have 3 heaps, one uncompressed, one using row compression, and one using page compression. And each heap is populated with 660K rows.
Table sizes
I’m using my table sizes script, with a small modification to also show the 8KB page count.

Notice how both compressed tables have the same size (630MB) and page count (82508)?
Pretty weird considering page compression has an even better compression ratio than row compression due to the additional prefix and dictionary compression it adds on top of row compression.
But why?
As with pretty much every technical thing, it’s caused by folks not reading the documentation.
And I get it, I also get overly exited about a shiny new thing and opt to mess with it before properly reading the user manual.
In this case, the documentation mentions some very important points for page compression on heaps.
New rows added to a page compressed heap will only receive page compression in the following cases:
- They are added via bulk import with bulk optimizations enabled.
- They’re inserted with
INSERT INTO … WITH (TABLOCK)syntax and the table doesn’t have a nonclustered index. Some folks tend to miss that very important last part. - A table rebuild is executed via
ALTER TABLE ... REBUILD.
At the time of writing this post, the docs say that the PAGE data compression option needs to be specified, but, as per my tests so far, it isn’t necessary if the heap already had page compression applied previously.
New pages allocated to a heap as a result of a normal DML operations (i.e. single row insert like in this example) will only be page compressed after a table rebuild, until then they’ll be stuck with row compression.
Hence why both row_compressed_heap and page_compressed_heap tables have the same size and page count, even if the latter is created with page compression.
The extra work
To fix this, I have to rebuild the page_compressed_heap table.
| 1 2 | ALTER TABLE [page_compressed_heap] REBUILD WITH (MAXDOP=0, ONLINE=OFF); GO |
And then I use the same table size script to check what changed.

After the rebuild, the page_compressed_heap table only takes up 9MB and consists of 1035 8KB pages.
But this only fixes this batch of data. Any rows added afterwards will again use row compression until a table rebuild.
Side-note: the extreme compression ratio here is due to me not being very creative with the inserted data and not having any variation between each row. The main point still applies.
What about clustered indexes?
Clustered indexes automatically apply page compression to newly added data regardless of how the data is added, so there are no special considerations or requirements.
Conclusion
If you really want to apply SQL Server’s page compression to a heap you need to be mindful of how the data is loaded into it and take into account the fact that a table rebuild might be required.
Also, if you really want to page compress a heap that’s active (SELECT/UPDATE/DELETE), consider a clustered index instead.
Keep in mind that heaps have some other hidden drawbacks which I’ve covered in-depth in this post.