🚀 SQL Server Data Compression: The Complete DBA Guide
Data compression is one of the most powerful yet underutilized features in SQL Server. When implemented correctly, it can reduce storage costs by 50-90%, improve I/O performance, and even accelerate query execution. This comprehensive guide covers everything you need to know about implementing data compression in production environments.
📊 Understanding SQL Server Compression Types
SQL Server offers several compression options, each designed for specific use cases and workload patterns.
Row Compression
Row compression is the lighter option that stores fixed-length data types using variable-length storage formats. It's the foundation upon which page compression builds.
-- Enable row compression on an existing table
ALTER TABLE Sales.OrderDetails
REBUILD WITH (DATA_COMPRESSION = ROW);
-- Enable row compression on a specific partition
ALTER TABLE Sales.OrderHistory
REBUILD PARTITION = 5 WITH (DATA_COMPRESSION = ROW);
-- Create a new table with row compression
CREATE TABLE Sales.CompressedOrders
(
OrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME2 NOT NULL,
TotalAmount DECIMAL(18,2) NOT NULL,
Status VARCHAR(20) NOT NULL,
CONSTRAINT PK_CompressedOrders PRIMARY KEY CLUSTERED (OrderID)
)
WITH (DATA_COMPRESSION = ROW);
Page Compression
Page compression includes row compression plus additional algorithms: prefix compression and dictionary compression. This typically achieves 50-80% compression ratios.
-- Enable page compression on an existing table
ALTER TABLE Sales.OrderDetails
REBUILD WITH (DATA_COMPRESSION = PAGE);
-- Enable page compression on an index
ALTER INDEX IX_OrderDetails_CustomerID
ON Sales.OrderDetails
REBUILD WITH (DATA_COMPRESSION = PAGE);
-- Create index with page compression
CREATE NONCLUSTERED INDEX IX_Orders_OrderDate
ON Sales.Orders (OrderDate)
INCLUDE (CustomerID, TotalAmount)
WITH (DATA_COMPRESSION = PAGE);
Columnstore Compression
Columnstore indexes use a highly efficient compression algorithm designed for analytics workloads. They can achieve 10x or better compression compared to row-store tables.
-- Create clustered columnstore index
CREATE CLUSTERED COLUMNSTORE INDEX CCI_FactSales
ON DW.FactSales
WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE);
-- Create nonclustered columnstore index for hybrid workloads
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Orders_Analytics
ON Sales.Orders (OrderDate, CustomerID, TotalAmount, Quantity)
WITH (DATA_COMPRESSION = COLUMNSTORE);
💡 Estimating Compression Savings
Before implementing compression, always estimate the potential savings using the built-in stored procedure.
Using sp_estimate_data_compression_savings
-- Estimate row compression savings
EXEC sp_estimate_data_compression_savings
@schema_name = 'Sales',
@object_name = 'OrderDetails',
@index_id = NULL,
@partition_number = NULL,
@data_compression = 'ROW';
-- Estimate page compression savings
EXEC sp_estimate_data_compression_savings
@schema_name = 'Sales',
@object_name = 'OrderDetails',
@index_id = NULL,
@partition_number = NULL,
@data_compression = 'PAGE';
Comprehensive Compression Analysis Script
-- Analyze all tables for compression opportunities
SET NOCOUNT ON;
CREATE TABLE #CompressionAnalysis
(
SchemaName SYSNAME,
TableName SYSNAME,
IndexName SYSNAME NULL,
IndexID INT,
PartitionNumber INT,
CurrentSizeKB BIGINT,
RowCompressedSizeKB BIGINT,
PageCompressedSizeKB BIGINT,
RowSavingsPercent DECIMAL(5,2),
PageSavingsPercent DECIMAL(5,2)
);
DECLARE @SchemaName SYSNAME, @TableName SYSNAME;
DECLARE @SQL NVARCHAR(MAX);
DECLARE table_cursor CURSOR FOR
SELECT
s.name AS SchemaName,
t.name AS TableName
FROM sys.tables t
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE t.is_ms_shipped = 0
AND t.type = 'U'
ORDER BY s.name, t.name;
OPEN table_cursor;
FETCH NEXT FROM table_cursor INTO @SchemaName, @TableName;
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRY
-- Get current size
DECLARE @CurrentSize BIGINT;
SELECT @CurrentSize = SUM(ps.reserved_page_count) * 8
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.tables t ON ps.object_id = t.object_id
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE s.name = @SchemaName AND t.name = @TableName;
-- Estimate row compression
CREATE TABLE #RowEst
(
object_name SYSNAME,
schema_name SYSNAME,
index_id INT,
partition_number INT,
size_with_current_compression_setting BIGINT,
size_with_requested_compression_setting BIGINT,
sample_size_with_current_compression_setting BIGINT,
sample_size_with_requested_compression_setting BIGINT
);
INSERT INTO #RowEst
EXEC sp_estimate_data_compression_savings
@schema_name = @SchemaName,
@object_name = @TableName,
@index_id = NULL,
@partition_number = NULL,
@data_compression = 'ROW';
-- Estimate page compression
CREATE TABLE #PageEst
(
object_name SYSNAME,
schema_name SYSNAME,
index_id INT,
partition_number INT,
size_with_current_compression_setting BIGINT,
size_with_requested_compression_setting BIGINT,
sample_size_with_current_compression_setting BIGINT,
sample_size_with_requested_compression_setting BIGINT
);
INSERT INTO #PageEst
EXEC sp_estimate_data_compression_savings
@schema_name = @SchemaName,
@object_name = @TableName,
@index_id = NULL,
@partition_number = NULL,
@data_compression = 'PAGE';
-- Insert results
INSERT INTO #CompressionAnalysis
SELECT
@SchemaName,
@TableName,
i.name,
r.index_id,
r.partition_number,
r.size_with_current_compression_setting,
r.size_with_requested_compression_setting,
p.size_with_requested_compression_setting,
CASE WHEN r.size_with_current_compression_setting > 0
THEN (1.0 - (CAST(r.size_with_requested_compression_setting AS DECIMAL) / r.size_with_current_compression_setting)) * 100
ELSE 0 END,
CASE WHEN p.size_with_current_compression_setting > 0
THEN (1.0 - (CAST(p.size_with_requested_compression_setting AS DECIMAL) / p.size_with_current_compression_setting)) * 100
ELSE 0 END
FROM #RowEst r
INNER JOIN #PageEst p ON r.index_id = p.index_id AND r.partition_number = p.partition_number
LEFT JOIN sys.indexes i ON i.object_id = OBJECT_ID(@SchemaName + '.' + @TableName) AND i.index_id = r.index_id;
DROP TABLE #RowEst;
DROP TABLE #PageEst;
END TRY
BEGIN CATCH
-- Skip tables that can't be analyzed
PRINT 'Error analyzing: ' + @SchemaName + '.' + @TableName;
END CATCH
FETCH NEXT FROM table_cursor INTO @SchemaName, @TableName;
END
CLOSE table_cursor;
DEALLOCATE table_cursor;
-- Display results sorted by potential savings
SELECT
SchemaName,
TableName,
IndexName,
CurrentSizeKB / 1024 AS CurrentSizeMB,
RowCompressedSizeKB / 1024 AS RowCompressedMB,
PageCompressedSizeKB / 1024 AS PageCompressedMB,
RowSavingsPercent,
PageSavingsPercent,
(CurrentSizeKB - PageCompressedSizeKB) / 1024 AS PotentialSavingsMB
FROM #CompressionAnalysis
WHERE CurrentSizeKB > 10240 -- Only show tables > 10MB
ORDER BY (CurrentSizeKB - PageCompressedSizeKB) DESC;
DROP TABLE #CompressionAnalysis;
⚠️ When to Use Each Compression Type
Row Compression Best Candidates
| Characteristic | Why It Helps |
|---|---|
| Fixed-length columns (INT, CHAR, DATETIME) | Maximum storage savings |
| OLTP workloads with frequent updates | Lower CPU overhead |
| Tables with high insert rates | Minimal performance impact |
| Mixed workloads | Good balance of compression and speed |
Page Compression Best Candidates
| Characteristic | Why It Helps |
|---|---|
| Read-heavy workloads | Decompression overhead acceptable |
| Tables with repetitive data patterns | Dictionary compression excels |
| Large tables (> 1GB) | Significant storage savings |
| Archive/historical data | Rarely accessed, maximum compression |
💡 Pro Tip: Start with row compression for OLTP tables and page compression for reporting/archive tables. Monitor CPU impact before expanding.
🔧 Implementing Compression in Production
Online vs Offline Compression
-- Online index rebuild with compression (Enterprise Edition)
ALTER INDEX PK_Orders ON Sales.Orders
REBUILD WITH (
DATA_COMPRESSION = PAGE,
ONLINE = ON,
MAXDOP = 4,
SORT_IN_TEMPDB = ON
);
-- Offline compression (all editions)
ALTER TABLE Sales.OrderHistory
REBUILD WITH (DATA_COMPRESSION = PAGE);
Compression During Index Creation
-- Create compressed indexes from the start
CREATE NONCLUSTERED INDEX IX_Customer_Email
ON Sales.Customers (Email)
INCLUDE (FirstName, LastName, Phone)
WITH (
DATA_COMPRESSION = PAGE,
FILLFACTOR = 90,
ONLINE = ON
);
Partition-Level Compression Strategy
-- Different compression for hot vs cold partitions
-- Hot partitions (recent data): Row compression for write performance
-- Cold partitions (historical data): Page compression for storage savings
-- Create partition function
CREATE PARTITION FUNCTION PF_OrderDate (DATETIME2)
AS RANGE RIGHT FOR VALUES (
'2023-01-01', '2023-04-01', '2023-07-01', '2023-10-01',
'2024-01-01', '2024-04-01', '2024-07-01', '2024-10-01',
'2025-01-01'
);
-- Create partition scheme
CREATE PARTITION SCHEME PS_OrderDate
AS PARTITION PF_OrderDate
ALL TO ([PRIMARY]);
-- Create table with mixed compression
CREATE TABLE Sales.PartitionedOrders
(
OrderID BIGINT NOT NULL,
OrderDate DATETIME2 NOT NULL,
CustomerID INT NOT NULL,
TotalAmount DECIMAL(18,2) NOT NULL,
CONSTRAINT PK_PartitionedOrders PRIMARY KEY CLUSTERED (OrderID, OrderDate)
)
ON PS_OrderDate (OrderDate);
-- Apply different compression to each partition
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 1 WITH (DATA_COMPRESSION = PAGE); -- Oldest, archive
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 2 WITH (DATA_COMPRESSION = PAGE);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 3 WITH (DATA_COMPRESSION = PAGE);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 4 WITH (DATA_COMPRESSION = PAGE);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 5 WITH (DATA_COMPRESSION = ROW); -- Recent, active
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 6 WITH (DATA_COMPRESSION = ROW);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 7 WITH (DATA_COMPRESSION = ROW);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 8 WITH (DATA_COMPRESSION = ROW);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 9 WITH (DATA_COMPRESSION = NONE); -- Current, hottest
📊 Monitoring Compression Performance
Check Current Compression Status
-- View compression settings for all objects
SELECT
SCHEMA_NAME(t.schema_id) AS SchemaName,
t.name AS TableName,
i.name AS IndexName,
i.type_desc AS IndexType,
p.partition_number,
p.data_compression_desc AS CompressionType,
ps.reserved_page_count * 8 / 1024 AS SizeMB,
ps.row_count AS RowCount
FROM sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.dm_db_partition_stats ps ON p.partition_id = ps.partition_id
WHERE t.is_ms_shipped = 0
ORDER BY ps.reserved_page_count DESC;
Monitor Compression CPU Impact
-- Track CPU usage for compression operations
SELECT
qs.total_worker_time / 1000 AS TotalCPU_ms,
qs.execution_count,
qs.total_worker_time / qs.execution_count / 1000 AS AvgCPU_ms,
SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
((CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(st.text)
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2) + 1) AS QueryText
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
WHERE st.text LIKE '%COMPRESSION%'
ORDER BY qs.total_worker_time DESC;
Compression Savings Report
-- Generate compression savings report
WITH CompressionStats AS (
SELECT
SCHEMA_NAME(t.schema_id) AS SchemaName,
t.name AS TableName,
p.data_compression_desc AS CompressionType,
SUM(ps.reserved_page_count) * 8 / 1024 AS CurrentSizeMB,
SUM(ps.row_count) AS TotalRows
FROM sys.tables t
INNER JOIN sys.partitions p ON t.object_id = p.object_id
INNER JOIN sys.dm_db_partition_stats ps ON p.partition_id = ps.partition_id
WHERE t.is_ms_shipped = 0
GROUP BY SCHEMA_NAME(t.schema_id), t.name, p.data_compression_desc
)
SELECT
SchemaName,
TableName,
CompressionType,
CurrentSizeMB,
TotalRows,
CASE
WHEN TotalRows > 0
THEN CAST(CurrentSizeMB AS DECIMAL(18,2)) / TotalRows * 1000000
ELSE 0
END AS BytesPerRow
FROM CompressionStats
ORDER BY CurrentSizeMB DESC;
✅ Best Practices for Data Compression
Pre-Implementation Checklist
Analyze workload patterns
-- Check read vs write ratio SELECT OBJECT_NAME(s.object_id) AS TableName, SUM(user_seeks + user_scans + user_lookups) AS Reads, SUM(user_updates) AS Writes, CASE WHEN SUM(user_updates) > 0 THEN CAST(SUM(user_seeks + user_scans + user_lookups) AS FLOAT) / SUM(user_updates) ELSE 999999 END AS ReadToWriteRatio FROM sys.dm_db_index_usage_stats s WHERE database_id = DB_ID() GROUP BY s.object_id HAVING SUM(user_seeks + user_scans + user_lookups) + SUM(user_updates) > 0 ORDER BY ReadToWriteRatio DESC;Test compression impact on CPU
-- Baseline CPU before compression SELECT record_id, SQLProcessUtilization, SystemIdle, 100 - SystemIdle - SQLProcessUtilization AS OtherProcesses FROM ( SELECT record.value('(./Record/@id)[1]', 'int') AS record_id, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS SystemIdle, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS SQLProcessUtilization FROM ( SELECT CONVERT(XML, record) AS record FROM sys.dm_os_ring_buffers WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR' AND record LIKE '%<SystemHealth>%' ) AS x ) AS y ORDER BY record_id DESC;Schedule compression during maintenance windows
-- Automated compression job DECLARE @StartTime TIME = '02:00:00'; -- Start at 2 AM DECLARE @EndTime TIME = '06:00:00'; -- End by 6 AM IF CAST(GETDATE() AS TIME) BETWEEN @StartTime AND @EndTime BEGIN ALTER TABLE Sales.OrderHistory REBUILD WITH (DATA_COMPRESSION = PAGE, ONLINE = ON); END
Compression Decision Matrix
| Scenario | Recommended Compression | Rationale |
|---|---|---|
| High-volume OLTP | Row | Minimal CPU overhead |
| Data warehouse fact tables | Page or Columnstore | Maximum compression |
| Archive tables | Page | Rarely accessed |
| Temp tables | None or Row | Short-lived data |
| Indexes on OLTP | Row | Balance of performance |
| Reporting indexes | Page | Read-heavy access |
🚀 Advanced Compression Scenarios
Backup Compression (Separate from Data Compression)
-- Enable backup compression at server level
EXEC sp_configure 'backup compression default', 1;
RECONFIGURE;
-- Compress individual backup
BACKUP DATABASE AdventureWorks
TO DISK = 'D:\Backups\AdventureWorks.bak'
WITH COMPRESSION, STATS = 10;
-- Compare compressed vs uncompressed backup size
SELECT
database_name,
backup_start_date,
backup_size / 1024 / 1024 AS BackupSizeMB,
compressed_backup_size / 1024 / 1024 AS CompressedSizeMB,
CAST(100.0 - (compressed_backup_size * 100.0 / backup_size) AS DECIMAL(5,2)) AS CompressionPercent
FROM msdb.dbo.backupset
WHERE database_name = 'AdventureWorks'
ORDER BY backup_start_date DESC;
Unicode Compression (SQL Server 2019+)
-- Unicode compression is automatic in page compression
-- Verify Unicode compression is working
SELECT
OBJECT_NAME(object_id) AS TableName,
name AS ColumnName,
system_type_name,
max_length,
collation_name
FROM sys.dm_exec_describe_first_result_set(
N'SELECT * FROM Sales.Customers', NULL, 0
)
WHERE collation_name IS NOT NULL;
📈 Real-World Compression Results
Case Study: E-Commerce Order History
| Metric | Before | After (Page) | Improvement |
|---|---|---|---|
| Table Size | 450 GB | 125 GB | 72% reduction |
| Backup Size | 180 GB | 45 GB | 75% reduction |
| Backup Time | 4.5 hours | 1.2 hours | 73% faster |
| I/O per Query | 125,000 | 42,000 | 66% reduction |
| CPU Impact | Baseline | +8% | Acceptable |
⚠️ Warning: Always test compression in a non-production environment first. Monitor CPU utilization closely during initial implementation.
🔧 Troubleshooting Compression Issues
Common Problems and Solutions
-- Issue: Compression taking too long
-- Solution: Use MAXDOP to control parallelism
ALTER INDEX ALL ON Sales.LargeTable
REBUILD WITH (
DATA_COMPRESSION = PAGE,
ONLINE = ON,
MAXDOP = 2 -- Limit to 2 CPUs
);
-- Issue: High CPU during business hours
-- Solution: Use Resource Governor
CREATE RESOURCE POOL CompressionPool
WITH (MAX_CPU_PERCENT = 20);
CREATE WORKLOAD GROUP CompressionGroup
WITH (REQUEST_MAX_CPU_TIME_SEC = 0)
USING CompressionPool;
-- Issue: Log file growing during compression
-- Solution: Rebuild in batches
DECLARE @PartitionNumber INT = 1;
WHILE @PartitionNumber <= 12
BEGIN
ALTER TABLE Sales.PartitionedTable
REBUILD PARTITION = @PartitionNumber
WITH (DATA_COMPRESSION = PAGE);
-- Allow log backup between partitions
WAITFOR DELAY '00:01:00';
SET @PartitionNumber = @PartitionNumber + 1;
END
Conclusion
SQL Server data compression is a powerful feature that can dramatically reduce storage costs and improve I/O performance. The key to success is:
- Analyze before implementing - Use sp_estimate_data_compression_savings
- Choose the right type - Row for OLTP, Page for analytics/archive
- Monitor CPU impact - Compression trades CPU for I/O
- Implement incrementally - Start with largest, least-active tables
- Combine with partitioning - Different compression for hot vs cold data
With proper planning and implementation, you can achieve 50-80% storage savings while maintaining or even improving query performance.
About the Author
Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in SQL Server, SQL optimization, and enterprise database solutions.
Related Articles
- SQL Server Columnstore Indexes Explained
- SQL Server Performance Tuning Guide
- SQL Server Indexing Strategies
Need help with SQL Server optimization? Contact us for expert database consulting.