Name: EVE AI Core
Author: Jamaurice Holt

🚀 SQL Server Data Compression: The Complete DBA Guide

Data compression is one of the most powerful yet underutilized features in SQL Server. When implemented correctly, it can reduce storage costs by 50-90%, improve I/O performance, and even accelerate query execution. This comprehensive guide covers everything you need to know about implementing data compression in production environments.

📊 Understanding SQL Server Compression Types

SQL Server offers several compression options, each designed for specific use cases and workload patterns.

Row Compression

Row compression is the lighter option that stores fixed-length data types using variable-length storage formats. It's the foundation upon which page compression builds.

-- Enable row compression on an existing table
ALTER TABLE Sales.OrderDetails
REBUILD WITH (DATA_COMPRESSION = ROW);

-- Enable row compression on a specific partition
ALTER TABLE Sales.OrderHistory
REBUILD PARTITION = 5 WITH (DATA_COMPRESSION = ROW);

-- Create a new table with row compression
CREATE TABLE Sales.CompressedOrders
(
    OrderID INT NOT NULL,
    CustomerID INT NOT NULL,
    OrderDate DATETIME2 NOT NULL,
    TotalAmount DECIMAL(18,2) NOT NULL,
    Status VARCHAR(20) NOT NULL,
    CONSTRAINT PK_CompressedOrders PRIMARY KEY CLUSTERED (OrderID)
)
WITH (DATA_COMPRESSION = ROW);

Page Compression

Page compression includes row compression plus additional algorithms: prefix compression and dictionary compression. This typically achieves 50-80% compression ratios.

-- Enable page compression on an existing table
ALTER TABLE Sales.OrderDetails
REBUILD WITH (DATA_COMPRESSION = PAGE);

-- Enable page compression on an index
ALTER INDEX IX_OrderDetails_CustomerID
ON Sales.OrderDetails
REBUILD WITH (DATA_COMPRESSION = PAGE);

-- Create index with page compression
CREATE NONCLUSTERED INDEX IX_Orders_OrderDate
ON Sales.Orders (OrderDate)
INCLUDE (CustomerID, TotalAmount)
WITH (DATA_COMPRESSION = PAGE);

Columnstore Compression

Columnstore indexes use a highly efficient compression algorithm designed for analytics workloads. They can achieve 10x or better compression compared to row-store tables.

-- Create clustered columnstore index
CREATE CLUSTERED COLUMNSTORE INDEX CCI_FactSales
ON DW.FactSales
WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE);

-- Create nonclustered columnstore index for hybrid workloads
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_Orders_Analytics
ON Sales.Orders (OrderDate, CustomerID, TotalAmount, Quantity)
WITH (DATA_COMPRESSION = COLUMNSTORE);

💡 Estimating Compression Savings

Before implementing compression, always estimate the potential savings using the built-in stored procedure.

Using sp_estimate_data_compression_savings

-- Estimate row compression savings
EXEC sp_estimate_data_compression_savings
    @schema_name = 'Sales',
    @object_name = 'OrderDetails',
    @index_id = NULL,
    @partition_number = NULL,
    @data_compression = 'ROW';

-- Estimate page compression savings
EXEC sp_estimate_data_compression_savings
    @schema_name = 'Sales',
    @object_name = 'OrderDetails',
    @index_id = NULL,
    @partition_number = NULL,
    @data_compression = 'PAGE';

Comprehensive Compression Analysis Script

-- Analyze all tables for compression opportunities
SET NOCOUNT ON;

CREATE TABLE #CompressionAnalysis
(
    SchemaName SYSNAME,
    TableName SYSNAME,
    IndexName SYSNAME NULL,
    IndexID INT,
    PartitionNumber INT,
    CurrentSizeKB BIGINT,
    RowCompressedSizeKB BIGINT,
    PageCompressedSizeKB BIGINT,
    RowSavingsPercent DECIMAL(5,2),
    PageSavingsPercent DECIMAL(5,2)
);

DECLARE @SchemaName SYSNAME, @TableName SYSNAME;
DECLARE @SQL NVARCHAR(MAX);

DECLARE table_cursor CURSOR FOR
SELECT
    s.name AS SchemaName,
    t.name AS TableName
FROM sys.tables t
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE t.is_ms_shipped = 0
  AND t.type = 'U'
ORDER BY s.name, t.name;

OPEN table_cursor;
FETCH NEXT FROM table_cursor INTO @SchemaName, @TableName;

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
        -- Get current size
        DECLARE @CurrentSize BIGINT;
        SELECT @CurrentSize = SUM(ps.reserved_page_count) * 8
        FROM sys.dm_db_partition_stats ps
        INNER JOIN sys.tables t ON ps.object_id = t.object_id
        INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
        WHERE s.name = @SchemaName AND t.name = @TableName;

        -- Estimate row compression
        CREATE TABLE #RowEst
        (
            object_name SYSNAME,
            schema_name SYSNAME,
            index_id INT,
            partition_number INT,
            size_with_current_compression_setting BIGINT,
            size_with_requested_compression_setting BIGINT,
            sample_size_with_current_compression_setting BIGINT,
            sample_size_with_requested_compression_setting BIGINT
        );

        INSERT INTO #RowEst
        EXEC sp_estimate_data_compression_savings
            @schema_name = @SchemaName,
            @object_name = @TableName,
            @index_id = NULL,
            @partition_number = NULL,
            @data_compression = 'ROW';

        -- Estimate page compression
        CREATE TABLE #PageEst
        (
            object_name SYSNAME,
            schema_name SYSNAME,
            index_id INT,
            partition_number INT,
            size_with_current_compression_setting BIGINT,
            size_with_requested_compression_setting BIGINT,
            sample_size_with_current_compression_setting BIGINT,
            sample_size_with_requested_compression_setting BIGINT
        );

        INSERT INTO #PageEst
        EXEC sp_estimate_data_compression_savings
            @schema_name = @SchemaName,
            @object_name = @TableName,
            @index_id = NULL,
            @partition_number = NULL,
            @data_compression = 'PAGE';

        -- Insert results
        INSERT INTO #CompressionAnalysis
        SELECT
            @SchemaName,
            @TableName,
            i.name,
            r.index_id,
            r.partition_number,
            r.size_with_current_compression_setting,
            r.size_with_requested_compression_setting,
            p.size_with_requested_compression_setting,
            CASE WHEN r.size_with_current_compression_setting > 0
                 THEN (1.0 - (CAST(r.size_with_requested_compression_setting AS DECIMAL) / r.size_with_current_compression_setting)) * 100
                 ELSE 0 END,
            CASE WHEN p.size_with_current_compression_setting > 0
                 THEN (1.0 - (CAST(p.size_with_requested_compression_setting AS DECIMAL) / p.size_with_current_compression_setting)) * 100
                 ELSE 0 END
        FROM #RowEst r
        INNER JOIN #PageEst p ON r.index_id = p.index_id AND r.partition_number = p.partition_number
        LEFT JOIN sys.indexes i ON i.object_id = OBJECT_ID(@SchemaName + '.' + @TableName) AND i.index_id = r.index_id;

        DROP TABLE #RowEst;
        DROP TABLE #PageEst;
    END TRY
    BEGIN CATCH
        -- Skip tables that can't be analyzed
        PRINT 'Error analyzing: ' + @SchemaName + '.' + @TableName;
    END CATCH

    FETCH NEXT FROM table_cursor INTO @SchemaName, @TableName;
END

CLOSE table_cursor;
DEALLOCATE table_cursor;

-- Display results sorted by potential savings
SELECT
    SchemaName,
    TableName,
    IndexName,
    CurrentSizeKB / 1024 AS CurrentSizeMB,
    RowCompressedSizeKB / 1024 AS RowCompressedMB,
    PageCompressedSizeKB / 1024 AS PageCompressedMB,
    RowSavingsPercent,
    PageSavingsPercent,
    (CurrentSizeKB - PageCompressedSizeKB) / 1024 AS PotentialSavingsMB
FROM #CompressionAnalysis
WHERE CurrentSizeKB > 10240  -- Only show tables > 10MB
ORDER BY (CurrentSizeKB - PageCompressedSizeKB) DESC;

DROP TABLE #CompressionAnalysis;

⚠️ When to Use Each Compression Type

Row Compression Best Candidates

Characteristic	Why It Helps
Fixed-length columns (INT, CHAR, DATETIME)	Maximum storage savings
OLTP workloads with frequent updates	Lower CPU overhead
Tables with high insert rates	Minimal performance impact
Mixed workloads	Good balance of compression and speed

Page Compression Best Candidates

Characteristic	Why It Helps
Read-heavy workloads	Decompression overhead acceptable
Tables with repetitive data patterns	Dictionary compression excels
Large tables (> 1GB)	Significant storage savings
Archive/historical data	Rarely accessed, maximum compression

💡 Pro Tip: Start with row compression for OLTP tables and page compression for reporting/archive tables. Monitor CPU impact before expanding.

🔧 Implementing Compression in Production

Online vs Offline Compression

-- Online index rebuild with compression (Enterprise Edition)
ALTER INDEX PK_Orders ON Sales.Orders
REBUILD WITH (
    DATA_COMPRESSION = PAGE,
    ONLINE = ON,
    MAXDOP = 4,
    SORT_IN_TEMPDB = ON
);

-- Offline compression (all editions)
ALTER TABLE Sales.OrderHistory
REBUILD WITH (DATA_COMPRESSION = PAGE);

Compression During Index Creation

-- Create compressed indexes from the start
CREATE NONCLUSTERED INDEX IX_Customer_Email
ON Sales.Customers (Email)
INCLUDE (FirstName, LastName, Phone)
WITH (
    DATA_COMPRESSION = PAGE,
    FILLFACTOR = 90,
    ONLINE = ON
);

Partition-Level Compression Strategy

-- Different compression for hot vs cold partitions
-- Hot partitions (recent data): Row compression for write performance
-- Cold partitions (historical data): Page compression for storage savings

-- Create partition function
CREATE PARTITION FUNCTION PF_OrderDate (DATETIME2)
AS RANGE RIGHT FOR VALUES (
    '2023-01-01', '2023-04-01', '2023-07-01', '2023-10-01',
    '2024-01-01', '2024-04-01', '2024-07-01', '2024-10-01',
    '2025-01-01'
);

-- Create partition scheme
CREATE PARTITION SCHEME PS_OrderDate
AS PARTITION PF_OrderDate
ALL TO ([PRIMARY]);

-- Create table with mixed compression
CREATE TABLE Sales.PartitionedOrders
(
    OrderID BIGINT NOT NULL,
    OrderDate DATETIME2 NOT NULL,
    CustomerID INT NOT NULL,
    TotalAmount DECIMAL(18,2) NOT NULL,
    CONSTRAINT PK_PartitionedOrders PRIMARY KEY CLUSTERED (OrderID, OrderDate)
)
ON PS_OrderDate (OrderDate);

-- Apply different compression to each partition
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 1 WITH (DATA_COMPRESSION = PAGE);  -- Oldest, archive
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 2 WITH (DATA_COMPRESSION = PAGE);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 3 WITH (DATA_COMPRESSION = PAGE);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 4 WITH (DATA_COMPRESSION = PAGE);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 5 WITH (DATA_COMPRESSION = ROW);   -- Recent, active
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 6 WITH (DATA_COMPRESSION = ROW);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 7 WITH (DATA_COMPRESSION = ROW);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 8 WITH (DATA_COMPRESSION = ROW);
ALTER TABLE Sales.PartitionedOrders
REBUILD PARTITION = 9 WITH (DATA_COMPRESSION = NONE);  -- Current, hottest

📊 Monitoring Compression Performance

Check Current Compression Status

-- View compression settings for all objects
SELECT
    SCHEMA_NAME(t.schema_id) AS SchemaName,
    t.name AS TableName,
    i.name AS IndexName,
    i.type_desc AS IndexType,
    p.partition_number,
    p.data_compression_desc AS CompressionType,
    ps.reserved_page_count * 8 / 1024 AS SizeMB,
    ps.row_count AS RowCount
FROM sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.dm_db_partition_stats ps ON p.partition_id = ps.partition_id
WHERE t.is_ms_shipped = 0
ORDER BY ps.reserved_page_count DESC;

Monitor Compression CPU Impact

-- Track CPU usage for compression operations
SELECT
    qs.total_worker_time / 1000 AS TotalCPU_ms,
    qs.execution_count,
    qs.total_worker_time / qs.execution_count / 1000 AS AvgCPU_ms,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(st.text)
            ELSE qs.statement_end_offset
        END - qs.statement_start_offset)/2) + 1) AS QueryText
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
WHERE st.text LIKE '%COMPRESSION%'
ORDER BY qs.total_worker_time DESC;

Compression Savings Report

-- Generate compression savings report
WITH CompressionStats AS (
    SELECT
        SCHEMA_NAME(t.schema_id) AS SchemaName,
        t.name AS TableName,
        p.data_compression_desc AS CompressionType,
        SUM(ps.reserved_page_count) * 8 / 1024 AS CurrentSizeMB,
        SUM(ps.row_count) AS TotalRows
    FROM sys.tables t
    INNER JOIN sys.partitions p ON t.object_id = p.object_id
    INNER JOIN sys.dm_db_partition_stats ps ON p.partition_id = ps.partition_id
    WHERE t.is_ms_shipped = 0
    GROUP BY SCHEMA_NAME(t.schema_id), t.name, p.data_compression_desc
)
SELECT
    SchemaName,
    TableName,
    CompressionType,
    CurrentSizeMB,
    TotalRows,
    CASE
        WHEN TotalRows > 0
        THEN CAST(CurrentSizeMB AS DECIMAL(18,2)) / TotalRows * 1000000
        ELSE 0
    END AS BytesPerRow
FROM CompressionStats
ORDER BY CurrentSizeMB DESC;

✅ Best Practices for Data Compression

Pre-Implementation Checklist

Analyze workload patterns

-- Check read vs write ratio
SELECT
    OBJECT_NAME(s.object_id) AS TableName,
    SUM(user_seeks + user_scans + user_lookups) AS Reads,
    SUM(user_updates) AS Writes,
    CASE
        WHEN SUM(user_updates) > 0
        THEN CAST(SUM(user_seeks + user_scans + user_lookups) AS FLOAT) / SUM(user_updates)
        ELSE 999999
    END AS ReadToWriteRatio
FROM sys.dm_db_index_usage_stats s
WHERE database_id = DB_ID()
GROUP BY s.object_id
HAVING SUM(user_seeks + user_scans + user_lookups) + SUM(user_updates) > 0
ORDER BY ReadToWriteRatio DESC;

Test compression impact on CPU

-- Baseline CPU before compression
SELECT
    record_id,
    SQLProcessUtilization,
    SystemIdle,
    100 - SystemIdle - SQLProcessUtilization AS OtherProcesses
FROM (
    SELECT
        record.value('(./Record/@id)[1]', 'int') AS record_id,
        record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS SystemIdle,
        record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS SQLProcessUtilization
    FROM (
        SELECT CONVERT(XML, record) AS record
        FROM sys.dm_os_ring_buffers
        WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'
          AND record LIKE '%<SystemHealth>%'
    ) AS x
) AS y
ORDER BY record_id DESC;

Schedule compression during maintenance windows

-- Automated compression job
DECLARE @StartTime TIME = '02:00:00';  -- Start at 2 AM
DECLARE @EndTime TIME = '06:00:00';    -- End by 6 AM

IF CAST(GETDATE() AS TIME) BETWEEN @StartTime AND @EndTime
BEGIN
    ALTER TABLE Sales.OrderHistory
    REBUILD WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);
END

Compression Decision Matrix

Scenario	Recommended Compression	Rationale
High-volume OLTP	Row	Minimal CPU overhead
Data warehouse fact tables	Page or Columnstore	Maximum compression
Archive tables	Page	Rarely accessed
Temp tables	None or Row	Short-lived data
Indexes on OLTP	Row	Balance of performance
Reporting indexes	Page	Read-heavy access

🚀 Advanced Compression Scenarios

Backup Compression (Separate from Data Compression)

-- Enable backup compression at server level
EXEC sp_configure 'backup compression default', 1;
RECONFIGURE;

-- Compress individual backup
BACKUP DATABASE AdventureWorks
TO DISK = 'D:\Backups\AdventureWorks.bak'
WITH COMPRESSION, STATS = 10;

-- Compare compressed vs uncompressed backup size
SELECT
    database_name,
    backup_start_date,
    backup_size / 1024 / 1024 AS BackupSizeMB,
    compressed_backup_size / 1024 / 1024 AS CompressedSizeMB,
    CAST(100.0 - (compressed_backup_size * 100.0 / backup_size) AS DECIMAL(5,2)) AS CompressionPercent
FROM msdb.dbo.backupset
WHERE database_name = 'AdventureWorks'
ORDER BY backup_start_date DESC;

Unicode Compression (SQL Server 2019+)

-- Unicode compression is automatic in page compression
-- Verify Unicode compression is working
SELECT
    OBJECT_NAME(object_id) AS TableName,
    name AS ColumnName,
    system_type_name,
    max_length,
    collation_name
FROM sys.dm_exec_describe_first_result_set(
    N'SELECT * FROM Sales.Customers', NULL, 0
)
WHERE collation_name IS NOT NULL;

📈 Real-World Compression Results

Case Study: E-Commerce Order History

Metric	Before	After (Page)	Improvement
Table Size	450 GB	125 GB	72% reduction
Backup Size	180 GB	45 GB	75% reduction
Backup Time	4.5 hours	1.2 hours	73% faster
I/O per Query	125,000	42,000	66% reduction
CPU Impact	Baseline	+8%	Acceptable

⚠️ Warning: Always test compression in a non-production environment first. Monitor CPU utilization closely during initial implementation.

🔧 Troubleshooting Compression Issues

Common Problems and Solutions

-- Issue: Compression taking too long
-- Solution: Use MAXDOP to control parallelism
ALTER INDEX ALL ON Sales.LargeTable
REBUILD WITH (
    DATA_COMPRESSION = PAGE,
    ONLINE = ON,
    MAXDOP = 2  -- Limit to 2 CPUs
);

-- Issue: High CPU during business hours
-- Solution: Use Resource Governor
CREATE RESOURCE POOL CompressionPool
WITH (MAX_CPU_PERCENT = 20);

CREATE WORKLOAD GROUP CompressionGroup
WITH (REQUEST_MAX_CPU_TIME_SEC = 0)
USING CompressionPool;

-- Issue: Log file growing during compression
-- Solution: Rebuild in batches
DECLARE @PartitionNumber INT = 1;
WHILE @PartitionNumber <= 12
BEGIN
    ALTER TABLE Sales.PartitionedTable
    REBUILD PARTITION = @PartitionNumber
    WITH (DATA_COMPRESSION = PAGE);

    -- Allow log backup between partitions
    WAITFOR DELAY '00:01:00';

    SET @PartitionNumber = @PartitionNumber + 1;
END

Conclusion

SQL Server data compression is a powerful feature that can dramatically reduce storage costs and improve I/O performance. The key to success is:

Analyze before implementing - Use sp_estimate_data_compression_savings
Choose the right type - Row for OLTP, Page for analytics/archive
Monitor CPU impact - Compression trades CPU for I/O
Implement incrementally - Start with largest, least-active tables
Combine with partitioning - Different compression for hot vs cold data

With proper planning and implementation, you can achieve 50-80% storage savings while maintaining or even improving query performance.

About the Author

Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in SQL Server, SQL optimization, and enterprise database solutions.

Need help with SQL Server optimization? Contact us for expert database consulting.

SQL Server Data Compression Techniques

🚀 SQL Server Data Compression: The Complete DBA Guide

📊 Understanding SQL Server Compression Types

Row Compression

Page Compression

Columnstore Compression

💡 Estimating Compression Savings

Using sp_estimate_data_compression_savings

Comprehensive Compression Analysis Script

⚠️ When to Use Each Compression Type

Row Compression Best Candidates

Page Compression Best Candidates

🔧 Implementing Compression in Production

Online vs Offline Compression

Compression During Index Creation

Partition-Level Compression Strategy

📊 Monitoring Compression Performance

Check Current Compression Status

Monitor Compression CPU Impact

Compression Savings Report

✅ Best Practices for Data Compression

Pre-Implementation Checklist

Compression Decision Matrix

🚀 Advanced Compression Scenarios

Backup Compression (Separate from Data Compression)

Unicode Compression (SQL Server 2019+)

📈 Real-World Compression Results

Case Study: E-Commerce Order History

🔧 Troubleshooting Compression Issues

Common Problems and Solutions

Conclusion

About the Author

Related Articles

🚀 SQL Server Data Compression: The Complete DBA Guide

📊 Understanding SQL Server Compression Types

Row Compression

Page Compression

Columnstore Compression

💡 Estimating Compression Savings

Using sp_estimate_data_compression_savings

Comprehensive Compression Analysis Script

⚠️ When to Use Each Compression Type

Row Compression Best Candidates

Page Compression Best Candidates

🔧 Implementing Compression in Production

Online vs Offline Compression

Compression During Index Creation

Partition-Level Compression Strategy

📊 Monitoring Compression Performance

Check Current Compression Status

Monitor Compression CPU Impact

Compression Savings Report

✅ Best Practices for Data Compression

Pre-Implementation Checklist

Compression Decision Matrix

🚀 Advanced Compression Scenarios

Backup Compression (Separate from Data Compression)

Unicode Compression (SQL Server 2019+)

📈 Real-World Compression Results

Case Study: E-Commerce Order History

🔧 Troubleshooting Compression Issues

Common Problems and Solutions

Conclusion

About the Author

Related Articles

Continue Reading

Leadership in Motion: Technical Leadership Lessons from Database Operations

Database Performance Tuning: Lessons from the Trenches

AI Governance and Hardware: Why Sovereign Refusal Belongs in Silicon