Master PostgreSQL Point-in-Time Recovery (PITR) to protect your data with the ability to restore your database to any specific moment, minimizing data loss from accidents or disasters.
What is Point-in-Time Recovery?
PITR allows you to restore a database to any point in time by combining a base backup with Write-Ahead Log (WAL) archives.
💡 Key Benefit: Unlike regular backups that restore to backup time, PITR can recover to the exact moment before a disaster, minimizing data loss to seconds.
How PITR Works
Base Backup (Sunday 00:00)
↓
WAL Archives: 0001 → 0002 → 0003 → 0004 → 0005 → ...
↓ ↑
└─────── Replay to target time ─────┘
(Tuesday 14:30)
🔧 Prerequisites Setup
Enable WAL Archiving
# postgresql.conf
wal_level = replica # Required for PITR
archive_mode = on # Enable archiving
archive_command = 'cp %p /archive/%f' # Archive command
archive_timeout = 300 # Force archive every 5 minutes
# Optional: Increase WAL retention
max_wal_size = 2GB
min_wal_size = 1GB
⚠️ Important:
wal_level = replica(orlogical) is required. The defaultminimaldoes not support PITR.
Configure Archive Storage
# Create archive directory
sudo mkdir -p /archive
sudo chown postgres:postgres /archive
# For remote storage (recommended)
# Use rsync, S3, or network storage
archive_command = 'rsync -a %p backup_server:/archive/%f'
# For AWS S3
archive_command = 'aws s3 cp %p s3://my-bucket/wal-archive/%f'
Verify Configuration
-- Check WAL level
SHOW wal_level;
-- Check archive status
SHOW archive_mode;
SHOW archive_command;
-- View archive stats
SELECT * FROM pg_stat_archiver;
Creating Base Backups
Using pg_basebackup
# Basic backup
pg_basebackup -D /backups/base_backup -Fp -Xs -P -h localhost -U replication_user
# With compression
pg_basebackup -D /backups/base_backup -Ft -z -Xs -P -h localhost -U replication_user
# Options explained:
# -D: Target directory
# -Fp: Plain format (directory)
# -Ft: Tar format
# -Xs: Stream WAL during backup
# -z: Compress output
# -P: Show progress
Backup with Checkpoint
# Force immediate checkpoint before backup
pg_basebackup -D /backups/base_backup -Fp -Xs -P --checkpoint=fast
Verify Backup
# Check backup label
cat /backups/base_backup/backup_label
# Verify PostgreSQL can start from backup (test only!)
# pg_ctl -D /backups/base_backup start -o "-p 5433"
📊 WAL Archiving Management
Monitor Archive Status
-- Check archiver statistics
SELECT
archived_count,
last_archived_wal,
last_archived_time,
failed_count,
last_failed_wal,
last_failed_time,
stats_reset
FROM pg_stat_archiver;
-- Calculate WAL generation rate
SELECT
pg_current_wal_lsn() AS current_lsn,
pg_walfile_name(pg_current_wal_lsn()) AS current_wal_file;
Verify WAL Continuity
# List archived WAL files
ls -la /archive/ | head -20
# Check for gaps in WAL sequence
ls /archive/*.gz | sort | awk -F'/' '{print $NF}' | \
awk '{prev=$1; getline; if ($1 != prev+1) print "Gap after: " prev}'
Clean Old Archives
# Find oldest required WAL file
pg_controldata /var/lib/postgresql/data | grep "REDO WAL file"
# Remove WAL files older than base backup
# Use pg_archivecleanup
pg_archivecleanup /archive 000000010000000000000010
🚀 Performing Point-in-Time Recovery
Scenario: Recover from Accidental DELETE
Accidental deletion occurred at 2025-06-15 14:35:00. We want to recover to 14:34:00.
Step 1: Stop PostgreSQL
sudo systemctl stop postgresql
Step 2: Preserve Current Data
# Rename current data directory
mv /var/lib/postgresql/16/main /var/lib/postgresql/16/main_corrupted
Step 3: Restore Base Backup
# Extract base backup
cp -r /backups/base_backup /var/lib/postgresql/16/main
# Or for tar format
mkdir /var/lib/postgresql/16/main
tar -xvf /backups/base_backup.tar -C /var/lib/postgresql/16/main
Step 4: Configure Recovery
Create postgresql.conf (or recovery.conf for PG < 12):
# postgresql.conf - Add these recovery settings
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2025-06-15 14:34:00'
recovery_target_action = 'promote'
Create recovery.signal file:
touch /var/lib/postgresql/16/main/recovery.signal
Step 5: Set Permissions
chown -R postgres:postgres /var/lib/postgresql/16/main
chmod 700 /var/lib/postgresql/16/main
Step 6: Start PostgreSQL
sudo systemctl start postgresql
# Monitor recovery progress
tail -f /var/log/postgresql/postgresql-16-main.log
Step 7: Verify Recovery
-- Check recovery status
SELECT pg_is_in_recovery();
-- Verify data
SELECT COUNT(*) FROM important_table;
-- Check recovery target
SELECT pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();
Recovery Target Options
Recovery to Specific Time
recovery_target_time = '2025-06-15 14:34:00'
Recovery to Specific Transaction
recovery_target_xid = '12345678'
Recovery to Named Restore Point
-- Create restore point before risky operation
SELECT pg_create_restore_point('before_migration');
recovery_target_name = 'before_migration'
Recovery to LSN Position
recovery_target_lsn = '0/1A000060'
Recovery Actions
# What to do when recovery target is reached
recovery_target_action = 'pause' # Pause and allow inspection
recovery_target_action = 'promote' # Promote to primary
recovery_target_action = 'shutdown' # Shut down cleanly
Advanced Recovery Techniques
Inclusive vs Exclusive Recovery
# Include the target transaction (default)
recovery_target_inclusive = true
# Stop just before target
recovery_target_inclusive = false
Recovery in New Timeline
# Create new timeline after recovery (default)
recovery_target_timeline = 'latest'
# Stay on specific timeline
recovery_target_timeline = '1'
Parallel Recovery
# Speed up recovery with parallel workers (PG 15+)
max_parallel_workers = 8
max_parallel_recovery_workers = 4
🔧 Automated PITR Setup
Backup Script
#!/bin/bash
# /opt/scripts/pitr_backup.sh
BACKUP_DIR="/backups"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_PATH="${BACKUP_DIR}/base_${DATE}"
RETENTION_DAYS=7
# Create base backup
echo "Starting base backup at $(date)"
pg_basebackup -D "$BACKUP_PATH" -Ft -z -Xs -P \
-h localhost -U replication_user
if [ $? -eq 0 ]; then
echo "Backup completed successfully: $BACKUP_PATH"
# Record backup info
echo "$DATE" > "${BACKUP_DIR}/latest_backup"
# Clean old backups
find "$BACKUP_DIR" -name "base_*" -mtime +${RETENTION_DAYS} -exec rm -rf {} \;
# Clean old WAL archives
OLDEST_BACKUP=$(ls -td ${BACKUP_DIR}/base_* 2>/dev/null | tail -1)
if [ -n "$OLDEST_BACKUP" ]; then
OLDEST_WAL=$(cat "${OLDEST_BACKUP}/backup_label" | grep "START WAL" | awk '{print $3}')
pg_archivecleanup /archive "$OLDEST_WAL"
fi
else
echo "Backup FAILED!"
exit 1
fi
Cron Configuration
# Daily base backup at 2 AM
0 2 * * * /opt/scripts/pitr_backup.sh >> /var/log/pitr_backup.log 2>&1
# Verify archive every hour
0 * * * * /opt/scripts/verify_archive.sh >> /var/log/archive_verify.log 2>&1
Archive Verification Script
#!/bin/bash
# /opt/scripts/verify_archive.sh
# Check archive status
FAILED=$(psql -t -c "SELECT failed_count FROM pg_stat_archiver")
if [ "$FAILED" -gt 0 ]; then
echo "WARNING: Archive failures detected: $FAILED"
# Send alert
fi
# Check archive lag
LAST_ARCHIVE=$(psql -t -c "SELECT last_archived_time FROM pg_stat_archiver")
LAG_MINUTES=$(psql -t -c "SELECT EXTRACT(EPOCH FROM (now() - '$LAST_ARCHIVE'::timestamp))/60")
if [ "${LAG_MINUTES%.*}" -gt 10 ]; then
echo "WARNING: Archive lag: $LAG_MINUTES minutes"
# Send alert
fi
📊 Monitoring PITR Health
Archive Status Dashboard
-- Comprehensive archive health check
SELECT
'Archive Mode' AS metric,
current_setting('archive_mode') AS value
UNION ALL
SELECT
'WAL Level',
current_setting('wal_level')
UNION ALL
SELECT
'Archived Count',
archived_count::text
FROM pg_stat_archiver
UNION ALL
SELECT
'Failed Count',
failed_count::text
FROM pg_stat_archiver
UNION ALL
SELECT
'Last Archived',
last_archived_time::text
FROM pg_stat_archiver
UNION ALL
SELECT
'Current WAL',
pg_walfile_name(pg_current_wal_lsn());
Calculate Recovery Window
-- Estimate recovery capability
WITH archive_info AS (
SELECT
pg_walfile_name(pg_current_wal_lsn()) AS current_wal,
last_archived_wal,
last_archived_time
FROM pg_stat_archiver
)
SELECT
current_wal,
last_archived_wal,
last_archived_time,
now() - last_archived_time AS archive_lag
FROM archive_info;
⚠️ Common Issues and Solutions
Issue 1: Archive Command Failing
-- Check archive status
SELECT * FROM pg_stat_archiver;
Solution:
# Test archive command manually
cp /var/lib/postgresql/16/main/pg_wal/000000010000000000000001 /archive/test
# Check permissions
ls -la /archive
# Check disk space
df -h /archive
Issue 2: WAL Files Accumulating
# Check for archive backlog
ls /var/lib/postgresql/16/main/pg_wal/ | wc -l
# If archive is failing, WAL will accumulate
# Fix archive command first, then:
# Files will be archived automatically
Issue 3: Recovery Not Finding WAL Files
LOG: could not open file "pg_wal/000000010000000000000010": No such file
Solution:
# Ensure restore_command is correct
restore_command = 'cp /archive/%f %p'
# Test restore command manually
cp /archive/000000010000000000000010 /tmp/test
Issue 4: Recovery Target Not Reached
Solution:
# Check target specification
# Ensure time is in correct timezone
recovery_target_time = '2025-06-15 14:34:00 America/New_York'
# Or use UTC explicitly
recovery_target_time = '2025-06-15 18:34:00 UTC'
✅ Best Practices
Backup Strategy
| Component | Frequency | Retention |
|---|---|---|
| Base Backup | Daily | 7-30 days |
| WAL Archives | Continuous | Match base backups |
| Full Backup | Weekly | 4-12 weeks |
Testing Requirements
- Test recovery monthly - Verify backups actually work
- Document procedures - Step-by-step recovery runbooks
- Train team members - Multiple people should know the process
- Time the recovery - Know your RTO (Recovery Time Objective)
Storage Recommendations
# Separate storage for archives
# - Different physical disk
# - Different server (ideally)
# - Cloud storage (S3, GCS, Azure Blob)
# Example S3 archive command
archive_command = 'aws s3 cp %p s3://my-bucket/wal/%f --storage-class STANDARD_IA'
# With wal-g (recommended)
archive_command = 'wal-g wal-push %p'
Quick Reference Commands
# Create base backup
pg_basebackup -D /backup -Fp -Xs -P
# Check archive status
psql -c "SELECT * FROM pg_stat_archiver"
# Create restore point
psql -c "SELECT pg_create_restore_point('my_point')"
# Start recovery
touch /data/recovery.signal
systemctl start postgresql
# Promote from recovery
psql -c "SELECT pg_promote()"
# Clean old archives
pg_archivecleanup /archive 000000010000000000000010
Conclusion
Point-in-Time Recovery is essential for data protection in PostgreSQL. By properly configuring WAL archiving and regularly testing your recovery procedures, you can minimize data loss and recover from almost any disaster scenario.
Key Takeaways
- ✅ Enable WAL archiving with proper storage
- ✅ Take regular base backups
- ✅ Test recovery procedures regularly
- ✅ Monitor archive health continuously
- ✅ Document and train on recovery procedures
About the Author
Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in PostgreSQL, SQL optimization, and enterprise database solutions.
Related Articles
- PostgreSQL Backup Strategies: pg_dump vs pg_basebackup
- PostgreSQL High Availability: Master-Slave Replication
- PostgreSQL Logical Replication Setup
Need help with PostgreSQL optimization? Contact us for expert database consulting.