Discover expert insights on PostgreSQL backup strategies comparing pg_dump and pg_basebackup to ensure your data is always protected and recoverable.
Overview
Choosing the right backup strategy is critical for database reliability and disaster recovery. PostgreSQL offers two primary backup tools: pg_dump for logical backups and pg_basebackup for physical backups. Understanding when to use each is essential for every DBA.
Critical reminder: Backups are useless unless you regularly test your restore procedures!
Why Backup Strategy Matters
In today's data-driven world, data loss can be catastrophic. A robust backup strategy ensures:
- Business continuity during disasters
- Recovery from human errors (accidental deletions, bad updates)
- Compliance requirements (data retention policies)
- Development and testing environments
pg_dump: Logical Backups
What is pg_dump?
pg_dump creates logical backups by extracting database objects as SQL statements. It's ideal for selective backups, database migrations, and version upgrades.
💡 Key Features
- Selective backups: Backup specific tables, schemas, or databases
- Platform independent: Restore to different OS or PostgreSQL versions
- Human readable: Plain SQL output (with
-Fpformat) - Online backups: No downtime required
- Flexible restore: Cherry-pick specific objects during restoration
Usage Examples
Basic Database Backup
# Backup entire database to custom format
pg_dump -U postgres -d mydb -Fc -f mydb_backup.dump
# Backup to SQL file (human readable)
pg_dump -U postgres -d mydb -Fp -f mydb_backup.sql
Selective Backups
# Backup specific table
pg_dump -U postgres -d mydb -t users -Fc -f users_backup.dump
# Backup specific schema
pg_dump -U postgres -d mydb -n public -Fc -f public_schema.dump
# Exclude specific tables
pg_dump -U postgres -d mydb -T logs -T temp_data -Fc -f mydb_without_logs.dump
Backup All Databases
# Use pg_dumpall for cluster-wide backup
pg_dumpall -U postgres -f cluster_backup.sql
# Backup only global objects (roles, tablespaces)
pg_dumpall -U postgres -g -f globals.sql
Restore from pg_dump
# Restore from custom format
pg_restore -U postgres -d mydb -v mydb_backup.dump
# Restore from SQL file
psql -U postgres -d mydb -f mydb_backup.sql
# Restore with parallel jobs (faster)
pg_restore -U postgres -d mydb -j 4 mydb_backup.dump
pg_dump Formats
| Format | Flag | Use Case | Compression | Parallel Restore |
|---|---|---|---|---|
| Plain | -Fp | Human readable, version control | No | No |
| Custom | -Fc | Flexible restore, best for production | Yes | Yes |
| Directory | -Fd | Large databases, parallel dump | Yes | Yes |
| Tar | -Ft | Archive format | No | No |
✅ When to Use pg_dump
- Database migrations across PostgreSQL versions
- Selective backups of specific tables/schemas
- Development/testing environment refreshes
- Small to medium databases (< 100GB)
- Cross-platform migrations
- Schema-only or data-only backups
⚠️ Limitations
- Slower for large databases: Dumps and restores can take hours
- Not point-in-time recovery: Only captures state at backup time
- Requires more storage: Plain text SQL is less efficient
- No binary data optimization: Large BLOBs can be slow
pg_basebackup: Physical Backups
What is pg_basebackup?
pg_basebackup creates physical backups by copying the entire PostgreSQL data directory. It's the foundation for Point-in-Time Recovery (PITR) and standby server setup.
💡 Key Features
- Complete cluster backup: Copies entire data directory
- Point-in-Time Recovery: Combined with WAL archiving
- Fast backup and restore: Binary copy is faster than logical
- Standby server setup: Used for streaming replication
- Consistent snapshots: Ensures transactional consistency
Usage Examples
Basic Physical Backup
# Backup to directory
pg_basebackup -h localhost -U replicator -D /backup/pgdata -Fp -P
# Backup with compression (tar format)
pg_basebackup -h localhost -U replicator -D /backup -Ft -z -P
Backup for PITR Setup
# Configure WAL archiving in postgresql.conf
# archive_mode = on
# archive_command = 'cp %p /archive/%f'
# Create base backup
pg_basebackup -h localhost -U replicator -D /backup/base -Fp -P -R
# -R flag creates recovery configuration for standby
Streaming Replication Setup
# Create standby server from backup
pg_basebackup -h primary_server -U replicator \
-D /var/lib/postgresql/data \
-P --wal-method=stream -R
Command Options Explained
Common flags:
-D /path: Target directory for backup-Fp: Plain format (directory copy)-Ft: Tar format (compressed archives)-z: Enable gzip compression-P: Show progress-R: Create standby configuration files--wal-method=stream: Stream WAL during backup
Restore from pg_basebackup
# Stop PostgreSQL service
systemctl stop postgresql
# Replace data directory
rm -rf /var/lib/postgresql/data
cp -r /backup/pgdata /var/lib/postgresql/data
# Set proper permissions
chown -R postgres:postgres /var/lib/postgresql/data
chmod 700 /var/lib/postgresql/data
# Start PostgreSQL
systemctl start postgresql
🚀 Point-in-Time Recovery (PITR)
Combine pg_basebackup with WAL archiving for PITR capabilities:
Step 1: Configure WAL Archiving
# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'
archive_timeout = 300
Step 2: Create Base Backup
pg_basebackup -D /backup/base -Fp -P
Step 3: Restore to Specific Point in Time
Create recovery.conf (PostgreSQL < 12) or recovery.signal (PostgreSQL 12+):
# recovery.conf (PostgreSQL < 12)
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2025-08-05 14:30:00'
recovery_target_action = 'promote'
For PostgreSQL 12+, add to postgresql.conf:
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2025-08-05 14:30:00'
Then create recovery.signal file:
touch /var/lib/postgresql/data/recovery.signal
✅ When to Use pg_basebackup
- Large databases (> 100GB)
- Point-in-Time Recovery requirements
- Disaster recovery planning
- Standby server setup for replication
- Fast backup and restore needs
- Complete cluster backup including all databases
⚠️ Limitations
- Version specific: Must restore to same PostgreSQL major version
- Platform dependent: OS and architecture must match
- All or nothing: Cannot selectively restore individual tables
- Larger storage: Includes all databases in cluster
- Requires replication user: Need proper permissions
Comparison: pg_dump vs pg_basebackup
| Feature | pg_dump | pg_basebackup |
|---|---|---|
| Backup Type | Logical | Physical |
| Speed | Slower | Faster |
| Size | Larger (text) | Smaller (binary) |
| Selective Restore | ✅ Yes | ❌ No |
| Cross-Version | ✅ Yes | ❌ No |
| PITR Support | ❌ No | ✅ Yes |
| Online Backup | ✅ Yes | ✅ Yes |
| Best For | Small DBs, migrations | Large DBs, DR |
Hybrid Backup Strategy
💡 Best Practice: Use Both!
A robust backup strategy combines both methods:
Daily: pg_basebackup + WAL Archiving
- Full physical backup weekly
- Continuous WAL archiving for PITR
- Retention: 30 days
Weekly: pg_dump
- Logical backup of critical databases
- Selective backups of important tables
- Retention: 90 days
Example Backup Schedule
#!/bin/bash
# Daily physical backup
0 1 * * * pg_basebackup -D /backup/daily/$(date +\%Y\%m\%d) -Fp -P
# Weekly logical backup (Sunday)
0 2 * * 0 pg_dump -U postgres -d production -Fc -f /backup/weekly/prod_$(date +\%Y\%m\%d).dump
# Monthly full cluster backup
0 3 1 * * pg_dumpall -U postgres -f /backup/monthly/cluster_$(date +\%Y\%m\%d).sql
Backup Best Practices
✅ Essential Guidelines
- Test restores regularly: Schedule quarterly restore drills
- Automate backups: Use cron jobs or backup tools
- Monitor backup status: Alert on failures
- Encrypt backups: Protect sensitive data
- Off-site storage: Use cloud storage or remote servers
- Document procedures: Maintain runbooks
- Validate backups: Check backup integrity
- Set retention policies: Balance storage costs and recovery needs
🚀 Performance Tips
- Use compression: Save storage space (
-zflag) - Parallel operations: Use
-jflag for pg_dump/pg_restore - Network bandwidth: Consider local backups for large databases
- Backup windows: Schedule during low-traffic periods
- Incremental backups: Use WAL archiving between full backups
⚠️ Common Mistakes to Avoid
- Not testing restores: Backups without restore tests are unreliable
- Single backup location: Always have off-site copies
- Ignoring WAL files: Essential for PITR and consistency
- Insufficient retention: Keep backups long enough for compliance
- Missing global objects: Remember to backup roles and tablespaces
- No monitoring: Silently failing backups are useless
Backup Automation Example
Complete Backup Script
#!/bin/bash
# PostgreSQL Backup Script
# Combines pg_dump and pg_basebackup
BACKUP_DIR="/var/backups/postgresql"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# Logical backup with pg_dump
echo "Starting logical backup..."
pg_dump -U postgres -d production \
-Fc -f "${BACKUP_DIR}/logical/prod_${DATE}.dump"
# Physical backup with pg_basebackup (weekly)
if [ $(date +%u) -eq 7 ]; then
echo "Starting weekly physical backup..."
pg_basebackup -D "${BACKUP_DIR}/physical/${DATE}" -Fp -P
fi
# Compress old backups
find "${BACKUP_DIR}/logical" -name "*.dump" -mtime +7 -exec gzip {} \;
# Delete old backups
find "${BACKUP_DIR}" -mtime +${RETENTION_DAYS} -delete
# Upload to S3 (optional)
aws s3 sync "${BACKUP_DIR}" s3://my-db-backups/ --storage-class GLACIER
echo "Backup completed successfully!"
Conclusion
Both pg_dump and pg_basebackup are essential tools in a PostgreSQL DBA's arsenal. Use pg_dump for flexibility and migrations, and pg_basebackup for speed and disaster recovery.
Quick Decision Guide
Choose pg_dump if you need:
- Selective table/schema backups
- Cross-version database migrations
- Human-readable backup format
- Development environment refreshes
Choose pg_basebackup if you need:
- Fast backup of large databases
- Point-in-Time Recovery (PITR)
- Standby server setup
- Complete disaster recovery solution
Remember: The best strategy uses both methods to maximize protection and flexibility!
About the Author
Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in PostgreSQL, SQL optimization, and enterprise database solutions.
Related Articles
- Database Performance Tuning Best Practices
- Cloud Database Migration Strategies
- High Availability Database Setup
Need help with PostgreSQL optimization? Contact us for expert database consulting.