Master PostgreSQL's built-in statistics views to monitor database health, identify performance bottlenecks, and proactively prevent issues before they impact users.
Overview of pg_stat Views
PostgreSQL provides extensive statistics collectors that gather information about database activity. These statistics are exposed through pg_stat_* and pg_statio_* views.
📊 Key Insight: The statistics collector runs as a separate process and updates statistics asynchronously. Data may be slightly delayed but provides accurate trends.
Statistics Categories
| Category | Views | Purpose |
|---|---|---|
| Database | pg_stat_database | Overall database activity |
| Tables | pg_stat_user_tables | Table access patterns |
| Indexes | pg_stat_user_indexes | Index usage |
| I/O | pg_statio_user_tables | Buffer cache hits/misses |
| Background | pg_stat_bgwriter | Background writer stats |
| Replication | pg_stat_replication | Replication lag |
| Activity | pg_stat_activity | Current connections |
📊 Database-Level Monitoring
pg_stat_database
Monitor overall database health:
SELECT
datname AS database,
numbackends AS connections,
xact_commit AS commits,
xact_rollback AS rollbacks,
blks_read,
blks_hit,
ROUND(100.0 * blks_hit / NULLIF(blks_hit + blks_read, 0), 2) AS cache_hit_ratio,
tup_returned,
tup_fetched,
tup_inserted,
tup_updated,
tup_deleted,
conflicts,
deadlocks,
temp_files,
pg_size_pretty(temp_bytes) AS temp_bytes,
stats_reset
FROM pg_stat_database
WHERE datname = current_database();
Key Metrics to Watch
-- Cache hit ratio (should be > 99%)
SELECT
datname,
ROUND(100.0 * blks_hit / NULLIF(blks_hit + blks_read, 0), 2) AS cache_hit_ratio
FROM pg_stat_database
WHERE datname NOT LIKE 'template%';
-- Transaction commit/rollback ratio
SELECT
datname,
xact_commit,
xact_rollback,
ROUND(100.0 * xact_rollback / NULLIF(xact_commit + xact_rollback, 0), 2) AS rollback_pct
FROM pg_stat_database
WHERE datname NOT LIKE 'template%';
-- Deadlock count (should be 0 or very low)
SELECT datname, deadlocks
FROM pg_stat_database
WHERE deadlocks > 0;
⚠️ Alert Thresholds:
- Cache hit ratio < 95%: Need more shared_buffers or query optimization
- Rollback ratio > 5%: Application issues
- Deadlocks > 0: Lock contention problems
📋 Table Statistics
pg_stat_user_tables
Analyze table access patterns:
SELECT
schemaname,
relname AS table_name,
seq_scan,
seq_tup_read,
idx_scan,
idx_tup_fetch,
n_tup_ins AS inserts,
n_tup_upd AS updates,
n_tup_del AS deletes,
n_tup_hot_upd AS hot_updates,
n_live_tup AS live_rows,
n_dead_tup AS dead_rows,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_row_pct,
last_vacuum,
last_autovacuum,
last_analyze,
last_autoanalyze,
vacuum_count,
autovacuum_count
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
Tables Needing Indexes
-- Tables with high sequential scan ratio (candidates for indexes)
SELECT
schemaname || '.' || relname AS table_name,
seq_scan,
idx_scan,
ROUND(100.0 * seq_scan / NULLIF(seq_scan + idx_scan, 0), 2) AS seq_scan_pct,
n_live_tup AS rows
FROM pg_stat_user_tables
WHERE seq_scan > 1000
AND n_live_tup > 10000
AND ROUND(100.0 * seq_scan / NULLIF(seq_scan + idx_scan, 0), 2) > 50
ORDER BY seq_scan DESC;
Tables Needing Vacuum
-- Tables with high dead tuple ratio
SELECT
schemaname || '.' || relname AS table_name,
n_live_tup,
n_dead_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup, 0), 2) AS dead_pct,
last_autovacuum,
last_vacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC
LIMIT 20;
HOT Update Ratio
-- HOT updates ratio (higher is better, reduces index maintenance)
SELECT
schemaname || '.' || relname AS table_name,
n_tup_upd AS total_updates,
n_tup_hot_upd AS hot_updates,
ROUND(100.0 * n_tup_hot_upd / NULLIF(n_tup_upd, 0), 2) AS hot_ratio
FROM pg_stat_user_tables
WHERE n_tup_upd > 1000
ORDER BY n_tup_upd DESC;
💡 HOT Updates: Heap-Only Tuple updates avoid index updates when indexed columns aren't modified. A low HOT ratio may indicate need for FILLFACTOR tuning.
🔍 Index Statistics
pg_stat_user_indexes
Monitor index usage:
SELECT
schemaname || '.' || relname AS table_name,
indexrelname AS index_name,
idx_scan,
idx_tup_read,
idx_tup_fetch,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;
Unused Indexes
-- Indexes that haven't been used (candidates for removal)
SELECT
schemaname || '.' || relname AS table_name,
indexrelname AS index_name,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey'
AND indexrelname NOT LIKE '%_unique%'
ORDER BY pg_relation_size(indexrelid) DESC;
Duplicate Indexes
-- Find potentially duplicate indexes
SELECT
pg_size_pretty(sum(pg_relation_size(idx))::bigint) AS size,
(array_agg(idx))[1] AS idx1,
(array_agg(idx))[2] AS idx2,
(array_agg(idx))[3] AS idx3,
(array_agg(idx))[4] AS idx4
FROM (
SELECT
indexrelid::regclass AS idx,
(indrelid::text || E'\n' || indclass::text || E'\n' ||
indkey::text || E'\n' || coalesce(indexprs::text, '') ||
E'\n' || coalesce(indpred::text, '')) AS key
FROM pg_index
) sub
GROUP BY key
HAVING count(*) > 1
ORDER BY sum(pg_relation_size(idx)) DESC;
🔄 I/O Statistics
pg_statio_user_tables
Track buffer cache effectiveness:
SELECT
schemaname || '.' || relname AS table_name,
heap_blks_read,
heap_blks_hit,
ROUND(100.0 * heap_blks_hit / NULLIF(heap_blks_hit + heap_blks_read, 0), 2) AS table_hit_ratio,
idx_blks_read,
idx_blks_hit,
ROUND(100.0 * idx_blks_hit / NULLIF(idx_blks_hit + idx_blks_read, 0), 2) AS index_hit_ratio,
toast_blks_read,
toast_blks_hit
FROM pg_statio_user_tables
ORDER BY heap_blks_read DESC;
Tables with Poor Cache Performance
-- Tables with low cache hit ratio
SELECT
schemaname || '.' || relname AS table_name,
heap_blks_read AS disk_reads,
heap_blks_hit AS cache_hits,
ROUND(100.0 * heap_blks_hit / NULLIF(heap_blks_hit + heap_blks_read, 0), 2) AS hit_ratio
FROM pg_statio_user_tables
WHERE heap_blks_read > 1000
AND ROUND(100.0 * heap_blks_hit / NULLIF(heap_blks_hit + heap_blks_read, 0), 2) < 95
ORDER BY heap_blks_read DESC;
👥 Connection Monitoring
pg_stat_activity
Monitor active connections:
-- Current connection summary
SELECT
state,
COUNT(*) AS count,
MAX(EXTRACT(EPOCH FROM (now() - query_start)))::int AS max_duration_sec
FROM pg_stat_activity
WHERE pid != pg_backend_pid()
GROUP BY state;
-- Active queries with duration
SELECT
pid,
usename,
datname,
state,
query_start,
EXTRACT(EPOCH FROM (now() - query_start))::int AS duration_sec,
wait_event_type,
wait_event,
LEFT(query, 100) AS query_preview
FROM pg_stat_activity
WHERE state = 'active'
AND pid != pg_backend_pid()
ORDER BY query_start;
Long-Running Queries
-- Queries running longer than 5 minutes
SELECT
pid,
usename,
datname,
state,
EXTRACT(EPOCH FROM (now() - query_start))::int AS duration_sec,
wait_event_type,
wait_event,
query
FROM pg_stat_activity
WHERE state != 'idle'
AND query_start < now() - INTERVAL '5 minutes'
AND pid != pg_backend_pid()
ORDER BY query_start;
Blocked Queries
-- Find blocked and blocking queries
SELECT
blocked.pid AS blocked_pid,
blocked.usename AS blocked_user,
blocking.pid AS blocking_pid,
blocking.usename AS blocking_user,
blocked.query AS blocked_query,
blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE cardinality(pg_blocking_pids(blocked.pid)) > 0;
Idle Connections
-- Idle connections holding resources
SELECT
pid,
usename,
datname,
state,
state_change,
EXTRACT(EPOCH FROM (now() - state_change))::int AS idle_seconds,
query
FROM pg_stat_activity
WHERE state = 'idle'
AND state_change < now() - INTERVAL '10 minutes'
ORDER BY state_change;
🔧 Background Writer Statistics
pg_stat_bgwriter
Monitor background writer and checkpoint activity:
SELECT
checkpoints_timed,
checkpoints_req,
checkpoint_write_time / 1000 AS checkpoint_write_sec,
checkpoint_sync_time / 1000 AS checkpoint_sync_sec,
buffers_checkpoint,
buffers_clean,
maxwritten_clean,
buffers_backend,
buffers_backend_fsync,
buffers_alloc,
stats_reset
FROM pg_stat_bgwriter;
Checkpoint Analysis
-- Checkpoint frequency and performance
SELECT
checkpoints_timed AS scheduled,
checkpoints_req AS requested,
ROUND(100.0 * checkpoints_req / NULLIF(checkpoints_timed + checkpoints_req, 0), 2) AS pct_requested,
ROUND(checkpoint_write_time / NULLIF(checkpoints_timed + checkpoints_req, 0) / 1000, 2) AS avg_write_sec,
ROUND(checkpoint_sync_time / NULLIF(checkpoints_timed + checkpoints_req, 0) / 1000, 2) AS avg_sync_sec
FROM pg_stat_bgwriter;
⚠️ Warning Signs:
- High
checkpoints_req: checkpoint_completion_target too low or heavy write loadmaxwritten_clean > 0: bgwriter not keeping upbuffers_backend_fsync > 0: Backends doing fsync (bad for performance)
📡 Replication Monitoring
pg_stat_replication
Monitor streaming replication:
-- Replication status on primary
SELECT
pid,
usename,
application_name,
client_addr,
state,
sync_state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replication_lag_bytes,
pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) AS lag_pretty,
write_lag,
flush_lag,
replay_lag
FROM pg_stat_replication;
Replication Lag Alert
-- Check if replication lag exceeds threshold
SELECT
application_name,
client_addr,
pg_wal_lsn_diff(sent_lsn, replay_lsn) AS lag_bytes,
CASE
WHEN pg_wal_lsn_diff(sent_lsn, replay_lsn) > 100000000 THEN 'CRITICAL'
WHEN pg_wal_lsn_diff(sent_lsn, replay_lsn) > 10000000 THEN 'WARNING'
ELSE 'OK'
END AS status
FROM pg_stat_replication;
🚀 Query Performance Statistics
pg_stat_statements
Track query performance (requires extension):
-- Enable extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Top queries by total time
SELECT
LEFT(query, 80) AS query_preview,
calls,
ROUND(total_exec_time::numeric, 2) AS total_ms,
ROUND(mean_exec_time::numeric, 2) AS avg_ms,
ROUND(stddev_exec_time::numeric, 2) AS stddev_ms,
rows,
shared_blks_hit,
shared_blks_read,
ROUND(100.0 * shared_blks_hit / NULLIF(shared_blks_hit + shared_blks_read, 0), 2) AS cache_hit_pct
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
Slowest Queries
-- Queries with highest average execution time
SELECT
LEFT(query, 100) AS query,
calls,
ROUND(mean_exec_time::numeric, 2) AS avg_ms,
ROUND(min_exec_time::numeric, 2) AS min_ms,
ROUND(max_exec_time::numeric, 2) AS max_ms,
rows / NULLIF(calls, 0) AS avg_rows
FROM pg_stat_statements
WHERE calls > 10
ORDER BY mean_exec_time DESC
LIMIT 20;
Most I/O Intensive Queries
-- Queries with most disk reads
SELECT
LEFT(query, 80) AS query,
calls,
shared_blks_read AS disk_reads,
shared_blks_hit AS cache_hits,
ROUND(100.0 * shared_blks_read / NULLIF(shared_blks_read + shared_blks_hit, 0), 2) AS disk_read_pct,
temp_blks_read,
temp_blks_written
FROM pg_stat_statements
WHERE shared_blks_read > 1000
ORDER BY shared_blks_read DESC
LIMIT 20;
📈 Creating a Monitoring Dashboard
Comprehensive Health Check Query
-- Database health snapshot
WITH db_stats AS (
SELECT
ROUND(100.0 * blks_hit / NULLIF(blks_hit + blks_read, 0), 2) AS cache_hit_ratio,
deadlocks,
conflicts,
temp_bytes
FROM pg_stat_database
WHERE datname = current_database()
),
connection_stats AS (
SELECT
COUNT(*) AS total_connections,
COUNT(*) FILTER (WHERE state = 'active') AS active,
COUNT(*) FILTER (WHERE state = 'idle') AS idle,
COUNT(*) FILTER (WHERE state = 'idle in transaction') AS idle_in_tx,
MAX(EXTRACT(EPOCH FROM (now() - query_start))) FILTER (WHERE state = 'active') AS longest_query_sec
FROM pg_stat_activity
WHERE pid != pg_backend_pid()
),
table_stats AS (
SELECT
SUM(n_dead_tup) AS total_dead_tuples,
COUNT(*) FILTER (WHERE n_dead_tup > 10000) AS tables_need_vacuum
FROM pg_stat_user_tables
)
SELECT
ds.cache_hit_ratio,
ds.deadlocks,
ds.conflicts,
pg_size_pretty(ds.temp_bytes) AS temp_space_used,
cs.total_connections,
cs.active AS active_queries,
cs.idle AS idle_connections,
cs.idle_in_tx AS idle_in_transaction,
ROUND(cs.longest_query_sec::numeric, 2) AS longest_query_sec,
ts.total_dead_tuples,
ts.tables_need_vacuum
FROM db_stats ds, connection_stats cs, table_stats ts;
Automated Alert Queries
-- Create function for health alerts
CREATE OR REPLACE FUNCTION get_health_alerts()
RETURNS TABLE(
alert_type TEXT,
severity TEXT,
message TEXT
) AS $$
BEGIN
-- Cache hit ratio alert
RETURN QUERY
SELECT
'CACHE_HIT'::TEXT,
CASE WHEN ratio < 90 THEN 'CRITICAL' ELSE 'WARNING' END,
'Cache hit ratio: ' || ratio || '%'
FROM (
SELECT ROUND(100.0 * blks_hit / NULLIF(blks_hit + blks_read, 0), 2) AS ratio
FROM pg_stat_database WHERE datname = current_database()
) sub
WHERE ratio < 95;
-- Long running queries alert
RETURN QUERY
SELECT
'LONG_QUERY'::TEXT,
'WARNING'::TEXT,
'Query running for ' || duration || ' seconds: ' || LEFT(query, 50)
FROM (
SELECT
EXTRACT(EPOCH FROM (now() - query_start))::int AS duration,
query
FROM pg_stat_activity
WHERE state = 'active'
AND query_start < now() - INTERVAL '5 minutes'
) sub;
-- Deadlock alert
RETURN QUERY
SELECT
'DEADLOCKS'::TEXT,
'CRITICAL'::TEXT,
'Deadlocks detected: ' || deadlocks
FROM pg_stat_database
WHERE datname = current_database()
AND deadlocks > 0;
-- Bloat alert
RETURN QUERY
SELECT
'TABLE_BLOAT'::TEXT,
'WARNING'::TEXT,
'Table ' || relname || ' has ' || n_dead_tup || ' dead tuples'
FROM pg_stat_user_tables
WHERE n_dead_tup > 100000;
END;
$$ LANGUAGE plpgsql;
-- Run alerts
SELECT * FROM get_health_alerts();
🔄 Resetting Statistics
-- Reset statistics for specific table
SELECT pg_stat_reset_single_table_counters('tablename'::regclass);
-- Reset all database statistics
SELECT pg_stat_reset();
-- Reset background writer statistics
SELECT pg_stat_reset_shared('bgwriter');
-- Reset pg_stat_statements
SELECT pg_stat_statements_reset();
⚠️ Caution: Resetting statistics loses historical data. Consider exporting important metrics before reset.
✅ Best Practices
Monitoring Checklist
| Metric | Target | Frequency |
|---|---|---|
| Cache hit ratio | > 99% | Every 5 min |
| Dead tuple ratio | < 10% | Every 15 min |
| Replication lag | < 1MB | Every 1 min |
| Long queries | < 5 min | Every 1 min |
| Deadlocks | 0 | Every 5 min |
| Connection count | < 80% max | Every 1 min |
Recommended Tools
- pg_stat_statements - Query performance tracking
- auto_explain - Automatic query plan logging
- pgBadger - Log analysis and reporting
- Prometheus + Grafana - Real-time dashboards
- PgHero - Quick database insights
Conclusion
PostgreSQL's built-in statistics views provide comprehensive monitoring capabilities without external tools. By understanding and regularly checking these views, you can proactively identify and resolve performance issues.
Key Takeaways
- ✅ Monitor cache hit ratio (target > 99%)
- ✅ Track dead tuple ratio for vacuum needs
- ✅ Watch replication lag on replicas
- ✅ Identify unused indexes for cleanup
- ✅ Set up automated alerts for critical metrics
About the Author
Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in PostgreSQL, SQL optimization, and enterprise database solutions.
Related Articles
- PostgreSQL Vacuum and Autovacuum Optimization
- PostgreSQL Query Planner Deep Dive
- PostgreSQL Performance Tuning Configuration
Need help with PostgreSQL optimization? Contact us for expert database consulting.