Master PostgreSQL Full-Text Search (FTS) to build powerful, fast, and flexible search functionality directly in your database without external search engines like Elasticsearch.
Why PostgreSQL Full-Text Search?
PostgreSQL's built-in full-text search provides enterprise-grade search capabilities without additional infrastructure complexity:
💡 Key Advantage: Keep your search logic close to your data, reducing latency and eliminating synchronization issues with external search engines.
Benefits of Native FTS
| Feature | PostgreSQL FTS | External Search (Elasticsearch) |
|---|---|---|
| Setup Complexity | Simple | Complex |
| Data Sync | Real-time | Requires pipeline |
| Infrastructure | Single database | Additional servers |
| Cost | Included | Additional licensing |
| Consistency | ACID compliant | Eventually consistent |
| Learning Curve | SQL-based | New query language |
Core FTS Concepts
Understanding tsvector and tsquery
PostgreSQL FTS relies on two fundamental data types:
tsvector: Document Representation
A tsvector is a sorted list of normalized lexemes (words) with position information:
-- Convert text to tsvector
SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog');
Result:
'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2
💡 Notice: Common words like "the" and "over" are removed (stop words), and words are normalized ("jumps" becomes "jump", "lazy" becomes "lazi").
tsquery: Search Query Representation
A tsquery represents the search terms and operators:
-- Convert search terms to tsquery
SELECT to_tsquery('english', 'quick & fox');
Result:
'quick' & 'fox'
Setting Up Full-Text Search
Step 1: Create a Sample Table
-- Create articles table for demonstration
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
author VARCHAR(100),
published_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
category VARCHAR(50),
tags TEXT[]
);
-- Insert sample data
INSERT INTO articles (title, content, author, category, tags) VALUES
('Introduction to PostgreSQL',
'PostgreSQL is a powerful, open source object-relational database system with over 35 years of active development.',
'John Smith', 'Database', ARRAY['postgresql', 'database', 'sql']),
('Advanced SQL Queries',
'Learn advanced SQL techniques including window functions, CTEs, and recursive queries for complex data analysis.',
'Jane Doe', 'SQL', ARRAY['sql', 'advanced', 'queries']),
('Database Performance Tuning',
'Optimize your PostgreSQL database performance with indexing strategies, query optimization, and configuration tuning.',
'John Smith', 'Performance', ARRAY['performance', 'tuning', 'optimization']);
Step 2: Add Full-Text Search Column
-- Add a generated tsvector column for efficient searching
ALTER TABLE articles
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(content, '')), 'B') ||
setweight(to_tsvector('english', coalesce(author, '')), 'C')
) STORED;
🔧 Configuration Note: We use
setweight()to assign different importance levels:
- A (highest): Title matches
- B: Content matches
- C: Author matches
- D (lowest): Other fields
Step 3: Create GIN Index
-- Create GIN index for fast full-text search
CREATE INDEX idx_articles_search ON articles USING GIN(search_vector);
-- Verify index creation
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'articles';
🚀 Performance Tip: GIN (Generalized Inverted Index) indexes are optimized for full-text search and provide significant performance improvements for large tables.
Basic Search Queries
Simple Text Search
-- Search for articles containing "postgresql"
SELECT id, title, author
FROM articles
WHERE search_vector @@ to_tsquery('english', 'postgresql');
Multi-Term Search
-- Search for articles containing both "sql" AND "queries"
SELECT id, title, author
FROM articles
WHERE search_vector @@ to_tsquery('english', 'sql & queries');
-- Search for articles containing "sql" OR "database"
SELECT id, title, author
FROM articles
WHERE search_vector @@ to_tsquery('english', 'sql | database');
-- Search for articles containing "database" but NOT "performance"
SELECT id, title, author
FROM articles
WHERE search_vector @@ to_tsquery('english', 'database & !performance');
Advanced Search Techniques
Phrase Search
-- Search for exact phrase "PostgreSQL database"
SELECT id, title
FROM articles
WHERE search_vector @@ phraseto_tsquery('english', 'PostgreSQL database');
-- Phrase with distance (words within 2 positions of each other)
SELECT id, title
FROM articles
WHERE search_vector @@ to_tsquery('english', 'postgresql <2> database');
Prefix Search (Autocomplete)
-- Search for words starting with "post"
SELECT id, title
FROM articles
WHERE search_vector @@ to_tsquery('english', 'post:*');
-- Combine prefix with other terms
SELECT id, title
FROM articles
WHERE search_vector @@ to_tsquery('english', 'post:* & sql');
Ranking Search Results
Using ts_rank
-- Rank search results by relevance
SELECT
id,
title,
ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'postgresql | database') query
WHERE search_vector @@ query
ORDER BY rank DESC;
Using ts_rank_cd (Cover Density)
-- Cover density ranking considers proximity of matching terms
SELECT
id,
title,
ts_rank_cd(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'sql & queries') query
WHERE search_vector @@ query
ORDER BY rank DESC;
Weighted Ranking
-- Custom weights for different fields
SELECT
id,
title,
ts_rank(
'{0.1, 0.2, 0.4, 1.0}', -- D, C, B, A weights
search_vector,
query
) AS rank
FROM articles, to_tsquery('english', 'postgresql') query
WHERE search_vector @@ query
ORDER BY rank DESC;
💡 Tip: The weight array
{0.1, 0.2, 0.4, 1.0}means:
- D weight: 0.1
- C weight: 0.2
- B weight: 0.4
- A weight: 1.0 (highest priority for title matches)
Highlighting Search Results
Using ts_headline
-- Highlight matching terms in content
SELECT
id,
title,
ts_headline(
'english',
content,
to_tsquery('english', 'postgresql'),
'StartSel=<mark>, StopSel=</mark>, MaxWords=50, MinWords=25'
) AS highlighted_content
FROM articles
WHERE search_vector @@ to_tsquery('english', 'postgresql');
Custom Highlight Options
-- Advanced highlighting with more options
SELECT
title,
ts_headline(
'english',
content,
query,
'StartSel=<b class="highlight">,
StopSel=</b>,
MaxWords=35,
MinWords=15,
ShortWord=3,
MaxFragments=3,
FragmentDelimiter=" ... "'
) AS snippet
FROM articles, to_tsquery('english', 'database & performance') query
WHERE search_vector @@ query;
Text Search Configurations
Available Configurations
-- List all available text search configurations
SELECT cfgname FROM pg_ts_config;
Common configurations:
english- English languagesimple- No language-specific processingspanish,french,german- Other languages
Creating Custom Configuration
-- Create custom configuration based on English
CREATE TEXT SEARCH CONFIGURATION my_english (COPY = english);
-- Add custom dictionary for technical terms
CREATE TEXT SEARCH DICTIONARY tech_terms (
TEMPLATE = simple,
STOPWORDS = english
);
-- Alter configuration to use custom dictionary
ALTER TEXT SEARCH CONFIGURATION my_english
ALTER MAPPING FOR word, asciiword
WITH tech_terms, english_stem;
🔧 Practical Implementation Examples
Building a Search Function
-- Create a reusable search function
CREATE OR REPLACE FUNCTION search_articles(
search_query TEXT,
result_limit INTEGER DEFAULT 10,
result_offset INTEGER DEFAULT 0
)
RETURNS TABLE(
id INTEGER,
title VARCHAR,
author VARCHAR,
snippet TEXT,
rank REAL
) AS $$
BEGIN
RETURN QUERY
SELECT
a.id,
a.title,
a.author,
ts_headline('english', a.content, query,
'MaxWords=50, MinWords=25, StartSel=<mark>, StopSel=</mark>'
) AS snippet,
ts_rank(a.search_vector, query) AS rank
FROM articles a, plainto_tsquery('english', search_query) query
WHERE a.search_vector @@ query
ORDER BY rank DESC
LIMIT result_limit
OFFSET result_offset;
END;
$$ LANGUAGE plpgsql;
-- Use the search function
SELECT * FROM search_articles('postgresql database', 5, 0);
Fuzzy Search with Trigrams
-- Enable pg_trgm extension for fuzzy matching
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Create trigram index
CREATE INDEX idx_articles_title_trgm ON articles
USING GIN(title gin_trgm_ops);
-- Fuzzy search combining FTS and trigrams
SELECT
id,
title,
similarity(title, 'postgre') AS sim_score,
ts_rank(search_vector, to_tsquery('english', 'postgresql')) AS fts_rank
FROM articles
WHERE title % 'postgre' -- Trigram similarity
OR search_vector @@ to_tsquery('english', 'postgresql')
ORDER BY sim_score DESC, fts_rank DESC;
📊 Monitoring and Performance
Analyze Search Performance
-- Explain search query execution
EXPLAIN ANALYZE
SELECT id, title
FROM articles
WHERE search_vector @@ to_tsquery('english', 'postgresql & database');
Check Index Usage
-- Verify GIN index is being used
SELECT
indexrelname,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE indexrelname = 'idx_articles_search';
Monitor FTS Statistics
-- Check tsvector statistics
SELECT
tablename,
attname,
n_distinct,
most_common_vals
FROM pg_stats
WHERE tablename = 'articles'
AND attname = 'search_vector';
✅ Best Practices
Do's
- Use generated columns for tsvector to keep search data in sync
- Create GIN indexes on tsvector columns for large tables
- Weight important fields (title > content > metadata)
- Use plainto_tsquery for user input to handle special characters
- Combine with trigrams for typo tolerance
- Limit results and use pagination for large result sets
Don'ts
- Don't skip indexing - unindexed FTS is very slow
- Don't use raw user input in to_tsquery (use plainto_tsquery)
- Don't ignore NULL handling - use COALESCE
- Don't forget ANALYZE after bulk data loads
⚠️ Common Pitfalls
Handling Special Characters
-- Safe way to handle user input with special characters
SELECT * FROM articles
WHERE search_vector @@ plainto_tsquery('english', 'C++ programming');
-- websearch_tsquery for web-like search syntax
SELECT * FROM articles
WHERE search_vector @@ websearch_tsquery('english', '"exact phrase" OR alternative');
NULL Value Handling
-- Always use COALESCE to handle NULL values
SELECT to_tsvector('english', COALESCE(title, '') || ' ' || COALESCE(content, ''));
🚀 Performance Optimization
Index Maintenance
-- Reindex if index becomes bloated
REINDEX INDEX idx_articles_search;
-- Check index size
SELECT pg_size_pretty(pg_relation_size('idx_articles_search'));
Partial Indexes
-- Create partial index for published articles only
CREATE INDEX idx_published_articles_search
ON articles USING GIN(search_vector)
WHERE published_at IS NOT NULL;
Query Optimization
-- Use covering index for common queries
CREATE INDEX idx_articles_search_covering
ON articles USING GIN(search_vector)
INCLUDE (id, title, author);
Comparison: FTS vs LIKE vs Regex
| Feature | Full-Text Search | LIKE | Regular Expression |
|---|---|---|---|
| Performance | Excellent (indexed) | Poor | Poor |
| Language Support | Yes (stemming) | No | No |
| Ranking | Built-in | No | No |
| Typo Tolerance | With trigrams | No | Possible |
| Phrase Search | Yes | Yes | Yes |
| Complexity | Medium | Simple | Complex |
Conclusion
PostgreSQL Full-Text Search provides a powerful, integrated solution for building search functionality. By leveraging tsvector, tsquery, GIN indexes, and ranking functions, you can create sophisticated search experiences without external dependencies.
Key Takeaways
- ✅ Use generated tsvector columns for automatic synchronization
- ✅ Create GIN indexes for optimal performance
- ✅ Apply field weighting to prioritize important content
- ✅ Combine with pg_trgm for fuzzy matching
- ✅ Use ts_headline for user-friendly result snippets
About the Author
Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in PostgreSQL, SQL optimization, and enterprise database solutions.
Related Articles
- PostgreSQL Index Types: When to Use Each
- PostgreSQL Query Planner Deep Dive
- PostgreSQL Performance Tuning Configuration
Need help with PostgreSQL optimization? Contact us for expert database consulting.