Master PostgreSQL JSON and JSONB data types to efficiently store, query, and index semi-structured data while maintaining optimal database performance.
JSON vs JSONB: Understanding the Difference
PostgreSQL offers two JSON data types with distinct characteristics:
| Feature | JSON | JSONB |
|---|---|---|
| Storage | Text (as-is) | Binary (parsed) |
| Write Speed | Faster | Slower (parsing overhead) |
| Read Speed | Slower (re-parsing) | Faster |
| Indexing | Limited | Full GIN support |
| Duplicate Keys | Preserved | Last value wins |
| Key Order | Preserved | Not preserved |
| Whitespace | Preserved | Removed |
💡 Rule of Thumb: Use JSONB for most use cases. Use JSON only when you need to preserve exact formatting or duplicate keys.
When to Use JSON/JSONB
✅ Good Use Cases
- Flexible schemas - User preferences, configurations
- External API data - Store third-party responses
- Event payloads - Logging, audit trails
- Document storage - When NoSQL-like flexibility is needed
- Metadata fields - Extensible attributes
⚠️ Avoid When
- Data is highly structured and relational
- Frequent updates to specific JSON fields
- Heavy aggregation on JSON values
- Data requires foreign key relationships
Creating Tables with JSONB
Basic Table Structure
-- Create table with JSONB column
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
sku VARCHAR(50) UNIQUE NOT NULL,
attributes JSONB DEFAULT '{}',
metadata JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Insert sample data
INSERT INTO products (name, sku, attributes, metadata) VALUES
('Laptop Pro 15', 'LP-15-2025',
'{"brand": "TechCorp", "specs": {"ram": 16, "storage": 512, "cpu": "i7"}, "tags": ["electronics", "computers"]}',
'{"source": "import", "version": 2}'),
('Wireless Mouse', 'WM-100',
'{"brand": "PeripheralCo", "specs": {"dpi": 1600, "buttons": 5, "wireless": true}, "tags": ["electronics", "accessories"]}',
'{"source": "manual", "version": 1}'),
('USB-C Hub', 'UCH-7P',
'{"brand": "TechCorp", "specs": {"ports": 7, "power_delivery": true}, "tags": ["electronics", "accessories", "usb"]}',
'{"source": "api", "version": 3}');
JSONB Operators and Functions
Extraction Operators
-- -> returns JSONB (preserves type)
SELECT attributes -> 'brand' FROM products;
-- Result: "TechCorp"
-- ->> returns TEXT
SELECT attributes ->> 'brand' FROM products;
-- Result: TechCorp
-- #> path extraction (returns JSONB)
SELECT attributes #> '{specs, ram}' FROM products;
-- Result: 16
-- #>> path extraction (returns TEXT)
SELECT attributes #>> '{specs, ram}' FROM products;
-- Result: "16"
Containment Operators
-- @> contains (is superset)
SELECT * FROM products
WHERE attributes @> '{"brand": "TechCorp"}';
-- <@ is contained by (is subset)
SELECT * FROM products
WHERE '{"brand": "TechCorp"}' <@ attributes;
-- ? key exists
SELECT * FROM products
WHERE attributes ? 'brand';
-- ?| any key exists
SELECT * FROM products
WHERE attributes ?| array['brand', 'manufacturer'];
-- ?& all keys exist
SELECT * FROM products
WHERE attributes ?& array['brand', 'specs'];
🔧 Advanced Query Techniques
Querying Nested Objects
-- Access nested values
SELECT
name,
attributes -> 'specs' ->> 'ram' AS ram_gb,
attributes -> 'specs' ->> 'storage' AS storage_gb
FROM products
WHERE (attributes -> 'specs' ->> 'ram')::int >= 16;
-- Filter by nested value
SELECT * FROM products
WHERE attributes #>> '{specs, cpu}' = 'i7';
-- Multiple conditions on nested data
SELECT * FROM products
WHERE attributes @> '{"specs": {"wireless": true}}'
AND attributes -> 'specs' ->> 'dpi' IS NOT NULL;
Querying Arrays in JSONB
-- Check if array contains value
SELECT * FROM products
WHERE attributes -> 'tags' ? 'electronics';
-- Alternative: use @> with array
SELECT * FROM products
WHERE attributes @> '{"tags": ["usb"]}';
-- Get array length
SELECT
name,
jsonb_array_length(attributes -> 'tags') AS tag_count
FROM products;
-- Unnest array elements
SELECT
p.name,
tag.value AS tag
FROM products p,
jsonb_array_elements_text(p.attributes -> 'tags') AS tag;
🚀 Indexing JSONB for Performance
GIN Index (General Inverted Index)
-- Create GIN index for containment queries
CREATE INDEX idx_products_attributes ON products
USING GIN (attributes);
-- This enables fast @>, ?, ?|, ?& queries
EXPLAIN ANALYZE
SELECT * FROM products
WHERE attributes @> '{"brand": "TechCorp"}';
GIN with jsonb_path_ops
-- More compact index for @> queries only
CREATE INDEX idx_products_attributes_path ON products
USING GIN (attributes jsonb_path_ops);
-- Smaller index, faster for containment queries
-- Does NOT support ?, ?|, ?& operators
💡 Performance Tip: Use
jsonb_path_opswhen you only need@>containment queries - the index is 2-3x smaller.
Expression Indexes for Specific Paths
-- B-tree index on specific JSON path
CREATE INDEX idx_products_brand ON products
((attributes ->> 'brand'));
-- Enables fast equality searches
SELECT * FROM products
WHERE attributes ->> 'brand' = 'TechCorp';
-- Index on nested value (cast to proper type)
CREATE INDEX idx_products_ram ON products
(((attributes #>> '{specs, ram}')::int));
-- Enables range queries
SELECT * FROM products
WHERE (attributes #>> '{specs, ram}')::int > 8;
Partial Indexes
-- Index only products with specific attributes
CREATE INDEX idx_products_wireless ON products
USING GIN (attributes)
WHERE attributes @> '{"specs": {"wireless": true}}';
JSONB Modification Operations
Updating JSONB Values
-- Update entire JSONB column
UPDATE products
SET attributes = '{"brand": "NewBrand", "specs": {"ram": 32}}'
WHERE id = 1;
-- Concatenate/merge JSONB (|| operator)
UPDATE products
SET attributes = attributes || '{"new_field": "value"}'
WHERE id = 1;
-- Set nested value using jsonb_set
UPDATE products
SET attributes = jsonb_set(
attributes,
'{specs, ram}',
'32'::jsonb
)
WHERE id = 1;
-- Set deeply nested value (create path if not exists)
UPDATE products
SET attributes = jsonb_set(
attributes,
'{specs, display, resolution}',
'"4K"'::jsonb,
true -- create_missing
)
WHERE id = 1;
Removing JSONB Elements
-- Remove key using - operator
UPDATE products
SET attributes = attributes - 'old_field'
WHERE id = 1;
-- Remove nested key using #-
UPDATE products
SET attributes = attributes #- '{specs, old_spec}'
WHERE id = 1;
-- Remove array element by index
UPDATE products
SET attributes = attributes #- '{tags, 0}' -- Remove first tag
WHERE id = 1;
-- Remove array element by value
UPDATE products
SET attributes = jsonb_set(
attributes,
'{tags}',
(SELECT jsonb_agg(elem)
FROM jsonb_array_elements(attributes -> 'tags') elem
WHERE elem != '"electronics"')
)
WHERE id = 1;
Insert Into Arrays
-- Append to array
UPDATE products
SET attributes = jsonb_set(
attributes,
'{tags}',
(attributes -> 'tags') || '["new_tag"]'::jsonb
)
WHERE id = 1;
-- Insert at specific position
UPDATE products
SET attributes = jsonb_insert(
attributes,
'{tags, 0}',
'"first_tag"'::jsonb
)
WHERE id = 1;
📊 Performance Optimization Strategies
Denormalization Patterns
-- Extract frequently queried fields to columns
ALTER TABLE products ADD COLUMN brand VARCHAR(100);
ALTER TABLE products ADD COLUMN ram_gb INTEGER;
-- Create trigger to keep in sync
CREATE OR REPLACE FUNCTION sync_product_fields()
RETURNS TRIGGER AS $$
BEGIN
NEW.brand := NEW.attributes ->> 'brand';
NEW.ram_gb := (NEW.attributes #>> '{specs, ram}')::integer;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_sync_product_fields
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW EXECUTE FUNCTION sync_product_fields();
-- Now you can use regular B-tree indexes
CREATE INDEX idx_products_brand_col ON products (brand);
CREATE INDEX idx_products_ram_col ON products (ram_gb);
Query Optimization Tips
-- Bad: Casting in WHERE clause (no index usage)
SELECT * FROM products
WHERE (attributes ->> 'price')::decimal > 100;
-- Good: Use containment with indexed column
SELECT * FROM products
WHERE attributes @> '{"category": "electronics"}'
AND (attributes ->> 'price')::decimal > 100;
-- Better: Create expression index on price
CREATE INDEX idx_products_price ON products
(((attributes ->> 'price')::decimal));
Monitoring JSONB Index Usage
-- Check if indexes are being used
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM products
WHERE attributes @> '{"brand": "TechCorp"}';
-- View index statistics
SELECT
indexrelname,
idx_scan,
idx_tup_read,
idx_tup_fetch,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'products';
Working with JSON Path (PostgreSQL 12+)
JSONPath Expressions
-- Basic path query
SELECT jsonb_path_query(attributes, '$.specs.ram')
FROM products;
-- Filter with path query
SELECT * FROM products
WHERE jsonb_path_exists(attributes, '$.specs.ram ? (@ > 8)');
-- Get all matching elements
SELECT jsonb_path_query_array(
attributes,
'$.tags[*] ? (@ like_regex "^elec")'
)
FROM products;
JSONPath Operators
-- Comparison operators
SELECT * FROM products
WHERE jsonb_path_exists(
attributes,
'$.specs.storage ? (@ >= 256 && @ <= 1024)'
);
-- Pattern matching
SELECT * FROM products
WHERE jsonb_path_exists(
attributes,
'$.brand ? (@ like_regex "Tech" flag "i")'
);
-- Arithmetic in path
SELECT jsonb_path_query(
attributes,
'$.specs.ram.double() * 1024' -- Convert GB to MB
)
FROM products;
Aggregation and Reporting
Building JSON from Queries
-- Aggregate rows into JSON array
SELECT jsonb_agg(
jsonb_build_object(
'name', name,
'brand', attributes ->> 'brand'
)
) AS products_json
FROM products
WHERE attributes @> '{"brand": "TechCorp"}';
-- Create key-value object from rows
SELECT jsonb_object_agg(sku, name) AS sku_name_map
FROM products;
Extracting Statistics
-- Count by JSON field value
SELECT
attributes ->> 'brand' AS brand,
COUNT(*) AS product_count
FROM products
GROUP BY attributes ->> 'brand';
-- Average of nested numeric value
SELECT
AVG((attributes #>> '{specs, ram}')::int) AS avg_ram
FROM products
WHERE attributes #>> '{specs, ram}' IS NOT NULL;
-- Distinct values from JSON field
SELECT DISTINCT attributes ->> 'brand' AS brands
FROM products
ORDER BY 1;
✅ Best Practices
Schema Design
- Don't overuse JSONB - Use relational columns for frequently queried, indexed data
- Define default values -
DEFAULT '{}'prevents NULL handling issues - Validate JSON structure - Use CHECK constraints or application validation
- Document your schema - JSONB flexibility can lead to inconsistency
-- Add CHECK constraint for required keys
ALTER TABLE products ADD CONSTRAINT chk_attributes_brand
CHECK (attributes ? 'brand');
-- Validate JSON structure with function
CREATE OR REPLACE FUNCTION validate_product_attributes(attr JSONB)
RETURNS BOOLEAN AS $$
BEGIN
RETURN attr ? 'brand'
AND attr ? 'specs'
AND jsonb_typeof(attr -> 'tags') = 'array';
END;
$$ LANGUAGE plpgsql IMMUTABLE;
ALTER TABLE products ADD CONSTRAINT chk_attributes_valid
CHECK (validate_product_attributes(attributes));
Indexing Strategy
-- 1. GIN index for containment queries (most common)
CREATE INDEX idx_jsonb_gin ON products USING GIN (attributes);
-- 2. Expression indexes for specific path queries
CREATE INDEX idx_jsonb_brand ON products ((attributes ->> 'brand'));
-- 3. Partial indexes for filtered queries
CREATE INDEX idx_jsonb_premium ON products USING GIN (attributes)
WHERE (attributes ->> 'price')::decimal > 1000;
Query Performance
-- ✅ Good: Uses GIN index
SELECT * FROM products WHERE attributes @> '{"brand": "Tech"}';
-- ✅ Good: Uses expression index
SELECT * FROM products WHERE attributes ->> 'brand' = 'Tech';
-- ⚠️ Slow: No index support for LIKE on JSONB
SELECT * FROM products WHERE attributes ->> 'brand' LIKE '%Tech%';
-- ✅ Better: Use pg_trgm for text search
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_jsonb_brand_trgm ON products
USING GIN ((attributes ->> 'brand') gin_trgm_ops);
⚠️ Common Pitfalls
Pitfall 1: Ignoring NULL vs Missing Keys
-- These behave differently!
SELECT * FROM products WHERE attributes ->> 'color' IS NULL;
-- Returns rows where 'color' key doesn't exist OR value is null
SELECT * FROM products WHERE NOT (attributes ? 'color');
-- Returns only rows where 'color' key doesn't exist
SELECT * FROM products WHERE attributes -> 'color' = 'null'::jsonb;
-- Returns rows where 'color' explicitly equals JSON null
Pitfall 2: Type Coercion Issues
-- Wrong: Comparing text to number
SELECT * FROM products
WHERE attributes ->> 'price' > 100; -- String comparison!
-- Correct: Cast to proper type
SELECT * FROM products
WHERE (attributes ->> 'price')::decimal > 100;
Pitfall 3: Update Performance
-- Slow: Full JSONB rewrite for each update
UPDATE products
SET attributes = jsonb_set(attributes, '{views}',
((attributes ->> 'views')::int + 1)::text::jsonb)
WHERE id = 1;
-- Better: Use separate counter column for frequent updates
ALTER TABLE products ADD COLUMN view_count INTEGER DEFAULT 0;
UPDATE products SET view_count = view_count + 1 WHERE id = 1;
Performance Benchmarks
Index Size Comparison
| Index Type | Size (1M rows) | Query Time |
|---|---|---|
| No index | - | 850ms |
| GIN (default) | 180 MB | 2.1ms |
| GIN (jsonb_path_ops) | 85 MB | 1.8ms |
| Expression B-tree | 45 MB | 0.5ms |
Query Performance by Operation
| Operation | Without Index | With GIN | With Expression |
|---|---|---|---|
@> containment | 850ms | 2ms | N/A |
? key exists | 750ms | 1.5ms | N/A |
->> equality | 900ms | 650ms | 0.5ms |
| Nested path | 1200ms | 800ms | 1ms |
Conclusion
PostgreSQL's JSONB provides powerful semi-structured data capabilities with excellent performance when properly indexed. By understanding operators, indexing strategies, and best practices, you can leverage JSONB effectively while maintaining query performance.
Key Takeaways
- ✅ Use JSONB over JSON for most use cases
- ✅ Create GIN indexes for containment queries
- ✅ Use expression indexes for specific path queries
- ✅ Denormalize frequently accessed fields to columns
- ✅ Validate JSONB structure with constraints
- ⚠️ Avoid storing highly relational data in JSONB
- ⚠️ Watch for type coercion issues in queries
About the Author
Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in PostgreSQL, SQL optimization, and enterprise database solutions.
Related Articles
- PostgreSQL Index Types: When to Use Each
- PostgreSQL Array and HSTORE Usage
- PostgreSQL Full-Text Search Implementation
Need help with PostgreSQL optimization? Contact us for expert database consulting.