Name: EVE AI Core
Author: Jamaurice Holt

Master PostgreSQL JSON and JSONB data types to efficiently store, query, and index semi-structured data while maintaining optimal database performance.

JSON vs JSONB: Understanding the Difference

PostgreSQL offers two JSON data types with distinct characteristics:

Feature	JSON	JSONB
Storage	Text (as-is)	Binary (parsed)
Write Speed	Faster	Slower (parsing overhead)
Read Speed	Slower (re-parsing)	Faster
Indexing	Limited	Full GIN support
Duplicate Keys	Preserved	Last value wins
Key Order	Preserved	Not preserved
Whitespace	Preserved	Removed

💡 Rule of Thumb: Use JSONB for most use cases. Use JSON only when you need to preserve exact formatting or duplicate keys.

When to Use JSON/JSONB

✅ Good Use Cases

Flexible schemas - User preferences, configurations
External API data - Store third-party responses
Event payloads - Logging, audit trails
Document storage - When NoSQL-like flexibility is needed
Metadata fields - Extensible attributes

⚠️ Avoid When

Data is highly structured and relational
Frequent updates to specific JSON fields
Heavy aggregation on JSON values
Data requires foreign key relationships

Creating Tables with JSONB

Basic Table Structure

-- Create table with JSONB column
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    sku VARCHAR(50) UNIQUE NOT NULL,
    attributes JSONB DEFAULT '{}',
    metadata JSONB,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Insert sample data
INSERT INTO products (name, sku, attributes, metadata) VALUES
('Laptop Pro 15', 'LP-15-2025',
 '{"brand": "TechCorp", "specs": {"ram": 16, "storage": 512, "cpu": "i7"}, "tags": ["electronics", "computers"]}',
 '{"source": "import", "version": 2}'),

('Wireless Mouse', 'WM-100',
 '{"brand": "PeripheralCo", "specs": {"dpi": 1600, "buttons": 5, "wireless": true}, "tags": ["electronics", "accessories"]}',
 '{"source": "manual", "version": 1}'),

('USB-C Hub', 'UCH-7P',
 '{"brand": "TechCorp", "specs": {"ports": 7, "power_delivery": true}, "tags": ["electronics", "accessories", "usb"]}',
 '{"source": "api", "version": 3}');

JSONB Operators and Functions

Extraction Operators

-- -> returns JSONB (preserves type)
SELECT attributes -> 'brand' FROM products;
-- Result: "TechCorp"

-- ->> returns TEXT
SELECT attributes ->> 'brand' FROM products;
-- Result: TechCorp

-- #> path extraction (returns JSONB)
SELECT attributes #> '{specs, ram}' FROM products;
-- Result: 16

-- #>> path extraction (returns TEXT)
SELECT attributes #>> '{specs, ram}' FROM products;
-- Result: "16"

Containment Operators

-- @> contains (is superset)
SELECT * FROM products
WHERE attributes @> '{"brand": "TechCorp"}';

-- <@ is contained by (is subset)
SELECT * FROM products
WHERE '{"brand": "TechCorp"}' <@ attributes;

-- ? key exists
SELECT * FROM products
WHERE attributes ? 'brand';

-- ?| any key exists
SELECT * FROM products
WHERE attributes ?| array['brand', 'manufacturer'];

-- ?& all keys exist
SELECT * FROM products
WHERE attributes ?& array['brand', 'specs'];

🔧 Advanced Query Techniques

Querying Nested Objects

-- Access nested values
SELECT
    name,
    attributes -> 'specs' ->> 'ram' AS ram_gb,
    attributes -> 'specs' ->> 'storage' AS storage_gb
FROM products
WHERE (attributes -> 'specs' ->> 'ram')::int >= 16;

-- Filter by nested value
SELECT * FROM products
WHERE attributes #>> '{specs, cpu}' = 'i7';

-- Multiple conditions on nested data
SELECT * FROM products
WHERE attributes @> '{"specs": {"wireless": true}}'
  AND attributes -> 'specs' ->> 'dpi' IS NOT NULL;

Querying Arrays in JSONB

-- Check if array contains value
SELECT * FROM products
WHERE attributes -> 'tags' ? 'electronics';

-- Alternative: use @> with array
SELECT * FROM products
WHERE attributes @> '{"tags": ["usb"]}';

-- Get array length
SELECT
    name,
    jsonb_array_length(attributes -> 'tags') AS tag_count
FROM products;

-- Unnest array elements
SELECT
    p.name,
    tag.value AS tag
FROM products p,
     jsonb_array_elements_text(p.attributes -> 'tags') AS tag;

🚀 Indexing JSONB for Performance

GIN Index (General Inverted Index)

-- Create GIN index for containment queries
CREATE INDEX idx_products_attributes ON products
USING GIN (attributes);

-- This enables fast @>, ?, ?|, ?& queries
EXPLAIN ANALYZE
SELECT * FROM products
WHERE attributes @> '{"brand": "TechCorp"}';

GIN with jsonb_path_ops

-- More compact index for @> queries only
CREATE INDEX idx_products_attributes_path ON products
USING GIN (attributes jsonb_path_ops);

-- Smaller index, faster for containment queries
-- Does NOT support ?, ?|, ?& operators

💡 Performance Tip: Use jsonb_path_ops when you only need @> containment queries - the index is 2-3x smaller.

Expression Indexes for Specific Paths

-- B-tree index on specific JSON path
CREATE INDEX idx_products_brand ON products
    ((attributes ->> 'brand'));

-- Enables fast equality searches
SELECT * FROM products
WHERE attributes ->> 'brand' = 'TechCorp';

-- Index on nested value (cast to proper type)
CREATE INDEX idx_products_ram ON products
    (((attributes #>> '{specs, ram}')::int));

-- Enables range queries
SELECT * FROM products
WHERE (attributes #>> '{specs, ram}')::int > 8;

Partial Indexes

-- Index only products with specific attributes
CREATE INDEX idx_products_wireless ON products
USING GIN (attributes)
WHERE attributes @> '{"specs": {"wireless": true}}';

JSONB Modification Operations

Updating JSONB Values

-- Update entire JSONB column
UPDATE products
SET attributes = '{"brand": "NewBrand", "specs": {"ram": 32}}'
WHERE id = 1;

-- Concatenate/merge JSONB (|| operator)
UPDATE products
SET attributes = attributes || '{"new_field": "value"}'
WHERE id = 1;

-- Set nested value using jsonb_set
UPDATE products
SET attributes = jsonb_set(
    attributes,
    '{specs, ram}',
    '32'::jsonb
)
WHERE id = 1;

-- Set deeply nested value (create path if not exists)
UPDATE products
SET attributes = jsonb_set(
    attributes,
    '{specs, display, resolution}',
    '"4K"'::jsonb,
    true  -- create_missing
)
WHERE id = 1;

Removing JSONB Elements

-- Remove key using - operator
UPDATE products
SET attributes = attributes - 'old_field'
WHERE id = 1;

-- Remove nested key using #-
UPDATE products
SET attributes = attributes #- '{specs, old_spec}'
WHERE id = 1;

-- Remove array element by index
UPDATE products
SET attributes = attributes #- '{tags, 0}'  -- Remove first tag
WHERE id = 1;

-- Remove array element by value
UPDATE products
SET attributes = jsonb_set(
    attributes,
    '{tags}',
    (SELECT jsonb_agg(elem)
     FROM jsonb_array_elements(attributes -> 'tags') elem
     WHERE elem != '"electronics"')
)
WHERE id = 1;

Insert Into Arrays

-- Append to array
UPDATE products
SET attributes = jsonb_set(
    attributes,
    '{tags}',
    (attributes -> 'tags') || '["new_tag"]'::jsonb
)
WHERE id = 1;

-- Insert at specific position
UPDATE products
SET attributes = jsonb_insert(
    attributes,
    '{tags, 0}',
    '"first_tag"'::jsonb
)
WHERE id = 1;

📊 Performance Optimization Strategies

Denormalization Patterns

-- Extract frequently queried fields to columns
ALTER TABLE products ADD COLUMN brand VARCHAR(100);
ALTER TABLE products ADD COLUMN ram_gb INTEGER;

-- Create trigger to keep in sync
CREATE OR REPLACE FUNCTION sync_product_fields()
RETURNS TRIGGER AS $$
BEGIN
    NEW.brand := NEW.attributes ->> 'brand';
    NEW.ram_gb := (NEW.attributes #>> '{specs, ram}')::integer;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_sync_product_fields
    BEFORE INSERT OR UPDATE ON products
    FOR EACH ROW EXECUTE FUNCTION sync_product_fields();

-- Now you can use regular B-tree indexes
CREATE INDEX idx_products_brand_col ON products (brand);
CREATE INDEX idx_products_ram_col ON products (ram_gb);

Query Optimization Tips

-- Bad: Casting in WHERE clause (no index usage)
SELECT * FROM products
WHERE (attributes ->> 'price')::decimal > 100;

-- Good: Use containment with indexed column
SELECT * FROM products
WHERE attributes @> '{"category": "electronics"}'
  AND (attributes ->> 'price')::decimal > 100;

-- Better: Create expression index on price
CREATE INDEX idx_products_price ON products
    (((attributes ->> 'price')::decimal));

Monitoring JSONB Index Usage

-- Check if indexes are being used
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM products
WHERE attributes @> '{"brand": "TechCorp"}';

-- View index statistics
SELECT
    indexrelname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'products';

Working with JSON Path (PostgreSQL 12+)

JSONPath Expressions

-- Basic path query
SELECT jsonb_path_query(attributes, '$.specs.ram')
FROM products;

-- Filter with path query
SELECT * FROM products
WHERE jsonb_path_exists(attributes, '$.specs.ram ? (@ > 8)');

-- Get all matching elements
SELECT jsonb_path_query_array(
    attributes,
    '$.tags[*] ? (@ like_regex "^elec")'
)
FROM products;

JSONPath Operators

-- Comparison operators
SELECT * FROM products
WHERE jsonb_path_exists(
    attributes,
    '$.specs.storage ? (@ >= 256 && @ <= 1024)'
);

-- Pattern matching
SELECT * FROM products
WHERE jsonb_path_exists(
    attributes,
    '$.brand ? (@ like_regex "Tech" flag "i")'
);

-- Arithmetic in path
SELECT jsonb_path_query(
    attributes,
    '$.specs.ram.double() * 1024'  -- Convert GB to MB
)
FROM products;

Aggregation and Reporting

Building JSON from Queries

-- Aggregate rows into JSON array
SELECT jsonb_agg(
    jsonb_build_object(
        'name', name,
        'brand', attributes ->> 'brand'
    )
) AS products_json
FROM products
WHERE attributes @> '{"brand": "TechCorp"}';

-- Create key-value object from rows
SELECT jsonb_object_agg(sku, name) AS sku_name_map
FROM products;

Extracting Statistics

-- Count by JSON field value
SELECT
    attributes ->> 'brand' AS brand,
    COUNT(*) AS product_count
FROM products
GROUP BY attributes ->> 'brand';

-- Average of nested numeric value
SELECT
    AVG((attributes #>> '{specs, ram}')::int) AS avg_ram
FROM products
WHERE attributes #>> '{specs, ram}' IS NOT NULL;

-- Distinct values from JSON field
SELECT DISTINCT attributes ->> 'brand' AS brands
FROM products
ORDER BY 1;

✅ Best Practices

Schema Design

Don't overuse JSONB - Use relational columns for frequently queried, indexed data
Define default values - DEFAULT '{}' prevents NULL handling issues
Validate JSON structure - Use CHECK constraints or application validation
Document your schema - JSONB flexibility can lead to inconsistency

-- Add CHECK constraint for required keys
ALTER TABLE products ADD CONSTRAINT chk_attributes_brand
    CHECK (attributes ? 'brand');

-- Validate JSON structure with function
CREATE OR REPLACE FUNCTION validate_product_attributes(attr JSONB)
RETURNS BOOLEAN AS $$
BEGIN
    RETURN attr ? 'brand'
       AND attr ? 'specs'
       AND jsonb_typeof(attr -> 'tags') = 'array';
END;
$$ LANGUAGE plpgsql IMMUTABLE;

ALTER TABLE products ADD CONSTRAINT chk_attributes_valid
    CHECK (validate_product_attributes(attributes));

Indexing Strategy

-- 1. GIN index for containment queries (most common)
CREATE INDEX idx_jsonb_gin ON products USING GIN (attributes);

-- 2. Expression indexes for specific path queries
CREATE INDEX idx_jsonb_brand ON products ((attributes ->> 'brand'));

-- 3. Partial indexes for filtered queries
CREATE INDEX idx_jsonb_premium ON products USING GIN (attributes)
WHERE (attributes ->> 'price')::decimal > 1000;

Query Performance

-- ✅ Good: Uses GIN index
SELECT * FROM products WHERE attributes @> '{"brand": "Tech"}';

-- ✅ Good: Uses expression index
SELECT * FROM products WHERE attributes ->> 'brand' = 'Tech';

-- ⚠️ Slow: No index support for LIKE on JSONB
SELECT * FROM products WHERE attributes ->> 'brand' LIKE '%Tech%';

-- ✅ Better: Use pg_trgm for text search
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_jsonb_brand_trgm ON products
    USING GIN ((attributes ->> 'brand') gin_trgm_ops);

⚠️ Common Pitfalls

Pitfall 1: Ignoring NULL vs Missing Keys

-- These behave differently!
SELECT * FROM products WHERE attributes ->> 'color' IS NULL;
-- Returns rows where 'color' key doesn't exist OR value is null

SELECT * FROM products WHERE NOT (attributes ? 'color');
-- Returns only rows where 'color' key doesn't exist

SELECT * FROM products WHERE attributes -> 'color' = 'null'::jsonb;
-- Returns rows where 'color' explicitly equals JSON null

Pitfall 2: Type Coercion Issues

-- Wrong: Comparing text to number
SELECT * FROM products
WHERE attributes ->> 'price' > 100;  -- String comparison!

-- Correct: Cast to proper type
SELECT * FROM products
WHERE (attributes ->> 'price')::decimal > 100;

Pitfall 3: Update Performance

-- Slow: Full JSONB rewrite for each update
UPDATE products
SET attributes = jsonb_set(attributes, '{views}',
    ((attributes ->> 'views')::int + 1)::text::jsonb)
WHERE id = 1;

-- Better: Use separate counter column for frequent updates
ALTER TABLE products ADD COLUMN view_count INTEGER DEFAULT 0;
UPDATE products SET view_count = view_count + 1 WHERE id = 1;

Performance Benchmarks

Index Size Comparison

Index Type	Size (1M rows)	Query Time
No index	-	850ms
GIN (default)	180 MB	2.1ms
GIN (jsonb_path_ops)	85 MB	1.8ms
Expression B-tree	45 MB	0.5ms

Query Performance by Operation

Operation	Without Index	With GIN	With Expression
`@>` containment	850ms	2ms	N/A
`?` key exists	750ms	1.5ms	N/A
`->>` equality	900ms	650ms	0.5ms
Nested path	1200ms	800ms	1ms

Conclusion

PostgreSQL's JSONB provides powerful semi-structured data capabilities with excellent performance when properly indexed. By understanding operators, indexing strategies, and best practices, you can leverage JSONB effectively while maintaining query performance.

Key Takeaways

✅ Use JSONB over JSON for most use cases
✅ Create GIN indexes for containment queries
✅ Use expression indexes for specific path queries
✅ Denormalize frequently accessed fields to columns
✅ Validate JSONB structure with constraints
⚠️ Avoid storing highly relational data in JSONB
⚠️ Watch for type coercion issues in queries

About the Author

Jamaurice Holt is a Cloud Database Expert and AWS Solutions Architect with over 10 years of experience in database optimization, high availability, and cloud migrations. Specializing in PostgreSQL, SQL optimization, and enterprise database solutions.

Need help with PostgreSQL optimization? Contact us for expert database consulting.