postgres-optimization

Unconventional PostgreSQL optimization techniques

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "postgres-optimization" with this command: npx skills add simhacker/moollm/simhacker-moollm-postgres-optimization

🐘 PostgreSQL Optimization

"Beyond 'just add an index' — creative solutions for real performance problems."

Unconventional optimization techniques for PostgreSQL that go beyond standard DBA playbooks.

Purpose

When conventional approaches fall short — query rewrites, adding indexes, VACUUM, ANALYZE — these techniques offer creative solutions:

  • Eliminate impossible query scans with constraint exclusion
  • Reduce index size with function-based indexes
  • Enforce uniqueness with hash indexes instead of B-Trees

When to Use

  • Ad-hoc query environments where users make mistakes
  • Large indexes approaching table size
  • Uniqueness constraints on large text values (URLs, documents)
  • Timestamp columns queried at coarser granularity

Technique 1: Constraint Exclusion

The Problem

Check constraints prevent invalid data, but PostgreSQL doesn't use them to optimize queries by default.

CREATE TABLE users (
    id INT PRIMARY KEY,
    username TEXT NOT NULL,
    plan TEXT NOT NULL,
    CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);

An analyst writes:

SELECT * FROM users WHERE plan = 'Pro';  -- Note: capital P

Despite the check constraint making this condition impossible, PostgreSQL scans the entire table.

The Solution

SET constraint_exclusion TO 'on';

With constraint exclusion enabled:

EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';
Result  (cost=0.00..0.00 rows=0 width=0)
  One-Time Filter: false
Execution Time: 0.008 ms

PostgreSQL recognizes the condition contradicts the constraint and skips the scan entirely.

When to Enable

EnvironmentRecommendation
OLTP productionLeave as 'partition' (default)
BI / Data WarehouseSet to 'on'
Ad-hoc query toolsSet to 'on'
Reporting databasesSet to 'on'

Tradeoffs

  • Benefit: Eliminates impossible query scans
  • Cost: Extra planning overhead evaluating constraints against conditions
  • Default: 'partition' — only used for partition pruning

Technique 2: Function-Based Indexes for Lower Cardinality

The Problem

You have a sales table with timestamps:

CREATE TABLE sale (
    id INT PRIMARY KEY,
    sold_at TIMESTAMPTZ NOT NULL,
    charged INT NOT NULL
);

Analysts query by day:

SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE sold_at BETWEEN '2025-01-01 UTC' AND '2025-02-01 UTC'
GROUP BY 1;

You add a B-Tree index on sold_at — 214 MB for a 160 MB table. The index is almost half the table size!

The Solution

Index only what queries need:

CREATE INDEX sale_sold_at_date_ix 
ON sale((date_trunc('day', sold_at AT TIME ZONE 'UTC'))::date);
IndexSize
sale_sold_at_ix (full timestamp)214 MB
sale_sold_at_date_ix (date only)66 MB

The function-based index is 3x smaller because:

  • Dates are 4 bytes vs 8 bytes for timestamptz
  • Fewer distinct values enable deduplication

The Discipline Problem

Function-based indexes require exact expression match:

-- Uses the index ✓
WHERE date_trunc('day', sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

-- Does NOT use the index ✗
WHERE (sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

Solution: Virtual Generated Columns (PostgreSQL 18+)

ALTER TABLE sale ADD sold_at_date DATE
GENERATED ALWAYS AS (date_trunc('day', sold_at AT TIME ZONE 'UTC'));

Now queries use the virtual column:

SELECT sold_at_date, SUM(charged)
FROM sale
WHERE sold_at_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;

Benefits:

  • Smaller index
  • Faster queries
  • No discipline required — column guarantees correct expression
  • No ambiguity about timezones

Limitation: PostgreSQL 18 doesn't support indexes directly on virtual columns (yet).


Technique 3: Hash Index for Uniqueness

The Problem

You have a table with large URLs:

CREATE TABLE urls (
    id INT PRIMARY KEY,
    url TEXT NOT NULL,
    data JSON
);

You add a unique B-Tree index:

CREATE UNIQUE INDEX urls_url_unique_ix ON urls(url);
Size
Table: 160 MB
B-Tree index: 154 MB

The index is almost as large as the table because B-Tree stores actual values in leaf blocks.

The Solution

Use an exclusion constraint with a hash index:

ALTER TABLE urls 
ADD CONSTRAINT urls_url_unique_hash 
EXCLUDE USING HASH (url WITH =);
IndexSize
B-Tree154 MB
Hash32 MB

The hash index is 5x smaller because it stores hash values, not the actual URLs.

Uniqueness Is Enforced

INSERT INTO urls (id, url) VALUES (1000002, 'https://example.com');
-- ERROR: conflicting key value violates exclusion constraint

Queries Still Fast

EXPLAIN ANALYZE SELECT * FROM urls WHERE url = 'https://example.com';
Index Scan using urls_url_unique_hash on urls
Execution Time: 0.022 ms  -- Faster than B-Tree's 0.046 ms!

Limitations

FeatureB-Tree UniqueHash Exclusion
Foreign key reference
ON CONFLICT (column)
ON CONFLICT ON CONSTRAINT✓ (DO NOTHING only)
ON CONFLICT DO UPDATE
MERGE

Workaround: Use MERGE

Instead of INSERT ... ON CONFLICT DO UPDATE:

MERGE INTO urls t
USING (VALUES (1000004, 'https://example.com')) AS s(id, url)
ON t.url = s.url
WHEN MATCHED THEN UPDATE SET id = s.id
WHEN NOT MATCHED THEN INSERT (id, url) VALUES (s.id, s.url);

Quick Reference

Diagnostic Queries

Check index sizes:

\di+ table_*

Compare index to table size:

SELECT 
    relname AS name,
    pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_class 
WHERE relname LIKE 'your_table%'
ORDER BY pg_relation_size(oid) DESC;

Check constraint_exclusion setting:

SHOW constraint_exclusion;

Decision Tree

Is the query scanning impossibly?
├── Yes → Enable constraint_exclusion
└── No
    ↓
Is index nearly as large as table?
├── Yes, timestamp column → Function-based index on date
├── Yes, large text column → Hash exclusion constraint
└── No → Standard B-Tree is fine

Commands

CommandAction
ANALYZE [table]Analyze query performance
CHECK-CONSTRAINTSEvaluate constraint exclusion opportunity
LOWER-CARDINALITYFind function-based index opportunities
HASH-UNIQUEEvaluate hash index for large values
COMPARE-INDEXESCompare index sizes and performance

Integration

DirectionSkillRelationship
debuggingQuery debugging leads here
plan-then-executeSystematic optimization

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

self-repair

No summary provided by upstream source.

Repository SourceNeeds Review
General

persona

No summary provided by upstream source.

Repository SourceNeeds Review
General

dog

No summary provided by upstream source.

Repository SourceNeeds Review
General

probability

No summary provided by upstream source.

Repository SourceNeeds Review