query-optimizer

Optimize SQL queries for better performance through indexing, rewriting, and analysis.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "query-optimizer" with this command: npx skills add armanzeroeight/fastagent-plugins/armanzeroeight-fastagent-plugins-query-optimizer

Query Optimizer

Optimize SQL queries for better performance through indexing, rewriting, and analysis.

Quick Start

Use EXPLAIN to analyze queries, add indexes on WHERE/JOIN columns, avoid SELECT *, limit results.

Instructions

Query Analysis with EXPLAIN

Basic EXPLAIN:

EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';

EXPLAIN ANALYZE (actual execution):

EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';

Key metrics to check:

  • Seq Scan (bad) vs Index Scan (good)

  • Rows: Estimated vs actual

  • Cost: Lower is better

  • Execution time

Common Performance Issues

  1. Missing Indexes

Problem:

-- Seq Scan on users (cost=0.00..1234.56) SELECT * FROM users WHERE email = 'user@example.com';

Solution:

CREATE INDEX idx_users_email ON users(email); -- Now: Index Scan using idx_users_email

**2. SELECT ***

Problem:

SELECT * FROM posts; -- Fetches all columns

Solution:

SELECT id, title, created_at FROM posts; -- Only needed columns

  1. N+1 Queries

Problem:

-- Fetches posts SELECT * FROM posts; -- Then for each post: SELECT * FROM users WHERE id = ?;

Solution:

-- Single query with JOIN SELECT posts.*, users.name FROM posts JOIN users ON posts.user_id = users.id;

  1. No LIMIT

Problem:

SELECT * FROM posts ORDER BY created_at DESC; -- Returns all rows

Solution:

SELECT * FROM posts ORDER BY created_at DESC LIMIT 20;

Indexing Strategies

Single column index:

CREATE INDEX idx_users_email ON users(email);

Composite index (order matters):

-- For: WHERE user_id = ? AND created_at > ? CREATE INDEX idx_posts_user_created ON posts(user_id, created_at);

Covering index (includes all needed columns):

-- For: SELECT id, title FROM posts WHERE user_id = ? CREATE INDEX idx_posts_user_id_title ON posts(user_id) INCLUDE (title);

Partial index (filtered):

CREATE INDEX idx_active_users ON users(email) WHERE is_active = true;

Index on expressions:

CREATE INDEX idx_users_lower_email ON users(LOWER(email)); -- For: WHERE LOWER(email) = 'user@example.com'

Query Rewriting

Use EXISTS instead of IN for large sets:

-- Slow SELECT * FROM users WHERE id IN (SELECT user_id FROM posts);

-- Faster SELECT * FROM users u WHERE EXISTS ( SELECT 1 FROM posts p WHERE p.user_id = u.id );

Use JOIN instead of subquery:

-- Slow SELECT * FROM posts WHERE user_id IN ( SELECT id FROM users WHERE is_active = true );

-- Faster SELECT p.* FROM posts p JOIN users u ON p.user_id = u.id WHERE u.is_active = true;

Avoid functions on indexed columns:

-- Bad: Can't use index SELECT * FROM users WHERE YEAR(created_at) = 2024;

-- Good: Can use index SELECT * FROM users WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';

Use UNION ALL instead of UNION:

-- Slow: Removes duplicates SELECT id FROM posts UNION SELECT id FROM drafts;

-- Fast: No duplicate removal SELECT id FROM posts UNION ALL SELECT id FROM drafts;

JOIN Optimization

Order matters - filter early:

-- Bad: Large intermediate result SELECT * FROM posts p JOIN users u ON p.user_id = u.id WHERE p.created_at > '2024-01-01';

-- Good: Filter first SELECT * FROM posts p WHERE p.created_at > '2024-01-01' JOIN users u ON p.user_id = u.id;

Use appropriate JOIN type:

-- INNER JOIN: Only matching rows SELECT * FROM posts p INNER JOIN users u ON p.user_id = u.id;

-- LEFT JOIN: All posts, even without user SELECT * FROM posts p LEFT JOIN users u ON p.user_id = u.id;

Index JOIN columns:

CREATE INDEX idx_posts_user_id ON posts(user_id); CREATE INDEX idx_users_id ON users(id); -- Usually PK already indexed

Pagination Optimization

Offset pagination (slow for large offsets):

-- Slow for page 1000 SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 20000;

Cursor pagination (faster):

-- First page SELECT * FROM posts ORDER BY created_at DESC, id DESC LIMIT 20;

-- Next page (using last created_at and id) SELECT * FROM posts WHERE (created_at, id) < ('2024-01-01 12:00:00', 12345) ORDER BY created_at DESC, id DESC LIMIT 20;

Aggregation Optimization

Use indexes for GROUP BY:

CREATE INDEX idx_posts_user_id ON posts(user_id);

SELECT user_id, COUNT(*) FROM posts GROUP BY user_id;

Filter before aggregating:

-- Good SELECT user_id, COUNT(*) FROM posts WHERE created_at > '2024-01-01' GROUP BY user_id;

Use HAVING for aggregate filters:

SELECT user_id, COUNT() as post_count FROM posts GROUP BY user_id HAVING COUNT() > 10;

Subquery Optimization

Correlated subqueries (slow):

-- Bad: Runs subquery for each row SELECT * FROM users u WHERE (SELECT COUNT(*) FROM posts WHERE user_id = u.id) > 10;

JOIN instead:

-- Good: Single query SELECT u.* FROM users u JOIN ( SELECT user_id, COUNT() as post_count FROM posts GROUP BY user_id HAVING COUNT() > 10 ) p ON u.id = p.user_id;

Caching Strategies

Materialized views:

CREATE MATERIALIZED VIEW user_post_counts AS SELECT user_id, COUNT(*) as post_count FROM posts GROUP BY user_id;

-- Refresh periodically REFRESH MATERIALIZED VIEW user_post_counts;

Query result caching (application level):

Cache expensive queries

@cache(ttl=300) def get_popular_posts(): return db.query("SELECT * FROM posts ORDER BY views DESC LIMIT 10")

Common Patterns

Full-text Search

PostgreSQL:

-- Add tsvector column ALTER TABLE posts ADD COLUMN search_vector tsvector;

-- Update with trigger CREATE INDEX idx_posts_search ON posts USING GIN(search_vector);

-- Search SELECT * FROM posts WHERE search_vector @@ to_tsquery('postgresql & optimization');

Use dedicated search engine for complex needs:

  • Elasticsearch

  • Algolia

  • Meilisearch

Batch Operations

Bulk insert:

-- Bad: Multiple inserts INSERT INTO users (name) VALUES ('User 1'); INSERT INTO users (name) VALUES ('User 2');

-- Good: Single insert INSERT INTO users (name) VALUES ('User 1'), ('User 2'), ('User 3');

Bulk update:

-- Use CASE for conditional updates UPDATE posts SET status = CASE WHEN views > 1000 THEN 'popular' WHEN views > 100 THEN 'normal' ELSE 'new' END;

Connection Pooling

Use connection pool

from sqlalchemy import create_engine

engine = create_engine( 'postgresql://user:pass@localhost/db', pool_size=20, max_overflow=10 )

Performance Monitoring

Check slow queries:

-- PostgreSQL: Enable slow query log ALTER DATABASE mydb SET log_min_duration_statement = 1000; -- 1 second

-- View pg_stat_statements SELECT query, calls, total_time, mean_time FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;

Check index usage:

SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch FROM pg_stat_user_indexes WHERE idx_scan = 0 -- Unused indexes ORDER BY pg_relation_size(indexrelid) DESC;

Check table statistics:

SELECT schemaname, tablename, seq_scan, seq_tup_read, idx_scan, idx_tup_fetch FROM pg_stat_user_tables ORDER BY seq_scan DESC;

Best Practices

Always:

  • Use EXPLAIN ANALYZE for slow queries

  • Index foreign keys

  • Index WHERE/JOIN columns

  • Limit result sets

  • Use prepared statements

Avoid:

  • SELECT *

  • Functions on indexed columns in WHERE

  • Correlated subqueries

  • Large OFFSET values

  • Over-indexing

Monitor:

  • Slow query log

  • Index usage

  • Table statistics

  • Connection pool

Troubleshooting

Query still slow after indexing:

  • Check if index is being used (EXPLAIN)

  • Verify index column order for composite indexes

  • Consider covering index

  • Check for stale statistics (ANALYZE table)

Too many indexes:

  • Remove unused indexes

  • Combine similar indexes

  • Monitor write performance

High memory usage:

  • Reduce work_mem

  • Optimize sort operations

  • Use streaming instead of loading all data

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

gcp-cost-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

schema-designer

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

api-documentation-generator

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

aws-cost-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review