zero-downtime-migration

Zero-Downtime Migration ()

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "zero-downtime-migration" with this command: npx skills add yonatangross/orchestkit/yonatangross-orchestkit-zero-downtime-migration

Zero-Downtime Migration ()

Database migration patterns that ensure continuous service availability during schema changes.

Overview

  • Deploying schema changes to production systems with uptime requirements

  • Renaming or removing columns without breaking existing application code

  • Adding NOT NULL constraints to existing columns with data

  • Creating indexes on large tables without locking

  • Migrating data between columns or tables during live traffic

  • Using pgroll for automated expand-contract migrations

Quick Reference

Expand-Contract Overview

┌─────────────────────────────────────────────────────────────────────────┐ │ EXPAND-CONTRACT PATTERN │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ Phase 1: EXPAND Phase 2: MIGRATE Phase 3: CONTRACT│ │ ───────────────── ────────────────── ──────────────── │ │ Add new column Backfill data Remove old column │ │ (nullable) Update app to use new (after app migrated)│ │ Both versions work │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │old_col │ ───────────────>│old_col │ ─────────────> │new_col │ │ │ │ │ │new_col │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ Rollback: Drop new Rollback: Use old Rollback: N/A │ │ (dual-write in app) (commit) │ │ │ └─────────────────────────────────────────────────────────────────────────┘

pgroll: Automated Expand-Contract

Install pgroll ( recommended tool)

brew install xataio/pgroll/pgroll

or

go install github.com/xataio/pgroll@latest

Initialize pgroll in your database

pgroll init --postgres-url "postgres://user:pass@localhost/db"

Create a migration file (migrations/001_add_email_verified.json)

{ "name": "001_add_email_verified", "operations": [ { "add_column": { "table": "users", "column": { "name": "email_verified", "type": "boolean", "default": "false", "nullable": false }, "up": "false" } } ] }

Start migration (creates versioned schema)

pgroll start migrations/001_add_email_verified.json

App v1 uses: schema "public_001_add_email_verified"

App v2 uses: schema "public" (new version)

After verification, complete migration

pgroll complete

Rollback if issues

pgroll rollback

Manual Expand Phase (Add New)

-- Step 1: Add new column (nullable, no default constraint yet) ALTER TABLE users ADD COLUMN display_name VARCHAR(200);

-- Step 2: Create trigger for dual-write (if app can't dual-write) CREATE OR REPLACE FUNCTION sync_display_name() RETURNS TRIGGER AS $$ BEGIN NEW.display_name := CONCAT(NEW.first_name, ' ', NEW.last_name); RETURN NEW; END; $$ LANGUAGE plpgsql;

CREATE TRIGGER trg_sync_display_name BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE FUNCTION sync_display_name();

-- Step 3: Backfill existing data (in batches) UPDATE users SET display_name = CONCAT(first_name, ' ', last_name) WHERE display_name IS NULL AND id IN (SELECT id FROM users WHERE display_name IS NULL LIMIT 1000);

Manual Contract Phase (Remove Old)

-- Step 1: Verify no readers of old column (check query logs) SELECT * FROM pg_stat_statements WHERE query LIKE '%first_name%' OR query LIKE '%last_name%';

-- Step 2: Drop trigger (if used) DROP TRIGGER IF EXISTS trg_sync_display_name ON users; DROP FUNCTION IF EXISTS sync_display_name();

-- Step 3: Drop old columns ONLY after app fully migrated ALTER TABLE users DROP COLUMN first_name; ALTER TABLE users DROP COLUMN last_name;

-- Step 4: Make new column NOT NULL if required ALTER TABLE users ALTER COLUMN display_name SET NOT NULL;

CONCURRENTLY Index Creation

-- Create index without locking table (PostgreSQL) CREATE INDEX CONCURRENTLY idx_orders_customer_date ON orders (customer_id, created_at DESC);

-- Drop index without locking (if recreation needed) DROP INDEX CONCURRENTLY IF EXISTS idx_orders_customer_date;

-- IMPORTANT: CONCURRENTLY cannot run inside transaction block -- Run outside of Alembic transaction or use raw connection

NOT VALID Constraint Pattern

-- Step 1: Add constraint without validating existing rows (instant) ALTER TABLE orders ADD CONSTRAINT chk_amount_positive CHECK (amount > 0) NOT VALID;

-- Step 2: Validate constraint (scans table but allows writes) ALTER TABLE orders VALIDATE CONSTRAINT chk_amount_positive;

Foreign Key Safe Addition

-- Step 1: Add FK without validation (instant) ALTER TABLE orders ADD CONSTRAINT fk_orders_user FOREIGN KEY (user_id) REFERENCES users(id) NOT VALID;

-- Step 2: Validate FK (scans but allows writes) ALTER TABLE orders VALIDATE CONSTRAINT fk_orders_user;

Key Decisions

Decision Recommendation Rationale

Tool choice pgroll for automation Handles dual-writes via triggers automatically

Column Rename Add new + copy + drop old Direct RENAME blocks reads

Constraint Timing Add NOT VALID first, VALIDATE separately NOT VALID is non-blocking

Rollback Window Keep old schema 24-72 hours Allows safe rollback if issues

Backfill Batch Size 1000-10000 rows per batch Prevents lock escalation

Index Strategy CONCURRENTLY always Standard CREATE INDEX locks table

Verification Check pg_stat_statements Ensure no queries use old columns

Monitoring During Migration

-- Check for locks during migration SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state FROM pg_stat_activity WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes' AND state != 'idle';

-- Check replication lag (if using replicas) SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn, (sent_lsn - replay_lsn) AS replication_lag FROM pg_stat_replication;

-- Monitor backfill progress SELECT COUNT() FILTER (WHERE display_name IS NOT NULL) as migrated, COUNT() FILTER (WHERE display_name IS NULL) as remaining, ROUND(100.0 * COUNT() FILTER (WHERE display_name IS NOT NULL) / COUNT(), 2) as pct_complete FROM users;

Anti-Patterns (FORBIDDEN)

-- FORBIDDEN: Single-step ALTER that locks table ALTER TABLE users RENAME COLUMN name TO full_name; -- Impact: Blocks ALL queries during metadata lock

-- FORBIDDEN: Add NOT NULL to existing column directly ALTER TABLE orders ADD COLUMN org_id UUID NOT NULL; -- Impact: Fails immediately if table has data

-- FORBIDDEN: Regular CREATE INDEX on large table CREATE INDEX idx_big_table_col ON big_table(col); -- Impact: Locks table for minutes/hours

-- FORBIDDEN: Drop column without verification period ALTER TABLE users DROP COLUMN legacy_field; -- Impact: No rollback if application still references it

-- FORBIDDEN: Constraint validation in same transaction as creation ALTER TABLE orders ADD CONSTRAINT fk_org FOREIGN KEY (org_id) REFERENCES orgs(id); -- Impact: Full table scan with exclusive lock

-- FORBIDDEN: Backfill without batching UPDATE users SET new_col = old_col; -- Impact: Locks entire table, fills transaction log

Related Skills

  • alembic-migrations

  • Python migration framework with expand-contract support

  • database-schema-designer

  • Schema design patterns and normalization principles

  • database-versioning

  • Version control and change management for schemas

Capability Details

expand-contract

Keywords: expand contract, zero downtime, online migration, safe deploy, pgroll Solves:

  • How do I rename a column without downtime?

  • Safe production schema changes

  • Rolling deployments with schema changes

online-index

Keywords: concurrent index, non-blocking index, large table index Solves:

  • Create index without locking

  • Index creation on production

  • PostgreSQL CONCURRENTLY pattern

constraint-migration

Keywords: not valid constraint, foreign key migration, check constraint safe Solves:

  • Add constraints without downtime

  • Foreign key on existing data

  • Validate constraints safely

pgroll-automation

Keywords: pgroll, versioned schema, automatic dual-write, schema versioning Solves:

  • Automate expand-contract pattern

  • Multiple app versions during migration

  • Automatic rollback support

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

responsive-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
General

domain-driven-design

No summary provided by upstream source.

Repository SourceNeeds Review
General

dashboard-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
General

rag-retrieval

No summary provided by upstream source.

Repository SourceNeeds Review