db-seed

Database Seed Generator

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "db-seed" with this command: npx skills add jezweb/claude-skills/jezweb-claude-skills-db-seed

Database Seed Generator

Generate seed scripts that populate databases with realistic, domain-appropriate sample data. Reads your schema and produces ready-to-run seed files.

Workflow

  1. Find the Schema

Scan the project for schema definitions:

Source Location pattern

Drizzle schema src/db/schema.ts , src/schema/*.ts , db/schema.ts

D1 migrations drizzle/.sql , migrations/.sql

Raw SQL schema.sql , db/*.sql

Prisma prisma/schema.prisma

Read all schema files. Build a mental model of:

  • Tables and their columns

  • Data types and constraints (NOT NULL, UNIQUE, DEFAULT)

  • Foreign key relationships (which tables reference which)

  • JSON fields stored as TEXT (common in D1/SQLite)

  1. Determine Seed Parameters

Ask the user:

Parameter Options Default

Purpose dev, demo, testing dev

Volume small (5-10 rows/table), medium (20-50), large (100+) small

Domain context "e-commerce store", "SaaS app", "blog", etc. Infer from schema

Output format TypeScript (Drizzle), raw SQL, or both Match project's ORM

Purpose affects data quality:

  • dev: Varied data, some edge cases (empty fields, long strings, unicode)

  • demo: Polished data that looks good in screenshots and presentations

  • testing: Systematic data covering boundary conditions, duplicates, special characters

  1. Plan Insert Order

Build a dependency graph from foreign keys. Insert parent tables before children.

Example order for a blog schema:

  1. users (no dependencies)
  2. categories (no dependencies)
  3. posts (depends on users, categories)
  4. comments (depends on users, posts)
  5. tags (no dependencies)
  6. post_tags (depends on posts, tags)

Circular dependencies: If table A references B and B references A, use nullable foreign keys and insert in two passes (insert with NULL, then UPDATE).

  1. Generate Realistic Data

Do NOT use generic placeholders like "test123", "foo@bar.com", or "Lorem ipsum". Generate data that matches the domain.

Data Generation Patterns (no external libraries needed)

Names: Use a hardcoded list of common names. Mix genders and cultural backgrounds.

const firstNames = ['Sarah', 'James', 'Priya', 'Mohammed', 'Emma', 'Wei', 'Carlos', 'Aisha']; const lastNames = ['Chen', 'Smith', 'Patel', 'Garcia', 'Kim', 'O'Brien', 'Nguyen', 'Wilson'];

Emails: Derive from names — sarah.chen@example.com . Use example.com domain (RFC 2606 reserved).

Dates: Generate within a realistic range. Use ISO 8601 format for D1/SQLite.

const randomDate = (daysBack: number) => { const d = new Date(); d.setDate(d.getDate() - Math.floor(Math.random() * daysBack)); return d.toISOString(); };

IDs: Use crypto.randomUUID() for UUIDs, or sequential integers if the schema uses auto-increment.

Deterministic seeding: For reproducible data, use a seeded PRNG:

function seededRandom(seed: number) { return () => { seed = (seed * 16807) % 2147483647; return (seed - 1) / 2147483646; }; } const rand = seededRandom(42); // Same seed = same data every time

Prices/amounts: Use realistic ranges. (rand() * 900 + 100).toFixed(2) for $1-$10 range.

Descriptions/content: Write 3-5 realistic variations per content type and cycle through them. Don't generate AI-sounding prose — write like real user data.

  1. Output Format

TypeScript (Drizzle ORM)

// scripts/seed.ts import { drizzle } from 'drizzle-orm/d1'; import * as schema from '../src/db/schema';

export async function seed(db: ReturnType<typeof drizzle>) { console.log('Seeding database...');

// Clear existing data (reverse dependency order) await db.delete(schema.comments); await db.delete(schema.posts); await db.delete(schema.users);

// Insert users const users = [ { id: crypto.randomUUID(), name: 'Sarah Chen', email: 'sarah@example.com', ... }, // ... ];

// D1 batch limit: 10 rows per INSERT for (let i = 0; i < users.length; i += 10) { await db.insert(schema.users).values(users.slice(i, i + 10)); }

// Insert posts (references users) const posts = [ { id: crypto.randomUUID(), userId: users[0].id, title: '...', ... }, // ... ];

for (let i = 0; i < posts.length; i += 10) { await db.insert(schema.posts).values(posts.slice(i, i + 10)); }

console.log(Seeded: ${users.length} users, ${posts.length} posts); }

Run with: npx tsx scripts/seed.ts

For Cloudflare Workers, add a seed endpoint (remove before production):

app.post('/api/seed', async (c) => { const db = drizzle(c.env.DB); await seed(db); return c.json({ ok: true }); });

Raw SQL (D1)

-- seed.sql -- Run: npx wrangler d1 execute DB_NAME --local --file=./scripts/seed.sql

-- Clear existing (reverse order) DELETE FROM comments; DELETE FROM posts; DELETE FROM users;

-- Users INSERT INTO users (id, name, email, created_at) VALUES ('uuid-1', 'Sarah Chen', 'sarah@example.com', '2025-01-15T10:30:00Z'), ('uuid-2', 'James Wilson', 'james@example.com', '2025-02-01T14:22:00Z');

-- Posts (max 10 rows per INSERT for D1) INSERT INTO posts (id, user_id, title, body, created_at) VALUES ('post-1', 'uuid-1', 'Getting Started', 'Welcome to...', '2025-03-01T09:00:00Z');

  1. Idempotency

Seed scripts must be safe to re-run:

// Option A: Delete-then-insert (simple, loses data) await db.delete(schema.users); await db.insert(schema.users).values(seedUsers);

// Option B: Upsert (preserves non-seed data) for (const user of seedUsers) { await db.insert(schema.users) .values(user) .onConflictDoUpdate({ target: schema.users.id, set: user }); }

Default to Option A for dev/testing, Option B for demo (where users may have added their own data).

D1-Specific Gotchas

Gotcha Solution

Max ~10 rows per INSERT Batch inserts in chunks of 10

No native BOOLEAN Use INTEGER (0/1)

No native DATETIME Use TEXT with ISO 8601 strings

JSON stored as TEXT JSON.stringify() before insert

Foreign keys always enforced Insert parent tables first

100 bound parameter limit Keep batch size × columns < 100

Quality Rules

  • Match the domain — an e-commerce seed has products with real-sounding names and prices, not "Product 1"

  • Vary the data — don't make every user "John Smith" or every price "$9.99"

  • Include edge cases (for testing seeds) — empty strings, very long text, special characters, maximum values

  • Reference real IDs — foreign keys must point to actually-inserted parent rows

  • Print what was seeded — always log counts so the user knows it worked

  • Document the run command — put it in a comment at the top of the file

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

tailwind-v4-shadcn

No summary provided by upstream source.

Repository SourceNeeds Review
2.7K-jezweb
General

tanstack-query

No summary provided by upstream source.

Repository SourceNeeds Review
2.5K-jezweb
General

fastapi

No summary provided by upstream source.

Repository SourceNeeds Review
General

zustand-state-management

No summary provided by upstream source.

Repository SourceNeeds Review
1.2K-jezweb