Image Generation

Generate, edit, and upscale images with standardized quality tiers and embedded best practices.

Quick Start

Need image? ├─ Text/Logo → bun scripts/gen.ts "..." --text [-t tier] ├─ Photo/Art → bun scripts/gen.ts "..." [-t tier] ├─ Edit existing → bun scripts/edit.ts <img> "..." [-t tier] ├─ Upscale → bun scripts/upscale.ts <img> [-t tier] ├─ Vectorize → bun scripts/svg.ts <img> ($0.01/img) └─ Remove BG → bun scripts/rembg.ts <img> (FREE)

Tier selection: ├─ iterate → FREE drafts (~96/day via Cloudflare) ├─ default → Daily driver ($0.008/MP) ├─ premium → Final assets ($0.03/MP) └─ max → Critical work, SOTA ($0.06-0.07/MP)

Entry Points

Script Purpose

bun scripts/gen.ts

Text → Image

bun scripts/edit.ts

Image + Instruction → Image

bun scripts/upscale.ts

Image → Larger Image

bun scripts/svg.ts

Image → SVG ($0.01/img)

bun scripts/rembg.ts

Remove background (FREE)

Prompting Best Practices

CRITICAL: Good prompts are the difference between unusable output and production-ready assets.

The Universal Prompt Structure

[Subject] + [Action/Pose] + [Environment] + [Style/Medium] + [Lighting] + [Camera/Composition]

Example:

"A cybernetic owl perched on a neon sign in a rain-soaked alley. Cinematic lighting with teal and orange highlights. Shot on 35mm film, shallow depth of field, hyper-detailed textures."

DO: Effective Prompting

Technique Example

Be specific "middle-aged man with salt-and-pepper hair wearing charcoal turtleneck" NOT "a man"

Describe the result "person with clear eyes" NOT "remove glasses"

Use camera terms "Shot on Hasselblad, 85mm lens, f/1.8"

Specify lighting "golden hour rim lighting with deep shadows"

Include textures "weathered sandstone", "anodized aluminum", "iridescent silk"

DON'T: Common Mistakes

Mistake Problem Fix

Negative phrasing "no glasses" often adds glasses Describe what IS there

Vague subjects AI interprets randomly Be exhaustively specific

Keyword salad "4k, trending, masterpiece" is noise Use descriptive sentences

Short prompts Under 20 words underperforms Aim for 40-80 words

Style Keywords That Work

Category Keywords

Lighting golden hour, volumetric lighting, Rembrandt lighting, neon rim light, bioluminescent

Camera 35mm anamorphic, macro photography, tilt-shift, fisheye, drone shot

Style cinematic, photorealistic, concept art, ukiyo-e, baroque, impressionist

Quality hyper-detailed, sharp focus, 8k resolution, raytraced

Text & Logo Generation (--text flag)

Uses Recraft V3 (iterate/default) or Ideogram V3 (premium/max) - specialized for typography.

Text Prompting Rules

CRITICAL: Put text in "Double Quotes" at the START of your prompt.

Correct - text first, then describe

bun scripts/gen.ts '"QUANTUM" in bold futuristic font, metallic silver, dark space background' --text

Wrong - text buried in description

bun scripts/gen.ts 'A logo with the word QUANTUM on it' --text

Logo Design Patterns

Style Prompt Pattern

Minimalist "BRAND" minimalist vector logo, clean lines, simple geometry, flat design

Vintage "EST. 1920" vintage badge logo, circular emblem, ribbon banner, ornate border

Negative space "PEAK" logo where the letter A forms a mountain, negative space design

3D/Modern "TECHCORP" bold 3D chrome letters, gradient fill, dark background

Font Specification

Use typography terms: modern sans-serif , elegant script , bold blocky , blackletter , neon tubing , retro 70s serif

DO/DON'T for Text

DO DON'T

"Three cats playing" (exact count) "cats playing" (random count)

"wooden baseball bat" (specific) "bat" (ambiguous)

Describe only what you want "no cake" (will add cake)

Image Editing

bun scripts/edit.ts <image> <instruction> [-t TIER] [--mask <mask.png>] [--ref <img>...]

Writing Edit Instructions

Key: Describe the TARGET STATE, not the change.

Bad Instruction Good Instruction

"change car to blue" "A sleek blue metallic sports car, reflections of neon lights on wet asphalt"

"add a hat" "person wearing a vintage red fedora, matching the scene lighting"

"remove background" Use rembg.ts instead (FREE and better)

Mask Best Practices

Task Mask Strategy

Object removal Mask LARGER than object (10-20px margin) for seamless fill

Object addition Mask exact shape or slightly smaller

Outpainting Overlap 10-20px INTO original image

Feathering: Apply 12-16px blur to masks. Sharp masks = visible seams.

Multi-Reference Editing (--ref)

Using 2+ reference images auto-selects max tier (flux-2-flex).

Style transfer: apply reference style to base image

bun scripts/edit.ts base.jpg "in the style of the reference" --ref style.jpg

Multi-reference blending

bun scripts/edit.ts scene.jpg "forest sofa scene" --ref forest.jpg --ref sofa.jpg

Tip: When blending references, describe their relationship: "A velvet sofa placed in a misty pine forest"

Upscaling

bun scripts/upscale.ts <image> [-t TIER] [--scale 2|4]

When to Use 2x vs 4x

Source Quality Recommendation

High (RAW, clean PNG) 4x safe - AI infers detail accurately

Medium (standard JPEG) 2x preferred - denoise first if possible

Low (compressed, blurry) 2x max - noise gets magnified

Use Case Guidelines

Output Scale Notes

Web/UI 2x Reduces file size, improves perceived sharpness

Print (300 DPI) 4x Target 300 DPI for print quality

Icons/Logos 2x Use svg.ts instead for infinite scaling

Common Artifacts & Fixes

Artifact Cause Prevention

Haloing (white edges) Aggressive sharpening Use iterate/default tier

Plasticky skin Over-smoothing Reduce to 2x, use premium tier

Grid patterns Tile processing Use higher tier models

Rule of Thumb: If image looks "crunchy" at 100% zoom, don't exceed 2x.

Tier Selection Guide

Scenario Tier Why

Exploring 10+ variations iterate

FREE, fast iteration

Daily work, 3-5 variations default

Best cost/quality balance

Client deliverables premium

Higher fidelity

Critical assets, multi-ref max

SOTA quality, advanced features

Text/logos (any) default

Recraft V3 already excellent

Text/logos (critical) premium

Ideogram V3 for perfect typography

Cost Optimization

EXPENSIVE WORKFLOW (avoid): Generate at max tier → iterate on max → deliver

COST-EFFECTIVE WORKFLOW (recommended): Generate at iterate (FREE) → find best concept → Regenerate winner at default/premium → deliver

Environment

For FREE iterate generation (Cloudflare)

CLOUDFLARE_ACCOUNT_ID=xxx CLOUDFLARE_API_TOKEN=xxx

For paid tiers (Fal.ai)

FAL_API_KEY=xxx

Quota: Cloudflare FREE tier allows ~96 images/day at 1024x1024.

Exit Codes

Code Meaning Action

0 Success Image saved to .ada/data/images/

1 General error Check error message

2 Config/auth error Verify API keys in .env

3 Resource limit Quota exceeded - wait 24h or use paid tier

CRITICAL: Exit code 3 does NOT fall back to paid tier. This prevents accidental charges.

Integration

Skill When to Use Together

ui-animation

Animate generated images for web/mobile

docs-write

Document image assets and parameters used

Find prompting resources and style references

code-quality

After modifying skill scripts

References

references/usage-guide.md
Extended prompting guide, error codes, testing
README.md
Architecture diagrams, model reference, CLI details
Fal.ai Docs - Official API documentation

Output

Images saved to .ada/data/images/ with timestamped filenames:

20260118_gen_default_cyberpunk_city.jpg 20260118_svg_default_logo_vector.svg

image-generation

Safety Notice

Copy this and send it to your AI assistant to learn