fal-generate

Generate AI images, videos, audio and 3D content using fal.ai's 500+ models. Supports Kling V3, FLUX.2, Grok Imagine, Veo 3.1, MiniMax, Hunyuan 3D, OpenRouter LLMs and more via queue-based generation.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "fal-generate" with this command: npx skills add lovisdotio/skills-fal-ai/lovisdotio-skills-fal-ai-fal-generate

fal-generate Skill

Overview

This skill enables AI content generation through fal.ai's latest models using a queue-based system. It supports:

  • Text-to-Image - Generate images from text prompts
  • Image-to-Image - Edit and transform existing images
  • Text-to-Video - Create videos from text descriptions
  • Image-to-Video - Animate images into videos
  • Text-to-Speech - Generate natural speech from text
  • Speech-to-Text - Transcribe audio to text
  • Text-to-3D - Create 3D models from text
  • Image-to-3D - Convert images to 3D models
  • LLM / VLM / ALM - Run any LLM, vision, audio or video model via OpenRouter

Scripts

ScriptPurpose
scripts/generate.shMain generation tool with queue management
scripts/upload.shUpload files to fal CDN (returns URL)
scripts/poll.shPoll queue status until completion
scripts/models.shSearch and discover models

Prerequisites

export FAL_KEY="your-api-key"

Output Format

All generation scripts output JSON to stdout when using --wait. The JSON contains URLs to the generated content:

  • Images: {"images": [{"url": "https://fal.media/files/...", "width": 1024, "height": 1024}]}
  • Videos: {"video": {"url": "https://fal.media/files/...mp4"}}
  • Audio/TTS: {"audio": {"url": "https://fal.media/files/...mp3"}} or {"audio_url": "https://..."}
  • 3D Models: {"model_mesh": {"url": "https://fal.media/files/...glb"}}
  • Transcription: {"text": "transcribed content..."}
  • OpenRouter: {"output": "LLM response text..."}

Without --wait, prints the request ID. With --async, prints only the request ID for later polling.


Examples by Category

Text-to-Image

# Basic image generation
./scripts/generate.sh -m fal-ai/kling-image/v3/text-to-image \
  -p "A majestic mountain at sunrise, cinematic lighting" -w

# With aspect ratio and seed
./scripts/generate.sh -m fal-ai/flux-2/klein/9b \
  -p "Professional headshot, studio lighting" \
  --aspect-ratio "1:1" --seed 42 -w

# Ultra-fast generation
./scripts/generate.sh -m fal-ai/z-image/turbo \
  -p "Quick concept sketch of a robot" -w

# With custom parameters (inference steps, guidance scale)
./scripts/generate.sh -m fal-ai/flux-2/klein/9b \
  -p "Detailed portrait of a scientist" \
  --param num_inference_steps=28 --param guidance_scale=3.5 -w

Image-to-Image (Edit/Transform)

# Upload local image first
IMAGE_URL=$(./scripts/upload.sh ~/photos/portrait.jpg)

# Edit with instructions
./scripts/generate.sh -m fal-ai/qwen-image-max/edit \
  --image-url "$IMAGE_URL" \
  -p "Make the background a sunset beach" -w

# Style transfer
./scripts/generate.sh -m fal-ai/glm-image/image-to-image \
  --image-url "$IMAGE_URL" \
  -p "Convert to oil painting style" -w

Text-to-Video

# Cinematic video with audio (Kling V3 Pro)
./scripts/generate.sh -m fal-ai/kling-video/v3/pro/text-to-video \
  -p "A butterfly emerging from a cocoon in slow motion, macro lens" \
  --duration 5 -w

# Fast video generation
./scripts/generate.sh -m fal-ai/ltx-2-19b/distilled/text-to-video \
  -p "Drone shot flying over a city at golden hour" -w

# Google Veo 3.1 with sound
./scripts/generate.sh -m fal-ai/veo3.1 \
  -p "A cat playing piano, realistic" -w

Image-to-Video (Animate Images)

IMAGE_URL=$(./scripts/upload.sh ~/photos/landscape.jpg)

# Animate a still photo
./scripts/generate.sh -m fal-ai/kling-video/o3/pro/image-to-video \
  --image-url "$IMAGE_URL" \
  -p "Gentle wind moving through the trees, clouds drifting" -w

# Lip-sync avatar from image + audio
AUDIO_URL=$(./scripts/upload.sh ~/audio/speech.mp3)
./scripts/generate.sh -m fal-ai/longcat-multi-avatar/image-audio-to-video \
  --image-url "$IMAGE_URL" --audio-url "$AUDIO_URL" -w

Text-to-Speech

# High-quality TTS (MiniMax)
./scripts/generate.sh -m fal-ai/minimax/speech-2.8-hd \
  -t "Hello! Welcome to the future of AI-generated content." -w

# Fast TTS
./scripts/generate.sh -m fal-ai/minimax/speech-2.8-turbo \
  -t "This is a quick test of fast speech generation." -w

# Custom voice with Qwen-3 TTS
./scripts/generate.sh -m fal-ai/qwen-3-tts/text-to-speech/1.7b \
  -t "Custom voice synthesis with natural intonation." -w

Voice Cloning

# Upload a voice sample (10+ seconds recommended)
VOICE_URL=$(./scripts/upload.sh ~/audio/voice-sample.wav)

# Clone and generate speech
./scripts/generate.sh -m fal-ai/qwen-3-tts/clone-voice/1.7b \
  --audio-url "$VOICE_URL" \
  -t "This sentence will be spoken in the cloned voice." -w

Speech-to-Text (Transcription)

AUDIO_URL=$(./scripts/upload.sh ~/recordings/meeting.mp3)

# Fast transcription
./scripts/generate.sh -m fal-ai/nemotron/asr \
  --audio-url "$AUDIO_URL" -w

# Accurate transcription with timestamps
./scripts/generate.sh -m fal-ai/elevenlabs/speech-to-text/scribe-v2 \
  --audio-url "$AUDIO_URL" -w

Text-to-3D

# Detailed 3D model from text
./scripts/generate.sh -m fal-ai/hunyuan-3d/v3.1/pro/text-to-3d \
  -p "A detailed medieval sword with ornate handle" -w

# Fast 3D generation
./scripts/generate.sh -m fal-ai/hunyuan-3d/v3.1/rapid/text-to-3d \
  -p "Simple wooden chair" -w

Image-to-3D

IMAGE_URL=$(./scripts/upload.sh ~/photos/object.jpg)

# Convert image to 3D model
./scripts/generate.sh -m fal-ai/hunyuan-3d/v3.1/rapid/image-to-3d \
  --image-url "$IMAGE_URL" -w

# High-fidelity geometry
./scripts/generate.sh -m fal-ai/ultrashape \
  --image-url "$IMAGE_URL" -w

OpenRouter — Run Any LLM

# Text chat with any LLM (GPT-5, Claude, Gemini, Llama 4, etc.)
./scripts/generate.sh -m openrouter/router \
  -p "Explain quantum computing in simple terms" \
  --param model=google/gemini-2.5-flash -w

# Vision — analyze an image
IMAGE_URL=$(./scripts/upload.sh ~/photos/chart.png)
./scripts/generate.sh -m openrouter/router/vision \
  --image-url "$IMAGE_URL" \
  -p "Describe what you see in this image" \
  --param model=google/gemini-2.5-flash -w

# Audio — process audio with an ALM
AUDIO_URL=$(./scripts/upload.sh ~/audio/podcast.mp3)
./scripts/generate.sh -m openrouter/router/audio \
  --audio-url "$AUDIO_URL" \
  -p "Summarize the key points discussed" \
  --param model=google/gemini-2.5-flash -w

# Video — analyze a video
VIDEO_URL=$(./scripts/upload.sh ~/videos/demo.mp4)
./scripts/generate.sh -m openrouter/router/video \
  --video-url "$VIDEO_URL" \
  -p "Describe what happens in this video" \
  --param model=google/gemini-2.5-flash -w

Usage Patterns

Queue Mode (Default) — submit and poll

./scripts/generate.sh -m fal-ai/flux-2/klein/9b -p "Portrait" --wait

Async Mode — get ID, poll later

REQUEST_ID=$(./scripts/generate.sh -m fal-ai/kling-video/v3/pro/text-to-video \
  -p "Drone flying over a city" --async)
./scripts/poll.sh fal-ai/kling-video/v3/pro/text-to-video $REQUEST_ID

File Upload — local files to fal CDN

IMAGE_URL=$(./scripts/upload.sh ~/photos/portrait.jpg)
./scripts/generate.sh -m fal-ai/kling-video/o3/pro/image-to-video \
  --image-url "$IMAGE_URL" -p "Gentle wind blowing through hair" -w

Common Parameters

ParameterDescriptionExample
-m, --modelModel endpoint (required)fal-ai/kling-image/v3/text-to-image
-p, --promptText description"A sunset over mountains"
-t, --textText for TTS models"Hello world"
--image-urlInput image URL"https://..."
--video-urlInput video URL"https://..."
--audio-urlInput audio URL"https://..."
--aspect-ratioOutput ratio"16:9", "9:16", "1:1"
--durationVideo length (sec)5, 10
--seedReproducibility12345
-w, --waitPoll until done(flag)
-a, --asyncReturn ID only(flag)
--paramExtra param (repeatable)num_inference_steps=28

Environment Variables

VariableRequiredDescription
FAL_KEYYesAPI authentication key
FAL_WEBHOOKNoWebhook URL for callbacks

Tips

  1. Always use --wait or --async — Without either, you get the request ID + a manual curl command
  2. Use --param for advanced control — Pass any model-specific parameter: --param guidance_scale=7.5
  3. Check model schema./scripts/models.sh --schema <endpoint> to see all available params
  4. Upload files first — Use ./scripts/upload.sh for local images/audio/video before generation
  5. Use seeds — Same seed = same output for reproducible results
  6. Pro vs Standard — Pro = better quality + longer generation; Standard = cost-effective
  7. Flash/Turbo/Distilled — Best for previews and fast iterations

Model Catalog

Image Generation — February 2026

EndpointDescription
fal-ai/kling-image/v3/text-to-imageKling V3: Latest Kling image model
fal-ai/kling-image/v3/image-to-imageKling V3 image transformation
fal-ai/kling-image/o3/text-to-imageKling Omni 3: Top-tier consistency
fal-ai/kling-image/o3/image-to-imageKling Omni 3 image editing
xai/grok-imagine-imagexAI Grok Imagine: Highly aesthetic
xai/grok-imagine-image/editGrok Imagine editing
fal-ai/hunyuan-image/v3/instruct/text-to-imageHunyuan 3.0 Instruct
fal-ai/hunyuan-image/v3/instruct/editHunyuan 3.0 editing
fal-ai/qwen-image-max/text-to-imageQwen Image Max: Enhanced realism
fal-ai/qwen-image-max/editQwen Image Max editing
fal-ai/z-image/baseZ-Image Base: 6B fast model

Image Generation — January 2026

EndpointDescription
fal-ai/flux-2/klein/9bFLUX.2 Klein 9B: Photorealism & text
fal-ai/flux-2/klein/9b/editFLUX.2 Klein 9B editing
fal-ai/flux-2/klein/9b/base/loraFLUX.2 Klein 9B with LoRA
fal-ai/flux-2/klein/4bFLUX.2 Klein 4B: Lightweight
fal-ai/glm-imageGLM Image: Accurate text rendering
bria/fibo-edit/editBria Fibo Edit: Multi-tool editing
bria/fibo-edit/blendBria Fibo composition
bria/fibo-edit/relightBria Fibo relighting
bria/fibo-edit/restyleBria Fibo artistic styles
bria/fibo-lite/generateBria Fibo Lite: Fast generation
imagineart/imagineart-1.5-pro-preview/text-to-imageImagineArt 1.5 Pro: 4K

Image Generation — December 2025

EndpointDescription
fal-ai/flux-2-maxFLUX.2 Max: State-of-the-art
fal-ai/flux-2/turboFLUX.2 Turbo: Fast generation
fal-ai/flux-2/flashFLUX.2 Flash: Ultra-fast
fal-ai/gpt-image-1.5GPT Image 1.5: Strong prompt adherence
fal-ai/bytedance/seedream/v4.5/text-to-imageSeedream 4.5: ByteDance
fal-ai/z-image/turboZ-Image Turbo: 6B super fast
fal-ai/qwen-image-2512Qwen Image 2512

Video Generation — February 2026

EndpointDescription
fal-ai/kling-video/v3/pro/text-to-videoKling 3.0 Pro: Cinematic + audio
fal-ai/kling-video/v3/standard/text-to-videoKling 3.0 Standard
fal-ai/kling-video/v3/pro/image-to-videoKling 3.0 Pro I2V
fal-ai/kling-video/v3/standard/image-to-videoKling 3.0 Standard I2V
fal-ai/kling-video/o3/pro/text-to-videoKling O3 Pro: Realistic
fal-ai/kling-video/o3/pro/image-to-videoKling O3 Pro I2V
fal-ai/kling-video/o3/pro/reference-to-videoKling O3 character consistency
xai/grok-imagine-video/text-to-videoGrok Video with audio
xai/grok-imagine-video/image-to-videoGrok Video I2V

Video Generation — January 2026

EndpointDescription
fal-ai/vidu/q3/text-to-videoVidu Q3 T2V
fal-ai/vidu/q3/image-to-videoVidu Q3 I2V
fal-ai/pixverse/v5.6/text-to-videoPixverse V5.6 T2V
fal-ai/pixverse/v5.6/image-to-videoPixverse V5.6 I2V
fal-ai/ltx-2-19b/text-to-videoLTX-2 19B: Video + audio
fal-ai/ltx-2-19b/image-to-videoLTX-2 19B I2V
fal-ai/ltx-2-19b/distilled/text-to-videoLTX-2 Distilled: Fast
fal-ai/longcat-multi-avatar/image-audio-to-videoLongCat: Lip-sync avatar

Video Generation — December 2025

EndpointDescription
fal-ai/veo3.1Veo 3.1: Google's best + sound
fal-ai/veo3.1/fastVeo 3.1 Fast
fal-ai/veo3.1/image-to-videoVeo 3.1 I2V
fal-ai/veo3.1/extend-videoVeo 3.1 Extend: Up to 30s
fal-ai/hunyuan-video-v1.5/text-to-videoHunyuan Video 1.5 T2V
fal-ai/bytedance/seedance/v1.5/pro/text-to-videoSeedance 1.5 Pro
fal-ai/kandinsky5-pro/text-to-videoKandinsky 5 Pro
fal-ai/live-avatarLive Avatar: Real-time
clarityai/crystal-video-upscalerCrystal Video Upscaler
fal-ai/creatify/auroraCreatify Aurora: Studio avatars

Audio — February 2026

EndpointDescription
fal-ai/minimax/speech-2.8-hdMiniMax 2.8 HD: Best TTS
fal-ai/minimax/speech-2.8-turboMiniMax 2.8 Turbo: Fast TTS

Audio — January 2026

EndpointDescription
fal-ai/qwen-3-tts/text-to-speech/1.7bQwen-3 TTS 1.7B: Custom voices
fal-ai/qwen-3-tts/text-to-speech/0.6bQwen-3 TTS 0.6B: Lightweight
fal-ai/qwen-3-tts/clone-voice/1.7bQwen-3 Voice Clone: Zero-shot
fal-ai/qwen-3-tts/clone-voice/0.6bQwen-3 Voice Clone Light
fal-ai/qwen-3-tts/voice-design/1.7bQwen-3 Voice Design
fal-ai/nemotron/asrNemotron ASR: Fast STT
fal-ai/nemotron/asr/streamNemotron ASR Streaming
fal-ai/elevenlabs/voice-changerElevenLabs Voice Changer
fal-ai/elevenlabs/speech-to-text/scribe-v2ElevenLabs Scribe V2
fal-ai/deepfilternet3DeepFilterNet3: Noise removal

Audio — December 2025

EndpointDescription
fal-ai/sam-audio/separateSAM Audio: Text-guided separation
fal-ai/elevenlabs/musicElevenLabs Music
fal-ai/maya/batchMaya: Expressive voice
fal-ai/demucsDemucs: SOTA stemming
fal-ai/index-tts-2/text-to-speechIndex TTS 2.0

3D Generation — February 2026

EndpointDescription
fal-ai/hunyuan-3d/v3.1/pro/text-to-3dHunyuan 3D V3.1 Pro: Text to 3D
fal-ai/hunyuan-3d/v3.1/pro/image-to-3dHunyuan 3D V3.1 Pro: Image to 3D
fal-ai/hunyuan-3d/v3.1/rapid/text-to-3dHunyuan 3D Rapid: Fast
fal-ai/hunyuan-3d/v3.1/rapid/image-to-3dHunyuan 3D Rapid I2-3D
fal-ai/ultrashapeUltraShape: High-fidelity geometry

3D Generation — December 2025

EndpointDescription
fal-ai/trellis-2Trellis 2: Versatile 3D
fal-ai/hunyuan3d-v3/text-to-3dHunyuan 3D V3
fal-ai/hunyuan-motionHunyuan Motion: 3D animation
fal-ai/meshy/v6-preview/text-to-3dMeshy V6 Preview

OpenRouter Endpoints

Access 100+ LLMs via OpenRouter. Use --param model=<provider/model> to select the model.

Text (LLM)

EndpointDescription
openrouter/routerAny LLM: GPT-5, Claude, Gemini, Llama 4, Mistral
openrouter/router/streamLLM with streaming
openrouter/router/enterpriseEnterprise LLM (enhanced SLA)
openrouter/router/enterprise/streamEnterprise LLM streaming

Vision (VLM)

EndpointDescription
openrouter/router/visionAny VLM: Image analysis with GPT-5, Gemini, Claude
openrouter/router/vision/streamVision streaming
openrouter/router/vision/enterpriseEnterprise vision
openrouter/router/vision/enterprise/streamEnterprise vision streaming

Audio (ALM)

EndpointDescription
openrouter/router/audioAny ALM: Audio analysis with Gemini
openrouter/router/audio/streamAudio streaming
openrouter/router/audio/enterpriseEnterprise audio
openrouter/router/audio/enterprise/streamEnterprise audio streaming

Video (VLM)

EndpointDescription
openrouter/router/videoAny Video LM: Video analysis with Gemini
openrouter/router/video/streamVideo streaming
openrouter/router/video/enterpriseEnterprise video
openrouter/router/video/enterprise/streamEnterprise video streaming

OpenAI-Compatible

EndpointDescription
openrouter/router/openai/v1/chat/completionsOpenAI Chat Completions API
openrouter/router/openai/v1/responsesOpenAI Responses API
openrouter/router/openai/v1/embeddingsOpenAI Embeddings API

Model Selection Guide

Use CaseRecommended
Best imagefal-ai/kling-image/o3/text-to-image
Fastest imagefal-ai/z-image/turbo
Photorealisticfal-ai/flux-2/klein/9b
Image editingfal-ai/qwen-image-max/edit
Best videofal-ai/kling-video/v3/pro/text-to-video
Fastest videofal-ai/ltx-2-19b/distilled/text-to-video
Video + audioxai/grok-imagine-video/text-to-video
Animate imagefal-ai/kling-video/o3/pro/image-to-video
Best TTSfal-ai/minimax/speech-2.8-hd
Voice clonefal-ai/qwen-3-tts/clone-voice/1.7b
Transcriptionfal-ai/nemotron/asr
3D from textfal-ai/hunyuan-3d/v3.1/pro/text-to-3d
3D from imagefal-ai/hunyuan-3d/v3.1/rapid/image-to-3d
Any LLMopenrouter/router
Vision/Audioopenrouter/router/vision or /audio

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

fal-generate

No summary provided by upstream source.

Repository SourceNeeds Review
General

fal-generate

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

fal-generate

No summary provided by upstream source.

Repository SourceNeeds Review
General

Workspace Trash

Soft-delete protection for workspace files. Intercept file deletions and move them to a recoverable trash instead of permanent removal. Use when deleting, re...

Registry SourceRecently Updated