Visla Video Generation
Version: 260501-1423
Create AI-generated videos from text scripts, web URLs, or documents (PPT/PDF) using Visla's OpenAPI.
Before You Start
Credentials (NEVER output API keys/secrets in responses):
IMPORTANT: Never output API keys/secrets in responses.
- Check if
~/.config/visla/.credentialsexists (do NOT read it yet). - If the file exists, use a choice-based confirmation to ask the user:
"Found saved credentials. Allow reading
~/.config/visla/.credentials?" Options: Allow / No - If the user selects Allow: proceed with the command.
- If the user selects No, or the file does not exist:
Ask the user to provide credentials via one of:
- Environment variables (
VISLA_API_KEY,VISLA_API_SECRET) - CLI arguments (
--key,--secret) - Direct input of API key and secret
- Environment variables (
- If provided credentials fail with
VISLA_CLI_ERROR_CODE=missing_credentialsorVISLA_CLI_ERROR_CODE=auth_failed, ask the user to re-enter valid credentials.
Only process local files (scripts/docs) explicitly provided by the user, and remind users to avoid uploading sensitive data.
- Tell the user: this is a one-time setup (once configured, they won't need to do this again)
- Tell the user: get API Key and Secret from https://www.visla.us/visla-api
- Do not repeat the secrets back in the response.
Credential validity check (practical):
- If credentials exist but running
accountfails withVISLA_CLI_ERROR_CODE=missing_credentialsorVISLA_CLI_ERROR_CODE=auth_failed, treat credentials as invalid and ask the user to provide real ones.
File format (bash/zsh):
export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"
For PowerShell (temporary session):
$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"
Scripts: scripts/visla_cli.py (Python), scripts/visla_cli.sh (Bash)
Platform Execution
Default strategy:
- Prefer Bash on macOS when dependencies are available (the Bash CLI avoids Python SSL-stack issues on some macOS setups).
- Prefer Python when you're already using a well-configured Python (or when Bash dependencies are missing).
Bash (recommended on macOS; also works on Linux-like environments):
# With user consent, you may source ~/.config/visla/.credentials
export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"
./scripts/visla_cli.sh <command>
Python (cross-platform):
python3 scripts/visla_cli.py --key "your_key" --secret "your_secret" <command>
# Or, credentials are auto-detected from ~/.config/visla/.credentials (with user consent):
python3 scripts/visla_cli.py <command>
Windows native (PowerShell/CMD without Bash; Python):
# PowerShell
$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"
python scripts/visla_cli.py <command>
Windows note:
- The agent should prefer running the Python CLI on Windows unless it has verified a Bash environment (WSL/Git Bash) is available.
- For simple scripts, pass directly:
python scripts/visla_cli.py script "Scene 1: ..." - For multi-line or complex scripts, use stdin with
-(recommended, no temp files):@" Scene 1: ... Scene 2: ... "@ | python scripts/visla_cli.py script - - If you have Python Launcher installed,
py -3 scripts/visla_cli.py <command>may work better thanpython. - Credentials:
- The Python CLI auto-detects
~/.config/visla/.credentialswhen present. - On Windows the default path is typically:
%USERPROFILE%\\.config\\visla\\.credentials.
- The Python CLI auto-detects
Note: do not print credentials. Prefer environment variables or auto-detected credentials with explicit user consent.
Commands
| Command | Description |
|---|---|
/visla script <script-or-@file> | Create video from a script (text or a local file) |
/visla url <URL> | Create video from web page URL |
/visla doc <file> | Create video from document (PPT/PDF) |
/visla idea <text-or-@file> | Create video from an idea |
/visla visual <file> [file ...] | Create video from visual resources (images/videos), supports multiple files |
/visla speech <file> [file ...] | Create video from speech (audio/video file), supports multiple files |
/visla account | Show account info and credit balance |
/visla avatar | List available AI avatars |
/visla voice | List available AI voices |
Important: For avatar and voice commands:
- Run the full CLI command (
./visla_cli.sh avataror./visla_cli.sh voice). - You may filter the output before presenting to the user:
- For
avatar: removeThumbnail:lines - For
voice: removeURL:lines
- For
- Categorize and format avatar results as follows:
- Group avatars by gender category (Female, Male, Neutral, Dynamic)
- List each avatar name with (n) where n = number of looks
- For each look, show: Look Name (lookUuid)
- Format:
- AvatarName (n): Look1 (uuid), Look2 (uuid), ... - Example:
**Female (16):** - Emma (5): Blue Dress (1000145), Patterned Dress (1000146), Black Blazer (1000147), Light Gray Blazer (1000148), Emerald Green Pantsuit (1000149)
- Categorize voice results by language/region (e.g., System, US English, Chinese, Japanese, French, etc.)
- You must NOT omit any items from the list. The user must see all available avatars/voices, even if the list is long.
- Agents must use the exact ID from the listing when configuring videos.
Optional Parameters
| Parameter | Description |
|---|---|
-c, --config <file> | Path to JSON config file with video options |
--avatar <id> | Avatar ID to use for the video (get list from avatar command) |
--voice <id> | Voice ID to use for the video (get list from voice command) |
visual command specific
| Parameter | Description |
|---|---|
--script, -s <text> | Script or description text (or @filename) |
--style <style> | Video style: montage, storytelling (default), explainer |
speech command specific
| Parameter | Description |
|---|---|
--function <func> | Speech to video function: SPEECH_TO_VIDEO_SUMMARY or SPEECH_TO_VIDEO_FULL_LENGTH |
All other options (aspect_ratio, pace, burn_subtitles, footage_options, bgm_options, etc.) can be set in the config file.
Cleanup: After video creation completes, delete the config file unless it's intended for reuse.
Config File Format (JSON)
All video options can be stored in a JSON config file (nested structure matches API request body):
{
"video_title": "My Video",
"video_description": "Video description",
"project_function": "SPEECH_TO_VIDEO_SUMMARY",
"script_text_mode": "ai_rewrite",
"doc_usage": "page_by_page_walkthrough",
"speaker_notes_verbatim": false,
"target_video": {
"aspect_ratio": "16:9",
"video_pace": "fast",
"burn_subtitles": false,
"video_duration_in_seconds": 60
},
"avatar_options": {
"use_avatar": false,
"look_id": 12345,
"avatar_layout": "smart_composition",
"enable_auto_wallpaper": true,
"enable_in_preview": true
},
"voice_options": {
"use_voice": false,
"voice_id": 1
},
"footage_options": {
"enable_footage": true,
"use_free_stocks": true,
"use_premium_stocks": true,
"use_premium_stocks_getty": true,
"use_private_stocks": true,
"private_stock_ids": 123456
},
"bgm_options": {
"enable_bgm": true,
"use_free_stocks": true,
"use_premium_stocks": true
}
}
Note: avatar_options.avatar_layout accepts only: host_only, host_pip, smart_composition.
CLI arguments (avatar, voice) override config file values.
Source of truth for the exact CLI surface: run scripts/visla_cli.sh --help or python3 scripts/visla_cli.py --help.
Script Format
**Scene 1** (0-10 sec):
**Visual:** A futuristic calendar flipping to 2025 with digital patterns.
**Narrator:** "AI is evolving rapidly! Here are 3 game-changing AI trends."
**Scene 2** (10-25 sec):
**Visual:** Text: "Trend #1: Generative AI Everywhere." Show tools like ChatGPT.
**Narrator:** "Generative AI is dominating industries—creating content and images."
Workflow
The script, url, doc, idea, visual, and speech commands execute the complete flow automatically:
- Create project
- Poll until generation completes (may take a few minutes)
- Auto-export and return download link
Execution Instructions:
- Inform user that video generation takes some time
- Report progress status periodically during polling
Timeout Guidance
- This workflow typically takes 3-10 minutes, but can take up to ~30 minutes in the worst case. Set the
task/command
timeoutto >= 30 minutes (Windows defaults are often ~10 minutes and need to be increased). If you cannot change the timeout, warn the user up front and, on timeout, ask whether to continue or switch to a step-by-step run. - If timeout occurs, the CLI returns
project_uuidin the output. Inform the user they can manually check project status and continue later using the Visla web interface or API.
Examples
/visla script @myscript.txt
/visla script "Scene 1: ..."
/visla url https://blog.example.com/article
/visla doc presentation.pptx
/visla idea "Create a video about machine learning"
/visla idea @my_idea.txt
/visla visual image.jpg
/visla visual photo1.jpg photo2.jpg photo3.jpg
/visla visual image.jpg --script "Description of the images..."
/visla visual image.jpg --style montage
/visla speech interview.m4a
/visla speech podcast.mp3 audio1.mp3 audio2.mp3
/visla speech podcast.mp3 --function SPEECH_TO_VIDEO_SUMMARY
/visla account
/visla avatar
/visla voice
# With config file
/visla script "Scene 1: Hello" -c config.json
# With avatar/voice (CLI overrides config)
/visla script "Scene 1: Hello" --avatar avatar_123 --voice voice_456
Supported Document Formats
- PowerPoint:
.pptx,.ppt - PDF:
.pdf
Supported Media Formats
Visual Resources (visual command)
- Images:
.jpg,.jpeg,.png,.gif,.webp - Videos:
.mp4,.mov,.avi,.mkv
Audio/Speech (speech command)
- Audio:
.mp3,.wav,.m4a,.aac,.flac - Videos:
.mp4,.mov,.avi,.mkv
Output Format
- Start: Display "Visla Skill v260501-1423" when skill begins
- End: Display "Visla Skill v260501-1423 completed" when skill finishes
Security
The CLI scripts enforce the following safety measures to prevent unauthorized file access:
- Path traversal: Paths containing
..are rejected. - System directories: Access to
/etc/,/proc/,/sys/,/dev/,/run/,/var/log/(and Windows equivalents) is denied. - Text file extension restriction: The
@filesyntax inscript,idea, andvisual --scriptcommands only accepts.txt,.md,.srt,.vtt,.csvfiles. - Document/media file validation: The
doc,visual, andspeechcommands validate file extensions against supported formats before upload. - Credentials: The Python CLI auto-detects
~/.config/visla/.credentialsonly. No arbitrary credential file paths are accepted. - User consent: The agent must ask for user consent before accessing local files, as specified in the "Before You Start" section.