AnyGen Image Generator
You MUST strictly follow every instruction in this document. Do not skip, reorder, or improvise any step. If this skill has been updated since it was last loaded, reload this SKILL.md before proceeding and always follow the latest version.
Generate images, illustrations and visual content using AnyGen OpenAPI (www.anygen.io ). Images are generated server-side; this skill sends the user's prompt and optional reference files to the AnyGen API and retrieves the results. An API key (ANYGEN_API_KEY ) is required to authenticate with the service.
When to Use
-
User needs to generate images, illustrations, or visual assets
-
User wants to create posters, banners, social media graphics, or marketing creatives
-
User has reference images to upload for style guidance
Security & Permissions
Images are generated server-side by AnyGen's OpenAPI (www.anygen.io ). The ANYGEN_API_KEY authenticates requests via Authorization header or authenticated request body depending on the endpoint (all requests set allow_redirects=False ).
What this skill does: sends prompts to www.anygen.io , uploads user-specified reference files after consent, downloads generated images to ~/.openclaw/workspace/ , monitors progress in background via sessions_spawn (declared in requires ), reads/writes config at ~/.config/anygen/config.json .
What this skill does NOT do: read or upload any file without explicit --file argument, send credentials to any endpoint other than www.anygen.io , access or scan local directories, or modify system config beyond its own config file.
Bundled scripts: scripts/anygen.py , scripts/auth.py , scripts/fileutil.py (Python — uses requests ). Scripts print machine-readable labels to stdout (e.g., File Token: , Task ID: ) as the standard agent-tool communication channel. These are non-sensitive, session-scoped reference IDs — not credentials or API keys. The agent should not relay raw script output to the user to keep the conversation natural (see Communication Style).
Prerequisites
-
Python3 and requests : pip3 install requests
-
AnyGen API Key (sk-xxx ) — Get one from AnyGen
-
Configure key: python3 scripts/anygen.py config set api_key "sk-xxx" (saved to ~/.config/anygen/config.json , chmod 600). Or set ANYGEN_API_KEY env var.
All scripts/ paths below are relative to this skill's installation directory.
Communication Style
Use natural language. Never expose task_id , file_token , task_xxx , tk_xxx , anygen.py , or command syntax to the user. Say "your images", "generating", "checking progress" instead. When presenting reply and prompt from prepare , preserve the original content as much as possible — translate into the user's language if needed, but do NOT rephrase, summarize, or add your own interpretation. Ask questions in your own voice (NOT "AnyGen wants to know…"). When prompting the user for an API key, MUST use Markdown link syntax: Get your AnyGen API Key so the full URL is clickable.
Image Generation Workflow (MUST Follow All 5 Phases)
Phase 1: Understand Requirements
If the user provides reference files, handle them before calling prepare :
-
Describe the reference image yourself if provided.
-
Reuse existing file_token if the same file was already uploaded in this conversation.
-
Get consent before uploading: "I'll upload your reference image to AnyGen. This may take a moment..."
-
Upload to get a file_token .
-
Include description in --message when calling prepare .
python3 scripts/anygen.py upload --file ./reference.png
Output: File Token: tk_abc123
python3 scripts/anygen.py prepare
--message "I need a poster design for a music festival. Style reference uploaded."
--file-token tk_abc123
--save ./conversation.json
Present questions from reply to the user — preserve the original content, translate into the user's language if needed. Continue with user's answers:
python3 scripts/anygen.py prepare
--input ./conversation.json
--message "Vibrant colors, modern style, include artist names and venue info"
--save ./conversation.json
Repeat until status="ready" with suggested_task_params .
Special cases:
-
status="ready" on first call → proceed to Phase 2.
-
User says "just create it" → skip to Phase 3 with create directly.
Phase 2: Confirm with User (MANDATORY)
When status="ready" , present the reply and the prompt from suggested_task_params to the user as the design plan. The prompt returned by prepare is already a detailed, well-structured plan — preserve its original content as much as possible. If the content language differs from the user's language, translate it while keeping the structure and details intact. Do NOT rephrase, summarize, or add your own interpretation.
Ask the user to confirm or request adjustments. NEVER auto-create without explicit approval.
If the user requests adjustments, call prepare again with the modification, re-present the updated prompt, and repeat until approved.
Phase 3: Create Task
python3 scripts/anygen.py create
--operation ai_designer
--prompt "<prompt from suggested_task_params>"
--file-token tk_abc123
Output: Task ID: task_xxx, Task URL: https://...
Immediately tell the user (natural language):
-
Image is being generated.
-
Online preview link: "You can follow the progress here: [URL]".
-
Takes about 5–10 minutes — free to do other things.
-
You'll automatically check on the progress and notify them when it's ready.
Phase 4: Monitor and Deliver Result
Requires sessions_spawn . If unavailable, skip to Fallback below.
Background Monitoring (preferred)
CRITICAL: When calling sessions_spawn , you MUST set a timeout of at least 25 minutes to ensure the background monitor completes. The poll operation can take up to 20 minutes.
Example spawn call syntax (adjust to match your OpenClaw/sessions API):
sessions_spawn( prompt=<subagent prompt below>, runTimeoutSeconds=1500 # REQUIRED: 25 minutes (1500s) to cover 20-min poll + buffer )
If your sessions API uses different parameter names (e.g., timeout , maxDuration ), use the equivalent parameter to set a 25-minute (1500 second) timeout.
Subagent prompt (it has NO conversation context):
You are a background monitor for an image generation task. You MUST strictly follow every instruction below. Do not skip, reorder, or improvise any step.
Task ID: {task_id} Task URL: {task_url} Script: {script_path} User Language: {user_language}
IMPORTANT: This monitoring task may take up to 20 minutes. Ensure your execution timeout is at least 25 minutes.
CRITICAL RULES:
- You MUST reply in {user_language}.
- After completing ALL steps (including sending messages to the user), your FINAL reply MUST be EXACTLY "ANNOUNCE_SKIP" — nothing else. This prevents the main session from sending duplicate messages.
- Do NOT say anything beyond what is specified below. No greetings, no extra commentary.
- Use natural language in all messages to the user. Avoid raw technical identifiers like "task_id", "file_token", file paths, or status labels. The user already knows their content is being generated — just provide the result naturally.
Your job:
-
Run: python3 {script_path} poll --task-id {task_id} This command blocks for up to 20 minutes waiting for task completion. No --output needed — results are viewed online.
-
On success: a. Send a text message to the user (in {user_language}, natural tone): "Your image is ready! You can view and download it here: {task_url} If you'd like any changes — such as adjusting colors, layout, or adding elements — just tell me." b. Reply EXACTLY: ANNOUNCE_SKIP
-
On failure: a. Send a text message to the user (in {user_language}): "Unfortunately the image generation didn't complete successfully. You can check the details here: {task_url}" b. Reply EXACTLY: ANNOUNCE_SKIP
-
On timeout (20 min): a. Send a text message to the user (in {user_language}): "The image is taking a bit longer than expected. You can check the progress here: {task_url}" b. Reply EXACTLY: ANNOUNCE_SKIP
Do NOT wait for the background monitor to finish — continue the conversation immediately.
Handling the completion event. The background monitor sends the notification to the user directly. It replies ANNOUNCE_SKIP as its final output, which means the main session should NOT relay or duplicate any message. If you receive a completion event with ANNOUNCE_SKIP , simply ignore it — the user has already been notified.
Fallback (no background monitoring)
Tell the user: "I've started generating your image. It usually takes about 5–10 minutes. You can check the progress here: [Task URL]. Let me know when you'd like me to check if it's ready!"
Phase 5: Multi-turn Conversation (Modify Completed Images)
After a task has completed (Phase 4 finished), the user may request modifications such as:
-
"Make the background darker"
-
"Change the text to bold font"
-
"Add a logo in the top-right corner"
-
"Adjust the color scheme to blue tones"
When the user requests changes to an already-completed task, use the multi-turn conversation API instead of creating a new task.
IMPORTANT: You MUST remember the task_id from Phase 3 throughout the conversation. When the user asks for modifications, use the same task_id .
Step 1: Send Modification Request
python3 scripts/anygen.py send-message --task-id {task_id} --message "Make the background color darker and add more contrast"
Output: Message ID: 123, Status: processing
Save the returned Message ID — you'll need it to detect the AI reply.
Immediately tell the user (natural language, NO internal terms):
- "I'm working on your changes now. I'll let you know when they're done."
Step 2: Monitor for AI Reply
Requires sessions_spawn . If unavailable, skip to Multi-turn Fallback below.
CRITICAL: When calling sessions_spawn , you MUST set a timeout of at least 10 minutes (600 seconds). Modifications are faster than initial generation.
Example spawn call syntax:
sessions_spawn( prompt=<subagent prompt below>, runTimeoutSeconds=600 # REQUIRED: 10 minutes (600s) )
Subagent prompt (it has NO conversation context):
You are a background monitor for an image modification task. You MUST strictly follow every instruction below. Do not skip, reorder, or improvise any step.
Task ID: {task_id} Task URL: {task_url} Script: {script_path} User Message ID: {user_message_id} User Language: {user_language}
IMPORTANT: This monitoring task may take up to 8 minutes. Ensure your execution timeout is at least 10 minutes.
CRITICAL RULES:
- You MUST reply in {user_language}.
- After completing ALL steps (including sending messages to the user), your FINAL reply MUST be EXACTLY "ANNOUNCE_SKIP" — nothing else. This prevents the main session from sending duplicate messages.
- Do NOT say anything beyond what is specified below. No greetings, no extra commentary.
- Use natural language in all messages to the user. Avoid raw technical identifiers like "task_id", "message_id", file paths, or status labels.
Your job:
-
Run: python3 {script_path} get-messages --task-id {task_id} --wait --since-id {user_message_id} This command blocks until the AI reply is completed.
-
On success (AI reply received): a. Send a text message to the user (in {user_language}, natural tone): "Your changes are done! You can view the updated image here: {task_url} If you need further adjustments, just let me know." b. Reply EXACTLY: ANNOUNCE_SKIP
-
On failure / timeout: a. Send a text message to the user (in {user_language}): "The modification didn't complete as expected. You can check the details here: {task_url}" b. Reply EXACTLY: ANNOUNCE_SKIP
Do NOT wait for the background monitor to finish — continue the conversation immediately.
Multi-turn Fallback (no background monitoring)
Tell the user: "I've sent your changes. You can check the progress here: [Task URL]. Let me know when you'd like me to check if it's done!"
When the user asks you to check, use:
python3 scripts/anygen.py get-messages --task-id {task_id} --limit 5
Look for a completed assistant message and relay the content to the user naturally.
Subsequent Modifications
The user can request multiple rounds of modifications. Each time, repeat Phase 5:
-
send-message with the new modification request
-
Background-monitor with get-messages --wait
-
Notify the user with the online link when done
All modifications use the same task_id — do NOT create a new task.
Notes
-
Max task execution time: 20 minutes
-
Image download available from task URL
-
Poll interval: 3 seconds