gpt-image2

Generate high-quality images with GPT Image 2 (OpenAI gpt-image-2) via the ClawdChat tool gateway. Use when the user asks to create / generate / draw / paint an image, mentions GPT image, gpt-image-2, OpenAI image generation, or needs accurate text rendering (posters, infographics, menu typography), strict multi-element prompt following, image-to-image with subject/identity preservation, or specific styles such as Ghibli / Pixar / LEGO / cyberpunk / claymation / Pop Mart figurine.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "gpt-image2" with this command: npx skills add lxyd-ai/gpt-image2

GPT Image 2 — High-quality AI image generation

Powered by ClawdChat — calls OpenAI gpt-image-2 through the Uno tool gateway.

What this skill does

Two command-line invocations against the public ClawdChat tool gateway:

Tool slugPurposeCost
gpt-image-2.gpt_image2_submitSubmit a generation job, returns job_id immediately (async)300 credits / call
gpt-image-2.gpt_image2_resultPoll job status / fetch image URL when ready0 credits

The bundled bin/uno.py handles authentication, HTTP transport, and rate-limiting. No external skill dependency required — install this skill and you're ready.

Credentials & permissions (read before first use)

  • Credential type: ClawdChat API key (Bearer token).
  • Where it lives: ~/.uno/credentials.json. Managed entirely by the bundled bin/uno.py; this skill never opens, prints or copies it.
  • How it was obtained: run python bin/uno.py login (see Setup below). The bundled script drives a ClawdChat OAuth device-code flow and stores the resulting token.
  • What it authorises: calling the ClawdChat tool gateway as the logged-in user. Each gpt_image2_submit deducts 300 credits from that account.
  • Network egress: the user's prompt text and any reference_image_urls are sent to the ClawdChat gateway over HTTPS. Do not paste private, confidential, or personally-identifying content into the prompt unless you are comfortable with the gateway's data handling — see https://clawdchat.cn for the data policy.
  • What the credential authorises: OAuth scope mcp:gpt-image-2 — scoped specifically to gpt-image-2 tool calls. Only gpt-image-2.gpt_image2_submit and gpt-image-2.gpt_image2_result are invoked; no other gateway tools are called.
  • Login output: the login --poll step returns the new API key once in its JSON response (standard device-code OAuth confirmation). Treat that terminal output as a secret — do not log or share it. bin/uno.py immediately persists the key to ~/.uno/credentials.json and does not print it again.
  • Logging out / revoking: run python bin/uno.py logout to delete ~/.uno/credentials.json and end the local session.

Cost transparency & confirmation rule

Every gpt_image2_submit call costs the logged-in account real credits. The agent must:

  1. Show the user the planned prompt, size, style, and number of images before the first call.
  2. Ask for explicit confirmation when the user has not already approved a generation in the current turn.
  3. For multi-image batches (n > 1) or retries, treat each submission as a separate spending event and confirm again unless the user has pre-authorised the batch.
  4. On error responses, surface the error to the user instead of silently retrying.

Polling via gpt_image2_result is free; only submit spends credits.

Setup

No external skill dependency. After clawhub install gpt-image2 the layout is:

gpt-image2/
├── SKILL.md
└── bin/
    └── uno.py    # bundled CLI — no extra install needed

All commands below run from inside the gpt-image2/ folder.

Check if already logged in

python bin/uno.py whoami --compact
  • Returns user info (name, email, credits) → credentials valid, skip login.
  • Returns {"error": "Not logged in"} → proceed to login below.

Log in

python bin/uno.py login --start

This prints a device code and a URL like https://clawdtools.uno/device?code=XXXX.

Open that URL in a browser. If not yet signed in to clawdtools.uno, the page redirects to ClawdChat SSO automatically and returns to the Authorise screen. Click "Authorise".

Then poll for completion:

python bin/uno.py login --poll DEVICE_CODE

Or run python bin/uno.py login (blocking, polls automatically).

Credential file (~/.uno/credentials.json) is written by bin/uno.py and reused on subsequent runs.

Generating an image — full async flow

A single 1024×1024 image typically takes ~150 s, longer than the default MCP 60 s timeout. Always use the submit → poll-result pattern.

Step 1 — submit

python bin/uno.py call gpt-image-2.gpt_image2_submit --compact \
  --args '{"prompt":"A shiba inu under cherry blossoms, sunny afternoon","size":"1024x1024","style":"ghibli_anime"}'

Response (already flattened — no need to unwrap content[0].text):

{"success": true, "data": {"status": "pending", "job_id": "0b84b8f0f0c8", "estimated_seconds": 150}, "meta": {"latency_ms": 120, "credits_used": 300}}

Record data.job_id.

Step 2 — poll for result

python bin/uno.py call gpt-image-2.gpt_image2_result --compact --timeout 70 \
  --args '{"job_id":"0b84b8f0f0c8","wait_seconds":50}'

wait_seconds=50 makes the server-side wait 50 s (within the 60 s MCP envelope); --timeout 70 adds a small client buffer.

Repeat the call until data.status is one of:

  • done — image ready, URLs in data.items[].url.
  • error — generation failed, message in data.error.
  • pending / running — call again immediately. Do not add a client-side sleep; the server already waited 50 s on your behalf.

Three to five iterations (~150–250 s total) is normal.

Reference Python loop

Use subprocess.run with an argument list to safely pass arbitrary prompt text without shell-injection risk:

import json, subprocess, sys

UNO = "bin/uno.py"

def uno(args):
    r = subprocess.run(["python", UNO] + args, capture_output=True, text=True)
    return json.loads(r.stdout)

prompt = "Van Gogh starry night"
resp = uno(["call", "gpt-image-2.gpt_image2_submit", "--compact",
            "--args", json.dumps({"prompt": prompt, "style": "oil_painting_vangogh"})])
job_id = resp["data"]["job_id"]

for _ in range(6):
    r = uno(["call", "gpt-image-2.gpt_image2_result", "--compact", "--timeout", "70",
             "--args", json.dumps({"job_id": job_id, "wait_seconds": 50})])
    status = r["data"]["status"]
    if status == "done":
        print(json.dumps(r, ensure_ascii=False, indent=2))
        break
    if status == "error":
        print("Error:", r["data"].get("error"), file=sys.stderr)
        sys.exit(1)

Parameters

FieldMeaningValues
promptImage description (required, any language)free text
sizeImage dimensions1024x1024 (default), 1024x1536 (portrait), 1536x1024 (landscape), auto
nNumber of images to generate1–4 (default 1)
styleBuilt-in style presetone of the 20 keys below
reference_image_urlsReference images (image-to-image)URL string, comma-separated for multiple

20 built-in style presets

keydescription
ghibli_animeStudio Ghibli / hand-drawn anime
pixar_3dPixar / Disney 3D animation
claymationStop-motion claymation (Laika / Aardman)
lego_brickLEGO bricks
popmart_figurineBlind-box / Pop Mart figurine
isometric_gameIsometric 2.5D game scene
cinematic_photoCinematic photorealism (35mm)
polaroid_filmPolaroid film snapshot
watercolor_inkWatercolour / East-Asian ink wash
oil_painting_vangoghVan Gogh impasto oil painting
cyberpunk_neonCyberpunk neon nightscape
vintage_infographicRetro infographic / data poster
movie_posterMovie poster (large title + still)
flat_vectorFlat-vector illustration / banner
pixel_8bitPixel art (8/16-bit)
papercraft_layeredLayered papercraft
exploded_diagramExploded technical diagram
dreamcore_liminalDreamcore / liminal space
knolling_flatlayTop-down knolling / flat-lay
botanical_engravingBotanical engraving / antique illustration

Where this model shines (vs Midjourney / Flux / SD)

  • Accurate text rendering — poster headlines, infographics, menu typography, meme captions: written into the image as specified.
  • Strong prompt following — multi-element scenes, ordering and spatial relationships obeyed.
  • Subject preservation in image-to-image — faces, brands, and characters stay consistent across reference images.
  • Wide style coverage — Ghibli, Pixar, claymation, LEGO, Pop Mart, botanical engraving etc. all handled.

Agent guidance

  • Tell the user up-front that one image takes ~150 s.
  • The gpt_image2_result tool already sleeps 50 s server-side — never add an extra client-side sleep between polls.
  • Use --timeout 70 for result calls (50 s server wait + buffer).
  • Pass the user's prompt verbatim, including non-English text.
  • Reference images: combine reference_image_urls with a style preset for "restyle while keeping the subject".
  • Posters / infographics / menus: lean on the text-rendering strength.
  • If submit returns success=false, surface the error/hint fields to the user.
  • If the loop exhausts (~600 s) and status is still running, tell the user the job can be re-polled later with the same job_id.

Response shape

{
  "success": true,
  "data": {"status": "...", "job_id": "...", "items": [{"url": "..."}]},
  "meta": {"latency_ms": 120, "credits_used": 300}
}

Read data.status, data.job_id, data.items[].url directly.

Errors:

{"success": false, "error": "...", "hint": "..."}

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.