gemini-imagen

This skill enables image generation from text prompts using Google's Gemini Imagen API. It provides a reusable script that handles API authentication, request formatting, response processing, and automatic image saving with proper error handling.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "gemini-imagen" with this command: npx skills add agentiveau/myagentive/agentiveau-myagentive-gemini-imagen

Gemini Imagen

Overview

This skill enables image generation from text prompts using Google's Gemini Imagen API. It provides a reusable script that handles API authentication, request formatting, response processing, and automatic image saving with proper error handling.

When to Use This Skill

Use this skill when the user requests:

  • Creating or generating images from text descriptions

  • Visualizing concepts, scenes, or objects through AI-generated imagery

  • Producing multiple variations of an image concept

  • Creating images with specific aspect ratios or quality levels

Example requests:

  • "Generate an image of a sunset over mountains"

  • "Create a logo concept showing a geometric bird"

  • "Make me an image of a futuristic city at night in 16:9 ratio"

  • "Generate 3 variations of a robot painting artwork"

Configuration

API Key Setup

The Gemini API requires an API key for authentication. Obtain a key from Google AI Studio.

Recommended approach: Store the API key as an environment variable:

export GEMINI_API_KEY="your-api-key-here"

Alternatively, pass the key directly when invoking the script (less secure for shared environments).

Python Dependencies

The script requires these Python packages:

  • requests

  • HTTP client for API calls

  • Pillow

  • Image processing library

These are included in the project's shared virtual environment. Activate it before running:

source .venv/bin/activate # On Windows: .venv\Scripts\activate

Generating Images

Basic Usage

To generate a single image with default settings:

python scripts/generate_image.py "your prompt here" --api-key $GEMINI_API_KEY

The script will:

  • Send the prompt to the Gemini Imagen API

  • Receive and decode the generated image(s)

  • Save images with timestamped filenames (e.g., gemini_image_20231123_142530_1.png )

  • Display progress and file paths

Advanced Options

Model Selection

Choose from three quality/speed tiers:

Fast generation (default) - quickest, good quality

--model imagen-4.0-fast-generate-001

Standard generation - balanced speed and quality

--model imagen-4.0-generate-001

Ultra generation - highest quality, slower

--model imagen-4.0-ultra-generate-001

Aspect Ratios

Generate images in different dimensions:

Square (default)

--aspect-ratio 1:1

Portrait orientations

--aspect-ratio 3:4 --aspect-ratio 9:16

Landscape orientations

--aspect-ratio 4:3 --aspect-ratio 16:9

Multiple Images

Generate up to 4 variations in a single request:

--num 4

Output Directory

Specify where to save generated images:

--output ./generated_images

Complete Examples

Generate a high-quality landscape image:

python scripts/generate_image.py
"Majestic mountain range at golden hour with dramatic clouds"
--api-key $GEMINI_API_KEY
--model imagen-4.0-ultra-generate-001
--aspect-ratio 16:9
--output ./landscapes

Create multiple logo variations:

python scripts/generate_image.py
"Minimalist geometric logo for tech startup, blue and white"
--api-key $GEMINI_API_KEY
--num 4
--aspect-ratio 1:1
--output ./logo_concepts

Quick social media graphic:

python scripts/generate_image.py
"Abstract colorful pattern for social media background"
--api-key $GEMINI_API_KEY
--aspect-ratio 9:16
--output ./social_media

Workflow Integration

When a user requests image generation:

  • Extract the prompt from the user's request

  • Determine parameters based on context:

  • Aspect ratio (square for logos, 16:9 for presentations, etc.)

  • Number of variations (if user wants options)

  • Quality tier (ultra for final outputs, fast for iteration)

  • Invoke the script with appropriate parameters

  • Show the generated images to the user and provide file paths

  • Iterate if needed with refined prompts or different parameters

Best Practices

Prompt Engineering

  • Be specific and descriptive: Include details about style, lighting, composition, colors

  • Specify art style if desired: "digital art", "oil painting", "photorealistic", "minimalist"

  • Mention important elements: Objects, subjects, background, atmosphere

  • Include quality keywords: "high detail", "professional", "award-winning"

Example good prompt:

"A serene Japanese garden with cherry blossoms in full bloom, koi pond in foreground, traditional stone lantern, soft morning light, photorealistic style, high detail"

Example basic prompt (works but less controlled):

"Japanese garden"

Model Selection

  • Fast model: Prototyping, iteration, quick previews, high-volume generation

  • Standard model: General-purpose images, balanced quality and speed

  • Ultra model: Final outputs, client presentations, high-stakes visuals

Error Handling

The script handles common errors:

  • Invalid API keys → Check API key configuration

  • Network timeouts → Verify internet connection, retry request

  • Rate limiting → Wait and retry, consider reducing simultaneous requests

  • Invalid parameters → Review model name, aspect ratio, and num_images values

Output Format

Generated images are saved as PNG files with:

  • Naming convention: gemini_image_YYYYMMDD_HHMMSS_N.png

  • Timestamp: Ensures unique filenames across runs

  • Sequential numbering: When generating multiple images

  • SynthID watermark: Automatically embedded by Imagen API

Resources

scripts/generate_image.py

The main image generation script that handles:

  • API authentication and request formatting

  • Base64 image decoding and PIL processing

  • Automatic file saving with timestamps

  • Comprehensive error handling and user feedback

  • Command-line interface with all customization options

Invoke directly from the command line or integrate into larger workflows.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

twilio-phone

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

deepgram-transcription

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

email-himalaya

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

skill-creator

No summary provided by upstream source.

Repository SourceNeeds Review