Run any Skill in Manus with one click

$pwd:

image-generation

Name: Image Generation
Author: massgen

// Guide to image generation and editing in MassGen. Use when creating images, editing existing images, iterating on image designs, or choosing between image backends (OpenAI, Google Gemini/Imagen, Grok, OpenRouter).

Run Skill in Manus

$ git log --oneline --stat

stars:1,041

forks:161

updated:March 6, 2026 at 16:37

File Explorer

4 files

SKILL.md

readonly

name	image-generation
description	Guide to image generation and editing in MassGen. Use when creating images, editing existing images, iterating on image designs, or choosing between image backends (OpenAI, Google Gemini/Imagen, Grok, OpenRouter).

Image Generation

Generate images using generate_media with mode="image". The system auto-selects the best backend based on available API keys.

Quick Start

# Simple text-to-image (auto-selects backend)
generate_media(prompt="A cat in space", mode="image")

# Specify backend and quality
generate_media(prompt="A logo for a coffee shop", mode="image",
               backend_type="openai", quality="high")

# Batch generation (parallel)
generate_media(prompts=["sunset over ocean", "mountain landscape", "city at night"],
               mode="image", max_concurrent=3)

Backend Comparison

Backend	Default Model	Strengths	API Key
Google (priority 1)	`gemini-3.1-flash-image-preview` (Nano Banana 2)	Fast, flexible sizes, image editing, multi-turn	`GOOGLE_API_KEY` or `GEMINI_API_KEY`
OpenAI (priority 2)	`gpt-5.4`	High quality, transparent backgrounds, continuation via response ID	`OPENAI_API_KEY`
Grok (priority 3)	`grok-imagine-image`	1k resolution, continuation via stored data URI	`XAI_API_KEY`
OpenRouter (priority 4)	`google/gemini-3.1-flash-image-preview`	Access to multiple models via single API	`OPENROUTER_API_KEY`

Key Parameters

Parameter	Description	Example
`prompt`	Text description of the image	`"A watercolor painting of mountains"`
`backend_type`	Force a specific backend	`"google"`, `"openai"`, `"grok"`, `"openrouter"`
`model`	Override default model	`"gemini-3-pro-image-preview"` for studio quality
`quality`	Image quality (OpenAI)	`"low"`, `"medium"`, `"high"`, `"auto"`
`size`	Image dimensions	See backends reference
`aspect_ratio`	Aspect ratio	`"16:9"`, `"1:1"`, `"4:5"`
`input_images`	Source images for image-to-image editing	`["photo.jpg"]`
`continue_from`	Continuation ID for multi-turn editing	`result["continuation_id"]`

Image-to-Image Editing

Transform existing images by providing input_images:

generate_media(
    prompt="Make it look like a watercolor painting",
    mode="image",
    input_images=["photo.jpg"]
)

Supported backends for image-to-image: Google (Gemini), OpenAI, Grok. The system auto-selects if your current backend doesn't support it.

Multi-Turn Editing (Continuation)

Iteratively refine images using continue_from:

# First generation
result = generate_media(prompt="A logo for a coffee shop", mode="image")

# Refine using the continuation ID
result2 = generate_media(
    prompt="Make the text larger and add a cup icon",
    mode="image",
    continue_from=result["continuation_id"]
)

Each backend uses a different continuation mechanism:

OpenAI: Passes previous_response_id (stateless)
Google Gemini: In-memory chat store (LRU, 50 items)
Grok: In-memory data URI store (LRU, 50 items)

Continuation only works for single image generation (not batch).

Google: Gemini vs Imagen

Google supports two API paths. Gemini (Nano Banana 2) is the default and recommended for most use cases. Imagen is only needed for advanced reference-image editing features.

Gemini models (gemini-*): generate_content() — text-to-image, image editing via input_images, multi-turn continuation
Imagen models (imagen-*): generate_images() / edit_image() — text-to-image with negative_prompt/seed/guidance_scale, plus style transfer, control editing, and subject consistency via reference images

For studio-quality precision and text rendering, use: model="gemini-3-pro-image-preview" (Pro-tier).

Need More Control?

Per-backend sizes, quality options, and quirks: See references/backends.md
Complete extra_params reference: See references/extra_params.md
Advanced editing (inpainting, style transfer, control, subject): See references/editing.md

related-skills.json

same repository

massgen-log-analyzer.md

from "massgen/MassGen"

Run MassGen experiments and analyze logs using automation mode, logfire tracing, and SQL queries. Use this skill for performance analysis, debugging agent behavior, evaluating coordination patterns, and improving the logging structure, or whenever an ANALYSIS_REPORT.md is needed in a log directory.

2026-03-311.0k

massgen.md

from "massgen/MassGen"

Invoke MassGen's multi-agent system. Use when the user wants multiple AI agents on a task: writing, code, review, planning, specs, research, design, or any task where parallel iteration beats working alone.

2026-03-281.0k

backend-integrator.md

from "massgen/MassGen"

Complete guide for integrating a new LLM backend into MassGen. Use when adding a new provider (e.g., Codex, Mistral, DeepSeek) or when auditing an existing backend for missing integration points. Covers all ~15 files that need touching.

2026-03-101.0k

video-generation.md

from "massgen/MassGen"

Guide to video generation in MassGen. Use when creating videos from text prompts or images across Grok, Google Veo, and OpenAI Sora backends.

2026-03-031.0k

multimedia-backend-integrator.md

from "massgen/MassGen"

Reference guide for adding new media generation backends to MassGen's unified generate_media tool.

2026-03-021.0k

audio-generation.md

from "massgen/MassGen"

Guide to audio generation and understanding in MassGen. Covers text-to-speech, music, sound effects, and audio understanding across ElevenLabs and OpenAI backends.

2026-03-011.0k

package.json

"author": "massgen"

"repository": "massgen/MassGen"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Computer Occupations, All OtherComputer and Mathematical Occupations15-1299L4

name	image-generation
description	Guide to image generation and editing in MassGen. Use when creating images, editing existing images, iterating on image designs, or choosing between image backends (OpenAI, Google Gemini/Imagen, Grok, OpenRouter).

Image Generation

Generate images using generate_media with mode="image". The system auto-selects the best backend based on available API keys.

Quick Start

# Simple text-to-image (auto-selects backend)
generate_media(prompt="A cat in space", mode="image")

# Specify backend and quality
generate_media(prompt="A logo for a coffee shop", mode="image",
               backend_type="openai", quality="high")

# Batch generation (parallel)
generate_media(prompts=["sunset over ocean", "mountain landscape", "city at night"],
               mode="image", max_concurrent=3)

Backend Comparison

Backend	Default Model	Strengths	API Key
Google (priority 1)	`gemini-3.1-flash-image-preview` (Nano Banana 2)	Fast, flexible sizes, image editing, multi-turn	`GOOGLE_API_KEY` or `GEMINI_API_KEY`
OpenAI (priority 2)	`gpt-5.4`	High quality, transparent backgrounds, continuation via response ID	`OPENAI_API_KEY`
Grok (priority 3)	`grok-imagine-image`	1k resolution, continuation via stored data URI	`XAI_API_KEY`
OpenRouter (priority 4)	`google/gemini-3.1-flash-image-preview`	Access to multiple models via single API	`OPENROUTER_API_KEY`

Key Parameters

Parameter	Description	Example
`prompt`	Text description of the image	`"A watercolor painting of mountains"`
`backend_type`	Force a specific backend	`"google"`, `"openai"`, `"grok"`, `"openrouter"`
`model`	Override default model	`"gemini-3-pro-image-preview"` for studio quality
`quality`	Image quality (OpenAI)	`"low"`, `"medium"`, `"high"`, `"auto"`
`size`	Image dimensions	See backends reference
`aspect_ratio`	Aspect ratio	`"16:9"`, `"1:1"`, `"4:5"`
`input_images`	Source images for image-to-image editing	`["photo.jpg"]`
`continue_from`	Continuation ID for multi-turn editing	`result["continuation_id"]`

Image-to-Image Editing

Transform existing images by providing input_images:

generate_media(
    prompt="Make it look like a watercolor painting",
    mode="image",
    input_images=["photo.jpg"]
)

Supported backends for image-to-image: Google (Gemini), OpenAI, Grok. The system auto-selects if your current backend doesn't support it.

Multi-Turn Editing (Continuation)

Iteratively refine images using continue_from:

# First generation
result = generate_media(prompt="A logo for a coffee shop", mode="image")

# Refine using the continuation ID
result2 = generate_media(
    prompt="Make the text larger and add a cup icon",
    mode="image",
    continue_from=result["continuation_id"]
)

Each backend uses a different continuation mechanism:

OpenAI: Passes previous_response_id (stateless)
Google Gemini: In-memory chat store (LRU, 50 items)
Grok: In-memory data URI store (LRU, 50 items)

Continuation only works for single image generation (not batch).

Google: Gemini vs Imagen

Google supports two API paths. Gemini (Nano Banana 2) is the default and recommended for most use cases. Imagen is only needed for advanced reference-image editing features.

Gemini models (gemini-*): generate_content() — text-to-image, image editing via input_images, multi-turn continuation
Imagen models (imagen-*): generate_images() / edit_image() — text-to-image with negative_prompt/seed/guidance_scale, plus style transfer, control editing, and subject consistency via reference images

For studio-quality precision and text rendering, use: model="gemini-3-pro-image-preview" (Pro-tier).

Need More Control?

Per-backend sizes, quality options, and quirks: See references/backends.md
Complete extra_params reference: See references/extra_params.md
Advanced editing (inpainting, style transfer, control, subject): See references/editing.md

image-generation

Image Generation

Quick Start

Backend Comparison

Key Parameters

Image-to-Image Editing

Multi-Turn Editing (Continuation)

Google: Gemini vs Imagen

Need More Control?

More from this repository

More from this repository

Image Generation

Quick Start

Backend Comparison

Key Parameters

Image-to-Image Editing

Multi-Turn Editing (Continuation)

Google: Gemini vs Imagen

Need More Control?