Run any Skill in Manus with one click

generate-image

Stars560

Forks46

UpdatedJune 22, 2026 at 09:37

Generate or edit images from a text prompt using Google's Gemini 3 Pro Image model. Use this skill whenever the user wants to create, generate, draw, render, illustrate, or mock up an image, picture, illustration, concept art, storyboard panel, icon, logo, poster, or product shot — and also when they want to edit, restyle, retouch, combine, or extend an existing image. Triggers include "generate an image", "make a picture of", "draw me", "create an illustration", "生成图片", "画一张", "做一张图", "P 一下这张图", or any request that should produce a PNG/JPEG from a description. Prefer this skill over describing an image in text.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

FradSer

FradSer/dotclaude

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Software DevelopersComputer and Mathematical Occupations·SOC 15-1252

File Explorer

3 files

SKILL.md

readonly

name	generate-image
description	Generate or edit images from a text prompt using Google's Gemini 3 Pro Image model. Use this skill whenever the user wants to create, generate, draw, render, illustrate, or mock up an image, picture, illustration, concept art, storyboard panel, icon, logo, poster, or product shot — and also when they want to edit, restyle, retouch, combine, or extend an existing image. Triggers include "generate an image", "make a picture of", "draw me", "create an illustration", "生成图片", "画一张", "做一张图", "P 一下这张图", or any request that should produce a PNG/JPEG from a description. Prefer this skill over describing an image in text.
user-invocable	true
argument-hint	"PROMPT" [-o out.png] [-i input.png ...] [--aspect-ratio 16:9] [--size 2K] [--count N]
allowed-tools	["Read","Write","AskUserQuestion","Bash(uv run:)","Bash(/generate_image.py:*)"]

Generate Image (Gemini 3 Pro Image)

Turn a text prompt — optionally with reference images — into one or more images using the gemini-3-pro-image model. The script does the API call, file saving, and configuration; your job is to craft a strong prompt and wire up the flags.

Prerequisites

uv available (the script is a self-contained uv run script; deps install on first run).
A Google AI Studio key. The script resolves it progressively, so any one of these works:
- export GEMINI_API_KEY=... in the shell, or
- a .env file (checked in order: $PWD/.env, then ${CLAUDE_PLUGIN_ROOT}/.env), or
- --api-key ... on the command line.
CRITICAL -- Never paste the API key into chat or commit a .env. If the key is missing, the script prints exactly how to set it — relay that to the user rather than guessing.

Workflow

1. Clarify intent (only if genuinely ambiguous)

A one-line request like "draw a fox in a spacesuit" needs no questions — just generate. Ask (via AskUserQuestion) only when a choice would materially change the result and you cannot reasonably default it, e.g. aspect ratio for a "banner vs. avatar", or whether an attached image should be edited vs. used as style reference.

2. Craft the prompt

Read references/prompting.md before writing a non-trivial prompt. In short: describe subject, composition, lighting, style, and mood in concrete terms; put any literal text to render in quotes; and for edits, state what to change AND what to keep. A vivid one-paragraph prompt beats a terse phrase.

3. Run the script

Invoke it directly (the shebang runs it through uv):

${CLAUDE_PLUGIN_ROOT}/skills/generate-image/scripts/generate_image.py "PROMPT" -o OUT.png [flags]

Flags:

Flag	Purpose	Default
`-o, --output`	Output path (`.png`/`.jpeg`)	`generated.png`
`-i, --input`	Reference/input image to edit or compose; repeatable	none
`--aspect-ratio`	`1:1 2:3 3:2 3:4 4:3 4:5 5:4 9:16 16:9 21:9`	model decides
`--size`	`1K` / `2K` / `4K`	model decides
`--count`	Number of candidates (each is a separate call; one image per call)	1
`--model`	`pro`, `flash`, or a raw id (else `GEMINI_IMAGE_MODEL`)	`pro`

Models (pass the alias to --model or set GEMINI_IMAGE_MODEL):

Alias	Model id	Use for
`pro` (default)	`gemini-3-pro-image`	highest quality, up to 4K
`flash`	`gemini-3.1-flash-image`	faster / cheaper drafts

When the user wants choices to pick from, request --count 2 (or more) and show all outputs. With -i, the prompt becomes an edit/compose instruction over the supplied image(s).

4. Report

Tell the user the saved path(s). If nothing was returned, it is usually a safety-filtered prompt — say so and offer a reworded prompt. Output files are images: reference them by path; do not try to inline their bytes.

Configuration is progressive (the key best practice)

Key, model id, and the API key are each resolved by lib/progressive_env.py in this order, stopping at the first hit: CLI flag → process env → .env chain → built-in default. This is why the same command works in a project with a local .env, in a shell with exports, or with everything overridden inline — and why a newer model can be selected with export GEMINI_IMAGE_MODEL=... without touching code. See references/prompting.md for the parameter reference.

Files

scripts/generate_image.py — the generator (Gemini 3 Pro Image via google-genai).
references/prompting.md — prompt-writing guide and full parameter reference.
${CLAUDE_PLUGIN_ROOT}/lib/progressive_env.py — shared progressive config resolver.

More from this repository

same repository

swiftui-review

FradSer/dotclaude

Reviews SwiftUI code for best practices on modern APIs, maintainability, and performance. This skill should be used when the user asks to review SwiftUI code, check for deprecated iOS/macOS APIs, validate data flow patterns, or audit accessibility compliance in Swift projects.

2026-06-22560

storm-engine

FradSer/dotclaude

Provides the shared STORM methodology, artifact layout, stage-gating contract, citation hygiene, and retrieval fallback. Use when executing any /storm:* skill (generate, research, outline, write, polish). Internal knowledge; never user-invocable.

2026-06-22560

storm-generate

FradSer/dotclaude

Run the full STORM pipeline end-to-end. This skill should be used when the user asks to "generate a storm article", "write a wikipedia-style article about X", "research and write a long-form piece on X", or invokes /storm:generate. Orchestrates research -> outline -> write -> polish, skipping already-completed phases.

2026-06-22560

storm-outline

FradSer/dotclaude

Run STORM phase 2 — outline generation. This skill should be used when the user asks to "generate an outline for a storm article", "draft a wikipedia-style outline", or invokes /storm:outline. Produces a draft outline from parametric knowledge then refines it using the research information table.

2026-06-22560

storm-polish

FradSer/dotclaude

Run STORM phase 4 — article polishing. This skill should be used when the user asks to "polish the storm article", "finalize the article", or invokes /storm:polish. Adds a summary section, removes duplicate content, and verifies citation integrity.

2026-06-22560

storm-research

FradSer/dotclaude

Run STORM phase 1 — multi-perspective research. This skill should be used when the user asks to "research a topic for an article", "find sources on X from multiple perspectives", "do storm research on X", or invokes /storm:research. Discovers personas, runs simulated Q&A per persona in parallel, and produces an information table + deduplicated sources.

2026-06-22560

name	generate-image
description	Generate or edit images from a text prompt using Google's Gemini 3 Pro Image model. Use this skill whenever the user wants to create, generate, draw, render, illustrate, or mock up an image, picture, illustration, concept art, storyboard panel, icon, logo, poster, or product shot — and also when they want to edit, restyle, retouch, combine, or extend an existing image. Triggers include "generate an image", "make a picture of", "draw me", "create an illustration", "生成图片", "画一张", "做一张图", "P 一下这张图", or any request that should produce a PNG/JPEG from a description. Prefer this skill over describing an image in text.
user-invocable	true
argument-hint	"PROMPT" [-o out.png] [-i input.png ...] [--aspect-ratio 16:9] [--size 2K] [--count N]
allowed-tools	["Read","Write","AskUserQuestion","Bash(uv run:)","Bash(/generate_image.py:*)"]

Generate Image (Gemini 3 Pro Image)

Prerequisites

uv available (the script is a self-contained uv run script; deps install on first run).
A Google AI Studio key. The script resolves it progressively, so any one of these works:
- export GEMINI_API_KEY=... in the shell, or
- a .env file (checked in order: $PWD/.env, then ${CLAUDE_PLUGIN_ROOT}/.env), or
- --api-key ... on the command line.
CRITICAL -- Never paste the API key into chat or commit a .env. If the key is missing, the script prints exactly how to set it — relay that to the user rather than guessing.

Workflow

1. Clarify intent (only if genuinely ambiguous)

2. Craft the prompt

3. Run the script

Invoke it directly (the shebang runs it through uv):

${CLAUDE_PLUGIN_ROOT}/skills/generate-image/scripts/generate_image.py "PROMPT" -o OUT.png [flags]

Flags:

Flag	Purpose	Default
`-o, --output`	Output path (`.png`/`.jpeg`)	`generated.png`
`-i, --input`	Reference/input image to edit or compose; repeatable	none
`--aspect-ratio`	`1:1 2:3 3:2 3:4 4:3 4:5 5:4 9:16 16:9 21:9`	model decides
`--size`	`1K` / `2K` / `4K`	model decides
`--count`	Number of candidates (each is a separate call; one image per call)	1
`--model`	`pro`, `flash`, or a raw id (else `GEMINI_IMAGE_MODEL`)	`pro`

Models (pass the alias to --model or set GEMINI_IMAGE_MODEL):

Alias	Model id	Use for
`pro` (default)	`gemini-3-pro-image`	highest quality, up to 4K
`flash`	`gemini-3.1-flash-image`	faster / cheaper drafts

When the user wants choices to pick from, request --count 2 (or more) and show all outputs. With -i, the prompt becomes an edit/compose instruction over the supplied image(s).

4. Report

Configuration is progressive (the key best practice)

Files

scripts/generate_image.py — the generator (Gemini 3 Pro Image via google-genai).
references/prompting.md — prompt-writing guide and full parameter reference.
${CLAUDE_PLUGIN_ROOT}/lib/progressive_env.py — shared progressive config resolver.