with one click
with one click
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | gemini-image-generator |
| description | Generate images from text prompts via Google Gemini. |
| agent | python-general-engineer |
| user-invocable | false |
| allowed-tools | ["Read","Write","Bash","Grep","Glob","Edit"] |
| command | /generate-image |
| routing | {"triggers":["generate image","create image with AI","gemini image","text to image","python image generation","create sprite","generate character art"],"pairs_with":["python-general-engineer","workflow"],"complexity":"simple","category":"image-generation"} |
Generate images from text prompts via CLI using Google Gemini APIs. Supports model selection between fast (gemini-2.5-flash-image) and quality (gemini-3-pro-image-preview) models, batch generation, watermark removal, and background transparency.
| Signal | Load These Files | Why |
|---|---|---|
| tasks related to this reference | prompts.md | Loads detailed guidance from prompts.md. |
Verify the API key exists before any generation attempt -- a missing key produces confusing errors that waste time debugging.
echo "GEMINI_API_KEY is ${GEMINI_API_KEY:+set}"
Expect: GEMINI_API_KEY is set. If not set, instruct user to configure it and stop.
Verify Python dependencies are available:
python3 -c "from google import genai; from PIL import Image; print('OK')"
If missing, install:
pip install google-genai Pillow
Determine the output path. Always use absolute paths for output files -- relative paths break when scripts run in different working directories. Verify the parent directory exists or will be created.
Proceed only when: API key is set, dependencies installed, output path is valid.
Choose the model based on the use case:
| Scenario | Model | Why |
|---|---|---|
| Iterating on prompt, drafts | gemini-2.5-flash-image | Fast feedback (2-5s) |
| Final quality asset | gemini-3-pro-image-preview | Best quality, 2K resolution |
| Game sprites, batch work | gemini-2.5-flash-image | Cost effective, consistent |
| Text in image, typography | gemini-3-pro-image-preview | Better text rendering |
| Product photography | gemini-3-pro-image-preview | Detail matters |
Use ONLY these exact model strings -- the API returns cryptic errors for anything else, and date suffixes (valid for text models) do not work for image models:
| Correct (use exactly) | WRONG (never use) |
|---|---|
gemini-2.5-flash-image | gemini-2.5-flash-preview-05-20 (date suffix) |
gemini-3-pro-image-preview | gemini-2.5-pro-image (doesn't exist) |
gemini-3-flash-image (doesn't exist) | |
gemini-pro-vision (that's image input) |
Compose the prompt using this structure: [Subject] [Style] [Background] [Constraints]
For transparent background post-processing, include:
Always include negative constraints: "no text", "no labels", "character only"
Determine post-processing flags:
--remove-watermark--transparent-bg--bg-color "#FFFFFF" --bg-tolerance 20Proceed only when: Model selected, prompt composed, flags determined.
Always use the provided generate_image.py script -- it contains retry logic, rate limiting, post-processing, model validation, and error handling that inline Python would miss.
python3 $HOME/vexjoy-agent/skills/content/gemini-image-generator/scripts/generate_image.py \
--prompt "YOUR_PROMPT_HERE" \
--output /absolute/path/to/output.png \
--model gemini-3-pro-image-preview
For batch mode:
python3 $HOME/vexjoy-agent/skills/content/gemini-image-generator/scripts/generate_image.py \
--batch /path/to/prompts.txt \
--output-dir /absolute/path/to/output/ \
--model gemini-2.5-flash-image
Display the full script output -- never summarize it, since the user needs to see status, warnings, and any partial failures.
Check for SUCCESS or ERROR in output. If rate limited (429), the script handles retry automatically with exponential backoff (up to 3 attempts).
Proceed only when: Script exited with code 0 and printed SUCCESS.
Confirm the output file exists and has non-zero size -- a zero-byte file means the write succeeded but no image data was returned:
ls -la /absolute/path/to/output.png
Optionally check dimensions:
python3 -c "from PIL import Image; img = Image.open('/absolute/path/to/output.png'); print(f'Size: {img.size}, Mode: {img.mode}')"
Visual inspection is mandatory. Read the generated image file using the Read tool to visually inspect it. A file can pass all size and dimension checks but still contain watermarks, wrong composition, excessive padding, or content that doesn't match the prompt.
Check for:
If the image fails visual inspection, regenerate with an adjusted prompt before reporting to the user. Do not commit or deliver images without visual verification.
Provide the user with:
Only report what was directly requested -- do not suggest additional generations, style variations, or enhancements the user did not ask for.
Cause: Environment variable missing or empty Solution:
export GEMINI_API_KEY="your-key"Cause: Too many requests to Gemini API in short period Solution:
--delay to 5-10 secondsgemini-2.5-flash-image for higher throughputCause: API returned text-only response or generation was blocked Solution:
response_modalities=["IMAGE", "TEXT"]Cause: Prompt contains restricted content or triggers safety filters Solution:
Location: $HOME/vexjoy-agent/skills/content/gemini-image-generator/scripts/generate_image.py
| Argument | Required | Description |
|---|---|---|
--prompt | Yes* | Text prompt for image generation |
--output | Yes* | Output file path (.png) |
--model | No | Model name (default: gemini-3-pro-image-preview) |
--remove-watermark | No | Remove watermarks from corners |
--transparent-bg | No | Make background transparent |
--bg-color | No | Background color hex (default: #3a3a3a) |
--bg-tolerance | No | Color matching tolerance (default: 30) |
--batch | No | File with prompts (one per line) |
--output-dir | No | Directory for batch output |
--retries | No | Max retry attempts (default: 3) |
--delay | No | Delay between batch requests in seconds (default: 3) |
*Required unless using --batch + --output-dir
Exit Codes: 0 = success, 1 = missing API key, 2 = generation failed, 3 = invalid arguments
Effective prompt structure: [Subject] [Style] [Background] [Constraints]
For transparent background post-processing:
--transparent-bg flagFor clean edges: "clean edges", "sharp outlines", "heavy ink outlines"
Negative constraints: Always include "no text", "no labels", "no watermarks", "character only"
${CLAUDE_SKILL_DIR}/references/prompts.md: Categorized example prompts by use case (game art, characters, product photography, pixel art, icons)