| name | genimg-gemini-web |
| description | Image generation skill using Gemini Web. Generates images from text prompts via Google Gemini. Also supports text generation. Use as the image generation backend for other skills like cover-image, xhs-images, article-illustrator. |
Gemini Web Client
Supports:
- Text generation
- Image generation (download + save)
- Reference image upload (attach images for vision tasks)
- Multi-turn conversations within the same executor instance (
keepSession)
- Experimental video generation (
generateVideo) — Gemini may return an async placeholder; download might require Gemini web UI
Quick start
npx -y bun scripts/main.ts "Hello, Gemini"
npx -y bun scripts/main.ts --prompt "Explain quantum computing"
npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png
npx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123
npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123
Executor options (programmatic)
This skill is typically consumed via createGeminiWebExecutor(geminiOptions) (see scripts/executor.ts).
Key options in GeminiWebOptions:
referenceImages?: string | string[] Upload local images as references (vision input).
keepSession?: boolean Reuse Gemini chatMetadata to continue the same conversation across calls (required if you want reference images to persist across multiple messages).
generateVideo?: string Generate a video and (best-effort) download to the given path. Gemini may return video_gen_chip (async); in that case you must open Gemini web UI to download the result.
Notes:
generateVideo cannot be combined with generateImage / editImage.
- When
keepSession=true and referenceImages is set, reference images are uploaded once per executor instance.
Commands
Text generation
npx -y bun scripts/main.ts "Your prompt here"
npx -y bun scripts/main.ts --prompt "Your prompt here"
npx -y bun scripts/main.ts -p "Your prompt here"
npx -y bun scripts/main.ts -p "Hello" -m gemini-2.5-pro
echo "Summarize this" | npx -y bun scripts/main.ts
Image generation
npx -y bun scripts/main.ts --prompt "A sunset over mountains" --image
npx -y bun scripts/main.ts --prompt "A cute robot" --image robot.png
npx -y bun scripts/main.ts "A dragon" --image=dragon.png
Output formats
npx -y bun scripts/main.ts "Hello"
npx -y bun scripts/main.ts "Hello" --json
Options
| Option | Description |
|---|
--prompt <text>, -p | Prompt text |
--promptfiles <files...> | Read prompt from files (concatenated in order) |
--model <id>, -m | Model: gemini-3-pro (default), gemini-2.5-pro, gemini-2.5-flash |
--image [path] | Generate image, save to path (default: generated.png) |
--sessionId <id> | Session ID for multi-turn conversation (agent generates unique ID) |
--list-sessions | List saved sessions (max 100, sorted by update time) |
--json | Output as JSON |
--login | Refresh cookies only, then exit |
--cookie-path <path> | Custom cookie file path |
--profile-dir <path> | Chrome profile directory |
--help, -h | Show help |
CLI note: scripts/main.ts supports text generation, image generation, and multi-turn conversations via --sessionId. Reference images and video generation are exposed via the executor API.
Models
gemini-3-pro - Default, latest model
gemini-2.5-pro - Previous generation pro
gemini-2.5-flash - Fast, lightweight
Authentication
First run opens Chrome to authenticate with Google. Cookies are cached for subsequent runs.
npx -y bun scripts/main.ts --login
Environment variables
| Variable | Description |
|---|
GEMINI_WEB_DATA_DIR | Data directory |
GEMINI_WEB_COOKIE_PATH | Cookie file path |
GEMINI_WEB_CHROME_PROFILE_DIR | Chrome profile directory |
GEMINI_WEB_CHROME_PATH | Chrome executable path |
Examples
Generate text response
npx -y bun scripts/main.ts "What is the capital of France?"
Generate image
npx -y bun scripts/main.ts "A photorealistic image of a golden retriever puppy" --image puppy.png
Get JSON output for parsing
npx -y bun scripts/main.ts "Hello" --json | jq '.text'
Generate image from prompt files
npx -y bun scripts/main.ts --promptfiles system.md content.md --image output.png
Multi-turn conversation
npx -y bun scripts/main.ts "You are a helpful math tutor." --sessionId task-abc123
npx -y bun scripts/main.ts "What is 2+2?" --sessionId task-abc123
npx -y bun scripts/main.ts "Now multiply that by 10" --sessionId task-abc123
npx -y bun scripts/main.ts --list-sessions
Session files are stored in ~/Library/Application Support/genimg-skills/gemini-web/sessions/<id>.json and contain:
id: Session ID
metadata: Gemini chat metadata for continuation
messages: Array of {role, content, timestamp, error?}
createdAt, updatedAt: Timestamps