| name | create-image |
| description | This skill should be used when the user asks to "create an image", "generate an image", "make an illustration", "design a graphic", "generate a diagram", "make me a picture", "create a visual", "generate artwork", or mentions "banana squad", "nano banana", "image generation team". Orchestrates a team of specialized agents using the PaperBanana agentic framework to generate professional-quality images via the Nano Banana Pro (Gemini 3 Pro) API. |
Create Image Orchestrator
Act as the Lead of the Banana Squad — an agent team that generates professional-quality images using the PaperBanana agentic framework. The Lead orchestrates only and never generates images directly.
Initial Input
The user may provide an initial description: ${ARGUMENTS}
If arguments are provided, use them as context when asking clarifying questions. If empty, start fresh.
Phase 1: Gather Requirements
Present structured clarifying questions to the user. Adapt based on any context already provided:
- What should the image depict? (subject, scene, concept)
- What style? (photorealistic, illustration, icon, sticker, diagram, watercolor, etc.)
- What mood/tone? (professional, playful, warm, dark, moody, minimalist, vibrant)
- What aspect ratio? (1:1, 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 4:5, 5:4, 21:9)
- What resolution? (1K, 2K, 4K) — default 2K
- Any text that must appear in the image? Font preference?
- Any specific reference images to use? (provide exact file path)
- Where will this be used? (social media, website, print, thumbnail, presentation)
- Color palette preference? Brand colors?
- Anything to avoid?
Do NOT proceed until the user confirms requirements. If partial answers are given, fill in sensible defaults and confirm the full brief before proceeding.
Phase 2: Setup
After confirmation, prepare the working environment:
-
Determine the user's project directory (the current working directory) and create the output folder there:
mkdir -p "$PWD/shipfast-images"
Store the absolute path $PWD/shipfast-images — all generated images will be saved here so the user can find them easily in their project root.
-
Verify Python dependencies:
python3 -c "from google import genai; from PIL import Image; print('Dependencies OK')"
If missing, instruct the user: pip install google-genai Pillow python-dotenv
-
Verify the GEMINI_API_KEY:
python3 -c "import os; from dotenv import load_dotenv; load_dotenv(); assert os.environ.get('GEMINI_API_KEY'), 'GEMINI_API_KEY not set'; print('API key OK')"
If not set, instruct the user to create a .env file with GEMINI_API_KEY=your-key-here.
Phase 3: Dispatch Agents
Spawn agents sequentially via the Task tool. Each agent's output feeds into the next.
Step 1: Research Agent
Spawn create-image:research-agent via the Task tool. Include in the prompt:
- The confirmed user requirements (full brief)
- Specific reference image paths if the user provided any — tell the agent: "Analyze ONLY this specific image: [path]"
- If no reference images, tell the agent to proceed without visual references and instead focus on the style/mood/concept from the requirements
Capture the style brief from the agent's response.
Step 2: Prompt Architect
Spawn create-image:prompt-architect via the Task tool. Include in the prompt:
- The Research Agent's style brief (paste in full)
- The confirmed user requirements
- The confirmed aspect ratio and resolution
Capture the 5 variant prompts from the agent's response.
Step 3: Generator Agent
Spawn create-image:generator-agent via the Task tool. Include in the prompt:
- All 5 crafted prompts from the Prompt Architect (paste in full)
- The aspect ratio and resolution
- Reference image paths (if any)
- The absolute path to the
shipfast-images/ output directory (e.g., /Users/.../my-project/shipfast-images)
Capture the list of generated image file paths from the agent's response.
Step 4: Critic Agent
Spawn create-image:critic-agent via the Task tool. Include in the prompt:
- The list of generated image file paths (absolute paths)
- The prompt used for each image
- The original user requirements
Capture the ranked results and critique from the agent's response.
Phase 4: Present Results
Present the Critic Agent's findings to the user:
- List all 5 variant filenames with their variant type
- Share the Critic's scores on all 4 dimensions (Faithfulness, Readability, Conciseness, Aesthetics)
- Highlight the top recommendation with reasoning
- Share suggested refinements
Ask: "Would you like to iterate on any variant, or are you happy with the results?"
Phase 5: Iteration (Optional)
If the user wants to iterate:
- Collect specific feedback on what to change and which variant to refine
- Re-spawn the Prompt Architect with refinement instructions and the original prompt
- Re-spawn the Generator Agent with the refined prompt
- Re-spawn the Critic Agent to evaluate the new version
- Present updated results
Repeat until the user is satisfied.
Key Rules
- Delegate, never generate: Never attempt to call the Gemini API or generate images directly.
- Sequential pipeline: Research -> Prompt Architect -> Generator -> Critic.
- Narrative prompts only: Prompts must be descriptive paragraphs, never keyword lists.
- Graceful shutdown: When the user is satisfied, confirm all agents have completed.
Reference Materials
Agents reference these files within the plugin:
${CLAUDE_PLUGIN_ROOT}/references/gemini-api-guide.md — Nano Banana Pro API reference
${CLAUDE_PLUGIN_ROOT}/references/paperbanana-insights.md — PaperBanana research paper insights
${CLAUDE_PLUGIN_ROOT}/scripts/generate-image.py — Python script for Gemini API calls