| name | gpt-image-cookbook |
| description | Use this skill whenever a user asks to generate, create, draw, render, or edit images with AI image models — gpt-image-2, DALL-E, Google Imagen, Flux, or others. Covers text-to-image, reference-image editing, inpainting, posters, typography, UI mockups, diagrams, and curated gallery prompts. Search the bundled cookbook references for matching patterns, confer on direction when ambiguous, then call the packaged `gic` CLI. Do not write new image-generation code unless the user explicitly asks to modify this repo. |
| compatibility | Requires Python 3.11+ and one of `gic`, `uv`, or `uvx`. CLI/API calls read provider API keys from env (`OPENAI_API_KEY`, `GOOGLE_API_KEY`, `FAL_KEY`, etc.) and may incur charges on the user's account. |
| metadata | {"openclaw":{"requires":{"anyBins":["gic","uv","uvx"]},"primaryEnv":"OPENAI_API_KEY","homepage":"https://github.com/eugeniughelbur/gpt-image-cookbook"}} |
gpt-image-cookbook
Agent runbook for multi-provider AI image generation and editing. Use the prompt cookbook + packaged gic CLI. Do not reimplement image API code.
Operating loop
- Classify the request:
generate, edit, inpaint, or multi-reference. Identify asset type, exact text to render, aspect ratio, references, safety constraints, and budget/quality tier.
- Pick a provider: default to
openai (gpt-image-2). Switch to imagen for Google-native quality on photoreal scenes, or flux for fast/cheap drafts and stylized art. The user's explicit request always wins.
- Search references first: open
references/gallery.md (the routing index). Load the closest references/gallery-<category>.md file(s). Read actual **Prompt** text before choosing a pattern — never guess from category name alone.
- Refine with craft: load
references/craft.md for dense text, diagrams, UI mockups, data visualization, multi-panel layouts, or when the gallery has no close match.
- Confer when useful: before costly, ambiguous, or high-polish calls, present 1–3 matched directions plus planned size/quality/provider; ask at most one concise question. Skip the discussion for precise "generate now" requests.
- Preflight, no side effects: check
command -v gic first. Do not reinstall, overwrite skill folders, create or modify .env, or write API keys. Global/shared installs are opt-in only.
- Execute via CLI only: call
gic. Do not create a new generate.py, SDK wrapper, or ad-hoc script for normal image requests.
- Report: output file path(s), the provider/model used, key flags, and one concise refinement suggestion if useful.
Fast path: precise prompt + explicit "generate now" → quick reference/craft check, then CLI.
CLI resolution
Preferred call order:
gic -p "PROMPT" [-f OUT] [-i REF...] [-m MASK] [--provider openai|imagen|flux] [options]
uv run "$SKILL_DIR/scripts/generate.py" -p "PROMPT" [options]
uvx --from git+https://github.com/eugeniughelbur/gpt-image-cookbook gic -p "PROMPT" [options]
scripts/generate.py is a launcher: repo-local src/gic → installed gic on PATH → transient uvx fallback.
Provider selection
| Provider | Model default | When to use |
|---|
openai | gpt-image-2 | Default. Strong on text rendering, posters, UI mockups, Chinese typography, research figures. |
imagen | imagen-4 | Photoreal scenes, product shots, faces, lighting realism. Google-account billing. |
flux | flux-pro-1.1 | Fast/cheap drafts, stylized art, broad style exploration. fal.ai or Replicate billing. |
The CLI resolves the provider from the --provider flag, then GIC_DEFAULT_PROVIDER env var, then falls back to openai.
Key and cost rules
- The CLI reads provider keys from process env, then
.env, then ~/.env without overriding existing env. Successful API calls bill the user's provider account.
- If a host runtime has native platform-managed image generation and the user wants that path, use the host tool instead of this CLI.
- If the required key is unset, report the missing key and the env var name; do not write secrets.
- If the user wants to avoid local-key use, respect
unset OPENAI_API_KEY (etc.); if a key exists in .env/~/.env, tell them to remove or rename it for the session rather than working around it.
- Never print secret values.
Flags
| Flag | Values | Use |
|---|
-p, --prompt | string | Required prompt or edit instruction |
-f, --file | path | Output path; auto-named if omitted |
-i, --image | repeatable path | Use the edits endpoint; supports multiple references |
-m, --mask | PNG path | Inpaint with alpha mask; requires -i |
--provider | openai, imagen, flux | Provider router |
--model | string | Override the provider's default model |
--size | 1k, 2k, 4k, portrait, landscape, square, wide, tall, or literal WxH | Canvas size |
--quality | low, medium, high, auto | Cost/quality dial (provider-mapped) |
-n, --n | integer | Number of images |
--background | auto, opaque, transparent | Background mode (provider-dependent) |
--format | png, jpeg, webp | Output encoding |
--user | string | Optional end-user identifier passed to provider |
Quality policy:
low: cheap drafts, broad exploration, many variants.
medium: normal exploration, style probing, balanced cost.
high: final assets, dense text, posters, diagrams, UI mockups, paper figures, dense labels.
Size policy:
- default/social square:
1k / 1024x1024
- poster/mobile/beauty:
portrait
- landscape/gameplay/photo:
landscape
- print/paper figure:
2k
- widescreen hero:
4k
- vertical story/banner:
tall
Endpoint routing
| Mode | Trigger | Endpoint family |
|---|
| Text-to-image | no -i | provider's generations endpoint |
| Reference edit | one or more -i | provider's edits endpoint |
| Inpaint | -i + -m | provider's edits endpoint with mask |
Surface API errors verbatim enough for debugging. Exit codes: 0 success, 1 API/refusal, 2 bad args/missing key.
Reference loading
references/gallery.md: routing index for the cookbook gallery. Load first.
references/gallery-*.md: concrete prompts, previews, paths, metadata. Load 1 category for normal requests; 2–3 for hybrids.
references/craft.md: prompt-craft checklist. Load for prompt repair, exact text rendering, UI/data/diagram grammar, edit invariants, and multi-panel consistency.
references/providers.md: provider/model semantics. Load for API behavior or capability questions.
Reference loading policy: load the smallest useful slice; never load all category files by default.
Verification
After a generation call:
- Confirm the output file exists at the reported path.
- If the prompt requested specific text, verify the text renders correctly — re-run with
--quality high if it doesn't.
- For edits/inpainting, confirm the unmasked regions are preserved.