| name | codex-api-imagegen |
| description | Generate raster images (PNG/JPEG/WebP) via the local codex-api gateway, powered by the user's ChatGPT subscription — no OPENAI_API_KEY needed. Use when an agent needs to create a brand-new bitmap asset for the current project (photos, illustrations, icons, hero banners, mockups, sprites, concept art) and the output should be a bitmap file saved into the workspace. Do not use when the task is better solved by editing existing SVG/vector assets, writing code-native graphics (HTML/CSS/canvas), or extending an established repo icon system. |
codex-api Image Generation Skill
Generates images by calling the local codex-api gateway's
/v1/chat/completions endpoint with image_generation tool enabled. The
gateway forwards the request to ChatGPT's Responses API and returns the PNG
embedded as a data URI in the assistant message. This helper extracts that
PNG and writes it to a path inside the current project.
Prerequisites
- The codex-api gateway is running locally (default:
http://127.0.0.1:8000).
- Start it with:
cd ~/github.com/codex-api && uv run agent-cli-to-api codex
- Or rely on the user's launchd auto-start (
com.codex-api.gateway).
- The user has a logged-in ChatGPT subscription (
~/.codex/auth.json exists,
populated by codex login in the Codex CLI). No OPENAI_API_KEY required.
- Gateway settings:
CODEX_USE_CODEX_RESPONSES_API=1 — default
CODEX_ENABLE_IMAGE_GEN=1 — must be set explicitly (default is OFF). If
it's not set, the gateway won't inject the image_generation tool and the
model will return text only. Symptom: the script exits with
Model did not return an image.
If the gateway is unreachable, point the agent at the codex-api repo and ask
the user to start it — do not silently fall back. If CODEX_ENABLE_IMAGE_GEN
is unset, tell the user to add it to their .env / launchd plist and restart
the gateway, or to launch a one-off gateway with that env set.
When to use
- The user asks for a new photo, illustration, icon, hero banner, sprite,
cover image, infographic-style asset, product mockup, concept art, or any
other bitmap deliverable for the current project.
- The asset is intended to be checked into the repo (or used as a build input)
rather than ephemeral preview.
When not to use
- The user wants an SVG icon that matches an existing in-repo vector set.
- The task is better solved with code (HTML/CSS, canvas, Mermaid, PlantUML).
- The user is editing an image they already have on disk — for that, use the
Codex CLI's bundled
imagegen skill or call /v1/images/edits directly
with their OPENAI_API_KEY (this skill is generate-only at present).
Credentials — DO NOT pass tokens on the command line
Agents running this skill must not put CODEX_API_TOKEN=... or --token ...
on the command line, because tool calls are echoed into transcripts and chat
logs. The token would leak into anyone who can see the conversation.
The script reads the gateway URL and token from three sources, in priority
order:
- CLI flag (
--base-url, --token) — only when the user explicitly types
them; never construct these flags from a token your agent has memorised.
- Environment variable (
CODEX_API_BASE_URL, CODEX_API_TOKEN) — works if
the user exports them in their shell config.
- Plain file (
~/.config/codex-api/base_url and ~/.config/codex-api/token,
one line each, recommended mode 600) — preferred for agent workflows.
One-time setup (user runs this, once):
mkdir -p ~/.config/codex-api
chmod 700 ~/.config/codex-api
printf 'https://your-gateway.example.com' > ~/.config/codex-api/base_url
printf 'your-gateway-token' > ~/.config/codex-api/token
chmod 600 ~/.config/codex-api/{base_url,token}
After that, the agent just runs python3 scripts/generate.py "<prompt>" with
no env, no token flag — credentials come from the file silently.
If the file is missing and no env vars are set, the script falls back to
http://127.0.0.1:8000 + devtoken (the default for a locally-running
gateway with no auth configured).
How to invoke
python3 scripts/generate.py "<prompt>" [options]
Common options:
| Flag | Default | Notes |
|---|
-o, --out PATH | assets/generated/<slug>.png | Where to write the file. Parent dirs are created. |
--size | auto | 1024x1024, 1536x1024, 1024x1536, 2048x2048, 3840x2160, etc. The subscription path honours the size. |
--format | png | png | jpeg | webp |
--model | gpt-5.5 | The chat model that hosts image_generation |
--base-url | from $CODEX_API_BASE_URL → ~/.config/codex-api/base_url → http://127.0.0.1:8000 | Override gateway URL. Don't pass this from an agent — set the file once. |
--token | from $CODEX_API_TOKEN → ~/.config/codex-api/token → devtoken | Bearer token for the gateway. Don't pass this from an agent — set the file once. |
--quiet | off | Print only the output path on stdout |
The script prints just the saved file path on stdout (and progress info on
stderr), so the agent can capture it directly:
OUT=$(python3 scripts/generate.py "..." --quiet)
echo "saved to: $OUT"
Save-path policy
- Always save into the workspace, never to
/tmp or $HOME.
- If the user names a destination, pass it via
-o.
- If the asset is for the project, save into a sensible repo subdirectory
(e.g.
assets/, public/, static/, docs/img/, web/img/).
- Never overwrite an existing file unless the user explicitly asked for it
(
-o will silently overwrite — the script does not autoshield). The
default path with --out omitted auto-numbers (name.png, name-2.png).
- After saving, always echo back the path to the user.
Workflow for the agent
- Clarify the request enough to write a 1-3 sentence visual prompt:
subject, style, composition, mood, constraints.
- Pick the size/format based on intended use:
- icon →
1024x1024 png
- landing hero →
1536x1024 png (landscape)
- mobile splash →
1024x1536 png (portrait)
- photo →
2048x1152 png or auto
- transparent cutout → not currently supported by this skill on the
subscription path; tell the user it requires an
OPENAI_API_KEY and
the Codex CLI's bundled imagegen skill.
- Pick the output path under the workspace.
- Call the script, capture stdout (= path), report it back.
- Inspect the result if necessary. If the image is clearly wrong, iterate
with a single targeted change to the prompt — do not chain many calls
blindly (each costs subscription quota).
Quality and limits
- Quality is auto-selected by the model. Asking for
quality: high via the
tool params is silently downgraded to medium on the subscription path —
this is a tier cap, not a bug. Photorealistic 1536x1024 medium-quality
output is excellent.
- A single image typically takes 15-40 seconds to generate.
- Each call consumes ChatGPT subscription quota — the rate limit is shared
with the user's interactive ChatGPT/Codex usage. Don't loop over many
variants without permission.
Example
python3 scripts/generate.py \
"Minimal flat-design illustration of a green leaf, white background, centered, no text" \
-o assets/brand/leaf-icon.png \
--size 1024x1024 \
--quiet
Output (stdout): assets/brand/leaf-icon.png
Error handling
| Symptom | Likely cause | Fix |
|---|
failed to reach gateway | gateway not running | start it (see Prerequisites) |
HTTP 401 Missing Authorization | wrong/missing token | set CODEX_API_TOKEN or --token |
HTTP 403 error code: 1010 | Cloudflare in front of the gateway is blocking the request | the script already sets a User-Agent; if you removed it, restore it. Also confirm the gateway hostname is in your CF allowlist. |
HTTP 400 requires a newer version of Codex | gateway is sending version: 0.111.0 because it can't detect the local codex CLI version | server-side fix: pull the latest codex-api (commit fbf9316 or newer reads ~/.codex/version.json), or ensure codex --version succeeds in the gateway's environment (launchd PATH must reach node for the codex shim). Bumping the local codex CLI alone does not help if detection still fails. |
HTTP 500 env: node: No such file | gateway falling back to codex exec subprocess | confirm CODEX_USE_CODEX_RESPONSES_API=1 (default in current codex-api) |
Model did not return an image | model returned text only | rephrase prompt to explicitly say "use the image_generation tool" |
| HTTP 429 / quota errors | subscription rate-limited | wait, or switch to API-key path (gpt-image-2 direct) |
Internals (for future maintainers)
- Gateway request body: standard chat completions, the model is instructed to
call the
image_generation Responses-API tool.
- Gateway injects
tools: [{"type": "image_generation"}] into the upstream
/responses call automatically when CODEX_ENABLE_IMAGE_GEN=1 (default).
- Gateway collects
response.output_item.done events whose item.type == "image_generation_call" and embeds the base64 PNG as a markdown
 part of the assistant message content.
- The script extracts the first such data URI and writes it to disk.