一键导入
polli
// Generate images, text, audio, video, and transcribe speech via the Pollinations API using the polli CLI. Use when asked to generate media, call pollinations.ai, check pollen balance, list models, manage API keys, or run polli commands.
// Generate images, text, audio, video, and transcribe speech via the Pollinations API using the polli CLI. Use when asked to generate media, call pollinations.ai, check pollen balance, list models, manage API keys, or run polli commands.
Add, update, or remove text/image/video/audio/embeddings models. Covers the full lifecycle: files to touch, what to verify, and how to test empirically before merging.
Detect and analyze abusive accounts on Pollinations. IP clustering, multi-signal scoring, ban recommendations. Use when investigating abuse, bot farms, or suspicious usage patterns.
Debug and diagnose model errors in Pollinations services. Analyze logs, find error patterns, identify affected users. For taking action on user tiers, see tier-management skill.
Query billing, usage, credits, and resource deployments across all our cloud and SaaS providers (Azure, AWS, Cloudflare, GCP, Tinybird, Vercel, Stripe, Polar, etc.) via their native CLIs and APIs. Use for any question about provider costs, spend by service/day/month, credit eligibility, invoice totals, which resources are running, or how to deploy/inspect resources. Each provider has a dedicated playbook under `providers/`.
Analyze Pollinations revenue, pack purchases, and tier spending patterns. Query Polar for payment history and Tinybird for usage data.
Deploy Tinybird pipes and datasources for enter.pollinations.ai observability. Validates and pushes changes to Tinybird Cloud.
| name | polli |
| description | Generate images, text, audio, video, and transcribe speech via the Pollinations API using the polli CLI. Use when asked to generate media, call pollinations.ai, check pollen balance, list models, manage API keys, or run polli commands. |
| allowed-tools | Bash(polli *) |
Thin wrapper around gen.pollinations.ai. Generates images, text, audio, video; transcribes speech; manages API keys and usage.
| Intent | Command |
|---|---|
| Log in once | polli auth login |
| Store an existing key | printf '%s' "$POLLINATIONS_API_KEY" | polli auth login --with-token |
| Generate image | polli gen image "<prompt>" --output out.png |
| Generate text | polli gen text "<prompt>" |
| Text with stdin as context | echo "<ctx>" | polli gen text "<question>" |
| Describe an image (vision) | polli gen text "what is this?" --image <url> |
| One-shot TTS | polli gen audio "<text>" --output speech.mp3 |
| Speak out loud | polli gen audio "<text>" --play (uses afplay on macOS; ffplay/mpv/mpg123 on Linux) |
| Generate video | polli gen video "<prompt>" --output out.mp4 |
| Transcribe audio | polli gen transcribe path/to.mp3 |
| Upload a local file | polli upload path/to.png (prints public URL) |
| List all models | polli models |
| Filter models by type | polli models --type image |
| Model health + latency | polli models --stats (default 60m, --window <min>) |
| Check balance | polli usage |
| Machine-readable output | append --json to any command |
One-time: polli auth login (device-flow). To store an existing key, run
printf '%s' "$POLLINATIONS_API_KEY" | polli auth login --with-token. Verify
with polli auth status.
Override the stored key for a single command with --key <key>.
polli gen image "a fox reading a book, studio ghibli style" --output fox.png
Defaults: zimage, 1024x1024. Pick a different model with --model flux (see polli models --type image). For edits / img2img, pass one or more --image <url> flags — must be public http(s) URLs, local paths are rejected client-side. Only models that list "image" in input_modalities actually consume the flag — flux and zimage are text-only and will silently ignore --image. Find i2i-capable models with polli models --type image --json | jq -r '.[] | select(.input_modalities | contains(["image"])) | .name' (common choices: nanobanana, kontext, p-image-edit). To use a local file, upload it first with polli upload (see next recipe).
URL=$(polli upload cat.png)
polli gen image "make the cat purple" --image "$URL" --output purple.png
polli upload <file> posts to media.pollinations.ai (10MB max, 14-day TTL, content-addressed so duplicates dedupe). Human mode: URL on stdout, id/size/contentType/duplicate on stderr. --json: full upload response on stdout. The returned URL is public (no auth to fetch) and works anywhere --image is accepted — gen image, gen video, etc.
polli gen text "summarize the three laws of robotics"
Save to file: --output summary.txt. Use --system "<msg>" to set system prompt. For reasoning models, pass --reasoning low|medium|high to control reasoning effort. Only send --reasoning to models where reasoning: true in polli models --type text --json — the flag is not validated client-side, and non-reasoning models may silently accept it (openai), ignore it, or return a 400 (mistral).
URL=$(polli upload selfie.jpg)
polli gen text "turn this person into a cartoon pet in one playful sentence" --image "$URL"
gen text --image <url...> attaches one or more public https URLs as an OpenAI-style multimodal message — repeatable for multi-image prompts. Local paths aren't supported; run them through polli upload first (see the upload recipe above). Only text models with "image" in input_modalities actually read the image — filter with polli models --type text --json | jq -r '.[] | select(.input_modalities | index("image")) | .name'. Non-vision models silently ignore the attachment. Good defaults: openai, gemini, claude.
cat README.md | polli gen text "what does this project do?"
stdin becomes context; the positional argument is the question.
polli gen chat --model openai --system "you are a terse assistant"
Slash commands inside the session: /exit, /clear, /save <path>.
polli gen audio "hello world" --voice nova --output hello.mp3
echo "long script" | polli gen audio --voice nova --output out.mp3
Default voice is sage. To discover the full live voice list, use the model registry: polli models --type audio --json | jq -r '.[].voices[]?' — each audio model entry includes its voices[] array. Format defaults to mp3; --format opus|aac|flac|wav to change. Accepts stdin (same as gen text). Add --play to save and then play the audio back (handy for narration/demos). Playback starts after the file is fully written, and the command blocks until playback finishes — if you want fire-and-forget, wrap in a subshell: ( polli gen audio "..." --play & ). Player on macOS: afplay; on Linux it tries ffplay, then mpv, then mpg123 in that order.
polli gen audio "lofi hip-hop beat" --model elevenmusic --duration 30 --instrumental --output track.mp3
polli gen video "a spacecraft landing on mars" --model wan-fast --duration 5 --output mars.mp4
Cheapest path: --model wan-fast at ~$0.01/sec, fixed 5-second output (any --duration value is ignored — you always pay for and receive 5 sec). For image-to-video, pass --image <url> with a public HTTPS URL (local file paths and 404/rate-limited hosts will fail with a server error).
Flag support varies per model and is not enforced client-side. --duration, --aspect-ratio, --audio, --negative, and --enhance are forwarded to the server but may be silently ignored — verified on wan-fast where duration is locked to 5s, --aspect-ratio 9:16 still returns 16:9, and --audio produces no audio track. Always inspect the output (file, ffprobe) before trusting a flag worked. Check polli models --type video --json for per-model capabilities.
Video is not tracked by --stats. polli models --type video --stats returns empty — the stats pipe only records text/image/audio events. To compare video models, fall back to polli models --type video --json and look at price/description fields.
polli gen transcribe recording.mp3 --language en
Models: whisper (default), scribe, universal-2, universal-3-pro. Accepts common audio formats (mp3, wav, m4a, flac, ogg); non-audio input (e.g. a .txt file) returns a clear 400 invalid_request_error: extension "txt" not supported — no need to pre-validate with file. Default output is the plain transcript on stdout as a single line (pipe-friendly). Use --json for structured output: whisper and AssemblyAI return timing data when requested through the API; scribe returns only {text: "..."} — use whisper or AssemblyAI if you need timing data. --language <ISO-639-1> (e.g. en, fr) is an optional hint that can improve accuracy for non-English or accented speech.
polli models --type text # text models only
polli models --type image --verbose # with context length / pricing
polli models --stats # health + avg latency + err% (60m default)
polli models --stats --window 5 # last 5 minutes only
Use --stats before choosing a model. Caveat: the err% column counts 5xx only — a model can show 0.0% while having massive 4xx rates (auth, validation, etc.). For the full picture use --stats --json and read errors_4xx, errors_5xx, latency_p95_ms.
Pricing fields are per-token, not per-request. completionImageTokens: 0.000008 means each output image-token costs that much — a single 1024x1024 image from gptimage lands at ~$0.008, not $0.000008. Flat-priced image models (flux, zimage) expose completionImageTokens as the whole-image price because they emit exactly one "token" per image. When in doubt, make one call and read the true cost from polli usage --history --limit 5 --json.
polli usage # current pollen balance
polli usage --history # recent individual requests
polli usage --daily # daily cost summary
History is eventually consistent — a request you just made may not appear for 30–60s. When matching costs to freshly-generated media, use --limit 50 and filter by timestamp, and retry if the expected entry is missing. polli usage --json returns {"pollen": <number>} — the current balance only; use --history --json or --daily --json for cost breakdowns.
polli keys list # list all keys on the account
polli keys info # details about the CURRENTLY AUTHENTICATED key only (takes no id)
polli keys create --name "my-bot" --type secret --budget 1000 --permissions profile usage # scoped key
polli keys create --name "my-app" --type publishable --redirect-uri https://app.example/callback --earnings
polli keys revoke <id> # id comes from `keys list --json`
--permissions <perms...> scopes what the new key can do on the account (e.g. profile usage lets it call polli --key <new> usage). Without --permissions, new scoped keys can generate media but cannot read account state — polli --key <new> usage will 403. "keys" is auto-stripped from the list so a scoped key can never mint further keys. Publishable app keys default developer earnings off; pass --earnings to enable them. To inspect a specific key other than the current one, use polli keys list --json | jq '.[] | select(.id == "<id>")'. keys info is intentionally scoped to the caller's own key.
polli docs # full llm.txt reference
polli docs /v1/chat/completions # filter to one endpoint
polli docs --open # open in browser
key: value pairs or tab-separated tables with a header row. Exceptions: gen text prints the full response to stdout; gen transcribe prints the transcript as plain text; gen chat runs an interactive REPL. Status/progress messages go to stderr, so pipes stay clean.--json: every command emits machine-parseable JSON to stdout; all human messages go to stderr. Always prefer --json when piping into jq or parsing — it's the only shape with a stable contract.polli auth status first if you don't know whether the user is logged in. Fail fast with a clear "run polli auth login" message if not.--json whenever you'll parse the output. Never grep human-formatted tables.polli models --type <type>. Model availability changes.polli models --stats. Rule of thumb for "healthy": err% ≤ 5, avg latency in a reasonable range for the modality (standard text <5s, image <10s, video <60s), and requests high enough to be statistically meaningful (ignore rows with <10 requests — noise). Filter by capability first, then optimize by health — e.g. for a reasoning task, narrow to models where reasoning: true (via polli models --type text --json), then cross-reference against --stats output. The healthiest model overall may not support the capability you need. Reasoning models are inherently slower — expect 5–50s, not <5s; when picking among them, prioritize low err% and request count over raw latency, and compare latency only within the reasoning-capable subset.--output <path> for gen image, gen audio, gen video — otherwise the file lands in the current directory with a default name.gen text, pipe the context and pass the question as the positional argument: cat file | polli gen text "question about the file".polli <cmd> --help or polli gen <cmd> --help. This skill's recipes cover the common path; the CLI's own help is always the source of truth.polli docs [endpoint] over guessing API shapes. It prints the canonical llm.txt reference from the live API.--output on binary generators (image/audio/video) — the file goes to a default path, which may not be what the user wants.polli gen text --json expecting OpenAI chat-completions shape — the CLI's --json wraps its own structure. Use polli docs /v1/chat/completions to see the raw API shape if you need it.polli auth status tells you who you're logged in as and your balance in one call.gen text streams to a TTY, buffers when piped. The default now auto-detects — a human at the terminal sees tokens tick in, a pipe/redirect gets the full response once. Force either mode with --stream or --no-stream. For scripts and chains like polli gen text … | polli gen audio …, you don't need to do anything; buffering happens automatically.polli workflow into a browser app. gen.pollinations.ai requires a bearer token for generation requests, so a plain client-side fetch with no auth returns 401 unless it is served from cache. Mint a scoped key with polli keys create and proxy via your own backend.