一键导入
videoagent-image-studio
// Tired of juggling 8 API keys? This skill gives you one-command access to Midjourney, Flux, Ideogram, and more, with zero setup. Use when you want to generate any image without worrying about API keys.
// Tired of juggling 8 API keys? This skill gives you one-command access to Midjourney, Flux, Ideogram, and more, with zero setup. Use when you want to generate any image without worrying about API keys.
| name | videoagent-image-studio |
| version | 2.0.0 |
| author | wells |
| emoji | 🎨 |
| tags | ["video","image-generation","midjourney","flux","gemini","fal","ideogram","recraft"] |
| description | Tired of juggling 8 API keys? This skill gives you one-command access to Midjourney, Flux, Ideogram, and more, with zero setup. Use when you want to generate any image without worrying about API keys. |
| homepage | https://github.com/pexoai/image-studio-skill |
| metadata | {"openclaw":{"emoji":"🎨","install":[{"id":"node","kind":"node","label":"No dependencies needed — all calls go through the hosted proxy"}]}} |
Use when: User asks to generate, draw, create, or make any kind of image, photo, illustration, icon, logo, or artwork.
Generate images with 8 state-of-the-art AI models. This skill automatically picks the best model for the job and handles all the complexity — including Midjourney's async polling — so you can focus on the conversation.
| User Intent | Model | Speed |
|---|---|---|
| Artistic, cinematic, painterly | midjourney | ~15s |
| Photorealistic, portrait, product | flux-pro | ~8s |
| General purpose, balanced | flux-dev | ~10s |
| Quick draft, fast iteration | flux-schnell | ~2s |
| Image with text, logo, poster | ideogram | ~10s |
| Vector art, icon, flat design | recraft | ~8s |
| Anime, stylized illustration | sdxl | ~5s |
| Gemini-powered, consistent style | nano-banana | ~12s |
Before calling the script, expand the user's prompt with style, lighting, and quality descriptors appropriate for the chosen model.
cinematic lighting, ultra detailed, --v 7, --style rawmasterpiece, highly detailed, sharp focus, professional photographyvector illustration, flat design, icon stylenode {baseDir}/tools/generate.js \
--model <model_id> \
--prompt "<enhanced prompt>" \
--aspect-ratio <ratio>
All parameters:
| Parameter | Default | Description |
|---|---|---|
--model | flux-dev | Model ID from the table above |
--prompt | (required) | The image generation prompt |
--aspect-ratio | 1:1 | 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 21:9 |
--num-images | 1 | Number of images (1–4; Midjourney always returns 4) |
--negative-prompt | — | Things to avoid (not supported by Midjourney) |
--seed | — | Seed for reproducibility |
The script always waits and returns the final image URL(s). No polling required.
{
"success": true,
"model": "flux-pro",
"imageUrl": "https://...",
"images": ["https://..."]
}
Send the imageUrl to the user.
After generating a 4-image grid with Midjourney, offer the user these options:
# Upscale image #2 (subtle, preserves details)
node {baseDir}/tools/generate.js \
--model midjourney \
--action upscale \
--index 2 \
--job-id <job_id>
# Create a strong variation of image #3
node {baseDir}/tools/generate.js \
--model midjourney \
--action variation \
--index 3 \
--job-id <job_id> \
--variation-type 1
# Regenerate with same prompt
node {baseDir}/tools/generate.js \
--model midjourney \
--action reroll \
--job-id <job_id>
Upscale types: 0 = Subtle (default, best for photos), 1 = Creative (best for illustrations)
Variation types: 0 = Subtle (default), 1 = Strong (dramatic changes)
User: "Draw a snow leopard on a snowy mountain with cinematic lighting"
# Choose midjourney for artistic quality
node {baseDir}/tools/generate.js \
--model midjourney \
--prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7" \
--aspect-ratio 16:9
🎨 Done! Which one to upscale? (U1-U4) Or create a variant? (V1-V4)
User: "Use Flux to generate a perfume product poster, white background"
# Choose flux-pro for photorealistic product shots
node {baseDir}/tools/generate.js \
--model flux-pro \
--prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed" \
--aspect-ratio 3:4
User: "Show me a quick draft"
# flux-schnell for instant previews
node {baseDir}/tools/generate.js \
--model flux-schnell \
--prompt "..." \
--aspect-ratio 1:1
User: "Make me an App icon, flat style, blue theme"
# recraft for vector/icon style
node {baseDir}/tools/generate.js \
--model recraft \
--prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"
Zero API keys needed! All requests go through a hosted proxy that handles authentication server-side.
The skill works out of the box — just install and use.
If you want to use your own proxy or a persistent token, set these environment variables:
{
"skills": {
"entries": {
"videoagent-image-studio": {
"enabled": true,
"env": {
"IMAGE_STUDIO_PROXY_URL": "https://your-proxy.vercel.app",
"IMAGE_STUDIO_TOKEN": "your_token_here"
}
}
}
}
}
| Variable | Required | Description |
|---|---|---|
IMAGE_STUDIO_PROXY_URL | No | Custom proxy base URL (default: https://image-gen-proxy.vercel.app) |
IMAGE_STUDIO_TOKEN | No | Persistent token (auto-obtained if not set, 100 free uses per token) |
To deploy your own proxy, see the videoagent-audio-studio proxy as a reference implementation. You'll need FAL_KEY and LEGNEXT_KEY as Vercel environment variables.
--async / --poll flags needed in SKILL.md instructions.{ success, imageUrl, images } shape.--reference-images "url1,url2" for character/style consistency across generations.--async + --poll).AI video generation skill with auto model selection across Seedance 2, Kling 3.0, HappyHorse, and 10+ models. Produces finished multi-shot videos (5–120s) from text, images, URLs, scripts, or audio — including AI music, lip sync, and multi-shot sequencing. No prompts to write, no models to choose. USE FOR: video production, AI video, make a video, product video, brand video, promotional clip, explainer video, short video, TikTok video, Instagram Reel, YouTube Short, product ad, text-to-video, image-to-video, video generation, AI video agent.
Expert prompt engineering for Google Veo 3.2 (Artemis engine). Use when the user wants to generate a video with Veo 3.2, needs help crafting cinematic prompts, or mentions Veo, Google video generation, or Artemis engine.
Tired of juggling multiple audio APIs? This skill gives you one-command access to TTS, music generation, sound effects, and voice cloning. Use when you want to generate any audio without managing multiple API keys.
AI creative director that turns a user's natural-language idea into a complete storyboard and generates all assets — images, video clips, and audio — automatically. The user only describes what they want; all prompt engineering is handled internally.
Generate short AI videos from text or images — text-to-video, image-to-video, and reference-based generation — with zero API key setup. Use when the user wants to create a video clip, animate an image, or generate video from a description.
Expert prompt engineering for Seedance 2.0. Use when the user wants to generate a video with multimodal assets (images, videos, audio) and needs the best possible prompt.