ワンクリックで
video-generation
AI video generation via Replicate — 17 models, editing, and production workflows
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
AI video generation via Replicate — 17 models, editing, and production workflows
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
Create and maintain ASCII visual dashboards for project tracking with parallel lane progress bars
Store and manage voice samples for TTS cloning — portable, version-controlled audio references
Clear documentation through visual excellence
AI music generation via Replicate — 5 models for background tracks, lyrics, and sound design
Practitioner methodology for longitudinal case study research, evidence-based documentation, and publication-ready academic writing in AI-assisted development.
First impressions matter. Set projects up for success.
| name | video-generation |
| description | AI video generation via Replicate — 17 models, editing, and production workflows |
| tier | standard |
| applyTo | **/*video*,**/*animate*,**/*clip* |
| $schema | ../SKILL-SCHEMA.json |
Domain: AI Video Generation
Version: 2.0.0
Last Updated: 2026-04-15
Author: Alex (Master Alex)
Source: Patterns from AlexVideos CLI toolkit
Related: image-handling (images), text-to-speech (audio)
Generate AI videos using 17 cloud models on Replicate. Supports text-to-video, image-to-video, video editing, and talking-head workflows.
| Key | Model | Replicate ID | Duration | Audio | Cost |
|---|---|---|---|---|---|
veo3fast | Veo 3.1 Fast | google/veo-3.1-fast | 4/6/8s | ✅ Auto | $0.10–0.15/sec |
veo3 | Veo 3.1 | google/veo-3.1 | 4/6/8s | ✅ Auto | $0.20–0.40/sec |
grok | Grok Video | xai/grok-imagine-video | 1–15s | ✅ Lip-sync | $0.05/sec |
gen45 | Gen-4.5 Runway | runwayml/gen-4.5 | 5–10s | ❌ | $0.12/sec |
kling | Kling v3 | kwaivgi/kling-v3-video | 3–15s | Optional | $0.17–0.22/sec |
kling26 | Kling v2.6 | kwaivgi/kling-v2.6 | 5–10s | ✅ Auto | variable |
kling3omni | Kling v3 Omni | kwaivgi/kling-v3-omni-video | 3–15s | ❌ | variable |
sora | Sora-2 | openai/sora-2 | 4–12s | ✅ Auto | $0.10/sec |
sora2pro | Sora-2 Pro | openai/sora-2-pro | 4/8/12s | ✅ Auto | $0.30–0.50/sec |
seedance | Seedance Lite | bytedance/seedance-1-lite | 2–12s | ❌ | $0.036/sec |
seedpro | Seedance Pro | bytedance/seedance-1-pro | 2–12s | ❌ | $0.15/sec |
pixverse | PixVerse v5.6 | pixverse/pixverse-v5.6 | 5/10s | ✅ Auto | variable |
hailuo | Hailuo-02 | minimax/hailuo-02 | 6/10s | ❌ | variable |
hailuo23 | Hailuo-2.3 | minimax/hailuo-2.3 | 6/10s | ❌ | variable |
ray2 | Ray 2 Luma | luma/ray-2-720p | 5/9s | ❌ | $0.18/sec |
rayflash | Ray Flash 2 | luma/ray-flash-2-720p | 5/9s | ❌ | $0.06/sec |
wan | WAN 2.5 Fast | wan-video/wan-2.5-t2v-fast | 5–10s | ✅ Auto | $0.068/sec |
| Model | prompt | duration | image | aspect | resolution | negative | audio |
|---|---|---|---|---|---|---|---|
veo3fast | ✅ | 4/6/8 | ✅ | ✅ | ✅ | ✅ | ✅ auto |
veo3 | ✅ | 4/6/8 | ✅ | ✅ | ✅ | ✅ | ✅ auto |
grok | ✅ | 1–15 | ✅ | ✅ | ✅ | — | ✅ auto |
gen45 | ✅ | 5–10 | ✅ | ✅ | — | — | — |
kling | ✅ | 3–15 | ✅ | ✅ | ✅ (mode) | ✅ | optional |
kling26 | ✅ | 5–10 | ✅ | ✅ | — | ✅ | ✅ auto |
kling3omni | ✅ | 3–15 | ✅ | ✅ | ✅ (mode) | — | — |
sora | ✅ | 4–12 | ✅ | ✅ | — | — | ✅ auto |
sora2pro | ✅ | 4/8/12 | ✅ | ✅ | ✅ | — | ✅ auto |
seedance | ✅ | 2–12 | ✅ | ✅ | ✅ | — | — |
seedpro | ✅ | 2–12 | ✅ | ✅ | ✅ | — | — |
pixverse | ✅ | 5/10 | ✅ | ✅ | ✅ (quality) | ✅ | ✅ auto |
hailuo | ✅ | 6/10 | ✅ | — | ✅ | — | — |
hailuo23 | ✅ | 6/10 | ✅ | — | ✅ | — | — |
ray2 | ✅ | 5/9 | ✅ | ✅ | — | — | — |
rayflash | ✅ | 5/9 | ✅ | ✅ | — | — | — |
wan | ✅ | 5–10 | — | ✅ | — | ✅ | ✅ auto |
Notes:
| Need | Model | Why |
|---|---|---|
| Default / Quick preview | veo3fast | Best balance: speed, quality, cost |
| Talking head / lip-sync | grok, kling26 | Grok: best lip-sync; Kling26: best motion |
| High quality cinematic | sora2pro, veo3 | Premium quality, synced audio |
| Image animation | kling, gen45 | Strong image-to-video |
| Budget production | rayflash, seedance | Cheapest per-second |
| Longest duration (15s) | grok, kling | Only models supporting 15s |
| Real-world physics | hailuo, hailuo23 | Physics simulation, VFX |
Proven pipeline for generating a person speaking to camera with synchronized speech:
# Step 1 — Age-progress reference photo (if needed)
node scripts/generate-image.js "A 26-year-old professional in studio" --model nanapro --image ref.png
# Step 2 — Generate video with Kling v2.6 (best talking-head i2v)
node scripts/generate-video.js "Person looking directly into camera, speaking confidently" --model kling26 --image aged.jpg --duration 10
# Step 3 — Generate speech
node scripts/generate-voice.js "[script]" --model mm28turbo
# Step 4 — Merge video + speech
node scripts/generate-edit-video.js --model avmerge --video clip.mp4 --audio speech.mp3
Model notes:
kling26 — Best overall motion quality for talking-head i2vgrok — Best lip sync, but no native audio (requires avmerge)veo3fast / sora — May flag real-person reference photosPrompt pattern:
"A sharp [age]-year-old [description] looking directly into camera, speaking to the audience about [topic]. Calm, confident, broadcast-quality delivery. Professional studio lighting."
| Key | Model | Purpose | Cost |
|---|---|---|---|
modify | Luma Modify | AI video style transfer | variable |
reframe | Luma Reframe | AI crop to new aspect ratio | $0.06/sec |
trim | Trim Video | Extract segment by time | <$0.001 |
merge | Video Merge | Concatenate videos | variable |
avmerge | ffmpeg-static | Combine audio + video | free (local) |
extract | Extract Audio | Strip audio from video | variable |
frames | Frame Extractor | Export frames as images | <$0.001 |
upscale | Real-ESRGAN | AI upscale to 4K | ~$0.46 |
caption | AutoCaption | Auto subtitles | ~$0.07 |
utils | Video Utils | Convert, misc ops | <$0.002 |
import Replicate from "replicate";
const replicate = new Replicate();
const output = await replicate.run("google/veo-3.1-fast", {
input: {
prompt: "A person walking through a futuristic city at sunset, cinematic lighting",
duration: 5, // seconds
},
});
console.log("Video URL:", output);
import { readFileSync } from "fs";
const output = await replicate.run("kwaivgi/kling-v3-video", {
input: {
prompt: "Gentle wind blowing through hair, subtle movement",
image: readFileSync("source-image.png"), // or URL
duration: 5,
},
});
const output = await replicate.run("xai/grok-imagine-video", {
input: {
prompt: "Person speaking to camera with natural expressions",
audio: readFileSync("narration.mp3"), // Voice audio
duration: 10,
},
});
| Element | Good Example | Avoid |
|---|---|---|
| Camera motion | "slow dolly forward", "tracking shot" | "camera moves" |
| Lighting | "golden hour", "dramatic side lighting" | "good lighting" |
| Style | "cinematic 4K", "documentary style" | "nice video" |
| Subject | "a person with short brown hair" | "someone" |
| Action | "walking slowly through", "gesturing while talking" | "doing something" |
| Duration | Use Case | Model Recommendation |
|---|---|---|
| 2-4s | Loop, GIF replacement | veo-3.1-fast |
| 5-8s | Short clip, social media | veo-3.1-fast, kling-v3 |
| 10-15s | Story segment, talking head | grok-video, sora-2 |
| Model | Audio Type | Notes |
|---|---|---|
| veo-3.1-fast | Auto-generated | Ambient sounds from scene |
| grok-video | Lip-sync | Syncs mouth to provided audio |
| sora-2 | Synced | High-quality scene audio |
For models without audio, combine with TTS:
// 1. Generate video (no audio)
const video = await replicate.run("kwaivgi/kling-v3-video", {
input: { prompt: "...", duration: 5 },
});
// 2. Generate narration via TTS
const audio = await replicate.run("minimax/speech-2.8-turbo", {
input: { text: "Narration text", voice: "Wise_Woman" },
});
// 3. Combine with ffmpeg
// ffmpeg -i video.mp4 -i audio.mp3 -c:v copy -c:a aac output.mp4
| Model | 5s clip | 10s clip | 15s clip |
|---|---|---|---|
| veo-3.1-fast | ~$0.15 | ~$0.30 | N/A (8s max) |
| grok-video | ~$0.25 | ~$0.50 | ~$0.75 |
| minimax-video | ~$0.25 | ~$0.50 | N/A (6s max) |
| kling-v3 | varies | varies | varies |
| sora-2 | premium | premium | premium |
import { writeFileSync } from "fs";
const response = await fetch(output);
const buffer = await response.arrayBuffer();
writeFileSync("output.mp4", Buffer.from(buffer));
# Check video properties with ffprobe
ffprobe -v quiet -print_format json -show_format -show_streams output.mp4
| Error | Cause | Solution |
|---|---|---|
| Generation timeout | Complex prompt | Simplify, reduce duration |
| NSFW rejection | Content policy | Adjust prompt |
| Low quality output | Vague prompt | Add specific details |
| Audio desync | Wrong model | Use lip-sync model |