con un clic
video-generation
AI video generation via Replicate — 17 models, editing, and production workflows
Menú
AI video generation via Replicate — 17 models, editing, and production workflows
Create and maintain ASCII visual dashboards for project tracking with parallel lane progress bars
Store and manage voice samples for TTS cloning — portable, version-controlled audio references
Clear documentation through visual excellence
AI music generation via Replicate — 5 models for background tracks, lyrics, and sound design
Practitioner methodology for longitudinal case study research, evidence-based documentation, and publication-ready academic writing in AI-assisted development.
First impressions matter. Set projects up for success.
| name | video-generation |
| description | AI video generation via Replicate — 17 models, editing, and production workflows |
| tier | standard |
| applyTo | **/*video*,**/*animate*,**/*clip* |
| $schema | ../SKILL-SCHEMA.json |
Domain: AI Video Generation
Version: 2.0.0
Last Updated: 2026-04-15
Author: Alex (Master Alex)
Source: Patterns from AlexVideos CLI toolkit
Related: image-handling (images), text-to-speech (audio)
Generate AI videos using 17 cloud models on Replicate. Supports text-to-video, image-to-video, video editing, and talking-head workflows.
| Key | Model | Replicate ID | Duration | Audio | Cost |
|---|---|---|---|---|---|
veo3fast | Veo 3.1 Fast | google/veo-3.1-fast | 4/6/8s | ✅ Auto | $0.10–0.15/sec |
veo3 | Veo 3.1 | google/veo-3.1 | 4/6/8s | ✅ Auto | $0.20–0.40/sec |
grok | Grok Video | xai/grok-imagine-video | 1–15s | ✅ Lip-sync | $0.05/sec |
gen45 | Gen-4.5 Runway | runwayml/gen-4.5 | 5–10s | ❌ | $0.12/sec |
kling | Kling v3 | kwaivgi/kling-v3-video | 3–15s | Optional | $0.17–0.22/sec |
kling26 | Kling v2.6 | kwaivgi/kling-v2.6 | 5–10s | ✅ Auto | variable |
kling3omni | Kling v3 Omni | kwaivgi/kling-v3-omni-video | 3–15s | ❌ | variable |
sora | Sora-2 | openai/sora-2 | 4–12s | ✅ Auto | $0.10/sec |
sora2pro | Sora-2 Pro | openai/sora-2-pro | 4/8/12s | ✅ Auto | $0.30–0.50/sec |
seedance | Seedance Lite | bytedance/seedance-1-lite | 2–12s | ❌ | $0.036/sec |
seedpro | Seedance Pro | bytedance/seedance-1-pro | 2–12s | ❌ | $0.15/sec |
pixverse | PixVerse v5.6 | pixverse/pixverse-v5.6 | 5/10s | ✅ Auto | variable |
hailuo | Hailuo-02 | minimax/hailuo-02 | 6/10s | ❌ | variable |
hailuo23 | Hailuo-2.3 | minimax/hailuo-2.3 | 6/10s | ❌ | variable |
ray2 | Ray 2 Luma | luma/ray-2-720p | 5/9s | ❌ | $0.18/sec |
rayflash | Ray Flash 2 | luma/ray-flash-2-720p | 5/9s | ❌ | $0.06/sec |
wan | WAN 2.5 Fast | wan-video/wan-2.5-t2v-fast | 5–10s | ✅ Auto | $0.068/sec |
| Model | prompt | duration | image | aspect | resolution | negative | audio |
|---|---|---|---|---|---|---|---|
veo3fast | ✅ | 4/6/8 | ✅ | ✅ | ✅ | ✅ | ✅ auto |
veo3 | ✅ | 4/6/8 | ✅ | ✅ | ✅ | ✅ | ✅ auto |
grok | ✅ | 1–15 | ✅ | ✅ | ✅ | — | ✅ auto |
gen45 | ✅ | 5–10 | ✅ | ✅ | — | — | — |
kling | ✅ | 3–15 | ✅ | ✅ | ✅ (mode) | ✅ | optional |
kling26 | ✅ | 5–10 | ✅ | ✅ | — | ✅ | ✅ auto |
kling3omni | ✅ | 3–15 | ✅ | ✅ | ✅ (mode) | — | — |
sora | ✅ | 4–12 | ✅ | ✅ | — | — | ✅ auto |
sora2pro | ✅ | 4/8/12 | ✅ | ✅ | ✅ | — | ✅ auto |
seedance | ✅ | 2–12 | ✅ | ✅ | ✅ | — | — |
seedpro | ✅ | 2–12 | ✅ | ✅ | ✅ | — | — |
pixverse | ✅ | 5/10 | ✅ | ✅ | ✅ (quality) | ✅ | ✅ auto |
hailuo | ✅ | 6/10 | ✅ | — | ✅ | — | — |
hailuo23 | ✅ | 6/10 | ✅ | — | ✅ | — | — |
ray2 | ✅ | 5/9 | ✅ | ✅ | — | — | — |
rayflash | ✅ | 5/9 | ✅ | ✅ | — | — | — |
wan | ✅ | 5–10 | — | ✅ | — | ✅ | ✅ auto |
Notes:
| Need | Model | Why |
|---|---|---|
| Default / Quick preview | veo3fast | Best balance: speed, quality, cost |
| Talking head / lip-sync | grok, kling26 | Grok: best lip-sync; Kling26: best motion |
| High quality cinematic | sora2pro, veo3 | Premium quality, synced audio |
| Image animation | kling, gen45 | Strong image-to-video |
| Budget production | rayflash, seedance | Cheapest per-second |
| Longest duration (15s) | grok, kling | Only models supporting 15s |
| Real-world physics | hailuo, hailuo23 | Physics simulation, VFX |
Proven pipeline for generating a person speaking to camera with synchronized speech:
# Step 1 — Age-progress reference photo (if needed)
node scripts/generate-image.js "A 26-year-old professional in studio" --model nanapro --image ref.png
# Step 2 — Generate video with Kling v2.6 (best talking-head i2v)
node scripts/generate-video.js "Person looking directly into camera, speaking confidently" --model kling26 --image aged.jpg --duration 10
# Step 3 — Generate speech
node scripts/generate-voice.js "[script]" --model mm28turbo
# Step 4 — Merge video + speech
node scripts/generate-edit-video.js --model avmerge --video clip.mp4 --audio speech.mp3
Model notes:
kling26 — Best overall motion quality for talking-head i2vgrok — Best lip sync, but no native audio (requires avmerge)veo3fast / sora — May flag real-person reference photosPrompt pattern:
"A sharp [age]-year-old [description] looking directly into camera, speaking to the audience about [topic]. Calm, confident, broadcast-quality delivery. Professional studio lighting."
| Key | Model | Purpose | Cost |
|---|---|---|---|
modify | Luma Modify | AI video style transfer | variable |
reframe | Luma Reframe | AI crop to new aspect ratio | $0.06/sec |
trim | Trim Video | Extract segment by time | <$0.001 |
merge | Video Merge | Concatenate videos | variable |
avmerge | ffmpeg-static | Combine audio + video | free (local) |
extract | Extract Audio | Strip audio from video | variable |
frames | Frame Extractor | Export frames as images | <$0.001 |
upscale | Real-ESRGAN | AI upscale to 4K | ~$0.46 |
caption | AutoCaption | Auto subtitles | ~$0.07 |
utils | Video Utils | Convert, misc ops | <$0.002 |
import Replicate from "replicate";
const replicate = new Replicate();
const output = await replicate.run("google/veo-3.1-fast", {
input: {
prompt: "A person walking through a futuristic city at sunset, cinematic lighting",
duration: 5, // seconds
},
});
console.log("Video URL:", output);
import { readFileSync } from "fs";
const output = await replicate.run("kwaivgi/kling-v3-video", {
input: {
prompt: "Gentle wind blowing through hair, subtle movement",
image: readFileSync("source-image.png"), // or URL
duration: 5,
},
});
const output = await replicate.run("xai/grok-imagine-video", {
input: {
prompt: "Person speaking to camera with natural expressions",
audio: readFileSync("narration.mp3"), // Voice audio
duration: 10,
},
});
| Element | Good Example | Avoid |
|---|---|---|
| Camera motion | "slow dolly forward", "tracking shot" | "camera moves" |
| Lighting | "golden hour", "dramatic side lighting" | "good lighting" |
| Style | "cinematic 4K", "documentary style" | "nice video" |
| Subject | "a person with short brown hair" | "someone" |
| Action | "walking slowly through", "gesturing while talking" | "doing something" |
| Duration | Use Case | Model Recommendation |
|---|---|---|
| 2-4s | Loop, GIF replacement | veo-3.1-fast |
| 5-8s | Short clip, social media | veo-3.1-fast, kling-v3 |
| 10-15s | Story segment, talking head | grok-video, sora-2 |
| Model | Audio Type | Notes |
|---|---|---|
| veo-3.1-fast | Auto-generated | Ambient sounds from scene |
| grok-video | Lip-sync | Syncs mouth to provided audio |
| sora-2 | Synced | High-quality scene audio |
For models without audio, combine with TTS:
// 1. Generate video (no audio)
const video = await replicate.run("kwaivgi/kling-v3-video", {
input: { prompt: "...", duration: 5 },
});
// 2. Generate narration via TTS
const audio = await replicate.run("minimax/speech-2.8-turbo", {
input: { text: "Narration text", voice: "Wise_Woman" },
});
// 3. Combine with ffmpeg
// ffmpeg -i video.mp4 -i audio.mp3 -c:v copy -c:a aac output.mp4
| Model | 5s clip | 10s clip | 15s clip |
|---|---|---|---|
| veo-3.1-fast | ~$0.15 | ~$0.30 | N/A (8s max) |
| grok-video | ~$0.25 | ~$0.50 | ~$0.75 |
| minimax-video | ~$0.25 | ~$0.50 | N/A (6s max) |
| kling-v3 | varies | varies | varies |
| sora-2 | premium | premium | premium |
import { writeFileSync } from "fs";
const response = await fetch(output);
const buffer = await response.arrayBuffer();
writeFileSync("output.mp4", Buffer.from(buffer));
# Check video properties with ffprobe
ffprobe -v quiet -print_format json -show_format -show_streams output.mp4
| Error | Cause | Solution |
|---|---|---|
| Generation timeout | Complex prompt | Simplify, reduce duration |
| NSFW rejection | Content policy | Adjust prompt |
| Low quality output | Vague prompt | Add specific details |
| Audio desync | Wrong model | Use lip-sync model |