一键导入
video-podcast-maker
// Use when the user gives a topic and wants an automated video podcast created, or asks to learn visual design patterns from a reference video/image. Produces 4K video via research → script → TTS → Remotion → MP4 + BGM.
// Use when the user gives a topic and wants an automated video podcast created, or asks to learn visual design patterns from a reference video/image. Produces 4K video via research → script → TTS → Remotion → MP4 + BGM.
Use when user requests diagrams, flowcharts, architecture charts, or visualizations. Also use proactively when explaining systems with 3+ components, complex data flows, or relationships that benefit from visual representation. Generates .excalidraw files and exports to PNG/SVG via Kroki API or locally using excalidraw-brute-export-cli.
小遥搜索 MCP 工具 - 本地文件智能搜索(语义/全文/图像/语音/混合搜索)
| name | video-podcast-maker |
| description | Use when the user gives a topic and wants an automated video podcast created, or asks to learn visual design patterns from a reference video/image. Produces 4K video via research → script → TTS → Remotion → MP4 + BGM. |
| argument-hint | [topic] |
| effort | high |
| author | Agents365-ai |
| category | Content Creation |
| version | 2.0.0 |
| created | "2025-01-27T00:00:00.000Z" |
| updated | "2026-04-03T00:00:00.000Z" |
| bilibili | https://space.bilibili.com/441831884 |
| github | https://github.com/Agents365-ai/video-podcast-maker |
| dependencies | ["remotion-best-practices"] |
| metadata | {"openclaw":{"requires":{"bins":["python3","ffmpeg","node","npx"]},"primaryEnv":"AZURE_SPEECH_KEY","emoji":"🎬","homepage":"https://github.com/Agents365-ai/video-podcast-maker","os":["macos","linux"],"install":[{"kind":"brew","formula":"ffmpeg","bins":["ffmpeg"]},{"kind":"uv","package":"edge-tts","bins":["edge-tts"]}]}} |
REQUIRED: Load Remotion Best Practices First
This skill depends on
remotion-best-practices. You MUST invoke it before proceeding:Invoke the skill/tool named: remotion-best-practices
Open your coding agent and say: "Make a video podcast about $ARGUMENTS"
Or invoke directly: /video-podcast-maker AI Agent tutorial
Extract visual design patterns from reference videos or images and apply them to new video compositions. Skip this section unless the user provides a reference video/image or asks to save/list/delete style profiles.
→ See references/design-learning.md for commands, reference-library management, style-profile management, and integration with Pre-workflow / Step 9.
Agent behavior: Check for updates at most once per day (throttled by timestamp file).
Before any shell command that reads files from this skill, resolve SKILL_DIR to the directory containing SKILL.md.
If your agent exposes a built-in skill directory variable such as ${CLAUDE_SKILL_DIR}, you may map it to SKILL_DIR.
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"
STAMP="${SKILL_DIR}/.last_update_check"
NOW=$(date +%s)
LAST=$(cat "$STAMP" 2>/dev/null || echo 0)
if [ ! -d "${SKILL_DIR}/.git" ]; then
echo "MANUAL_INSTALL"
elif [ $((NOW - LAST)) -gt 86400 ]; then
timeout 5 git -C "${SKILL_DIR}" fetch --quiet 2>/dev/null || true
LOCAL=$(git -C "${SKILL_DIR}" rev-parse HEAD 2>/dev/null)
REMOTE=$(git -C "${SKILL_DIR}" rev-parse origin/main 2>/dev/null)
echo "$NOW" > "$STAMP"
if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then
echo "UPDATE_AVAILABLE"
else
echo "UP_TO_DATE"
fi
else
echo "SKIPPED_RECENT_CHECK"
fi
git -C "${SKILL_DIR}" pull. No → continue..git directory — skill was installed via tarball/zip/cp): Continue silently. Auto-update is disabled; the user must reinstall manually to update.!python3 "${SKILL_DIR}/scripts/check_prereqs.py"
If MISSING reported above, see README.md for full setup instructions (install commands, API key setup, Remotion project init). The check is backend-aware: backend is resolved as TTS_BACKEND env var → user_prefs.json (global.tts.backend) → edge default, then only env vars required by that backend are validated.
Automated pipeline for professional Bilibili horizontal knowledge videos from a topic.
Target: Bilibili horizontal video (16:9)
- Resolution: 3840×2160 (4K) or 1920×1080 (1080p)
- Style: Clean white (default)
Tech stack: Coding agent + TTS backend + Remotion + FFmpeg
| Parameter | Horizontal (16:9) | Vertical (9:16) |
|---|---|---|
| Resolution | 3840×2160 (4K) | 2160×3840 (4K) |
| Frame rate | 30 fps | 30 fps |
| Encoding | H.264, 16Mbps | H.264, 16Mbps |
| Audio | AAC, 192kbps | AAC, 192kbps |
| Duration | 1-15 min | 60-90s (highlight) |
Agent behavior: Detect user intent at workflow start:
Full pipeline with sensible defaults. Mandatory stop at Step 9:
| Step | Decision | Auto Default |
|---|---|---|
| 3 | Title position | top-center |
| 5 | Media assets | Skip (text-only animations) |
| 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) |
| 9 | Outro animation | Pre-made MP4 (white/black by theme) |
| 9 | Preview method | Remotion Studio (mandatory) |
| 12 | Subtitles | Skip |
| 14 | Cleanup | Auto-clean temp files |
Users can override any default in their initial request:
Prompts at each decision point. Activated by:
Hard constraints for video production. Visual design remains the agent's creative freedom within these rules:
| Rule | Requirement |
|---|---|
| Single Project | All videos under videos/{name}/ in user's Remotion project. NEVER create a new project per video. |
| 4K Output | 3840×2160, use scale(2) wrapper over 1920×1080 design space |
| Content Width | ≥85% of screen width |
| Bottom Safe Zone | Bottom 100px reserved for subtitles |
| Audio Sync | All animations driven by timing.json timestamps |
| Thumbnail | MUST generate 16:9 (1920×1080) AND 4:3 (1200×900). Centered layout, title ≥120px, icons ≥120px, fill most of canvas. See design-guide.md. |
| Font | PingFang SC / Noto Sans SC for Chinese text |
| Studio Before Render | MUST launch remotion studio for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final"). |
Load these files on demand — do NOT load all at once:
timing.json format.project-root/ # Remotion project root
├── src/remotion/ # Remotion source
│ ├── compositions/ # Video composition definitions
│ ├── Root.tsx # Remotion entry
│ └── index.ts # Exports
│
├── public/ # Remotion default (unused — use --public-dir videos/{name}/)
│
├── videos/{video-name}/ # Video project assets
│ ├── topic_definition.md # Step 1
│ ├── topic_research.md # Step 2
│ ├── podcast.txt # Step 4: narration script
│ ├── podcast_audio.wav # Step 8: TTS audio
│ ├── podcast_audio.srt # Step 8: subtitles
│ ├── timing.json # Step 8: timeline
│ ├── thumbnail_*.png # Step 7
│ ├── output.mp4 # Step 10
│ ├── video_with_bgm.mp4 # Step 11
│ ├── final_video.mp4 # Step 12: final output
│ └── bgm.mp3 # Background music
│
└── remotion.config.ts
Important: Always use
--public-dirand full output path for Remotion render:npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/
Video name {video-name}: lowercase English, hyphen-separated (e.g., reference-manager-comparison)
Section name {section}: lowercase English, underscore-separated, matches [SECTION:xxx]
Thumbnail naming (16:9 AND 4:3 both required):
| Type | 16:9 | 4:3 |
|---|---|---|
| Remotion | thumbnail_remotion_16x9.png | thumbnail_remotion_4x3.png |
| AI | thumbnail_ai_16x9.png | thumbnail_ai_4x3.png |
Use --public-dir videos/{name}/ for all Remotion commands. Each video's assets (timing.json, podcast_audio.wav, bgm.mp3) stay in its own directory — no copying to public/ needed. This enables parallel renders of different videos.
# All render/studio/still commands use --public-dir
npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/
At Step 1 start, use your agent's task tracker (Claude Code TaskCreate / Codex todo list / equivalent) to create one task per step. Mark in_progress on start, completed on finish. Files in videos/{name}/ (e.g. podcast.txt, timing.json, output.mp4) act as the durable record of what completed — if a session is interrupted, inspect the directory to determine where to resume.
1. Define topic direction → topic_definition.md
2. Research topic → topic_research.md
3. Design video sections (5-7 chapters)
4. Write narration script → podcast.txt
4.5. Pronunciation pre-flight (zh-CN only) → videos/{name}/phonemes.json
5. Collect media assets → media_manifest.json
6. Generate publish info (Part 1) → publish_info.md
7. Generate thumbnails (16:9 + 4:3) → thumbnail_*.png
8. Generate TTS audio → podcast_audio.wav, timing.json
9. Create Remotion composition + Studio preview (mandatory stop)
10. Render 4K video (only on user request) → output.mp4
11. Mix background music → video_with_bgm.mp4
12. Add subtitles (optional) → final_video.mp4
13. Complete publish info (Part 2) → chapter timestamps
14. Verify output & cleanup
15. Generate vertical shorts (optional) → shorts/
After Step 8 (TTS):
podcast_audio.wav exists and plays correctlytiming.json has all sections with correct timestampspodcast_audio.srt encoding is UTF-8After Step 10 (Render):
output.mp4 resolution is 3840x2160See CLAUDE.md for the full command reference (TTS, Remotion, FFmpeg, shorts generation).
Skill learns and applies preferences automatically. See references/troubleshooting.md for commands and learning details.
| File | Purpose |
|---|---|
user_prefs.json | Learned preferences (auto-created from template) |
user_prefs.template.json | Default values |
prefs_schema.json | JSON schema definition |
Final = merge(Root.tsx defaults < global < topic_patterns[type] < current instructions)
| Command | Effect |
|---|---|
| "show preferences" | Show current preferences |
| "reset preferences" | Reset to defaults |
| "save as X default" | Save to topic_patterns |
Full reference: Read references/troubleshooting.md on errors, preference questions, or BGM options.