一键导入
video-understanding
Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Generate lip-synced avatar video from text using OmniHuman v1.5. Use when creating talking-head or avatar videos.
Send and read emails via Gmail browser automation. Use when asked to send email or check inbox.
Schedule tasks using natural language time expressions. Use when asked to schedule a recurring or timed task.
Generate voice-over audio using OpenAI TTS. Use when creating narration or voice for videos.
Render videos using Remotion compositions. Use when creating or generating videos.
Research topics for video content creation. Use when researching ideas for videos.
| name | video-understanding |
| description | Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video. |
| allowed-tools | ["Bash","Read","Write"] |
Download videos and transcribe their content for analysis.
pip install yt-dlp or brew install yt-dlp)brew install ffmpeg or apt install ffmpeg)pip install openai-whisper)# Download video with yt-dlp
yt-dlp -o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
# For best quality
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" \
-o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
# For audio only (faster)
yt-dlp -x --audio-format mp3 \
-o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
ffmpeg -i "assets/downloads/video.mp4" \
-vn -acodec mp3 -ab 128k \
"assets/downloads/audio.mp3"
# Basic transcription
whisper "assets/downloads/audio.mp3" \
--model base \
--output_format txt \
--output_dir output/
# Higher quality (slower)
whisper "assets/downloads/audio.mp3" \
--model medium \
--output_format all \
--output_dir output/
# With timestamps
whisper "assets/downloads/audio.mp3" \
--model base \
--output_format srt \
--output_dir output/
Read the generated transcript file from output/
Summarize key points
Extract quotes and timestamps
Identify speakers if multiple
| Model | Size | Speed | Quality |
|---|---|---|---|
| tiny | 39M | Fastest | Lower |
| base | 74M | Fast | Good |
| small | 244M | Medium | Better |
| medium | 769M | Slow | High |
| large | 1550M | Slowest | Highest |
txt - Plain text transcriptsrt - SubRip subtitles with timestampsvtt - WebVTT subtitlesjson - Detailed JSON with word-level timingall - All formatsbase model for speed, medium for accuracy--language en to force English detection--task translate to translate to Englishassets/downloads/ for downloaded filesoutput/transcripts/