ワンクリックで
video-understanding
Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
Generate lip-synced avatar video from text using OmniHuman v1.5. Use when creating talking-head or avatar videos.
Send and read emails via Gmail browser automation. Use when asked to send email or check inbox.
Schedule tasks using natural language time expressions. Use when asked to schedule a recurring or timed task.
Generate voice-over audio using OpenAI TTS. Use when creating narration or voice for videos.
Render videos using Remotion compositions. Use when creating or generating videos.
Research topics for video content creation. Use when researching ideas for videos.
| name | video-understanding |
| description | Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video. |
| allowed-tools | ["Bash","Read","Write"] |
Download videos and transcribe their content for analysis.
pip install yt-dlp or brew install yt-dlp)brew install ffmpeg or apt install ffmpeg)pip install openai-whisper)# Download video with yt-dlp
yt-dlp -o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
# For best quality
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" \
-o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
# For audio only (faster)
yt-dlp -x --audio-format mp3 \
-o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
ffmpeg -i "assets/downloads/video.mp4" \
-vn -acodec mp3 -ab 128k \
"assets/downloads/audio.mp3"
# Basic transcription
whisper "assets/downloads/audio.mp3" \
--model base \
--output_format txt \
--output_dir output/
# Higher quality (slower)
whisper "assets/downloads/audio.mp3" \
--model medium \
--output_format all \
--output_dir output/
# With timestamps
whisper "assets/downloads/audio.mp3" \
--model base \
--output_format srt \
--output_dir output/
Read the generated transcript file from output/
Summarize key points
Extract quotes and timestamps
Identify speakers if multiple
| Model | Size | Speed | Quality |
|---|---|---|---|
| tiny | 39M | Fastest | Lower |
| base | 74M | Fast | Good |
| small | 244M | Medium | Better |
| medium | 769M | Slow | High |
| large | 1550M | Slowest | Highest |
txt - Plain text transcriptsrt - SubRip subtitles with timestampsvtt - WebVTT subtitlesjson - Detailed JSON with word-level timingall - All formatsbase model for speed, medium for accuracy--language en to force English detection--task translate to translate to Englishassets/downloads/ for downloaded filesoutput/transcripts/