원클릭으로
video-understanding
Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
메뉴
Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
SOC 직업 분류 기준
Generate lip-synced avatar video from text using OmniHuman v1.5. Use when creating talking-head or avatar videos.
Send and read emails via Gmail browser automation. Use when asked to send email or check inbox.
Schedule tasks using natural language time expressions. Use when asked to schedule a recurring or timed task.
Generate voice-over audio using OpenAI TTS. Use when creating narration or voice for videos.
Render videos using Remotion compositions. Use when creating or generating videos.
Research topics for video content creation. Use when researching ideas for videos.
| name | video-understanding |
| description | Download videos and transcribe their content. Use when asked to understand, summarize, or analyze a video. |
| allowed-tools | ["Bash","Read","Write"] |
Download videos and transcribe their content for analysis.
pip install yt-dlp or brew install yt-dlp)brew install ffmpeg or apt install ffmpeg)pip install openai-whisper)# Download video with yt-dlp
yt-dlp -o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
# For best quality
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" \
-o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
# For audio only (faster)
yt-dlp -x --audio-format mp3 \
-o "assets/downloads/%(title)s.%(ext)s" "<VIDEO_URL>"
ffmpeg -i "assets/downloads/video.mp4" \
-vn -acodec mp3 -ab 128k \
"assets/downloads/audio.mp3"
# Basic transcription
whisper "assets/downloads/audio.mp3" \
--model base \
--output_format txt \
--output_dir output/
# Higher quality (slower)
whisper "assets/downloads/audio.mp3" \
--model medium \
--output_format all \
--output_dir output/
# With timestamps
whisper "assets/downloads/audio.mp3" \
--model base \
--output_format srt \
--output_dir output/
Read the generated transcript file from output/
Summarize key points
Extract quotes and timestamps
Identify speakers if multiple
| Model | Size | Speed | Quality |
|---|---|---|---|
| tiny | 39M | Fastest | Lower |
| base | 74M | Fast | Good |
| small | 244M | Medium | Better |
| medium | 769M | Slow | High |
| large | 1550M | Slowest | Highest |
txt - Plain text transcriptsrt - SubRip subtitles with timestampsvtt - WebVTT subtitlesjson - Detailed JSON with word-level timingall - All formatsbase model for speed, medium for accuracy--language en to force English detection--task translate to translate to Englishassets/downloads/ for downloaded filesoutput/transcripts/