with one click
youtube-transcribe-skill
// Extract subtitles/transcripts from YouTube videos. Triggers: "youtube transcript", "extract subtitles", "video captions", "视频字幕", "字幕提取", "YouTube转文字", "提取字幕".
// Extract subtitles/transcripts from YouTube videos. Triggers: "youtube transcript", "extract subtitles", "video captions", "视频字幕", "字幕提取", "YouTube转文字", "提取字幕".
Execute long-running, multi-session tasks autonomously using Claude Code headless mode or in-session hook-based loops. Supports structured task decomposition (for complex projects) and lightweight Ralph-style iteration (for TDD, bug fixing, refactoring). Use this skill whenever the user says "autonomous", "long-running task", "multi-session", "run this in the background", "keep working on this", "batch process", "iterate until done", "ralph loop", or wants any task that requires sustained, unattended execution.
Generate or edit images using Google Gemini API via nanobanana. Triggers: "nanobanana", "generate image", "create image", "edit image", "AI drawing", "图片生成", "AI绘图", "图片编辑", "生成图片".
Generate or edit images using OpenAI GPT Image API (gpt-image-2, gpt-image-1, etc). Triggers: "gpt image", "openai image", "generate image with openai", "draw image", "create image", "image generation", "AI drawing", "图片生成", "AI绘图", "生成图片", "画图". Use this skill whenever the user wants to generate or edit images and mentions OpenAI, GPT, or when OPENAI_API_KEY is available.
Create, refine, and benchmark agent skills. Use when building a new skill, updating an existing one, running evals, checking trigger quality, or improving a skill description.
Leverage OpenAI Codex/GPT models for autonomous code implementation. Triggers: "codex", "use gpt", "gpt-5", "let openai", "full-auto", "用codex", "让gpt实现". Use this skill whenever the user wants to delegate coding tasks to OpenAI models, run code reviews via codex, or execute tasks in a sandboxed environment.
Create Claude Code custom slash commands with proper structure, frontmatter, and best practices. Use this skill whenever the user wants to create a new command, add a slash command, build a custom command, or mentions "create-command", "new command", "add command", or "make a command" for Claude Code. Also trigger when the user wants to turn a workflow into a reusable command.
| name | youtube-transcribe-skill |
| description | Extract subtitles/transcripts from YouTube videos. Triggers: "youtube transcript", "extract subtitles", "video captions", "视频字幕", "字幕提取", "YouTube转文字", "提取字幕". |
Extract subtitles/transcripts from a YouTube video URL and save them as a local file.
Input YouTube URL: $ARGUMENTS
Confirm the input is a valid YouTube URL (supports youtube.com/watch?v=, youtu.be/, and youtube.com/shorts/ formats). If no URL is provided via arguments, check the conversation context for a YouTube link.
Use command-line tools to quickly extract subtitles.
Execute which yt-dlp.
yt-dlp is found, proceed to 2.2.yt-dlp is not found, skip to Step 3.yt-dlp --cookies-from-browser=chrome --get-title "[VIDEO_URL]"
--cookies-from-browser to avoid sign-in restrictions. Default to chrome.firefox, safari, edge) and retry.yt-dlp --cookies-from-browser=chrome --write-auto-sub --write-sub --sub-lang zh-Hans,zh-Hant,en --skip-download --output "<Video Title>.%(ext)s" "[VIDEO_URL]"
yt-dlp saves subtitles as .vtt or .srt files. Convert the downloaded file to plain Timestamp Text format:
.vtt or .srt).<Video Title>.txt with one Timestamp Text entry per line.When the CLI method fails or yt-dlp is missing, use Chrome DevTools MCP to extract subtitles via browser UI automation.
Check if Chrome DevTools MCP tools are available (look for tools matching chrome__new_page or similar).
If Chrome DevTools MCP is not available and yt-dlp was not found in Step 2, stop and notify the user: "Unable to proceed. Please either install yt-dlp (for fast CLI extraction) or configure Chrome DevTools MCP (for browser automation)."
Use Chrome DevTools MCP new_page to open the video URL.
Use Chrome DevTools MCP take_snapshot to read the page accessibility tree.
The "Show transcript" button is usually hidden within the collapsed description area.
click to click that button.take_snapshot to get the updated UI.click to click that button.Directly reading the accessibility tree for long transcript lists is slow and token-heavy. Use Chrome DevTools MCP evaluate_script to run this JavaScript instead:
() => {
const segments = document.querySelectorAll("ytd-transcript-segment-renderer");
if (!segments.length) return "BUFFERING";
return Array.from(segments)
.map((seg) => {
const time = seg.querySelector(".segment-timestamp")?.innerText.trim();
const text = seg.querySelector(".segment-text")?.innerText.trim();
return `${time} ${text}`;
})
.join("\n");
};
If it returns "BUFFERING", wait a few seconds and retry (up to 3 attempts).
<Video Title>.txt.close_page to release resources.<Video Title>.txtTimestamp Subtitle Text.