| name | p2v-phase-1-repo-script |
| description | Generate and validate video_script.jsonl from a code repository (GitHub URL or local path). Use this when running phase 1 of the repo-to-video pipeline. |
| metadata | {"short-description":"P2V phase 1 repo script generation"} |
P2V Phase 1: Repo Script
When to use
Use this skill when the user wants phase 1 of repo-to-video: repository input to validated video_script.jsonl.
Inputs
- repository URL (e.g.
https://github.com/org/repo) or local path (e.g. /path/to/repo)
- output run directory (default:
outputs/<video_id>-<timestamp>)
Input Handling
- GitHub URL: Clone or fetch the repo contents. Read README, key source files, config, and docs.
- Local path: Read directly from disk. Resolve the path and explore the directory tree.
In both cases, build a mental model of the repo before writing any script.
Workflow
- Run a mandatory preparation pass before drafting any script lines.
- Follow the pedagogical framework at:
docs/educational-video-pedagogy-framework.md
docs/00-system-contract.md
- Draft one coherent educational script from the preparation results (not directly from raw code).
- Enforce the contract fields in
video_script.jsonl.
- Save as
video_script.jsonl in the run folder.
- Validate:
uv run python -c "from pathlib import Path; from paper2video.contracts.io import validate_artifact; validate_artifact(Path('<video_script.jsonl>'), artifact_type='video_script'); print('video_script contract ok')"
Required output
<run_dir>/video_script.jsonl
Mandatory Preparation Pass (Internal, Phase-1 only)
Before writing the first record, the agent must do this internally:
- Codebase extraction
- primary purpose: what problem does this repo solve?
- architecture: high-level components, layers, data flow
- key abstractions: the 3-5 core types/interfaces/patterns that define the system
- design decisions: why was it built this way? what tradeoffs were made?
- dependencies and integration points
- known limitations, tech debt, or open issues (from README, issues, TODOs)
- Pedagogical recomposition
- learner-first sequence (not directory/module order)
- narrative arc: hook -> architecture overview -> core mechanism -> how pieces compose -> tradeoffs -> synthesis
- prerequisite and misconception map (what must viewers already know?)
- Script planning
- chapter plan with explicit didactic objective per chapter
- segment purpose statements that justify each segment
- duration estimate based on repo complexity
Do not ask the user for these artifacts. Build them internally, then emit only video_script.jsonl.
Codebase Exploration Strategy
To build the codebase extraction, follow this order:
- Orientation: README, package manifest (pyproject.toml, package.json, Cargo.toml, etc.), top-level directory structure
- Entry points: CLI commands, main functions, API routers, app factory
- Core domain: The 3-5 most important modules/classes that implement the primary purpose
- Data flow: How data moves through the system (request lifecycle, pipeline stages, event flow)
- Configuration: Settings, environment variables, feature flags
- Tests: Scan test files for usage patterns and edge cases that reveal design intent
Do NOT try to read every file. Focus on the files that reveal architecture and intent.
Complexity-To-Depth Policy (Required)
Before drafting, assign a complexity tier using repo content:
tier_1 (simple utility/library): single responsibility, few modules, straightforward API
tier_2 (moderate application): multiple subsystems, some non-trivial patterns
tier_3 (complex system): many interacting components, non-trivial architecture, significant design decisions
tier_4 (very complex): tier_3 plus distributed components, multiple protocols, or heavy infrastructure
Use this mapping for script depth:
tier_1: 700-1100 words (~5-8 min)
tier_2: 1100-1700 words (~8-13 min)
tier_3: 1700-2600 words (~13-20 min)
tier_4: 2400-3600 words (~18-28 min)
If draft word count is below tier minimum, expand with:
- deeper walkthrough of core data flow
- concrete examples of how key abstractions compose
- design decision rationale and alternatives considered
- edge cases and failure modes
Canonical Narrative Arc for Repos
Adapt the pedagogical framework's arc to code:
- Hook through tension: Start with the problem the repo solves. Make viewers feel the pain point.
- "Imagine you need to X, but Y makes it hard..."
- "Every time you do X, you hit this wall..."
- Promise and scope: What will the viewer understand by the end?
- Architecture overview: The 10,000-foot view. Key components and how they connect.
- Toy world / minimal example: Show the simplest use case that exercises the core path.
- Core mechanism walkthrough: Walk through the main code path with concrete examples.
- How pieces compose: Show how modules interact, data transforms, state flows.
- Design decisions and tradeoffs: Why this architecture? What alternatives were considered?
- Limits and future: Known limitations, tech debt, roadmap.
- Synthesis: Tie back to the opening — the viewer now understands how the system works.
Story Archetype Selection
Choose the best archetype for the repo:
- A) Architecture Explainer: For libraries/frameworks. "Here's how it works under the hood."
- B) Problem-Solution Journey: For applications. "Here's the problem → here's how this repo solves it."
- C) Design Decision Deep-Dive: For repos with interesting engineering tradeoffs. "Why was it built this way?"
VideoMetaRecord Format
The first record in video_script.jsonl must be:
{
"record_type": "video_meta",
"video_id": "<repo-name>",
"paper": {
"source_type": "repository",
"repo_url_or_path": "<input>",
"repo_name": "<name>",
"primary_language": "<lang>",
"description": "<one-line description>"
},
"primary_thesis": "<what this repo does and why it matters>"
}
Depth And Specificity Rules
The script must reflect genuine understanding of the codebase:
- Include concrete code details where possible:
- actual class/function names from the repo
- real data flow paths
- specific design patterns used
- concrete configuration or API surface
- Avoid generic descriptions that could apply to any repo.
- Tie architectural claims to actual code structure.
- Use explicit transitions that preserve technical continuity.
- Duration is repo-dependent:
- do not force a fixed runtime target
- simple utilities can be shorter
- complex systems should expand enough to cover architecture thoroughly
- Do not collapse complex repos into a marketing summary.
If the current draft feels generic, refine before finalizing.
Narration Voice Rules (Required)
narration_text must sound like an educational video, not a lecture outline:
- Never use meta-outline phrasing inside narration text:
- avoid:
Chapter 1, Chapter 2, Section, Lecture, In this chapter
- Keep chapter metadata in fields (
record_type=chapter, chapter_id) but keep spoken text natural.
- Prefer direct viewer-facing transitions:
- examples:
Now let's look at how..., Next we trace the data through..., Here's where it gets interesting...
- Avoid production/meta instructions in narration:
- no references to script-writing process, tiers, or internal planning artifacts.
- Use code-native vocabulary naturally:
- "the handler grabs the request and...", "this decorator wraps...", "the pipeline stages chain together..."
Didactic Density Rules (Required)
Keep the script teachable for video viewers:
- One core idea per narration unit.
- each
segment should deliver one primary teaching point plus at most one supporting point.
- Control spoken technical load.
- don't enumerate every method signature or config option in speech.
- focus on the "why" and "how", leave exhaustive API surface to docs.
- Split dense units.
- if a segment covers more than two major components, split it into sequential segments.
- Keep recaps short and retrieval-oriented.
- Code snippets are for visuals, not speech.
- narration should describe what the code does conceptually, the animation shows the code.