ワンクリックでManusで任意のスキルを実行

$pwd:

analyze-video

Name: Analyze Video
Author: barefootford

// Full footage analysis pipeline — audio transcripts, contact sheets, and Sonnet-written summaries. Produces every artifact the cut skill reads. Orchestrated from the main thread.

Manusで実行

$ git log --oneline --stat

stars:509

forks:80

updated:2026年5月25日 05:32

ファイルエクスプローラー

2 ファイル

SKILL.md

readonly

name	analyze-video
description	Full footage analysis pipeline — audio transcripts, contact sheets, and Sonnet-written summaries. Produces every artifact the cut skill reads. Orchestrated from the main thread.

Skill: Analyze Video (parent brief)

This is the main thread's playbook for the Analyze Video workflow step. Run it after library setup, before any cut work. It covers the three artifacts produced per clip: audio transcript, contact_sheet, and markdown summary. The roughcut agent reads dialogue on demand by running script_extractor.rb over the transcript JSON — no separate script artifact.

SKILL.md is the parent's dispatch brief. The sub-agent working prompt lives in agent_prompt.md — inline its contents when launching a Task agent. Don't pass SKILL.md.

Terminology

User-facing: call it "footage analysis" or "analyzing footage."
Internal/file names: "transcription" (library.yaml field transcript, etc.).

Prerequisites

Library setup is complete (library.yaml exists, schema is current — run migrations from AGENTS.md if not).
Read libraries/settings.yaml directly for whisper_model. For library fields, read the snapshot via ruby lib/buttercut/library.rb <name> summary and pull the values you need from the JSON — don't parse library.yaml inline.

Step 1 — Audio transcripts (parallel sub-agents)

Inform the user: "Library setup complete. Found [N] videos ([total size]). Starting footage analysis..."

Launch transcribe-audio Task agents. Pass these values inline in each agent's prompt:

video_path, transcript_output_dir, language_code, whisper_model
transcript_refinement (boolean). If true, also pass the current user_context and footage_summary strings (empty strings are fine — refinement still catches nonsense-token and self-witness fixes).

As each agent completes, update library.yaml with transcript (filename only, not full path):

ruby lib/buttercut/library.rb <name> complete transcript <filename> [<filename>...]

Refinement note: When transcript_refinement: true, each transcribe-audio agent reviews and corrects its transcript in place before returning, using the user_context and footage_summary the parent passed in. Empty context strings are fine. The parent still only writes transcript: <filename>.json to library.yaml after the agent completes.

Step 2 — Contact sheets (deterministic, no agent)

Run from the project root:

ruby lib/buttercut/contact_sheet_job.rb <library-name> <clip> [<clip> ...]

Takes an explicit list of clip filenames (including extension, e.g. P1055016.MP4). Runs single-threaded — launch multiple invocations in parallel from the main thread when machine headroom allows (a 2-3 split across cores is usually safe on an M-series Mac). Always rebuilds every sheet for the clips it's given; for clips longer than 10 minutes that includes per-segment sheets covering successive 10-minute slices. Updates library.yaml's contact_sheet field for every clip it processes. No LLM — pure ffmpeg.

Step 3 — Summaries (Sonnet sub-agents, batched, rolling)

Dispatch analyze-video sub-agents on the Sonnet model. Sonnet reads the contact sheet with noticeably more visual specificity than Haiku (catches clothing, architecture, camera framing) — worth it since the summaries feed every later cut decision.

Batch 10 clips per sub-agent, up to 10 sub-agents in parallel, with rolling dispatch. Each sub-agent processes its 10 clips sequentially; batching amortizes the ~5–10s per-agent dispatch overhead. For a 93-clip library that's ~10 sub-agents total instead of 93. Start the next sub-agent as soon as one returns — don't wait for the whole wave of 10 to finish, or you give up ~30% of wall-clock to whichever agent in the wave is slowest.

For each sub-agent, pass a list of 10 clip records inline. Each clip record needs:

video_filename — basename of the video (used in the summary header and reply line)
duration — duration string from library.yaml (e.g. 00:01:19); the agent renders it in the summary header
contact_sheet_path — absolute path to the _full.jpg (from step 2)
transcript_path — absolute path to the audio transcript JSON (from step 1); the sub-agent extracts dialogue on demand via script_extractor.rb
summary_output_path — absolute path where the agent should write the summary markdown

As each sub-agent returns its batch, update library.yaml with summary for every clip in that batch:

ruby lib/buttercut/library.rb <name> complete summary <filename> [<filename>...]

The contact_sheet field was already populated in step 2, so the sub-agent return only contributes summaries.

If a sub-agent returns summaries inline instead of writing them to disk (sometimes Sonnet hallucinates "the Write tool is blocked" and dumps the markdown into its reply), don't retry blindly — just extract each summary from the agent's response and Write it to the matching summary_output_path from the parent thread. Then run the complete summary command as usual. Faster than redispatching, and the content is already there.

(Per-segment contact sheets generated for long clips live alongside the _full sheet on disk and are discoverable by convention — they aren't listed in library.yaml.)

Step 4 — Confirm footage understanding with the user

Once every summary is written, talk through what the footage actually shows — confirm character names, locations, the narrative through-line, any stray or off-thesis clips, and the user's creative intent for this library. Use plain conversation; only reach for AskUserQuestion when offering a discrete choice. As you learn things, update:

footage_summary and user_context via ruby lib/buttercut/library.rb <name> update_metadata footage_summary "..." (and the same with user_context)
individual summary_*.md files when a summary mislabels someone or misses a key detail (e.g., "a man in a tan jacket" → the user's name)

This is the one place to do this thorough pass. Every later roughcut planning run inherits the resulting context rather than re-interrogating the library.

Step 5 — Backup

After all analysis completes, automatically create a backup using the backup-library skill.

Parallel sub-agent pattern (reference)

Used in steps 1 and 3.

Parent agent responsibilities:

Read library.yaml and settings.yaml once to gather all values needed by sub-agents.
Launch Task agents passing all needed values inline in the prompt.
Update library.yaml sequentially as agents complete (via the Library API — see AGENTS.md).
Handle errors and retries.

Child agent responsibilities:

Process its assigned clip(s) using only the inputs passed inline by the parent.
Run WhisperX (transcribe-audio) or read the pre-generated contact sheet, extract dialogue from the transcript via script_extractor.rb, and write the summary markdown in one Write call (analyze-video).
Return a short structured response with file paths.

Each skill's agent_prompt.md documents its own IO contract — including whether the sub-agent reads or writes library.yaml. (Spoiler: it never writes library.yaml. Only the parent writes, via the Library API.)

If the user requests a rough cut before analysis completes

Warn: "I can create a rough cut now, but I'll do a better job after analyzing all the footage. Continue anyway?" If the user confirms, proceed. Otherwise, wait for analysis to complete.

related-skills.json

同じリポジトリ

process-library.md

from "barefootford/buttercut"

Skill for processing footage (video clips, sounds, photos, etc). Use this when creating a new library, adding new footage (videos) to an existing library, or resuming processing on an existing library.

2026-05-25509

cut.md

from "barefootford/buttercut"

Build a cut from a library — scene, selects, roughcut, or custom task. Starts by asking what kind of cut the user wants, then works with them to determine what they want to create. Always exports a file for Final Cut, Premiere, or Resolve at the end. Use when the user asks for a "roughcut", "sequence", "scene", "selects", or any other cut-shaped output.

2026-05-25509

backup-library.md

from "barefootford/buttercut"

Backs up user libraries and all their contents (external video excluded). This skill can also be useful when you need to restore a library.

2026-05-25509

contact-sheet.md

from "barefootford/buttercut"

Builds a contact sheet from a video clip — evenly spaced frames laid out in a single grid image, each with its hh:mm:ss timestamp burned in. Use when the user asks for a "contact sheet", "grid", "film strip", or wants a one-image overview of part of a clip.

2026-05-25509

full-transcript.md

from "barefootford/buttercut"

Exports all dialogue from every clip in a library into a single text file. One clip per block — filename, then its spoken words. Use when the user asks for a "full transcript", "full script", or wants all the dialogue from a library in one place.

2026-05-25509

reprocess-with-contact-sheets.md

from "barefootford/buttercut"

Reset a library's visual analysis (contact sheets, summaries, legacy visual_transcripts) and re-run the current analyze-video pipeline on it. Keeps audio transcripts, cuts, plans, and library metadata. Use when a library was processed under the older pipeline and the user wants to bring it onto the contact-sheet-based one.

2026-05-25509

package.json

"author": "barefootford"

"repository": "barefootford/buttercut"

GitHub リポジトリを開く Creator のリポジトリを見る

$ install --global

$ download --local

Manusで実行

$ useful --forSOC

ソフトウェア開発者コンピュータ・数学職15-1252L4

name	analyze-video
description	Full footage analysis pipeline — audio transcripts, contact sheets, and Sonnet-written summaries. Produces every artifact the cut skill reads. Orchestrated from the main thread.

Skill: Analyze Video (parent brief)

SKILL.md is the parent's dispatch brief. The sub-agent working prompt lives in agent_prompt.md — inline its contents when launching a Task agent. Don't pass SKILL.md.

Terminology

User-facing: call it "footage analysis" or "analyzing footage."
Internal/file names: "transcription" (library.yaml field transcript, etc.).

Prerequisites

Library setup is complete (library.yaml exists, schema is current — run migrations from AGENTS.md if not).
Read libraries/settings.yaml directly for whisper_model. For library fields, read the snapshot via ruby lib/buttercut/library.rb <name> summary and pull the values you need from the JSON — don't parse library.yaml inline.

Step 1 — Audio transcripts (parallel sub-agents)

Inform the user: "Library setup complete. Found [N] videos ([total size]). Starting footage analysis..."

Launch transcribe-audio Task agents. Pass these values inline in each agent's prompt:

video_path, transcript_output_dir, language_code, whisper_model
transcript_refinement (boolean). If true, also pass the current user_context and footage_summary strings (empty strings are fine — refinement still catches nonsense-token and self-witness fixes).

As each agent completes, update library.yaml with transcript (filename only, not full path):

ruby lib/buttercut/library.rb <name> complete transcript <filename> [<filename>...]

Step 2 — Contact sheets (deterministic, no agent)

Run from the project root:

ruby lib/buttercut/contact_sheet_job.rb <library-name> <clip> [<clip> ...]

Step 3 — Summaries (Sonnet sub-agents, batched, rolling)

For each sub-agent, pass a list of 10 clip records inline. Each clip record needs:

video_filename — basename of the video (used in the summary header and reply line)
duration — duration string from library.yaml (e.g. 00:01:19); the agent renders it in the summary header
contact_sheet_path — absolute path to the _full.jpg (from step 2)
transcript_path — absolute path to the audio transcript JSON (from step 1); the sub-agent extracts dialogue on demand via script_extractor.rb
summary_output_path — absolute path where the agent should write the summary markdown

As each sub-agent returns its batch, update library.yaml with summary for every clip in that batch:

ruby lib/buttercut/library.rb <name> complete summary <filename> [<filename>...]

The contact_sheet field was already populated in step 2, so the sub-agent return only contributes summaries.

(Per-segment contact sheets generated for long clips live alongside the _full sheet on disk and are discoverable by convention — they aren't listed in library.yaml.)

Step 4 — Confirm footage understanding with the user

footage_summary and user_context via ruby lib/buttercut/library.rb <name> update_metadata footage_summary "..." (and the same with user_context)
individual summary_*.md files when a summary mislabels someone or misses a key detail (e.g., "a man in a tan jacket" → the user's name)

This is the one place to do this thorough pass. Every later roughcut planning run inherits the resulting context rather than re-interrogating the library.

Step 5 — Backup

After all analysis completes, automatically create a backup using the backup-library skill.

Parallel sub-agent pattern (reference)

Used in steps 1 and 3.

Parent agent responsibilities:

Read library.yaml and settings.yaml once to gather all values needed by sub-agents.
Launch Task agents passing all needed values inline in the prompt.
Update library.yaml sequentially as agents complete (via the Library API — see AGENTS.md).
Handle errors and retries.

Child agent responsibilities:

Process its assigned clip(s) using only the inputs passed inline by the parent.
Run WhisperX (transcribe-audio) or read the pre-generated contact sheet, extract dialogue from the transcript via script_extractor.rb, and write the summary markdown in one Write call (analyze-video).
Return a short structured response with file paths.

If the user requests a rough cut before analysis completes

Warn: "I can create a rough cut now, but I'll do a better job after analyzing all the footage. Continue anyway?" If the user confirms, proceed. Otherwise, wait for analysis to complete.

analyze-video

Skill: Analyze Video (parent brief)

Terminology

Prerequisites

Step 1 — Audio transcripts (parallel sub-agents)

Step 2 — Contact sheets (deterministic, no agent)

Step 3 — Summaries (Sonnet sub-agents, batched, rolling)

Step 4 — Confirm footage understanding with the user

Step 5 — Backup

Parallel sub-agent pattern (reference)

If the user requests a rough cut before analysis completes

このリポジトリの他の Skills

このリポジトリの他の Skills

Skill: Analyze Video (parent brief)

Terminology

Prerequisites

Step 1 — Audio transcripts (parallel sub-agents)

Step 2 — Contact sheets (deterministic, no agent)

Step 3 — Summaries (Sonnet sub-agents, batched, rolling)

Step 4 — Confirm footage understanding with the user

Step 5 — Backup

Parallel sub-agent pattern (reference)

If the user requests a rough cut before analysis completes