| name | tsh-transcript-processing |
| description | Clean raw workshop or meeting transcripts from small talk, filler words, and off-topic tangents. Extract and structure business-relevant content into a standardized format with discussion topics, key decisions, action items, and open questions. |
| user-invocable | false |
Transcript Processing
This skill helps you clean raw workshop or meeting transcripts and produce a structured, business-relevant document. It removes noise (small talk, greetings, filler words, off-topic tangents) while preserving all actionable and business-critical discussion points.
Transcript Processing Process
Use the checklist below and track your progress:
Processing progress:
- [ ] Step 1: Identify transcript format and meeting metadata
- [ ] Step 2: Identify and tag participants
- [ ] Step 3: Remove non-business content
- [ ] Step 4: Group remaining content by discussion topics
- [ ] Step 5: Extract key decisions
- [ ] Step 6: Extract action items and open questions
- [ ] Step 7: Preserve critical raw context
- [ ] Step 8: Save the cleaned transcript
Step 1: Identify transcript format and meeting metadata
Determine the format of the raw transcript:
- Speaker-labelled transcript (e.g.,
[Speaker Name]: text)
- Plain text notes without speaker labels
- Timestamped transcript (e.g.,
[00:12:34] text)
- Mixed format
- PDF document (transcript or meeting notes exported/provided as PDF — use
pdf-reader tool to extract text content first)
Extract meeting metadata where available:
- Date and time of the workshop
- Duration (if timestamps are present, calculate from first to last entry)
- Workshop topic or title
- Context (e.g., "Discovery workshop for Project X, Sprint 3 planning")
If metadata is not explicitly stated in the transcript, ask the user to provide it.
Step 2: Identify and tag participants
Scan the transcript for participant names or speaker labels. For each participant:
- Note their name or identifier
- Infer their role if mentioned or obvious from context (e.g., "Product Owner", "Developer", "Client stakeholder")
- If roles are not clear, list participants without roles — do not guess
Step 3: Remove non-business content
Systematically identify and remove:
- Greetings and sign-offs: "Hi everyone", "Let's wrap up", "Have a good weekend"
- Small talk: Weather, personal anecdotes, unrelated banter
- Filler words and verbal tics: "um", "uh", "you know", "like" (when used as filler)
- Technical difficulties discussion: "Can you hear me?", "Let me share my screen", "You're on mute"
- Off-topic tangents: Discussions clearly unrelated to the workshop's purpose
- Repetitive restatements: When the same point is made multiple times, keep the clearest version
Important: When in doubt about whether content is business-relevant, keep it. It is better to preserve potentially useful context than to accidentally remove important information.
Step 4: Group remaining content by discussion topics
Analyze the cleaned content and organize it into logical discussion topics:
- Identify natural topic boundaries (when the conversation shifts to a new subject)
- Create descriptive topic headings that summarize the theme
- Under each topic, list the key points discussed as bullet points
- Attribute points to speakers when speaker labels are available
- Maintain chronological order within each topic
- If a topic spans multiple disconnected parts of the transcript, consolidate them under one heading
Step 5: Extract key decisions
Review the structured content and identify explicit and implicit decisions:
- Explicit decisions: Statements like "We agreed to...", "The decision is...", "Let's go with..."
- Implicit decisions: When discussion converges on a direction without formal declaration
- For each decision, note:
- What was decided
- Who made or endorsed the decision (if clear)
- Any conditions or caveats attached
Step 6: Extract action items and open questions
Scan for action items:
- Commitments to do something (e.g., "I'll prepare the wireframes by Friday")
- Assigned tasks (e.g., "Can you check the API documentation?")
- For each action item, note: what, who (if assigned), and any deadline mentioned
Scan for open questions:
- Unanswered questions raised during the workshop
- Items deferred for later discussion ("We'll revisit this next week")
- Ambiguities that were acknowledged but not resolved
Step 7: Preserve critical raw context
Identify and preserve exact quotes or passages where the original wording is important:
- Requirements stated in specific business language
- Constraints or limitations mentioned by stakeholders
- Conflicting viewpoints that need to be captured verbatim
- Domain-specific terminology defined or explained during the workshop
Place these in a "Preserved Context" section with attribution to the speaker.
Step 8: Save the cleaned transcript
Generate the final output following the ./cleaned-transcript.example.md template.
Save the file to specifications/<workshop-name>/cleaned-transcript.md.
Review the output to ensure:
- No business-relevant content was accidentally removed
- Topics are logically grouped and clearly labelled
- Decisions, action items, and open questions are complete
- The document is readable and useful as a standalone reference
Connected Skills
tsh-task-extracting - uses the cleaned transcript as a primary input for identifying epics and stories