| name | audio-script-writer |
| description | Convert written medical content into podcast or video scripts optimized for audio delivery. Transforms academic papers, reports, and educational materials into engaging spoken-word formats with pronunciation guides, timing markers, and audio-friendly structure. |
| allowed-tools | ["Read","Write","Bash","Edit"] |
| license | MIT |
| metadata | {"skill-author":"AIPOCH"} |
Audio Script Writer
Overview
Content transformation tool that converts written medical and scientific materials into professionally structured audio scripts suitable for podcasts, educational videos, audiobooks, and voiceover narration.
Key Capabilities:
- Format Conversion: Research papers → podcast scripts
- Spoken Word Optimization: Sentence restructuring for listening
- Pronunciation Guides: Medical terminology phonetic spelling
- Timing Estimation: Duration calculations for production planning
- Multi-Format Output: Podcast, video, lecture, audiobook templates
- Voice Direction: Tone, pace, and emphasis cues for narrators
When to Use
✅ Use this skill when:
- Creating medical education podcasts from journal articles
- Converting conference presentations to video scripts
- Developing audiobook versions of medical textbooks
- Scripting patient education audio materials
- Producing research summary videos for social media
- Adapting written case reports for audio case studies
- Creating voiceover scripts for e-learning modules
❌ Do NOT use when:
- Live presentation without script → Use improvisation
- Highly visual content (surgery videos) → Use visual-focused tools
- Interactive audio (Q&A format) → Use dialogue scripting tools
- Music or sound design planning → Use audio production software
- Voice recording itself → This creates scripts, not audio
Integration:
- Upstream:
abstract-summarizer (content condensation), lay-summary-gen (patient-friendly language)
- Downstream:
medical-translation (multi-language scripts), voice-cloning-tool (AI narration)
Core Capabilities
1. Spoken Word Transformation
Convert written text to conversational audio style:
from scripts.audio_writer import AudioScriptWriter
writer = AudioScriptWriter()
script = writer.convert_to_audio(
source_text=research_paper,
format="podcast",
target_audience="medical_students",
duration_minutes=15
)
print(script.spoken_text)
Transformation Rules:
| Written Style | Audio Style | Example |
|---|
| "Furthermore" | "Plus" | Less formal transitions |
| " et al." | "and their colleagues" | Expand abbreviations |
| Numbers in text | Spoken numbers | "15%" → "15 percent" |
| Long sentences | 15-20 word max | Break into digestible chunks |
| Passive voice | Active voice | "was observed" → "we saw" |
| Citations | Omit or footnote | "(Smith et al., 2024)" → [reference tone] |
2. Pronunciation Guide Generation
Create phonetic spelling for medical terms:
pronunciation = writer.create_pronunciation_guide(
text=script,
include_phonetic=True,
include_syllables=True
)
Guide Elements:
- Phonetic Spelling: IPA or simplified phonetics
- Syllable Breaks: hy-per-ten-sion
- Emphasis Marking: Primary stress (CAPS), secondary stress
- Alternative Pronunciations: Regional variations (UK vs US)
- Sound-Alikes: "rhymes with..." for difficult terms
3. Timing and Pacing
Calculate speaking duration and mark pacing cues:
timing = writer.calculate_timing(
script=script,
speaking_rate="conversational",
include_pauses=True
)
print(f"Estimated duration: {timing.duration_minutes} minutes")
print(f"Word count: {timing.word_count}")
print(f"Pace: {timing.words_per_minute} WPM")
Speaking Rates:
| Style | WPM | Use Case |
|---|
| Slow/Educational | 120-130 | Patient education, complex topics |
| Conversational | 140-160 | Podcasts, general audience |
| Fast/News | 170-190 | Time-constrained content |
| Variable | Varies | Dynamic pacing with pauses |
Pacing Cues:
[BREATHE] - Brief pause for narrator
[PAUSE 2s] - Two-second pause for emphasis
[SLOW DOWN] - Reduce pace for key point
[SPEED UP] - Increase energy/excitement
[BEAT] - Dramatic pause
4. Multi-Format Templates
Generate scripts for different audio formats:
podcast = writer.create_podcast_script(
content=article,
episode_format="interview",
include_intro_music=True,
ad_breaks=[5, 12]
)
video = writer.create_video_script(
content=lecture_slides,
visual_cues=True,
b_roll_notes=True
)
Format Types:
| Format | Characteristics | Best For |
|---|
| Podcast | Conversational, segments, ads | Long-form content, interviews |
| Video | Visual cues, B-roll notes | YouTube, educational platforms |
| Lecture | Structured, Q&A breaks | Online courses, training |
| Audiobook | Chapter markers, consistent tone | Textbooks, memoirs |
| News | Tight, factual, quick | Research briefs, updates |
Common Patterns
Pattern 1: Research Paper to Podcast
Scenario: Convert published study to 15-minute podcast episode.
python scripts/main.py \
--input paper.pdf \
--format podcast \
--duration 15 \
--style conversational \
--include-intro-outro \
--output podcast_script.txt
python scripts/main.py \
--input podcast_script.txt \
--generate-pronunciation \
--output pronunciation_guide.txt
Structure:
[INTRO MUSIC 5s]
HOST: Welcome to Medical Research Today. I'm your host...
[BREATHE]
HOST: Today we're diving into a fascinating study about...
[PAUSE]
HOST: So what did the researchers find? Well...
[BREATHE]
HOST: Dr. Smith, one of the study authors, explains...
[SOUND BITE: Interview clip]
...
[OUTRO MUSIC]
Pattern 2: Medical Lecture Recording
Scenario: Convert lecture notes to video script for online course.
lecture = writer.create_lecture_script(
notes=lecture_content,
duration=45,
break_intervals=[15, 30],
interaction_points=True
)
script = writer.add_visual_cues(
script=lecture,
slide_transitions=True,
animation_notes=True
)
Lecture Elements:
- Learning objectives at start
- Periodic comprehension checks
- Break reminders
- Transition phrases between topics
- Summary and key takeaways
Pattern 3: Patient Education Audio
Scenario: Create audio guide for diabetes management.
patient_script = writer.create_patient_script(
medical_content=diabetes_guide,
reading_level=6,
empathetic_tone=True,
key_points_highlighted=True
)
patient_script.adjust_pacing(
wpm=130,
pause_after_sentences=1.5
)
Patient Script Features:
- Simple language (avoid medical jargon)
- Empathetic tone
- Clear action steps
- Reassuring statements
- Repetition of key points
Pattern 4: Conference Presentation to Video
Scenario: Adapt live presentation to YouTube video format.
python scripts/main.py \
--input presentation_transcript.txt \
--format video \
--platform youtube \
--include-hooks true \
--engagement-cues true \
--output youtube_script.txt
YouTube Optimization:
- Hook in first 30 seconds
- Engagement questions for comments
- Call to action (subscribe, like)
- Timestamp markers for chapters
- B-roll suggestions for visual interest
Complete Workflow Example
From research paper to published podcast:
python scripts/main.py \
--input paper.pdf \
--extract-key-points \
--output key_points.txt
python scripts/main.py \
--input key_points.txt \
--format podcast \
--duration 20 \
--output raw_script.txt
python scripts/main.py \
--input raw_script.txt \
--add-music-cues \
--add-sound-effects \
--add-pacing-marks \
--output production_script.txt
python scripts/main.py \
--input production_script.txt \
--generate-pronunciation \
--output pronunciations.txt
python scripts/main.py \
--input production_script.txt \
--calculate-timing \
--output timing_breakdown.txt
Python API:
from scripts.audio_writer import AudioScriptWriter
from scripts.pronunciation import PronunciationGuide
from scripts.timing import TimingCalculator
writer = AudioScriptWriter()
pronouncer = PronunciationGuide()
timing = TimingCalculator()
with open("research_article.txt", "r") as f:
content = f.read()
script = writer.convert_to_audio(
text=content,
format="podcast",
target_duration=15,
audience="general_medical"
)
script_with_cues = writer.add_production_cues(
script=script,
music_stings=True,
transition_effects=True
)
medical_terms = pronouncer.extract_terms(script_with_cues)
pronunciation_guide = pronouncer.create_guide(medical_terms)
timing_analysis = timing.calculate(
script=script_with_cues,
speaking_rate=150
)
writer.export_production_package(
script=script_with_cues,
pronunciation=pronunciation_guide,
timing=timing_analysis,
output_dir="podcast_production/"
)
Quality Checklist
Content Quality:
Audio Optimization:
Production Quality:
Before Recording:
Common Pitfalls
Content Issues:
-
❌ Too dense → Information overload for listeners
- ✅ Break complex topics into multiple episodes
-
❌ Visual dependencies → "As shown in Figure 3..."
- ✅ Describe visuals or omit visual-dependent content
-
❌ Citation overload → Every sentence has reference
- ✅ Save citations for show notes, not narration
Audio Issues:
-
❌ Written-style language → "Furthermore, the aforementioned..."
- ✅ Conversational: "Plus, this thing we talked about..."
-
❌ No pauses → Relentless information delivery
- ✅ Build in breathing room; let points sink in
-
❌ Ignoring pronunciation → Mispronounced medical terms
- ✅ Research and practice all technical terms
Production Issues:
References
Available in references/ directory:
audio_writing_best_practices.md - Broadcast writing guidelines
medical_pronunciation_guide.md - Common terms phonetics
podcast_production_standards.md - Industry format standards
accessibility_guidelines.md - Inclusive audio content
platform_requirements.md - YouTube, Spotify, Apple specs
voice_care_tips.md - Narrator health and performance
Scripts
Located in scripts/ directory:
main.py - CLI interface for script conversion
audio_writer.py - Core text-to-audio transformation
pronunciation.py - Medical terminology phonetics
timing.py - Duration calculation and pacing
format_templates.py - Podcast, video, lecture templates
voice_direction.py - Narrator cues and direction
accessibility.py - Alternative format generation
Limitations
- Voice Performance: Script is text only; actual delivery varies by narrator
- Accent Variations: Pronunciation guides may not match all dialects
- Cultural Context: Humor and references may not translate across cultures
- Copyright: Cannot use copyrighted material without permission
- Technical Accuracy: Does not verify medical content (input-dependent)
- Live Elements: Cannot script unscripted interviews or Q&A
Parameters
| Parameter | Type | Default | Required | Description |
|---|
--input, -i | string | - | No | Input text file path |
--output, -o | string | - | No | Output JSON file path (default: stdout) |
--text | string | - | No | Direct text input (alternative to --input) |
--duration, -d | int | 5 | No | Target duration in minutes |
--pace, -p | string | normal | No | Speaking pace (slow, normal, fast) |
--style, -s | string | conversational | No | Script style (conversational, formal, educational) |
Usage
Basic Usage
python scripts/main.py --input article.txt --duration 5 --output script.json
python scripts/main.py --text "Medical research findings..." --duration 3
cat article.txt | python scripts/main.py --duration 5 --style conversational
python scripts/main.py --input paper.txt --style educational --pace slow
Risk Assessment
| Risk Indicator | Assessment | Level |
|---|
| Code Execution | Python script executed locally | Low |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Low |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output saved only to specified location | Low |
Security Checklist
Prerequisites
Evaluation Criteria
Success Metrics
Test Cases
- Basic Conversion: Convert text file → Returns audio script with metadata
- Abbreviation Handling: Text with "e.g., i.e., etc." → All expanded in output
- Number Conversion: Input with "1 in 4" → Output with "one in four"
Lifecycle Status
- Current Stage: Draft
- Next Review Date: 2026-03-06
- Known Issues: None
- Planned Improvements:
- Add support for custom abbreviation dictionaries
- Integrate with text-to-speech engines
- Add multilingual support
🎙️ Pro Tip: The best audio scripts sound natural when spoken. Always read your script aloud before finalizing—if you stumble over a sentence, your narrator will too. Revise for the ear, not the eye.