Run any Skill in Manus with one click

$pwd:

tts-voiceover

Name: Tts Voiceover
Author: microsoft

// Text-to-speech voice-over generation from YAML speaker notes using Azure Speech SDK with SSML pronunciation control - Brought to you by microsoft/hve-core

Run Skill in Manus

$ git log --oneline --stat

stars:1,078

forks:184

updated:May 20, 2026 at 23:30

File Explorer

19 files

SKILL.md

readonly

related-skills.json

same repository

powerpoint.md

from "microsoft/hve-core"

PowerPoint slide deck generation and management using python-pptx with YAML-driven content and styling - Brought to you by microsoft/hve-core

2026-05-121.1k

gh-code-scanning.md

from "microsoft/hve-core"

Retrieves and groups GitHub code scanning alerts by rule and severity using the gh CLI - Brought to you by microsoft/hve-core

2026-04-301.1k

owasp-docker.md

from "microsoft/hve-core"

OWASP Docker Top 6 vulnerability knowledge base for identifying, assessing, and remediating security risks in containerized Docker environments - Brought to you by microsoft/hve-core.

2026-04-241.1k

customer-card-render.md

from "microsoft/hve-core"

Generate customer-card PowerPoint content YAML from Design Thinking canonical artifacts and build using the shared PowerPoint skill pipeline - Brought to you by microsoft/hve-core

2026-04-231.1k

hve-core-installer.md

from "microsoft/hve-core"

Decision-driven installer for HVE-Core with 6 clone-based installation methods, extension quick-install, environment detection, and agent customization workflows - Brought to you by microsoft/hve-core

2026-04-221.1k

pr-reference.md

from "microsoft/hve-core"

Generates PR reference XML containing commit history and unified diffs between branches with extension and path filtering. Includes utilities to list changed files by type and read diff chunks. Use when creating pull request descriptions, preparing code reviews, analyzing branch changes, discovering work items from diffs, or generating structured diff summaries. - Brought to you by microsoft/hve-core

2026-04-211.1k

package.json

"author": "microsoft"

"repository": "microsoft/hve-core"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	tts-voiceover
description	Text-to-speech voice-over generation from YAML speaker notes using Azure Speech SDK with SSML pronunciation control - Brought to you by microsoft/hve-core
metadata	{"authors":"microsoft/hve-core","spec_version":"1.0"}

TTS Voice Over Skill

Generates per-slide WAV voice-over files from YAML speaker_notes using Azure Speech SDK with SSML pronunciation control.

Overview

This skill reads content.yaml files from a PowerPoint skill content directory, extracts speaker_notes fields, applies SSML acronym aliases for correct pronunciation of technical terms, and produces one WAV file per slide. Supports dry-run mode for SSML template verification without Azure credentials.

Prerequisites

Azure Speech resource — Free tier provides 500K characters per month.
Authentication — Key-based (SPEECH_KEY) or Microsoft Entra ID (SPEECH_RESOURCE_ID).
Python 3.11+ with uv for virtual environment management.

Key-Based Auth

export SPEECH_KEY="your-speech-key"
export SPEECH_REGION="eastus"

Microsoft Entra ID Auth

Requires a custom domain on the Speech resource and Cognitive Services Speech User role.

export SPEECH_RESOURCE_ID="/subscriptions/.../Microsoft.CognitiveServices/accounts/your-resource"
export SPEECH_REGION="eastus"

Install dependencies:

# run from this skill folder
uv sync

Quick Start

Verify SSML templates without generating audio:

uv run scripts/generate_voiceover.py --dry-run --content-dir path/to/content

Generate voice-over WAV files:

uv run scripts/generate_voiceover.py --content-dir path/to/content --output-dir voice-over

Embed audio into a PPTX deck:

uv run scripts/embed_audio.py --input deck.pptx --audio-dir voice-over --output deck-narrated.pptx

Parameters Reference

generate_voiceover.py

Parameter	Type	Default	Description
`--dry-run`	flag	`false`	Print SSML templates without generating audio
`--voice`	string	`en-US-Andrew:DragonHDLatestNeural`	Azure TTS voice name
`--rate`	string	`+10%`	Speech prosody rate
`--content-dir`	path	`content`	Path to slide content directory
`--output-dir`	path	`voice-over`	Path to WAV output directory
`--lexicon`	path	(auto-detect)	Custom acronyms.yaml path
`--verbose` / `-v`	flag	`false`	Enable verbose (DEBUG) logging output

embed_audio.py

Embeds WAV files into corresponding PPTX slides and adds narration timing XML so PowerPoint recognizes the audio for video export via File > Export > Create a Video > Use Recorded Timings and Narrations.

Parameter	Type	Default	Description
`--input`	path	(required)	Source PPTX file path
`--audio-dir`	path	`voice-over`	Directory with slide-NNN.wav
`--output`	path	`*-narrated.pptx`	Output PPTX file path
`--verbose` / `-v`	flag	`false`	Enable verbose (DEBUG) logging output

Script Reference

Generate with custom voice and rate:

uv run scripts/generate_voiceover.py \
  --content-dir content \
  --output-dir voice-over \
  --voice "en-US-Jenny:DragonHDLatestNeural" \
  --rate "+5%"

Use a custom lexicon:

uv run scripts/generate_voiceover.py \
  --content-dir content \
  --lexicon custom-acronyms.yaml

Embed generated audio:

uv run scripts/embed_audio.py \
  --input slide-deck/presentation.pptx \
  --audio-dir voice-over \
  --output slide-deck/presentation-narrated.pptx

Acronym Lexicon

The lexicon controls SSML <sub alias> replacements for acronyms and technical terms. Create an acronyms.yaml file:

acronyms:
  HVE-Core: "H V E Core"
  OWASP: "Oh wasp"
  SBOM: "S Bomb"
  SLSA: "Salsa"
  CI/CD: "C I C D"

Lexicon resolution order:

Path specified via --lexicon argument.
acronyms.yaml in the content directory.
Built-in defaults covering common technical acronyms.

SSML Template

Each slide produces an SSML document:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
 xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-US">
  <voice name="en-US-Andrew:DragonHDLatestNeural">
    <prosody rate="+10%">
      Text with <sub alias="Oh wasp">OWASP</sub> aliases applied.
    </prosody>
  </voice>
</speak>

Integration with PowerPoint Skill

This skill reads from the PowerPoint skill's content directory structure:

content/
├── slide-001/
│   └── content.yaml    # Must include speaker_notes: field
├── slide-002/
│   └── content.yaml
└── ...

Each content.yaml should contain a speaker_notes: field with the narration text. The generated WAV files are named slide-NNN.wav matching the directory names.

Troubleshooting

Issue	Solution
`Set SPEECH_KEY ... or SPEECH_RESOURCE_ID`	Export `SPEECH_KEY` (key auth) or `SPEECH_RESOURCE_ID` (Entra ID) with `SPEECH_REGION`.
401 with Entra ID auth	Verify custom domain on the Speech resource and `Cognitive Services Speech User` role. RBAC propagation takes up to 5 minutes.
Empty WAV files or skipped slides	Verify `speaker_notes:` is present and non-empty in `content.yaml`.
Mispronounced acronyms	Add entries to `acronyms.yaml` with phonetic aliases.
`azure-cognitiveservices-speech package is required`	Run `uv sync` in the skill directory.
Audio icon visible in PPTX	Reposition or resize the audio object in PowerPoint after embedding.
Authored slide animations missing after embedding	`embed_audio.py` replaces existing `p:timing` with narration timing; re-apply animations in PowerPoint after embedding audio.
Slides no longer advance on click after embedding	`embed_audio.py` sets `advClick="0"` for auto-advance. To re-enable, select all slides in PowerPoint and check Advance Slide > On Mouse Click in the Transitions tab.
Video export shows "No timings recorded"	Re-embed audio with the updated `embed_audio.py` which adds narration timing XML automatically.

Brought to you by microsoft/hve-core

🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.

tts-voiceover

More from this repository

More from this repository

TTS Voice Over Skill

Overview

Prerequisites

Key-Based Auth

Microsoft Entra ID Auth

Quick Start

Parameters Reference

generate_voiceover.py

embed_audio.py

Script Reference

Acronym Lexicon

SSML Template

Integration with PowerPoint Skill

Troubleshooting

TTS Voice Over Skill

Overview

Prerequisites

Key-Based Auth

Microsoft Entra ID Auth

Quick Start

Parameters Reference

generate_voiceover.py

embed_audio.py

Script Reference

Acronym Lexicon

SSML Template

Integration with PowerPoint Skill

Troubleshooting