Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

vocal

Estrellas1

Forks0

Actualizado25 de mayo de 2026, 23:04

Speak text aloud (TTS) and transcribe speech (STT). Supports local (macOS say, mlx-whisper) and cloud (ElevenLabs) providers. Use when user asks to speak, read aloud, listen, transcribe, or use vocal.

Instalación

Instalar con Codex o Claude Copia este prompt, pégalo en Codex, Claude u otro asistente, y deja que revise la página de la skill y la instale por ti.

Ejecutar en Manus

Fuente

fairchild

fairchild/dotclaude

Abrir repositorio de GitHub Ver repositorios del creador

Descarga

Ejecutar en Manus

Ocupaciones relacionadasSOC

Basado en la clasificación ocupacional SOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas·SOC 15-1252

Explorador de archivos

15 archivos

SKILL.md

readonly

Más de este repositorio

mismo repositorio

backlog

fairchild/dotclaude

Markdown task backlog and project roadmap (backlog/{todo,doing,done,failed}/, backlog/ROADMAP.md) for adding, advancing, recording progress, rescuing, cancelling, retrying, failing, maintenance, or reflecting on backlog priorities and roadmap direction.

2026-06-101

swiftui-expert

fairchild/dotclaude

Write, review, or improve SwiftUI code. Use for SwiftUI features/refactors, state management, view composition, performance, concurrency, modern Apple APIs, or native UI quality.

2026-06-071

dotclaude-config

fairchild/dotclaude

Work with Claude Code configuration at global (~/.claude) or project (.claude/) level. Use when editing settings.json (permissions, hooks, statusline, model), managing MCP servers, creating agents/commands/skills, writing CLAUDE.md, setting up rules files, or configuring a new project. Determines context automatically and provides guidance on global vs project placement to avoid duplication.

2026-06-061

chronicle

fairchild/dotclaude

Session continuity for coding work. Default to /chronicle to capture, /chronicle catchup to resume, and /chronicle pending to review open threads. Use curate, recap, wrapup, summarize, publish, insights, search, and ui only when the user explicitly needs them.

2026-05-261

analyze-usage

fairchild/dotclaude

Analyze AI coding assistant usage patterns across Claude Code, Codex, and Cursor. Use when user asks about their coding usage, tool statistics, productivity patterns, skill popularity, session history, or wants to query their AI coding logs. Triggers include "usage", "how much have I used", "most used tools", "skill popularity", "coding stats", "productivity patterns".

2026-05-261

ascii-art-fix

fairchild/dotclaude

Fix misaligned right borders in ASCII art diagrams

2026-05-261

name	vocal
description	Speak text aloud (TTS) and transcribe speech (STT). Supports local (macOS say, mlx-whisper) and cloud (ElevenLabs) providers. Use when user asks to speak, read aloud, listen, transcribe, or use vocal.
license	Apache-2.0
disable-model-invocation	true
metadata	{"status":"experimental","experimental_reason":"Voice workflows depend on local audio devices and optional ElevenLabs credentials, so reliability is environment-sensitive."}

Vocal

Speak text aloud and transcribe speech with local and cloud providers. User-invocable only (/vocal) — audio is a side-effect surface, not something to auto-trigger on.

Usage

`/vocal` — turn-based vocal loop

Runs an ask-aloud / listen / respond / keep-listening cycle using the vocal-listener background agent.

/vocal What should we work on next?

Optional inline config:

/vocal stt=local tts=local duration=8 What should we work on next?
/vocal stt=elevenlabs tts=elevenlabs duration=10 Ready when you are.

Loop behavior

Parse inline config from the command text:
- stt=local|elevenlabs (default: local)
- tts=local|elevenlabs (default: match stt)
- duration=<seconds> (default: 8)
- Remaining text becomes the first spoken prompt.

Validate selected providers before starting (run only the checks needed):

uv run ~/.claude/skills/vocal/scripts/stt_local.py --check
uv run ~/.claude/skills/vocal/scripts/stt_elevenlabs.py --check
uv run ~/.claude/skills/vocal/scripts/tts_local.py --check
uv run ~/.claude/skills/vocal/scripts/tts_elevenlabs.py --check

Launch the listener. Create or reuse a team named vocal and launch vocal-listener as a background task with config:

stt_provider=<local|elevenlabs>
duration_seconds=<duration>
continue_token=keep-listening
stop_token=stop-listening

Speak the first prompt aloud (if provided). If none is provided, speak: Vocal mode active. I'm listening.
For every listener message starting with [voice-input]:
- Treat the transcript as the user turn.
- Produce a concise assistant response.
- Speak the response with the selected TTS provider.
- Send keep-listening to the listener agent.
Stop conditions:
- Transcript asks to stop (e.g. "stop vocal mode", "goodbye", "exit vocal") — speak confirmation and send stop-listening.
- Listener reports [voice-error] — surface the error and pause vocal mode.

Turn-based, not full-duplex realtime. Each listen cycle is a separate background agent turn. Keep spoken responses short unless the user asks for detail.

Web tuning console

uv run --script ~/.claude/skills/vocal/scripts/web_console.py

Open http://127.0.0.1:8765 to tune the skill from a local browser.

The console supports:

TTS sample playback for local say and ElevenLabs
Local and ElevenLabs voice listing
Browser microphone recording and audio-file transcription
Provider checks from the same scripts used by the skill
Saved local defaults in skills/vocal/data/preferences.json

Options:

# Choose a port
uv run --script ~/.claude/skills/vocal/scripts/web_console.py --port 8799

# Use a private preference directory outside the skill checkout
VOCAL_DATA_DIR=~/Library/Application\ Support/vocal-skill \
  uv run --script ~/.claude/skills/vocal/scripts/web_console.py

Local TTS (macOS `say`)

uv run --script ~/.claude/skills/vocal/scripts/tts_local.py --text "Hello Michael"

Examples:

# Save audio to file
uv run --script ~/.claude/skills/vocal/scripts/tts_local.py \
  --text "Build succeeded" \
  --voice Alex \
  --rate 200 \
  --output /tmp/build.aiff

# List macOS voices
uv run --script ~/.claude/skills/vocal/scripts/tts_local.py --list-voices

Local STT (mlx-whisper, Apple Silicon)

# Record microphone for 5 seconds and transcribe
uv run --script ~/.claude/skills/vocal/scripts/stt_local.py --duration 5

# Transcribe an existing file
uv run --script ~/.claude/skills/vocal/scripts/stt_local.py --file ./meeting.wav

# List input devices
uv run --script ~/.claude/skills/vocal/scripts/stt_local.py --list-devices

# Use a specific device
uv run --script ~/.claude/skills/vocal/scripts/stt_local.py --duration 5 --device 1

ElevenLabs TTS (cloud)

uv run --script ~/.claude/skills/vocal/scripts/tts_elevenlabs.py \
  --text "Hello Michael" \
  --voice George

Examples:

# Save and play the generated mp3
uv run --script ~/.claude/skills/vocal/scripts/tts_elevenlabs.py \
  --text "Deployment complete" \
  --model eleven_turbo_v2_5 \
  --output /tmp/deploy.mp3 \
  --play

ElevenLabs STT (Scribe v2)

# Record microphone for 5 seconds and transcribe
uv run --script ~/.claude/skills/vocal/scripts/stt_elevenlabs.py --duration 5

# Transcribe an existing audio file
uv run --script ~/.claude/skills/vocal/scripts/stt_elevenlabs.py --file ./call.wav

# List input devices
uv run --script ~/.claude/skills/vocal/scripts/stt_elevenlabs.py --list-devices

# Use a specific device
uv run --script ~/.claude/skills/vocal/scripts/stt_elevenlabs.py --duration 5 --device 1

Provider checks

uv run --script ~/.claude/skills/vocal/scripts/tts_local.py --check
uv run --script ~/.claude/skills/vocal/scripts/stt_local.py --check
uv run --script ~/.claude/skills/vocal/scripts/tts_elevenlabs.py --check
uv run --script ~/.claude/skills/vocal/scripts/stt_elevenlabs.py --check

Provider Comparison

Provider	Mode	Latency	Quality	Cost
`tts_local.py`	Local	Low	Good	Free
`stt_local.py`	Local	Medium (first run downloads model)	Good	Free
`tts_elevenlabs.py`	Cloud	Very low with flash model	Very high	Paid API
`stt_elevenlabs.py`	Cloud	Low	Very high	Paid API

Environment Variables

Variable	Required	Used by
`ELEVENLABS_API_KEY`	Yes (cloud only)	`tts_elevenlabs.py`, `stt_elevenlabs.py`
`ELEVEN_LABS_API_KEY`	Accepted alias	`tts_elevenlabs.py`, `stt_elevenlabs.py`

Set via ~/.env or shell export.

Recommended local setup:

# Preferred name
ELEVENLABS_API_KEY=your-key-here

# Accepted legacy alias
ELEVEN_LABS_API_KEY=your-key-here

Put one of those lines in ~/.env, then restart web_console.py. The vocal scripts load ~/.env automatically before checking the process environment.

Troubleshooting

Getting an ElevenLabs API key

Open https://elevenlabs.io/app/settings/api-keys
Create a key
Export it:

export ELEVENLABS_API_KEY=your-key-here

macOS microphone permissions

If transcription fails with permission errors:

Open System Settings -> Privacy & Security -> Microphone
Allow Terminal (or your Claude host app)
Re-run the command

Common issues

say: command not found: install or restore macOS command line tools
mlx-whisper import error: run command via uv run so dependencies install
API key invalid: regenerate key and ensure no whitespace

Self-Validation

Run fast provider checks:

uv run --script ~/.claude/skills/vocal/tests/test_voice.py

Run file-based ask/listen/respond loop (no microphone required):

uv run --script ~/.claude/skills/vocal/tests/test_voice_loop.py

Include cloud loop validation (requires ElevenLabs key):

uv run --script ~/.claude/skills/vocal/tests/test_voice_loop.py --cloud

Run web console helper tests:

uv run --script ~/.claude/skills/vocal/tests/test_web_console.py

Run browser validation for the web console:

# Starts an isolated console on a free port and validates desktop/mobile flows
uv run --script ~/.claude/skills/vocal/tests/test_web_console_playwright.py

# Validate a console you already have open
uv run --script ~/.claude/skills/vocal/tests/test_web_console_playwright.py \
  --url http://127.0.0.1:8765

# Include the ElevenLabs TTS UI path (uses API credits)
uv run --script ~/.claude/skills/vocal/tests/test_web_console_playwright.py \
  --url http://127.0.0.1:8765 \
  --cloud

# Watch the test in a real browser window
uv run --script ~/.claude/skills/vocal/tests/test_web_console_playwright.py \
  --url http://127.0.0.1:8765 \
  --headed \
  --slow-mo 100

Fixture files for loop validation:

tests/fixtures/loop_prompt.txt
tests/fixtures/expected_keyword.txt

References

Architecture & research: See references/architecture.md — three-tier design, ElevenLabs API details, Claude Code background communication research, CLI programmatic modes
Voice bridge backlog: See backlog/voice-bridge-plan.md — standalone process for continuous voice conversation with self-eval loop

vocal

Más de este repositorio

Más de este repositorio

Vocal

Usage

/vocal — turn-based vocal loop

Loop behavior

Web tuning console

Local TTS (macOS say)

Local STT (mlx-whisper, Apple Silicon)

ElevenLabs TTS (cloud)

ElevenLabs STT (Scribe v2)

Provider checks

Provider Comparison

Environment Variables

Troubleshooting

Getting an ElevenLabs API key

macOS microphone permissions

Common issues

Self-Validation

References

Vocal

Usage

/vocal — turn-based vocal loop

Loop behavior

Web tuning console

Local TTS (macOS say)

Local STT (mlx-whisper, Apple Silicon)

ElevenLabs TTS (cloud)

ElevenLabs STT (Scribe v2)

Provider checks

Provider Comparison

Environment Variables

Troubleshooting

Getting an ElevenLabs API key

macOS microphone permissions

Common issues

Self-Validation

References

`/vocal` — turn-based vocal loop

Local TTS (macOS `say`)

`/vocal` — turn-based vocal loop

Local TTS (macOS `say`)