ワンクリックで
voice-note
Convert voice messages to text (STT) and text to voice (TTS). Supports Whisper local model and Edge-TTS.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
メニュー
Convert voice messages to text (STT) and text to voice (TTS). Supports Whisper local model and Edge-TTS.
Codex または Claude でインストール この Prompt をコピーして Codex、Claude、または他のアシスタントに貼り付けると、Skill ページを確認してインストールできます。
SOC 職業分類に基づく
Create and edit PowerPoint (.pptx) presentations programmatically. Requires python-pptx.
Create and edit Excel (.xlsx) workbooks with openpyxl. Supports formulas, charts, formatting, and data analysis.
Generate images via DALL-E, Stable Diffusion, or free alternatives. Supports multi-channel delivery.
Generate meme images with text overlays using Pillow. Pick templates or create custom image macros.
Execute Python code snippets in a sandboxed environment. Supports data analysis, visualization, and quick scripts.
GitHub CLI for issues, PRs, code search, CI logs, releases, and API queries. Requires gh CLI and auth.
| name | voice-note |
| description | Convert voice messages to text (STT) and text to voice (TTS). Supports Whisper local model and Edge-TTS. |
| version | 1.0.0 |
| metadata | {"echo":{"tags":["Voice","STT","TTS","Whisper","Audio","Media"]}} |
Speech-to-Text (STT) and Text-to-Speech (TTS) capabilities.
from openai import OpenAI
client = OpenAI()
with open("audio.ogg", "rb") as f:
transcript = client.audio.transcriptions.create(model="whisper-1", file=f)
print(transcript.text)
pip install faster-whisper
from faster_whisper import WhisperModel
model = WhisperModel("base", compute_type="int8") # tiny/base/small/medium/large-v3
segments, info = model.transcribe("audio.ogg", language="zh")
text = " ".join(s.text for s in segments)
print(f"[{info.language}] {text}")
pip install edge-tts
# CLI
edge-tts --voice zh-CN-XiaoxiaoNeural --text "你好世界" --write-media output.mp3
# List voices
edge-tts --list-voices | grep zh-CN
import edge_tts, asyncio
async def speak(text, voice="zh-CN-XiaoxiaoNeural", output="output.mp3"):
communicate = edge_tts.Communicate(text, voice)
await communicate.save(output)
asyncio.run(speak("今天天气不错,适合出门"))
| Voice | Style |
|---|---|
| zh-CN-XiaoxiaoNeural | 女声,活泼自然 |
| zh-CN-YunxiNeural | 男声,温和 |
| zh-CN-YunyangNeural | 男声,新闻播报 |
| zh-CN-XiaoyiNeural | 女声,温柔 |
python3 scripts/voice_process.py transcribe audio.ogg --model base --language zh --output transcript.txt
python3 scripts/voice_process.py summarize meeting.mp3 --model small
Note: TTS (speak/voices) is in the separate tts-voice skill.
ffmpeg -i input.ogg output.mp3