with one click
doubao-tts
// Generate high-quality speech audio using Doubao (豆包/Volcengine) TTS API. Use this skill when the user asks to generate audio, podcasts, voiceovers, or text-to-speech output.
// Generate high-quality speech audio using Doubao (豆包/Volcengine) TTS API. Use this skill when the user asks to generate audio, podcasts, voiceovers, or text-to-speech output.
Talk to other MetaBot bots (`mb talk` — send a message to another bot, including cross-instance peers). Use when you want to delegate to or message another bot, e.g. 'talk to bot X', '跟其他 bot 说话', 'send message to peer bot', 'ask the deploy-bot', 'delegate to bot'. Also covers bot/peer management, skill hub, voice calls.
MetaBot's persistent server-side scheduler (cron + one-shot). Optional skill — not installed by default. Use when the user wants tasks that survive Claude session restarts, are visible to other bots, or need to run in MetaBot's PM2 process rather than this Claude session.
The meta-skill: create AI agent teams, individual agents, or custom skills for any project. Use when the user wants to generate a complete agent team, create a single agent, or create a single skill for Claude Code, Kimi, or Codex.
Discover, search, and install shared skills from the Skill Hub registry. Use when the user wants to find available skills, share a skill with other bots, or install a skill from the hub.
Convert text to speech audio using mb voice CLI. Use when the user asks you to speak, say something aloud, generate audio, or produce a voice recording.
Read and write shared memory documents. Use this when you need to save knowledge, notes, research findings, or project context for future reference across sessions. Also use it to look up previously stored information.
| name | doubao-tts |
| description | Generate high-quality speech audio using Doubao (豆包/Volcengine) TTS API. Use this skill when the user asks to generate audio, podcasts, voiceovers, or text-to-speech output. |
Generate high-quality speech audio from text using Volcengine's Doubao TTS API. Supports short-form (real-time) and long-form (async, up to 100K characters) synthesis.
Use the doubao-tts CLI tool (installed at bin/doubao-tts):
# Short text (real-time, < 300 chars)
bin/doubao-tts "你好世界" -o output.mp3
# Long text from file (async mode, up to 100K chars)
bin/doubao-tts -f article.txt -o podcast.mp3
# Pipe content
echo "Hello world" | bin/doubao-tts -o hello.mp3
# Choose voice
bin/doubao-tts "你好" -v zh_male_aojiaobazong_moon_bigtts -o output.mp3
# Adjust speed/volume/pitch
bin/doubao-tts "你好" --speed 1.2 --volume 1.5 -o output.mp3
| Voice ID | Description |
|---|---|
zh_female_sajiaonvyou_moon_bigtts | 撒娇女友 (default) |
zh_female_gaolengyujie_moon_bigtts | 高冷御姐 |
zh_female_tianmeixiaoyuan_moon_bigtts | 甜美校园 |
zh_female_yuanqinvyou_moon_bigtts | 元气女友 |
zh_female_wanwanxiaohe_moon_bigtts | 弯弯小何 |
zh_female_linjianvhai_moon_bigtts | 邻家女孩 |
| Voice ID | Description |
|---|---|
zh_male_aojiaobazong_moon_bigtts | 傲娇霸总 |
zh_male_jingqiangkanye_moon_bigtts | 京腔侃爷 |
zh_male_wennuanahu_moon_bigtts | 温暖阿虎 |
zh_male_yangguangqingnian_moon_bigtts | 阳光青年 |
Note: 其他音色 (BV系列, mars后缀) 需要不同的 resource ID。如需更多音色,请在火山引擎控制台开通对应资源。
VOLCENGINE_TTS_APPID=<app_id>
VOLCENGINE_TTS_ACCESS_KEY=<access_key>
VOLCENGINE_TTS_RESOURCE_ID=volc.service_type.10029 (optional)
https://openspeech.bytedance.com/api/v3/tts/unidirectionaldata fieldPOST https://openspeech.bytedance.com/api/v1/tts_async/submitGET https://openspeech.bytedance.com/api/v1/tts_async/query?appid=X&task_id=Yaudio_url (valid for 1 hour)bin/doubao-tts -f script.txt -v zh_male_aojiaobazong_moon_bigtts -o podcast.mp3cp podcast.mp3 /tmp/metabot-outputs/<chatId>/ to send to userffmpeg# Generate segments for different speakers
bin/doubao-tts -f host_lines.txt -v zh_male_aojiaobazong_moon_bigtts -o host.mp3
bin/doubao-tts -f guest_lines.txt -v zh_female_gaolengyujie_moon_bigtts -o guest.mp3
# Concatenate (requires ffmpeg)
echo "file 'host.mp3'" > list.txt
echo "file 'guest.mp3'" >> list.txt
ffmpeg -f concat -safe 0 -i list.txt -c copy podcast.mp3
# Short-form
curl -X POST "https://openspeech.bytedance.com/api/v3/tts/unidirectional" \
-H "Content-Type: application/json" \
-H "X-Api-App-Id: $VOLCENGINE_TTS_APPID" \
-H "X-Api-Access-Key: $VOLCENGINE_TTS_ACCESS_KEY" \
-H "X-Api-Resource-Id: volc.service_type.10029" \
-H "X-Api-Request-Id: $(uuidgen)" \
-d '{
"req_params": {
"text": "你好世界",
"speaker": "zh_female_sajiaonvyou_moon_bigtts",
"audio_params": {"format": "mp3", "sample_rate": 24000}
}
}' | python3 -c "
import sys, json, base64
chunks = []
for line in sys.stdin:
line = line.strip()
if not line: continue
try:
d = json.loads(line)
if d.get('data'): chunks.append(base64.b64decode(d['data']))
except: pass
sys.stdout.buffer.write(b''.join(chunks))
" > output.mp3