Run any Skill in Manus with one click

$pwd:

doubao-tts

Name: Doubao Tts
Author: xvirobotics

// Generate high-quality speech audio using Doubao (豆包/Volcengine) TTS API. Use this skill when the user asks to generate audio, podcasts, voiceovers, or text-to-speech output.

Run Skill in Manus

$ git log --oneline --stat

stars:830

forks:126

updated:March 29, 2026 at 12:48

SKILL.md

readonly

name	doubao-tts
description	Generate high-quality speech audio using Doubao (豆包/Volcengine) TTS API. Use this skill when the user asks to generate audio, podcasts, voiceovers, or text-to-speech output.

Doubao TTS — 豆包语音合成

Generate high-quality speech audio from text using Volcengine's Doubao TTS API. Supports short-form (real-time) and long-form (async, up to 100K characters) synthesis.

When to Use

User asks to generate audio, podcasts, voiceovers, or narration
User wants text-to-speech for any content
User asks to "read this aloud" or "make an audio version"

Quick Usage

Use the doubao-tts CLI tool (installed at bin/doubao-tts):

# Short text (real-time, < 300 chars)
bin/doubao-tts "你好世界" -o output.mp3

# Long text from file (async mode, up to 100K chars)
bin/doubao-tts -f article.txt -o podcast.mp3

# Pipe content
echo "Hello world" | bin/doubao-tts -o hello.mp3

# Choose voice
bin/doubao-tts "你好" -v zh_male_aojiaobazong_moon_bigtts -o output.mp3

# Adjust speed/volume/pitch
bin/doubao-tts "你好" --speed 1.2 --volume 1.5 -o output.mp3

Available Voices (已验证可用)

Chinese Female

Voice ID	Description
`zh_female_sajiaonvyou_moon_bigtts`	撒娇女友 (default)
`zh_female_gaolengyujie_moon_bigtts`	高冷御姐
`zh_female_tianmeixiaoyuan_moon_bigtts`	甜美校园
`zh_female_yuanqinvyou_moon_bigtts`	元气女友
`zh_female_wanwanxiaohe_moon_bigtts`	弯弯小何
`zh_female_linjianvhai_moon_bigtts`	邻家女孩

Chinese Male

Voice ID	Description
`zh_male_aojiaobazong_moon_bigtts`	傲娇霸总
`zh_male_jingqiangkanye_moon_bigtts`	京腔侃爷
`zh_male_wennuanahu_moon_bigtts`	温暖阿虎
`zh_male_yangguangqingnian_moon_bigtts`	阳光青年

Note: 其他音色 (BV系列, mars后缀) 需要不同的 resource ID。如需更多音色，请在火山引擎控制台开通对应资源。

API Details

Environment Variables (already configured in MetaBot .env)

VOLCENGINE_TTS_APPID=<app_id>
VOLCENGINE_TTS_ACCESS_KEY=<access_key>
VOLCENGINE_TTS_RESOURCE_ID=volc.service_type.10029  (optional)

Short-form API (real-time, < 300 chars)

Endpoint: https://openspeech.bytedance.com/api/v3/tts/unidirectional
Response: chunked JSON with base64 audio in data field
Latency: < 1 second

Long-form API (async, up to 100K chars)

Submit: POST https://openspeech.bytedance.com/api/v1/tts_async/submit
Query: GET https://openspeech.bytedance.com/api/v1/tts_async/query?appid=X&task_id=Y
Response: audio_url (valid for 1 hour)
Latency: seconds to minutes depending on text length

Workflow for Podcasts

Write the script — Create the podcast script as markdown or plain text
Generate audio — Use bin/doubao-tts -f script.txt -v zh_male_aojiaobazong_moon_bigtts -o podcast.mp3
Copy to outputs — cp podcast.mp3 /tmp/metabot-outputs/<chatId>/ to send to user
For multi-voice podcasts, generate each speaker's segments separately, then concatenate with ffmpeg

Multi-Voice Podcast Example

# Generate segments for different speakers
bin/doubao-tts -f host_lines.txt -v zh_male_aojiaobazong_moon_bigtts -o host.mp3
bin/doubao-tts -f guest_lines.txt -v zh_female_gaolengyujie_moon_bigtts -o guest.mp3

# Concatenate (requires ffmpeg)
echo "file 'host.mp3'" > list.txt
echo "file 'guest.mp3'" >> list.txt
ffmpeg -f concat -safe 0 -i list.txt -c copy podcast.mp3

Raw curl (if CLI not available)

# Short-form
curl -X POST "https://openspeech.bytedance.com/api/v3/tts/unidirectional" \
  -H "Content-Type: application/json" \
  -H "X-Api-App-Id: $VOLCENGINE_TTS_APPID" \
  -H "X-Api-Access-Key: $VOLCENGINE_TTS_ACCESS_KEY" \
  -H "X-Api-Resource-Id: volc.service_type.10029" \
  -H "X-Api-Request-Id: $(uuidgen)" \
  -d '{
    "req_params": {
      "text": "你好世界",
      "speaker": "zh_female_sajiaonvyou_moon_bigtts",
      "audio_params": {"format": "mp3", "sample_rate": 24000}
    }
  }' | python3 -c "
import sys, json, base64
chunks = []
for line in sys.stdin:
    line = line.strip()
    if not line: continue
    try:
        d = json.loads(line)
        if d.get('data'): chunks.append(base64.b64decode(d['data']))
    except: pass
sys.stdout.buffer.write(b''.join(chunks))
" > output.mp3

related-skills.json

same repository

metabot.md

from "xvirobotics/metabot"

Talk to other MetaBot bots (`mb talk` — send a message to another bot, including cross-instance peers). Use when you want to delegate to or message another bot, e.g. 'talk to bot X', '跟其他 bot 说话', 'send message to peer bot', 'ask the deploy-bot', 'delegate to bot'. Also covers bot/peer management, skill hub, voice calls.

2026-05-18830

metaschedule.md

from "xvirobotics/metabot"

MetaBot's persistent server-side scheduler (cron + one-shot). Optional skill — not installed by default. Use when the user wants tasks that survive Claude session restarts, are visible to other bots, or need to run in MetaBot's PM2 process rather than this Claude session.

2026-05-13830

metaskill.md

from "xvirobotics/metabot"

The meta-skill: create AI agent teams, individual agents, or custom skills for any project. Use when the user wants to generate a complete agent team, create a single agent, or create a single skill for Claude Code, Kimi, or Codex.

2026-05-08830

skill-hub.md

from "xvirobotics/metabot"

Discover, search, and install shared skills from the Skill Hub registry. Use when the user wants to find available skills, share a skill with other bots, or install a skill from the hub.

2026-04-10830

voice.md

from "xvirobotics/metabot"

Convert text to speech audio using mb voice CLI. Use when the user asks you to speak, say something aloud, generate audio, or produce a voice recording.

2026-03-19830

metamemory.md

from "xvirobotics/metabot"

Read and write shared memory documents. Use this when you need to save knowledge, notes, research findings, or project context for future reference across sessions. Also use it to look up previously stored information.

2026-03-09830

package.json

"author": "xvirobotics"

"repository": "xvirobotics/metabot"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	doubao-tts
description	Generate high-quality speech audio using Doubao (豆包/Volcengine) TTS API. Use this skill when the user asks to generate audio, podcasts, voiceovers, or text-to-speech output.

Doubao TTS — 豆包语音合成

Generate high-quality speech audio from text using Volcengine's Doubao TTS API. Supports short-form (real-time) and long-form (async, up to 100K characters) synthesis.

When to Use

User asks to generate audio, podcasts, voiceovers, or narration
User wants text-to-speech for any content
User asks to "read this aloud" or "make an audio version"

Quick Usage

Use the doubao-tts CLI tool (installed at bin/doubao-tts):

# Short text (real-time, < 300 chars)
bin/doubao-tts "你好世界" -o output.mp3

# Long text from file (async mode, up to 100K chars)
bin/doubao-tts -f article.txt -o podcast.mp3

# Pipe content
echo "Hello world" | bin/doubao-tts -o hello.mp3

# Choose voice
bin/doubao-tts "你好" -v zh_male_aojiaobazong_moon_bigtts -o output.mp3

# Adjust speed/volume/pitch
bin/doubao-tts "你好" --speed 1.2 --volume 1.5 -o output.mp3

Available Voices (已验证可用)

Chinese Female

Voice ID	Description
`zh_female_sajiaonvyou_moon_bigtts`	撒娇女友 (default)
`zh_female_gaolengyujie_moon_bigtts`	高冷御姐
`zh_female_tianmeixiaoyuan_moon_bigtts`	甜美校园
`zh_female_yuanqinvyou_moon_bigtts`	元气女友
`zh_female_wanwanxiaohe_moon_bigtts`	弯弯小何
`zh_female_linjianvhai_moon_bigtts`	邻家女孩

Chinese Male

Voice ID	Description
`zh_male_aojiaobazong_moon_bigtts`	傲娇霸总
`zh_male_jingqiangkanye_moon_bigtts`	京腔侃爷
`zh_male_wennuanahu_moon_bigtts`	温暖阿虎
`zh_male_yangguangqingnian_moon_bigtts`	阳光青年

Note: 其他音色 (BV系列, mars后缀) 需要不同的 resource ID。如需更多音色，请在火山引擎控制台开通对应资源。

API Details

Environment Variables (already configured in MetaBot .env)

VOLCENGINE_TTS_APPID=<app_id>
VOLCENGINE_TTS_ACCESS_KEY=<access_key>
VOLCENGINE_TTS_RESOURCE_ID=volc.service_type.10029  (optional)

Short-form API (real-time, < 300 chars)

Endpoint: https://openspeech.bytedance.com/api/v3/tts/unidirectional
Response: chunked JSON with base64 audio in data field
Latency: < 1 second

Long-form API (async, up to 100K chars)

Submit: POST https://openspeech.bytedance.com/api/v1/tts_async/submit
Query: GET https://openspeech.bytedance.com/api/v1/tts_async/query?appid=X&task_id=Y
Response: audio_url (valid for 1 hour)
Latency: seconds to minutes depending on text length

Workflow for Podcasts

Write the script — Create the podcast script as markdown or plain text
Generate audio — Use bin/doubao-tts -f script.txt -v zh_male_aojiaobazong_moon_bigtts -o podcast.mp3
Copy to outputs — cp podcast.mp3 /tmp/metabot-outputs/<chatId>/ to send to user
For multi-voice podcasts, generate each speaker's segments separately, then concatenate with ffmpeg

Multi-Voice Podcast Example

# Generate segments for different speakers
bin/doubao-tts -f host_lines.txt -v zh_male_aojiaobazong_moon_bigtts -o host.mp3
bin/doubao-tts -f guest_lines.txt -v zh_female_gaolengyujie_moon_bigtts -o guest.mp3

# Concatenate (requires ffmpeg)
echo "file 'host.mp3'" > list.txt
echo "file 'guest.mp3'" >> list.txt
ffmpeg -f concat -safe 0 -i list.txt -c copy podcast.mp3

Raw curl (if CLI not available)

# Short-form
curl -X POST "https://openspeech.bytedance.com/api/v3/tts/unidirectional" \
  -H "Content-Type: application/json" \
  -H "X-Api-App-Id: $VOLCENGINE_TTS_APPID" \
  -H "X-Api-Access-Key: $VOLCENGINE_TTS_ACCESS_KEY" \
  -H "X-Api-Resource-Id: volc.service_type.10029" \
  -H "X-Api-Request-Id: $(uuidgen)" \
  -d '{
    "req_params": {
      "text": "你好世界",
      "speaker": "zh_female_sajiaonvyou_moon_bigtts",
      "audio_params": {"format": "mp3", "sample_rate": 24000}
    }
  }' | python3 -c "
import sys, json, base64
chunks = []
for line in sys.stdin:
    line = line.strip()
    if not line: continue
    try:
        d = json.loads(line)
        if d.get('data'): chunks.append(base64.b64decode(d['data']))
    except: pass
sys.stdout.buffer.write(b''.join(chunks))
" > output.mp3

doubao-tts

Doubao TTS — 豆包语音合成

When to Use

Quick Usage

Available Voices (已验证可用)

Chinese Female

Chinese Male

API Details

Environment Variables (already configured in MetaBot .env)

Short-form API (real-time, < 300 chars)

Long-form API (async, up to 100K chars)

Workflow for Podcasts

Multi-Voice Podcast Example

Raw curl (if CLI not available)

More from this repository

More from this repository

Doubao TTS — 豆包语音合成

When to Use

Quick Usage

Available Voices (已验证可用)

Chinese Female

Chinese Male

API Details

Environment Variables (already configured in MetaBot .env)

Short-form API (real-time, < 300 chars)

Long-form API (async, up to 100K chars)

Workflow for Podcasts

Multi-Voice Podcast Example

Raw curl (if CLI not available)