ワンクリックで
aliyun-qwen-tts-realtime
// Use when real-time speech synthesis is needed with Alibaba Cloud Model Studio Qwen TTS Realtime models. Use when low-latency interactive speech is required, including instruction-controlled realtime synthesis.
// Use when real-time speech synthesis is needed with Alibaba Cloud Model Studio Qwen TTS Realtime models. Use when low-latency interactive speech is required, including instruction-controlled realtime synthesis.
Use when querying distributed traces or application metrics in Alibaba Cloud ARMS (Application Real-Time Monitoring Service). Use for trace search by service/duration/tags, trace detail and method stack retrieval, application listing, and performance metrics queries.
Smoke test for aliyun-arms-query skill. Validates script compilation and basic SDK client initialization.
Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from reference audio and then reusing the returned voice_id in later TTS calls.
Use when designing custom voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from a voice prompt plus preview text before using the returned voice_id in TTS.
Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, live captions, or duplex voice agents.
Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.
| name | aliyun-qwen-tts-realtime |
| description | Use when real-time speech synthesis is needed with Alibaba Cloud Model Studio Qwen TTS Realtime models. Use when low-latency interactive speech is required, including instruction-controlled realtime synthesis. |
| version | 1.0.0 |
Category: provider
Use realtime TTS models for low-latency streaming speech output.
Use one of these exact model strings:
qwen3-tts-flash-realtimeqwen3-tts-instruct-flash-realtimeqwen3-tts-instruct-flash-realtime-2026-01-22qwen3-tts-vd-realtime-2026-01-15qwen3-tts-vc-realtime-2026-01-15python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.text (string, required)voice (string, required)instruction (string, optional)sample_rate (int, optional)audio_base64_pcm_chunks (array)sample_rate (int)finish_reason (string)MultiModalConversation; use the probe script below to verify compatibility.Use the probe script to verify realtime compatibility in your current SDK/runtime, and optionally fallback to a non-realtime model for immediate output:
.venv/bin/python skills/ai/audio/aliyun-qwen-tts-realtime/scripts/realtime_tts_demo.py \
--text "This is a realtime speech demo." \
--fallback \
--output output/ai-audio-tts-realtime/audio/fallback-demo.wav
Strict mode (for CI / gating):
.venv/bin/python skills/ai/audio/aliyun-qwen-tts-realtime/scripts/realtime_tts_demo.py \
--text "realtime health check" \
--strict
output/ai-audio-tts-realtime/audio/OUTPUT_DIR.mkdir -p output/aliyun-qwen-tts-realtime
for f in skills/ai/audio/aliyun-qwen-tts-realtime/scripts/*.py; do
python3 -m py_compile "$f"
done
echo "py_compile_ok" > output/aliyun-qwen-tts-realtime/validate.txt
Pass criteria: command exits 0 and output/aliyun-qwen-tts-realtime/validate.txt is generated.
output/aliyun-qwen-tts-realtime/.references/sources.md