一键在 Manus 中运行任何 Skill

$pwd:

aliyun-qwen-asr-realtime

Name: Aliyun Qwen Asr Realtime
Author: cinience

// Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, live captions, or duplex voice agents.

在 Manus 中运行

$ git log --oneline --stat

stars:391

forks:34

updated:2026年4月27日 22:35

文件资源管理器

4 个文件

SKILL.md

readonly

name	aliyun-qwen-asr-realtime
description	Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, live captions, or duplex voice agents.
version	1.0.0

Category: provider

Model Studio Qwen ASR Realtime

Validation

mkdir -p output/aliyun-qwen-asr-realtime
python -m py_compile skills/ai/audio/aliyun-qwen-asr-realtime/scripts/prepare_realtime_asr_request.py && echo "py_compile_ok" > output/aliyun-qwen-asr-realtime/validate.txt

Pass criteria: command exits 0 and output/aliyun-qwen-asr-realtime/validate.txt is generated.

Output And Evidence

Save session payloads and response samples under output/aliyun-qwen-asr-realtime/.

Critical model names

Use one of these exact model strings:

qwen3-asr-flash-realtime
qwen3-asr-flash-realtime-2026-02-10

Use cases

Realtime subtitles and captions
Voice-agent duplex input
Streaming speech-to-text in browser or terminal clients

Prerequisites

Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Realtime sessions generally require WebSocket or streaming session handling in the client.

Normalized interface (asr.realtime)

Request

model (string, optional): default qwen3-asr-flash-realtime
language_hints (array, optional)
format (string, optional): e.g. pcm, wav
sample_rate (int, optional): e.g. 16000
chunk_ms (int, optional): frame size in milliseconds

Response

text (string): recognized transcript fragment
is_final (bool): finalization marker
usage (object, optional)

Quick start

Generate a request template:

python skills/ai/audio/aliyun-qwen-asr-realtime/scripts/prepare_realtime_asr_request.py \
  --output output/aliyun-qwen-asr-realtime/request.json

Operational guidance

Prefer 16kHz mono PCM unless your client stack requires another format.
Keep chunks small enough for responsive partial results.
If you only have recorded files, use skills/ai/audio/aliyun-qwen-asr/ instead.

References

references/sources.md

related-skills.json

同仓库

aliyun-arms-query.md

from "cinience/alicloud-skills"

Use when querying distributed traces or application metrics in Alibaba Cloud ARMS (Application Real-Time Monitoring Service). Use for trace search by service/duration/tags, trace detail and method stack retrieval, application listing, and performance metrics queries.

2026-05-30391

aliyun-arms-query-test.md

from "cinience/alicloud-skills"

Smoke test for aliyun-arms-query skill. Validates script compilation and basic SDK client initialization.

2026-05-30391

aliyun-cosyvoice-voice-clone.md

from "cinience/alicloud-skills"

Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from reference audio and then reusing the returned voice_id in later TTS calls.

2026-04-27391

aliyun-cosyvoice-voice-design.md

from "cinience/alicloud-skills"

Use when designing custom voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from a voice prompt plus preview text before using the returned voice_id in TTS.

2026-04-27391

aliyun-qwen-asr.md

from "cinience/alicloud-skills"

Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

2026-04-27391

aliyun-qwen-livetranslate.md

from "cinience/alicloud-skills"

Use when live speech translation is needed with Alibaba Cloud Model Studio Qwen LiveTranslate models, including bilingual meetings, realtime interpretation, and speech-to-speech or speech-to-text translation flows.

2026-04-27391

package.json

"author": "cinience"

"repository": "cinience/alicloud-skills"

打开 GitHub 仓库查看创作者相关仓库

$ install --global

$ download --local

在 Manus 中运行

$ useful --forSOC

软件开发工程师计算机与数学类职业15-1252L4

name	aliyun-qwen-asr-realtime
description	Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, live captions, or duplex voice agents.
version	1.0.0

Category: provider

Model Studio Qwen ASR Realtime

Validation

mkdir -p output/aliyun-qwen-asr-realtime
python -m py_compile skills/ai/audio/aliyun-qwen-asr-realtime/scripts/prepare_realtime_asr_request.py && echo "py_compile_ok" > output/aliyun-qwen-asr-realtime/validate.txt

Pass criteria: command exits 0 and output/aliyun-qwen-asr-realtime/validate.txt is generated.

Output And Evidence

Save session payloads and response samples under output/aliyun-qwen-asr-realtime/.

Critical model names

Use one of these exact model strings:

qwen3-asr-flash-realtime
qwen3-asr-flash-realtime-2026-02-10

Use cases

Realtime subtitles and captions
Voice-agent duplex input
Streaming speech-to-text in browser or terminal clients

Prerequisites

Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Realtime sessions generally require WebSocket or streaming session handling in the client.

Normalized interface (asr.realtime)

Request

model (string, optional): default qwen3-asr-flash-realtime
language_hints (array, optional)
format (string, optional): e.g. pcm, wav
sample_rate (int, optional): e.g. 16000
chunk_ms (int, optional): frame size in milliseconds

Response

text (string): recognized transcript fragment
is_final (bool): finalization marker
usage (object, optional)

Quick start

Generate a request template:

python skills/ai/audio/aliyun-qwen-asr-realtime/scripts/prepare_realtime_asr_request.py \
  --output output/aliyun-qwen-asr-realtime/request.json

Operational guidance

Prefer 16kHz mono PCM unless your client stack requires another format.
Keep chunks small enough for responsive partial results.
If you only have recorded files, use skills/ai/audio/aliyun-qwen-asr/ instead.

References

references/sources.md

aliyun-qwen-asr-realtime

Model Studio Qwen ASR Realtime

Validation

Output And Evidence

Critical model names

Use cases

Prerequisites

Normalized interface (asr.realtime)

Request

Response

Quick start

Operational guidance

References

同仓库更多 Skills

Model Studio Qwen ASR Realtime

Validation

Output And Evidence

Critical model names

Use cases

Prerequisites

Normalized interface (asr.realtime)

Request

Response

Quick start

Operational guidance

References

同仓库更多 Skills