with one click
voice-clone-lab
Create and register cloned voices for later TTS only when the speaker has explicit consent. Use when the user asks for voice clone, clone voice, 克隆音色, 复刻声音, or wants a reusable voice_id.
Menu
Create and register cloned voices for later TTS only when the speaker has explicit consent. Use when the user asks for voice clone, clone voice, 克隆音色, 复刻声音, or wants a reusable voice_id.
Generate or edit a single image via OpenRouter (google/gemini-3.1-flash-image-preview by default). Accepts a text prompt and optional --input-image for image-to-image editing. Trigger when the user asks for an AI image, illustration, concept art, product render, or wants to modify an existing image.
Render a single 3-15s video clip via Seedance 2.0. Supports two backends: OpenRouter (default, model bytedance/seedance-2.0) and the official Volcengine ARK / BytePlus ModelArk endpoint (model doubao-seedance-2-0-260128 / dreamina-seedance-2-0-260128). Accepts a structured English video prompt, optional first-frame image, and optional identity/style reference image. Trigger when the user asks for AI video clip generation, 分镜视频, seedance, or wants a short cinematic shot from a prompt + frame.
Use this meta-skill instead of answering directly when the current user asks to draft, repair, compile, or produce an academic/research paper or LaTeX manuscript. It uses multi-skill orchestration for manuscript workflows that need source search, citation planning, experiment or figure/table placeholders, drafting, length checks, citation integrity, and LaTeX/PDF compilation. Ordinary paper requests use a compact draft path; explicit full/PDF/long-form requests use the full manuscript path. Do not use it for web research reports, slide decks, document decisions, or generic plotting.
Submit audio or video for multilingual dubbing, poll status, and download dubbed audio. Use when the user asks for dubbing, 多语言配音, 视频翻译配音, 译制片, or wants a source clip dubbed into another language.
Generate instrumental music, background beds, jingles, or sung songs with lyrics through OpenSquilla audio tools. Use when the user asks for BGM, music generation, 唱歌, 生成歌曲, lyrics to song, or a playable music audio artifact.
Convert a local source recording into an authorized target voice. Use when the user asks for voice conversion, voice changer, 换声, 变声, 音色转换, or converting existing narration to another approved voice.
| name | voice-clone-lab |
| description | Create and register cloned voices for later TTS only when the speaker has explicit consent. Use when the user asks for voice clone, clone voice, 克隆音色, 复刻声音, or wants a reusable voice_id. |
| triggers | ["voice clone","clone voice","克隆音色","复刻声音","声音克隆"] |
| provenance | {"origin":"opensquilla-original","license":"Apache-2.0","maintained_by":"OpenSquilla"} |
| metadata | {"opensquilla":{"risk":"high","capabilities":["network-read","filesystem-write"],"requires_tools":["voice_clone","audio_provider_capabilities"]}} |
Creates a reusable provider voice from a local sample. OpenRouter may help
summarize the request or produce labels, but cloning must use the direct
audio provider through voice_clone.
Before calling tools, extract these fields from the user request:
OpenRouter can summarize consent text or label a voice, but it is not an audio provider and cannot replace explicit consent.
consent_metadata before calling voice_clone.speakerconsent: truesample_sourcepermitted_userequested_byaudio_provider_capabilities if cloning availability is uncertain.voice_clone with the sample, name, description, and consent metadata.voice_clone returns status=ok, return the voice ID first, then the
consent summary, intended locale/accent, and any sample-quality warning.consent_required, do not proceed with a workaround. Ask for
the missing consent metadata in one concise question.not_available, quote the note and distinguish
disabled provider, key/quota limits, feature gating, and sample format issues.Ask which target language and locale the cloned voice will be used for. A clone works best when the sample matches the desired locale-appropriate accent.
Return: