Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

ai-music-skill

Deploy, compile, and operate acestep.cpp for local GPU-accelerated AI music generation, plus full-stack Codi webapp deployment. Use when: (1) Building/compiling ace-server with CUDA support on Windows, (2) Selecting appropriate models for GPUs with 8GB VRAM, (3) Starting ace-server and generating songs via HTTP API, (4) Troubleshooting build issues (MSVC/CUDA version conflicts), (5) Writing song generation prompts for the 5Hz LM model, (6) Running the complete Codi webapp (FastAPI backend + Vite frontend + SQLite), (7) Open-sourcing or modifying the Codi app (removing payment/sheet music, adding MP3/MP4 downloads). Covers end-to-end workflow from engine compilation through full-stack web deployment.

In Manus ausführen

Überblick

Installationsbefehl

npx skills add https://github.com/aresbit/codi --skill ai-music-skill

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um den Skill zu installieren

Quelle

aresbit/codi

Sterne3

Forks0

Aktualisiert5. Mai 2026 um 03:02

Datei-Explorer

8 Dateien

SKILL.md

readonly

name

ai-music-skill

description

AI Music Generation — acestep.cpp GPU

Compile, deploy, and run ace-server with CUDA acceleration for local AI music synthesis.

Quick Start

1. Compile GPU Version

call "D:\Program Files (x86)\Microsoft Visual Studio\18\BuildTools\VC\Auxiliary\Build\vcvars64.bat"
mkdir build-gpu && cd build-gpu
cmake .. -DGGML_CUDA=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_CUDA_ARCHITECTURES="86-real" -DCMAKE_CUDA_FLAGS="-allow-unsupported-compiler" -DGGML_CCACHE=OFF -G Ninja
cmake --build . --config Release -j %NUMBER_OF_PROCESSORS%

Or run the bundled script:

scripts/build-gpu.cmd [path/to/acestep.cpp/source]

Detailed build environment info in references/build-guide.md.

2. Start Server

ace-server.exe --models ../models --host 127.0.0.1 --port 8085 --vae-chunk 128 --vae-overlap 32

Server listens on port 8085. Verify with curl http://127.0.0.1:8085/health.

3. Generate a Song

python scripts/test_generation.py --caption folk --output ./my_song

Or use custom prompts:

python scripts/test_generation.py --caption custom --custom-text "Style: ..." --output ./custom_song

Full API reference in references/api-guide.md.

Model Selection (8GB VRAM)

For RTX 3060 Ti (8GB), use these quantized models (total ~5.5GB):

Component	Model File	Size
LM	`acestep-5Hz-lm-4B-Q5_K_M.gguf`	3.0 GB
DiT/Synth	`acestep-v15-turbo-Q4_K_M.gguf`	1.4 GB
VAE	`vae-BF16.gguf`	337 MB
Text Enc	`Qwen3-Embedding-0.6B-Q8_0.gguf`	784 MB

Detailed model guidance and param tuning in references/model-guide.md.

Inference Parameters (8GB Optimized)

inference_steps: 8 (turbo model, fewer steps = faster)
guidance_scale: 1.0 (no guidance, default for turbo)
shift: 3.0 (turbo flow-matching shift)
duration: 180 (seconds, ~3 min)
vae-chunk: 128 (process in chunks to save VRAM)
vae-overlap: 32 (chunk overlap)

Generation Workflow

Two-Phase API Flow

Phase 1 (LM):  POST /lm → get job_id → poll GET /job?id=X → done → GET /job?id=X&result=1 → enriched JSON
Phase 2 (Synth): POST /synth (body=enriched JSON) → poll → done → GET /job?result=1 → MP3 binary

Prompt Structure

Write captions with: style description + theme + language + song structure:

Style: [genre, instruments, vocal type, mood]
Theme: [1-2 sentence theme]
Language: Chinese
Structure: intro → verse → chorus → verse → chorus → bridge → chorus → outro

LM Model Behavior

Read references/lm-notes.md for details. Key behavior:

Chinese output is pinyin with tone numbers (phoneme tokenization), not Chinese characters
English output is plain text
Model mixes [zh] and [en] tags per line
Cannot exactly reproduce user-provided lyrics — generates original content

Parameter Tuning

Quality vs Speed: Increase inference_steps to 24-50 for better quality (slower, more VRAM)
Prompt Adherence: Increase guidance_scale to 3.0-7.0 (risk of audio distortion)
Song Length: Adjust duration (longer = more VRAM)
VRAM Saving: Lower vae-chunk, enable vae-overlap

Troubleshooting

MSVC version too new for CUDA: Add -DCMAKE_CUDA_FLAGS="-allow-unsupported-compiler". CUDA 12.8 supports up to VS 2022; VS 2026 triggers a version check error.

Proxy interference: Set NO_PROXY=127.0.0.1,localhost when running Python test scripts.

Job timeout: First request loads models into GPU memory — can take 1-3 minutes before generation starts.

Out of VRAM: Reduce duration, lower vae-chunk, or switch to CPU backend.

Resources

scripts/

build-gpu.cmd — One-click CUDA build script for ace-server
test_generation.py — Python script to generate songs via ace-server API (3 presets + custom mode)

references/

build-guide.md — Build environment setup, MSVC+CUDA details, known issues
model-guide.md — Model selection, VRAM budget, inference parameter tuning
api-guide.md — Full ace-server HTTP API reference
lm-notes.md — LM model behavior, pinyin output, prompt engineering tips
fullstack-deployment.md — Codi webapp full-stack setup, open-source modifications, database management, and troubleshooting

Quelle

aresbit

aresbit/codi

GitHub-Repository öffnen Creator-Repositorys ansehen

Installationsbefehl

Download

In Manus ausführen

Nützlich fürSOC

SoftwareentwicklerInformatik- und Mathematikberufe15-1252L4

name

ai-music-skill

description

AI Music Generation — acestep.cpp GPU

Compile, deploy, and run ace-server with CUDA acceleration for local AI music synthesis.

Quick Start

1. Compile GPU Version

call "D:\Program Files (x86)\Microsoft Visual Studio\18\BuildTools\VC\Auxiliary\Build\vcvars64.bat"
mkdir build-gpu && cd build-gpu
cmake .. -DGGML_CUDA=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_CUDA_ARCHITECTURES="86-real" -DCMAKE_CUDA_FLAGS="-allow-unsupported-compiler" -DGGML_CCACHE=OFF -G Ninja
cmake --build . --config Release -j %NUMBER_OF_PROCESSORS%

Or run the bundled script:

scripts/build-gpu.cmd [path/to/acestep.cpp/source]

Detailed build environment info in references/build-guide.md.

2. Start Server

ace-server.exe --models ../models --host 127.0.0.1 --port 8085 --vae-chunk 128 --vae-overlap 32

Server listens on port 8085. Verify with curl http://127.0.0.1:8085/health.

3. Generate a Song

python scripts/test_generation.py --caption folk --output ./my_song

Or use custom prompts:

python scripts/test_generation.py --caption custom --custom-text "Style: ..." --output ./custom_song

Full API reference in references/api-guide.md.

Model Selection (8GB VRAM)

For RTX 3060 Ti (8GB), use these quantized models (total ~5.5GB):

Component	Model File	Size
LM	`acestep-5Hz-lm-4B-Q5_K_M.gguf`	3.0 GB
DiT/Synth	`acestep-v15-turbo-Q4_K_M.gguf`	1.4 GB
VAE	`vae-BF16.gguf`	337 MB
Text Enc	`Qwen3-Embedding-0.6B-Q8_0.gguf`	784 MB

Detailed model guidance and param tuning in references/model-guide.md.

Inference Parameters (8GB Optimized)

inference_steps: 8 (turbo model, fewer steps = faster)
guidance_scale: 1.0 (no guidance, default for turbo)
shift: 3.0 (turbo flow-matching shift)
duration: 180 (seconds, ~3 min)
vae-chunk: 128 (process in chunks to save VRAM)
vae-overlap: 32 (chunk overlap)

Generation Workflow

Two-Phase API Flow

Phase 1 (LM):  POST /lm → get job_id → poll GET /job?id=X → done → GET /job?id=X&result=1 → enriched JSON
Phase 2 (Synth): POST /synth (body=enriched JSON) → poll → done → GET /job?result=1 → MP3 binary

Prompt Structure

Write captions with: style description + theme + language + song structure:

Style: [genre, instruments, vocal type, mood]
Theme: [1-2 sentence theme]
Language: Chinese
Structure: intro → verse → chorus → verse → chorus → bridge → chorus → outro

LM Model Behavior

Read references/lm-notes.md for details. Key behavior:

Chinese output is pinyin with tone numbers (phoneme tokenization), not Chinese characters
English output is plain text
Model mixes [zh] and [en] tags per line
Cannot exactly reproduce user-provided lyrics — generates original content

Parameter Tuning

Quality vs Speed: Increase inference_steps to 24-50 for better quality (slower, more VRAM)
Prompt Adherence: Increase guidance_scale to 3.0-7.0 (risk of audio distortion)
Song Length: Adjust duration (longer = more VRAM)
VRAM Saving: Lower vae-chunk, enable vae-overlap

Troubleshooting

MSVC version too new for CUDA: Add -DCMAKE_CUDA_FLAGS="-allow-unsupported-compiler". CUDA 12.8 supports up to VS 2022; VS 2026 triggers a version check error.

Proxy interference: Set NO_PROXY=127.0.0.1,localhost when running Python test scripts.

Job timeout: First request loads models into GPU memory — can take 1-3 minutes before generation starts.

Out of VRAM: Reduce duration, lower vae-chunk, or switch to CPU backend.

Resources

scripts/

build-gpu.cmd — One-click CUDA build script for ace-server
test_generation.py — Python script to generate songs via ace-server API (3 presets + custom mode)

references/

build-guide.md — Build environment setup, MSVC+CUDA details, known issues
model-guide.md — Model selection, VRAM budget, inference parameter tuning
api-guide.md — Full ace-server HTTP API reference
lm-notes.md — LM model behavior, pinyin output, prompt engineering tips
fullstack-deployment.md — Codi webapp full-stack setup, open-source modifications, database management, and troubleshooting