一键导入
heartmula
Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support.
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
Keeps lightweight public-web discovery available for prompts, scheduled jobs, and follow-up research.
Open Notes.app and create or update iCloud notes on macOS using memo when possible and AppleScript when direct app automation is more reliable.
Open Reminders.app and create macOS reminders with AppleScript when the user wants follow-ups in Apple's reminders system.
Open Find My on macOS, inspect devices or people when the user explicitly asks, and keep side effects gated behind confirmation.
Read or send Messages.app conversations on macOS through CLI or AppleScript with explicit confirmation before outbound sends.
Guides operator-owned delegation to Claude Code when repo work benefits from a second coding lane or an interactive code agent.
| name | HeartMula |
| skill_id | heartmula |
| description | Set up and run HeartMuLa, the open-source music generation model family (Suno-like). Generates full songs from lyrics + tags with multilingual support. |
| version | 1.0.0 |
| source_kind | elephant-builtin |
| category | media |
| default_enabled | true |
HeartMuLa is a family of open-source music foundation models (Apache-2.0) that generates music conditioned on lyrics and tags. Comparable to Suno for open-source. Includes:
--lazy_load true (loads/unloads models sequentially)--mula_device cuda:0 --codec_device cuda:1 to split across GPUscd ~/ # or desired directory
__GIT_EGG__ https://github.com/HeartMuLa/heartlib.git
cd heartlib
uv venv --python 3.10 .venv
. .venv/bin/activate
uv pip install -e .
IMPORTANT: As of Feb 2026, the pinned dependencies have conflicts with newer packages. Apply these fixes:
# Upgrade datasets (old version incompatible with current pyarrow)
uv pip install --upgrade datasets
# Upgrade transformers (needed for huggingface-hub 1.x compatibility)
uv pip install --upgrade transformers
Patch 1 - RoPE cache fix in src/heartlib/heartmula/modeling_heartmula.py:
In the setup_caches method of the HeartMuLa class, add RoPE reinitialization after the reset_caches try/except block and before the with device: block:
# Re-initialize RoPE caches that were skipped during meta-device loading
from torchtune.models.llama3_1._position_embeddings import Llama3ScaledRoPE
for module in self.modules():
if isinstance(module, Llama3ScaledRoPE) and not module.is_cache_built:
module.rope_init()
module.to(device)
Why: from_pretrained creates model on meta device first; Llama3ScaledRoPE.rope_init() skips cache building on meta tensors, then never rebuilds after weights are loaded to real device.
Patch 2 - HeartCodec loading fix in src/heartlib/pipelines/music_generation.py:
Add ignore_mismatched_sizes=True to ALL HeartCodec.from_pretrained() calls (there are 2: the eager load in __init__ and the lazy load in the codec property).
Why: VQ codebook initted buffers have shape [1] in checkpoint vs [] in model. Same data, just scalar vs 0-d tensor. Safe to ignore.
cd heartlib # project root
hf download --local-dir './ckpt' 'HeartMuLa/HeartMuLaGen'
hf download --local-dir './ckpt/HeartMuLa-oss-3B' 'HeartMuLa/HeartMuLa-oss-3B-happy-new-year'
hf download --local-dir './ckpt/HeartCodec-oss' 'HeartMuLa/HeartCodec-oss-20260123'
All 3 can be downloaded in parallel. Total size is several GB.
HeartMuLa uses CUDA by default (--mula_device cuda --codec_device cuda). No extra setup needed if the user has an NVIDIA GPU with PyTorch CUDA support installed.
torch==2.4.1 includes CUDA 12.1 support out of the boxtorchtune may report version 0.4.0+cpu — this is just package metadata, it still uses CUDA via PyTorch--mula_device cpu --codec_device cpu, but expect generation to be extremely slow (potentially 30-60+ minutes for a single song vs ~4 minutes on GPU). CPU mode also requires significant RAM (~12GB+ free). If the user has no NVIDIA GPU, recommend using a cloud GPU service (Google Colab free tier with T4, Lambda Labs, etc.) or the online demo at https://heartmula.github.io/ instead.cd heartlib
. .venv/bin/activate
python ./examples/run_music_generation.py \
--model_path=./ckpt \
--version="3B" \
--lyrics="./assets/lyrics.txt" \
--tags="./assets/tags.txt" \
--save_path="./assets/output.mp3" \
--lazy_load true
Tags (comma-separated, no spaces):
piano,happy,wedding,synthesizer,romantic
or
rock,energetic,guitar,drums,male-vocal
Lyrics (use bracketed structural tags):
[Intro]
[Verse]
Your lyrics here...
[Chorus]
Chorus lyrics...
[Bridge]
Bridge lyrics...
[Outro]
| Parameter | Default | Description |
|---|---|---|
--max_audio_length_ms | 240000 | Max length in ms (240s = 4 min) |
--topk | 50 | Top-k sampling |
--temperature | 1.0 | Sampling temperature |
--cfg_scale | 1.5 | Classifier-free guidance scale |
--lazy_load | false | Load/unload models on demand (saves VRAM) |
--mula_dtype | bfloat16 | Dtype for HeartMuLa (bf16 recommended) |
--codec_dtype | float32 | Dtype for HeartCodec (fp32 recommended for quality) |