Skip to main content
Run any Skill in Manus
with one click
$pwd:

multimodal-llm

// Vision, audio, video generation, and multimodal LLM integration patterns. Use when processing images, transcribing audio, generating speech, generating AI video (Kling v3, Sora 2, Veo 3.1 std/lite/fast, Runway Gen-4.5 via `gen4_turbo`), or building multimodal AI pipelines.

$ git log --oneline --stat
stars:171
forks:15
updated:April 17, 2026 at 15:49
File Explorer
13 files
SKILL.md
readonly