Skip to main content
Run any Skill in Manus
with one click
$pwd:

ck-ai-multimodal

// Analyze images/audio/video with Gemini API (better vision than Claude). Generate images (Imagen 4, Nano Banana 2, MiniMax), videos (Veo 3, Hailuo), speech (MiniMax TTS), music (MiniMax). Use for vision analysis, transcription, OCR, design extraction, multimodal AI.

$ git log --oneline --stat
stars:1,141
forks:378
updated:May 9, 2026 at 17:04
File Explorer
26 files
SKILL.md
readonly