Skip to main content
~/
skillsmp
$
ai
--search
$
cd
/occupations
$
watch
stats
$
man
docs
KO
Manus에서 모든 스킬 실행
원클릭으로
원클릭으로 Manus에서 모든 스킬 실행
시작하기
$
cd
top
원클릭으로 모든 스킬 실행
미디어 - Agent Skills | SkillsMP
$
pwd:
~
/
categories
/
media
미디어
이미지, 비디오, 오디오 처리를 위한 에이전트 스킬을 탐색하세요. 미디어 파일을 프로그래밍 방식으로 편집하고 변환하세요.
스타 순
최근 업데이트순
qqbot-media.md
363.2k
from
"openclaw/openclaw"
QQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
2026-03-31
camsnap.md
363.2k
from
"openclaw/openclaw"
Capture frames or clips from RTSP/ONVIF cameras.
2026-01-31
gifgrep.md
363.2k
from
"openclaw/openclaw"
Search GIF providers with CLI/TUI, download results, and extract stills/sheets.
2026-01-31
openai-whisper-api.md
363.2k
from
"openclaw/openclaw"
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
2026-03-30
songsee.md
363.2k
from
"openclaw/openclaw"
Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.
2026-01-31
video-frames.md
363.2k
from
"openclaw/openclaw"
Extract frames or short clips from videos using ffmpeg.
2026-03-11
fal-ai-media.md
165.9k
from
"affaan-m/everything-claude-code"
Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Use when the user wants to generate images, videos, or audio with AI.
2026-03-12
video-editing.md
165.9k
from
"affaan-m/everything-claude-code"
AI-assisted video editing workflows for cutting, structuring, and augmenting real footage. Covers the full pipeline from raw capture through FFmpeg, Remotion, ElevenLabs, fal.ai, and final polish in Descript or CapCut. Use when the user wants to edit video, cut footage, create vlogs, or build video content.
2026-03-12
fal-ai-media.md
165.9k
from
"affaan-m/everything-claude-code"
通过 fal.ai MCP 实现统一的媒体生成——图像、视频和音频。涵盖文本到图像(Nano Banana)、文本/图像到视频(Seedance、Kling、Veo 3)、文本到语音(CSM-1B),以及视频到音频(ThinkSound)。当用户想要使用 AI 生成图像、视频或音频时使用。
2026-03-22
video-editing.md
165.9k
from
"affaan-m/everything-claude-code"
AI辅助的视频编辑工作流程,用于剪辑、构建和增强实拍素材。涵盖从原始拍摄到FFmpeg、Remotion、ElevenLabs、fal.ai,再到Descript或CapCut最终润色的完整流程。适用于用户想要编辑视频、剪辑素材、制作vlog或构建视频内容的情况。
2026-03-22
videodb.md
165.9k
from
"affaan-m/everything-claude-code"
视频与音频的查看、理解与行动。查看:从本地文件、URL、RTSP/直播源或实时录制桌面获取内容;返回实时上下文和可播放流链接。理解:提取帧,构建视觉/语义/时间索引,并通过时间戳和自动剪辑搜索片段。行动:转码和标准化(编解码器、帧率、分辨率、宽高比),执行时间线编辑(字幕、文本/图像叠加、品牌化、音频叠加、配音、翻译),生成媒体资源(图像、音频、视频),并为直播流或桌面捕获的事件创建实时警报。
2026-03-30
fal-ai-media.md
165.9k
from
"affaan-m/everything-claude-code"
Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Use when the user wants to generate images, videos, or audio with AI.
2026-03-12
video-editing.md
165.9k
from
"affaan-m/everything-claude-code"
AI-assisted video editing workflows for cutting, structuring, and augmenting real footage. Covers the full pipeline from raw capture through FFmpeg, Remotion, ElevenLabs, fal.ai, and final polish in Descript or CapCut. Use when the user wants to edit video, cut footage, create vlogs, or build video content.
2026-03-12
videodb.md
165.9k
from
"affaan-m/everything-claude-code"
See, Understand, Act on video and audio. See- ingest from local files, URLs, RTSP/live feeds, or live record desktop; return realtime context and playable stream links. Understand- extract frames, build visual/semantic/temporal indexes, and search moments with timestamps and auto-clips. Act- transcode and normalize (codec, fps, resolution, aspect ratio), perform timeline edits (subtitles, text/image overlays, branding, audio overlays, dubbing, translation), generate media assets (image, audio, video), and create real time alerts for events from live streams or desktop capture.
2026-03-30
ascii-video.md
114.8k
from
"NousResearch/hermes-agent"
Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output.
2026-04-10
songsee.md
114.8k
from
"NousResearch/hermes-agent"
Generate spectrograms and audio feature visualizations (mel, chroma, MFCC, tempogram, etc.) from audio files via CLI. Useful for audio analysis, music production debugging, and visual documentation.
2026-03-13
image-enhancer.md
56.0k
from
"ComposioHQ/awesome-claude-skills"
Improves the quality of images, especially screenshots, by enhancing resolution, sharpness, and clarity. Perfect for preparing images for presentations, documentation, or social media posts.
2025-10-17
youtube-downloader.md
56.0k
from
"ComposioHQ/awesome-claude-skills"
Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.
2025-12-29
fal-image-edit.md
34.9k
from
"sickn33/antigravity-awesome-skills"
AI-powered image editing with style transfer and object removal
2026-04-13
fal-upscale.md
34.9k
from
"sickn33/antigravity-awesome-skills"
Upscale and enhance image and video resolution using AI
2026-04-13
ffuf-claude-skill.md
34.9k
from
"sickn33/antigravity-awesome-skills"
Web fuzzing with ffuf
2026-04-13
seo-images.md
34.9k
from
"sickn33/antigravity-awesome-skills"
Image optimization analysis for SEO and performance. Checks alt text, file sizes, formats, responsive images, lazy loading, and CLS prevention. Use when user says "image optimization", "alt text", "image SEO", "image size", or "image audit".
2026-04-13
videodb.md
34.9k
from
"sickn33/antigravity-awesome-skills"
Video and audio perception, indexing, and editing. Ingest files/URLs/live streams, build visual/spoken indexes, search with timestamps, edit timelines, add overlays/subtitles, generate media, and create real-time alerts.
2026-04-13
videodb-skills.md
34.9k
from
"sickn33/antigravity-awesome-skills"
Upload, stream, search, edit, transcribe, and generate AI video and audio using the VideoDB SDK.
2026-04-13
mmx-cli.md
34.9k
from
"sickn33/antigravity-awesome-skills"
Use mmx to generate text, images, video, speech, and music via the MiniMax AI platform. Use when the user wants to create media content, chat with MiniMax models, perform web search, or manage MiniMax API resources from the terminal.
2026-04-14
seek-and-analyze-video.md
34.9k
from
"sickn33/antigravity-awesome-skills"
Seek and analyze video content using Memories.ai Large Visual Memory Model for persistent video intelligence
2026-03-20
cli-anything-videocaptioner.md
32.5k
from
"HKUDS/CLI-Anything"
AI-powered video captioning — transcribe speech, optimize/translate subtitles, and burn them into video via the stable VideoCaptioner backend. Free ASR and translation included.
2026-04-20
image-manipulation-image-magick.md
31.1k
from
"github/awesome-copilot"
Process and manipulate images using ImageMagick. Supports resizing, format conversion, batch processing, and retrieving image metadata. Use when working with images, creating thumbnails, resizing wallpapers, or performing batch image operations.
2026-01-21
java-add-graalvm-native-image-support.md
31.1k
from
"github/awesome-copilot"
GraalVM Native Image expert that adds native image support to Java applications, builds the project, analyzes build errors, applies fixes, and iterates until successful compilation using Oracle best practices.
2026-02-24
nano-banana-pro-openrouter.md
31.1k
from
"github/awesome-copilot"
Generate or edit images via OpenRouter with the Gemini 3 Pro Image model. Use for prompt-only image generation, image edits, and multi-image compositing; supports 1K/2K/4K output.
2026-02-10
transloadit-media-processing.md
31.1k
from
"github/awesome-copilot"
Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.
2026-02-17
sora.md
25.1k
from
"davila7/claude-code-templates"
Use when the user asks to generate, remix, poll, list, download, or delete Sora videos via OpenAI’s video API using the bundled CLI (`scripts/sora.py`), including requests like “generate AI video,” “Sora,” “video remix,” “download video/thumbnail/spritesheet,” and batch video generation; requires `OPENAI_API_KEY` and Sora API access.
2026-02-08
weixin-file-send.md
22.5k
from
"iOfficeAI/AionUi"
Use when the user wants a local file or image sent back, such as "send me the file" or "发给我".
2026-03-31
hyper-fast-youtube-transcript.md
19.6k
from
"kortix-ai/suna"
Use when the user wants a YouTube transcript from a single URL or video ID. Optimized for one input and one output: fetch the transcript fast, default to plain transcript text only, and avoid extra commentary unless the user asks for timestamps, JSON, or metadata. Triggers on: youtube transcript, transcript from this video, get captions, extract transcript from YouTube, summarize this YouTube transcript after fetching it.
2026-03-31
whisper.md
19.6k
from
"kortix-ai/suna"
Transcribe any audio or video file to text using Whisper (Groq or OpenAI). Use when the agent receives voice messages, audio files, video messages, or any media with speech. Triggers on: 'transcribe', 'what does this say', 'voice message', 'speech to text', 'audio', any file path ending in .ogg .mp3 .mp4 .wav .webm .m4a .flac .oga .oga
2026-04-04
camsnap.md
18.2k
from
"elizaOS/eliza"
Capture frames or clips from RTSP/ONVIF cameras. Grabs snapshots, video clips, and motion events from IP cameras, security cameras, and video streams. Use when the user wants to take a snapshot from a camera, record a clip from an RTSP stream, monitor motion on a security camera, discover ONVIF devices on the network, or configure camera access for automated surveillance capture.
2026-03-17
nano-banana-pro.md
18.2k
from
"elizaOS/eliza"
Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro). Use when the user asks to create an image, generate a picture, produce AI-generated artwork, edit a photo, compose multiple images, or upscale an image to higher resolution. Supports text-to-image generation, single-image editing, and multi-image composition using the Gemini API.
2026-03-17
sora.md
17.4k
from
"openai/skills"
Use when the user asks to generate, edit, extend, poll, list, download, or delete Sora videos, create reusable non-human Sora character references, or run local multi-video queues via the bundled CLI (`scripts/sora.py`); includes requests like: (i) generate AI video, (ii) edit this Sora clip, (iii) extend this video, (iv) create a character reference, (v) download video/thumbnail/spritesheet, and (vi) Sora batch planning; requires `OPENAI_API_KEY` and Sora API access.
2026-03-18
clip-hand-skill.md
17.0k
from
"RightNow-AI/openfang"
Expert knowledge for AI video clipping — yt-dlp downloading, whisper transcription, SRT generation, and ffmpeg processing
2026-03-04
baoyu-compress-image.md
16.3k
from
"JimLiu/baoyu-skills"
Compresses images to WebP (default) or PNG with automatic tool selection. Use when user asks to "compress image", "optimize image", "convert to webp", or reduce image file size.
2026-04-19
remotion-best-practices.md
14.4k
from
"vercel-labs/json-render"
Best practices for Remotion - Video creation in React
2026-02-05
videocaptioner.md
14.2k
from
"WEIFENG2333/VideoCaptioner"
Process video subtitles — transcribe speech, optimize/translate text, burn styled subtitles into video. Use when you need to add subtitles to a video, transcribe audio, translate subtitles, or customize subtitle styles.
2026-03-29
demo-video.md
12.6k
from
"alirezarezvani/claude-skills"
Use when the user asks to create a demo video, product walkthrough, feature showcase, animated presentation, marketing video, or GIF from screenshots or scene descriptions. Orchestrates playwright, ffmpeg, and edge-tts MCPs to produce polished video content.
2026-04-04
audioeditor.md
11.7k
from
"danielmiessler/Personal_AI_Infrastructure"
AI-powered audio/video editing — transcription, intelligent cut detection, automated editing with crossfades, and optional cloud polish. USE WHEN clean audio, edit audio, remove filler words, clean podcast, remove ums, fix audio, cut dead air, polish audio, clean recording, transcribe and edit.
2026-02-28
mmx-cli.md
11.2k
from
"MiniMax-AI/skills"
Use mmx to generate text, images, video, speech, and music via the MiniMax AI platform. Use when the user wants to create media content, chat with MiniMax models, perform web search, or manage MiniMax API resources from the terminal.
2026-04-08
analyzing-videos.md
9.8k
from
"yikart/AiToEarn"
Analyzes video content and extracts highlights. Use when user wants to analyze video, extract highlights, create video summary, generate video keywords, understand video content, find best moments, create trailer, extract exciting clips, get video insights, or identify viral moments. 视频分析、提取精彩片段、视频摘要、视频理解、精彩集锦、视频关键词、剪辑精华、内容分析、热门片段。
2026-03-17
composing-videos.md
9.8k
from
"yikart/AiToEarn"
Combines multiple videos/images into a single video with optional background audio. Use when user wants to merge clips, concatenate videos, create slideshow from images, stitch videos together, combine media files, add background music to video, mix video with audio, create video montage, or join multiple video segments. 合并视频、拼接视频、图片合成视频、添加背景音乐、视频拼接、多图生成视频、视频混剪、素材合成。
2026-03-17
editing-images.md
9.8k
from
"yikart/AiToEarn"
Image editing using Sharp. Supports compositing (QR codes, logos, watermarks), resizing, cropping, rotating, flipping, brightness/contrast/saturation adjustment, blur, sharpen. 图片编辑、图片合成、添加二维码、添加Logo、添加水印、图片缩放、图片裁剪、图片旋转、图片翻转、亮度对比度饱和度调整、模糊、锐化。
2026-03-17