Trim, cut, split, segment, and concatenate media with ffmpeg (stream copy when possible, re-encode across cut boundaries). Use when the user asks to trim a video, cut a clip, extract a segment by timestamps, remove a section, split into parts, join/merge videos, concatenate files, or build a segmented HLS-style playlist.
Install and configure OBS Studio programmatically: install via brew cask / winget / Flatpak / apt, author profiles (basic.ini, streamEncoder.json, recordEncoder.json), author scene collections (scenes JSON), manage global.ini, set defaults for encoder / output / audio / hotkeys, cross-platform config paths. Use when the user asks to install OBS, set up an OBS profile, create a scene collection from code, configure OBS defaults without the GUI, edit basic.ini, manage multiple OBS profiles, or script a fresh OBS install with known-good settings.
Deep media inspection + automated QC — ffprobe stream details, MediaInfo diagnostics, VMAF/PSNR/SSIM quality metrics, PySceneDetect scene cuts, crop/silence/black-frame/interlacing detection, ffplay scope debugging, NAL/SEI bitstream forensics, metadata audits, loudness compliance against Spotify/Apple/ATSC/EBU specs, and automated CI QC gates. Use when the user says "QC this file", "run VMAF", "compare encoders", "detect scene cuts", "check loudness compliance", "validate delivery spec", "automated QC pipeline", or any deep inspection / quality gating.
Routes media production requests to the right Media OS specialist subagent. ALWAYS use this skill when the user expresses ANY media production intent — "go live", "start streaming", "OBS broadcast", "wire up live rig", "NDI to stream", "PTZ camera setup", "DeckLink capture stream", "HLS deliver", "DASH package", "encode for streaming", "CDN upload", "make a HLS manifest", "multi-bitrate ladder", "CMAF package", "LL-HLS", "low-latency stream", "Widevine package", "DRM package", "package for streaming", "broadcast deliver", "MXF master", "IMF package", "ProRes master", "DPP deliver", "AS-11 deliver", "Netflix deliver", "broadcast spec deliver", "deliver for air", "create IMF", "Premiere to Resolve", "round-trip", "OTIO export", "editorial conform", "XML round-trip", "FCPXML", "EDL export", "AAF export", "convert timeline", "Avid to Premiere", "Resolve to Premiere", "upscale this", "interpolate frames", "denoise AI", "remove background", "rotoscope", "matte", "depth estimate", "AI upscale", "RIFE", "Real-ESRGAN"
Ingest from every source — web (yt-dlp), screen / webcam / mic capture, SDI (DeckLink), DSLR tether (gphoto2), NDI network sources, RTSP / IP cameras via MediaMTX, PTZ — then verify integrity (SHA-256 + full-decode pass), preserve metadata (EXIF / XMP / IPTC), normalize to MKV or FFV1 / J2K / ProRes archival containers, and push to cold-storage cloud (Glacier / B2 / Archive.org). Use when the user says "download YouTube playlist", "capture SDI for 24 hours", "archive IP cameras", "preserve VHS rips", "tether DSLR timelapse", "cold-storage upload", or any ingest-to-archive workflow.
Restore, upscale, and enhance existing footage using 2026 open-source AI models — Real-ESRGAN/SwinIR/HAT super-resolution, RIFE/FILM interpolation, DeepFilterNet/RNNoise audio denoise, rembg/BiRefNet/RVM matting, Depth-Anything v2 depth — with strict OSI-open commercial-safe license filter. Use when the user says "upscale old footage", "remaster", "enhance quality", "30 to 60fps", "AI denoise", "restore VHS", "remove background from video", or anything about AI-driven footage restoration.
Generate media from scratch with 2026 open-source AI — TTS voiceover (Kokoro / OpenVoice / Piper), image gen (FLUX-schnell / Kolors / Sana / ComfyUI), video gen (LTX-Video / CogVideoX / Mochi / Wan), music (Riffusion / YuE), lipsync talking heads (LivePortrait / LatentSync), OCR (PaddleOCR / Tesseract 5 / TrOCR), zero-shot tagging (CLIP / SigLIP / BLIP-2 / LLaVA). Strict commercial-safe license filter. Use when the user says "generate a video", "TTS voiceover", "AI explainer video", "clone my voice", "generate music", "AI image", "digital human", or anything about from-scratch AI media.
Full-stack audio routing across macOS / Linux / Windows, MIDI + OSC control surfaces, DAW integration over JACK, DSP (EQ / compression / loudness / spatial / binaural HRTF), AI denoise (DeepFilterNet / RNNoise / Resemble Enhance), stem separation (Demucs), TTS, Whisper transcription, and EBU R128 / Spotify / Apple / ATSC loudness certification. Use when the user says "set up DAW", "route audio through OBS", "MIDI control surface", "binaural ASMR", "stem separation", "loudness to −16 LUFS", "JACK transport", or anything audio-stack-related.