| name | slack-media-analysis |
| description | Analyze screenshots, MP4/video/audio attachments, raw PDFs, screen recordings, and voice notes from Slack, Telegram, and other inbound channels with preprocessing, transcript extraction, keyframes, thread context, and Gemini-first multimodal reasoning. Use when Sunil asks Linus what is happening in a recording, screenshot, PDF, voice note, or other media evidence. |
| allowed-tools | Bash, Read, Write |
Slack Media Analysis
Use this skill when a Slack thread, Telegram chat, or other inbound channel contains media evidence:
- screenshots
- MP4 screen recordings
- short product demos
- voice notes
- audio clips
- videos with narration
- raw PDFs
Goals:
- produce a grounded synopsis
- extract timestamps and visible states
- preserve transcript/keyframes as evidence
- identify likely errors, reproduction steps, and next actions
Slack audience
Anyone in the company Slack workspace may ask Linus to analyze media.
Slack is still split into two modes:
- direct Slack DM with Sunil (
sunil@tribble.ai, Slack user U0528KFHAE8) may use Sunil-private context when the task is actually personal or operator-private
- every other Slack surface, including channels, shared threads, group DMs, and DMs with other coworkers, must stay product/engineering-only
Do not use private owner context or personal-memory context for this skill. In Slack, operate as an engineering/product operator only.
Backend policy
Default architecture:
- preserve the raw Slack file and message context
- preprocess with
ffprobe / ffmpeg
- transcribe audio when possible
- use Gemini on Vertex as the primary multimodal analyzer when Vertex credentials are available
- fall back to direct Gemini API key auth only if Vertex auth is unavailable
- save structured JSON and a Markdown report
Do not rely on "sample every 3 seconds and guess" as the only path.
Keyframes are evidence and fallback, not the primary understanding engine.
Thread packet first
When the ask starts from a Slack thread, build the packet first:
python3 /root/.openclaw/workspace/skills/slack-media-analysis/scripts/build_slack_media_packet.py \
--channel <CHANNEL_ID> \
--thread-ts <THREAD_TS> \
--out /tmp/slack-media-<thread-ts>
That packet preserves:
- thread context
- raw media metadata
- raw document metadata
ffprobe output
- transcript when available
- keyframes for video
For a local smoke test:
python3 /root/.openclaw/workspace/skills/slack-media-analysis/scripts/build_slack_media_packet.py \
--file /path/to/local/video.mp4 \
--title "local-smoke-test" \
--out /tmp/local-media-smoke
Inputs
The main script accepts:
- a local media path
- a Slack private download URL
- an optional focused question
- an optional
bug mode for UI / troubleshooting clips
Script
Primary helper:
python3 /root/.openclaw/workspace/skills/slack-media-analysis/scripts/analyze_slack_media.py \
--input "<local-path-or-slack-url>" \
--mode bug \
--question "What is failing in this recording, and what are the likely reproduction steps?"
Local repo path:
python3 deploy/openclaw-skills/slack-media-analysis/scripts/analyze_slack_media.py \
--input "<local-path-or-slack-url>" \
--mode bug \
--question "What is failing in this recording, and what are the likely reproduction steps?"
Output
The script writes an artifact directory with:
analysis.json
report.md
probe.json
transcript.txt when available
keyframes/ when available
For PDFs, the raw file is sent directly to Gemini. There is no ffprobe/audio/keyframe stage, but the same analysis/report artifacts are still written.
When the packet builder is used first, also expect:
manifest.json
messages.json
summary.txt
files/<id-or-name>/metadata.json
The report should capture:
- summary
- timeline
- visible UI states
- errors observed
- likely reproduction steps
- likely root causes
- confidence
Environment
Preferred:
- Vertex auth through a service-account credential file or
GOOGLE_APPLICATION_CREDENTIALS
- default local project:
tribble-ai
- default Vertex location:
global
- default frontier model:
gemini-3.1-pro-preview unless explicitly overridden
Fallback:
GEMINI_API_KEY or GOOGLE_API_KEY for direct Gemini API access
Optional:
OPENAI_API_KEY for audio transcription fallback
DEEPGRAM_API_KEY for audio transcription fallback
SLACK_BOT_TOKEN when the input is a private Slack file URL
The script also checks common local secret file locations for the Slack bot token and known local Vertex credential paths.
DS9 / product bug workflow
If the media is a DS9 / Tribble bug recording:
- run this skill first
- preserve the evidence and extract the likely failing UI step
- then route code/debug work through
ds9-triage
- if a PR exists and Sunil asks whether it works, route real local validation through
ds9-pr-testing
Shared-channel behavior
In shared Slack threads:
- do not post internal file paths, private Slack URLs, or raw infra details
- do not mention family, household, travel, health, private contacts, or unrelated personal owner context
- summarize the evidence in plain language
- attach screenshots only after the evidence is actually grounded
- do not call it fixed just because the model produced a plausible summary
- if Gemini is unavailable, say the analysis was packet-based rather than pretending it was full native-video analysis
Use labels like:
media analyzed
timeline extracted
backend validated locally
fully locally tested
Example questions
What is actually happening in this bug video?
What is happening in this screenshot thread?
Where does the workflow fail?
What error text is visible, and at what timestamp?
What does the voice note actually say?
What are the likely reproduction steps from this recording?