Run any Skill in Manus with one click

audio-voice-recovery

Stars163

Forks12

UpdatedMay 27, 2026 at 09:50

Audio forensics and voice recovery guidelines for CSI-level audio analysis. This skill should be used when recovering voice from low-quality or low-volume audio, enhancing degraded recordings, performing forensic audio analysis, or transcribing difficult audio. Triggers on tasks involving audio enhancement, noise reduction, voice isolation, forensic authentication, or audio transcription.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

pproenca

pproenca/dot-skills

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

Related occupationsSOC

Based on SOC occupation classification

Mathematical Science Occupations, All OtherComputer and Mathematical Occupations·SOC 15-2099

Sound Engineering TechniciansSOC 27-4014

File Explorer

54 files

SKILL.md

readonly

More from this repository

same repository

write-design-docs

pproenca/dot-skills

Write, revise, or review software design documents using Malte Ubl's "Design Docs at Google" guidance. Use when Codex needs to draft a design doc, technical design, RFC, engineering review doc, architecture proposal, mini design doc, or design-doc review focused on context, goals, non-goals, trade-offs, alternatives, cross-cutting concerns, review lifecycle, and deciding whether a design doc is warranted.

2026-06-12163

go-process-cli

pproenca/dot-skills

Write Go CLIs that manage processes and workloads — daemons, supervisors, job runners, deploy tools, anything that spawns or controls other processes. Use when writing, reviewing, or refactoring Go that handles OS signals, runs children via os/exec, fans out concurrent work, threads context for cancellation, returns exit codes, or builds a flag/cobra command surface. Covers the footguns the standard library makes easy to hit — SIGTERM vs Ctrl-C, SIGKILL-by-default children, orphaned process groups, pipe deadlocks, goroutine leaks, zombie reaping, and os.FindProcess lying about liveness. Triggers whenever the task touches process lifecycle, graceful shutdown, subprocess control, or concurrent workload supervision in Go — even if not named.

2026-05-31163

design-review

pproenca/dot-skills

Structured UI design review — existing code (React/JSX, CSS, Tailwind) and, when behaviour matters, the running app in a real browser — reported as a prioritised Before / After / Why table. Covers visual hierarchy, spacing, typography, colour & contrast, component states, motion, responsiveness, accessibility, multi-page flow & navigation, and interaction continuity — grounded in Refactoring UI and Emil Kowalski's principles. For animation/jank/FPS, focus order, and cross-page UX it can drive Chrome via chrome-devtools-mcp to capture what a screenshot can't. Trigger when the user asks to "review this UI", "design review", "critique this component/screen/page or multi-page flow", asks why something "looks off", "looks AI-generated", or "looks like a wireframe", or wants to raise visual polish. For building UI from scratch use web-taste; for the full animation set see emilkowal-animations.

2026-05-27163

deterministic-metric-design

pproenca/dot-skills

Inventing deterministic metrics — turning a fuzzy property like 'maintainability', 'risk', or 'how reducible this code is' into a deterministic, computable number an agent can trust and optimize. Covers the path from construct to adoption — operationalizing the construct, confronting computability limits (Kolmogorov, Rice) with sound proxies, picking the right measurement scale, proving properties (monotonicity, invariance, the Weyuker/Briand axioms), guaranteeing determinism, establishing construct validity (not just LOC in disguise), and hardening against Goodhart-style gaming when an agent optimizes the metric. Trigger when designing, reviewing, or validating a quantitative metric, score, measure, or index — and even when the user doesn't say 'metric' but wants to quantify, score, rank, or measure code/behavior, build a deterministic optimization target, or invent a measure for something previously unquantified (e.g., behavior-preserving codebase-size reduction).

2026-05-27163

implementation-design-patterns

pproenca/dot-skills

Implementation guide for the 22 Gang of Four design patterns in TypeScript, distilled from refactoring.guru. Use this skill when writing, refactoring, or reviewing TypeScript that exhibits a pattern-shaped problem — class-explosion from inheritance, conditionals switching on type, tight coupling to concrete classes, tree-shaped models, runtime algorithm selection, undo/redo, snapshot-and-restore, state-dependent behavior, subscriber notification, or hiding subsystem complexity. Each pattern entry includes intent, problem, solution, applicability (when to use AND when NOT to use), a runnable TypeScript example, implementation steps, pros/cons, and relations to sibling patterns. Trigger even when no pattern is named — cues like "class getting unwieldy," "giant switch," "swap implementations at runtime," "combinatorial subclasses," "need undo," or "traverse a tree" are pattern-shaped. Covers all 5 Creational, 7 Structural, and 10 Behavioral GoF patterns.

2026-05-27163

implementation-functional-patterns

pproenca/dot-skills

TypeScript's functional answers to the 22 Gang of Four classes — factory functions (Factory Method, Abstract Factory, Prototype, Memento), module-scope singletons, fluent immutable builders, wrapper functions (Adapter, Facade), native Proxy, WeakMap caches (Flyweight), discriminated unions with exhaustive match (State, Visitor, Composite), event emitters and signals (Mediator, Observer), pipelines and composition (CoR, Decorator), stream methods (Iterator), closures-as-commands, higher-order strategies, lambda placement. Use when reviewing TypeScript that has a class-shaped problem the GoF catalog solves with a hierarchy but where idiomatic TS reaches for a function, a tagged union, or a data structure. Each rule names the GoF pattern(s) it replaces and when the class form still wins. Trigger on "factory class", "singleton getInstance", "state machine class", "observer pattern", "AST visitor", "where do I put this lambda". Sibling to implementation-design-patterns.

2026-05-27163

name	audio-voice-recovery
description	Audio forensics and voice recovery guidelines for CSI-level audio analysis. This skill should be used when recovering voice from low-quality or low-volume audio, enhancing degraded recordings, performing forensic audio analysis, or transcribing difficult audio. Triggers on tasks involving audio enhancement, noise reduction, voice isolation, forensic authentication, or audio transcription.

Forensic Audio Research Audio Voice Recovery Best Practices

Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.

When to Apply

Reference these guidelines when:

Recovering voice from noisy or low-quality recordings
Enhancing audio for transcription or legal evidence
Performing forensic audio authentication
Analyzing recordings for tampering or splices
Building automated audio processing pipelines
Transcribing difficult or degraded speech

Rule Categories by Priority

Priority	Category	Impact	Prefix	Rules
1	Signal Preservation & Analysis	CRITICAL	`signal-`	5
2	Noise Profiling & Estimation	CRITICAL	`noise-`	5
3	Spectral Processing	HIGH	`spectral-`	6
4	Voice Isolation & Enhancement	HIGH	`voice-`	7
5	Temporal Processing	MEDIUM-HIGH	`temporal-`	5
6	Transcription & Recognition	MEDIUM	`transcribe-`	5
7	Forensic Authentication	MEDIUM	`forensic-`	5
8	Tool Integration & Automation	LOW-MEDIUM	`tool-`	7

Quick Reference

1. Signal Preservation & Analysis (CRITICAL)

signal-preserve-original - Never modify original recording
signal-lossless-format - Use lossless formats for processing
signal-sample-rate - Preserve native sample rate
signal-bit-depth - Use maximum bit depth for processing
signal-analyze-first - Analyze before processing

2. Noise Profiling & Estimation (CRITICAL)

noise-profile-silence - Extract noise profile from silent segments
noise-identify-type - Identify noise type before reduction
noise-adaptive-estimation - Use adaptive estimation for non-stationary noise
noise-snr-assessment - Measure SNR before and after
noise-avoid-overprocessing - Avoid over-processing and musical artifacts

3. Spectral Processing (HIGH)

spectral-subtraction - Apply spectral subtraction for stationary noise
spectral-wiener-filter - Use Wiener filter for optimal noise estimation
spectral-notch-filter - Apply notch filters for tonal interference
spectral-band-limiting - Apply frequency band limiting for speech
spectral-equalization - Use forensic equalization to restore intelligibility
spectral-declip - Repair clipped audio before other processing

4. Voice Isolation & Enhancement (HIGH)

voice-rnnoise - Use RNNoise for real-time ML denoising
voice-dialogue-isolate - Use source separation for complex backgrounds
voice-formant-preserve - Preserve formants during pitch manipulation
voice-dereverb - Apply dereverberation for room echo
voice-enhance-speech - Use AI speech enhancement services for quick results
voice-vad-segment - Use VAD for targeted processing
voice-frequency-boost - Boost frequency regions for specific phonemes

5. Temporal Processing (MEDIUM-HIGH)

temporal-dynamic-range - Use dynamic range compression for level consistency
temporal-noise-gate - Apply noise gate to silence non-speech segments
temporal-time-stretch - Use time stretching for intelligibility
temporal-transient-repair - Repair transient damage (clicks, pops, dropouts)
temporal-silence-trim - Trim silence and normalize before export

6. Transcription & Recognition (MEDIUM)

transcribe-whisper - Use Whisper for noise-robust transcription
transcribe-multipass - Use multi-pass transcription for difficult audio
transcribe-segment - Segment audio for targeted transcription
transcribe-confidence - Track confidence scores for uncertain words
transcribe-hallucination - Detect and filter ASR hallucinations

7. Forensic Authentication (MEDIUM)

forensic-enf-analysis - Use ENF analysis for timestamp verification
forensic-metadata - Extract and verify audio metadata
forensic-tampering - Detect audio tampering and splices
forensic-chain-custody - Document chain of custody for evidence
forensic-speaker-id - Extract speaker characteristics for identification

8. Tool Integration & Automation (LOW-MEDIUM)

tool-ffmpeg-essentials - Master essential FFmpeg audio commands
tool-sox-commands - Use SoX for advanced audio manipulation
tool-python-pipeline - Build Python audio processing pipelines
tool-audacity-workflow - Use Audacity for visual analysis and manual editing
tool-install-guide - Install audio forensic toolchain
tool-batch-automation - Automate batch processing workflows
tool-quality-assessment - Measure audio quality metrics

Essential Tools

Tool	Purpose	Install
FFmpeg	Format conversion, filtering	`brew install ffmpeg`
SoX	Noise profiling, effects	`brew install sox`
Whisper	Speech transcription	`pip install openai-whisper`
librosa	Python audio analysis	`pip install librosa`
noisereduce	ML noise reduction	`pip install noisereduce`
Audacity	Visual editing	`brew install audacity`

Workflow Scripts (Recommended)

Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.

scripts/preflight_audio.py - Generate a forensic preflight report (JSON or Markdown).
scripts/plan_from_preflight.py - Create a workflow plan template from the preflight report.
scripts/compare_audio.py - Compare objective metrics between baseline and processed audio.

Example usage:

# 1) Analyze and capture baseline metrics
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

# 2) Generate a workflow plan template
python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md

# 3) Compare baseline vs processed metrics
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
  --before evidence.wav \
  --after enhanced.wav \
  --format md \
  --out comparison.md

Forensic Preflight Workflow (Do This Before Any Changes)

Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001). Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence. Use scripts/preflight_audio.py to capture baseline metrics and preserve the report with the case file.

Capture and record before processing:

Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
Record signal integrity: sample rate, bit depth, channels, duration
Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
Locate the region of interest (ROI) and document time ranges and changes over time
Inspect spectral content and estimate speech-band energy and intelligibility risk
Scan for temporal defects: dropouts, discontinuities, splices, drift
Evaluate channel correlation and phase anomalies (if stereo)
Extract and preserve metadata: timestamps, device/model tags, embedded notes

Procedure:

Prepare a forensic working copy, verify hashes, and preserve the original untouched.
Locate ROI and target signal; document exact time ranges and changes across the recording.
Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
Identify required processing and plan a workflow order that avoids unwanted artifacts. Generate a plan draft with scripts/plan_from_preflight.py and complete it with case-specific decisions.
Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
If stereo, evaluate channel correlation and phase; document anomalies.
Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.

Failure-pattern guardrails:

Do not process until every preflight field is captured.
Document every process, setting, software version, and time segment to enable repeatability.
Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
If the request is not achievable, communicate limitations and do not declare completion.
Require objective metrics and A/B listening before declaring completion.
Do not rely solely on objective metrics; corroborate with critical listening.
Take listening breaks to avoid ear fatigue during extended reviews.

Quick Enhancement Pipeline

# 1. Analyze original (run preflight and capture baseline metrics)
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

# 2. Create working copy with checksum
cp evidence.wav working.wav
sha256sum evidence.wav > evidence.sha256

# 3. Apply enhancement
ffmpeg -i working.wav -af "\
  highpass=f=80,\
  adeclick=w=55:o=75,\
  afftdn=nr=12:nf=-30:nt=w,\
  equalizer=f=2500:t=q:w=1:g=3,\
  loudnorm=I=-16:TP=-1.5:LRA=11\
" enhanced.wav

# 4. Transcribe
whisper enhanced.wav --model large-v3 --language en

# 5. Verify original unchanged
sha256sum -c evidence.sha256

# 6. Verify improvement (objective comparison + A/B listening)
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
  --before evidence.wav \
  --after enhanced.wav \
  --format md \
  --out comparison.md

How to Use

Read individual reference files for detailed explanations and code examples:

Section definitions - Category structure and impact levels
Rule template - Template for adding new rules

Reference Files

File	Description
AGENTS.md	Complete compiled guide with all rules
references/_sections.md	Category definitions and ordering
assets/templates/_template.md	Template for new rules
metadata.json	Version and reference information