Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

ab-testing

Estrellas1

Forks1

Actualizado10 de marzo de 2026, 10:04

Production A/B testing lifecycle for design variants. Covers hypothesis formation, feature flags, variant comparison, analytics tracking, statistical significance analysis, experiment setup, and cleanup.

Instalación

Instalar con Codex o Claude Copia este prompt, pégalo en Codex, Claude u otro asistente, y deja que revise la página de la skill y la instale por ti.

Ejecutar en Manus

Fuente

dlabs

dlabs/claude-marketplace

Abrir repositorio de GitHub Ver repositorios del creador

Descarga

Ejecutar en Manus

Ocupaciones relacionadasSOC

Basado en la clasificación ocupacional SOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas·SOC 15-1252

Explorador de archivos

3 archivos

SKILL.md

readonly

name	ab-testing
description	Production A/B testing lifecycle for design variants. Covers hypothesis formation, feature flags, variant comparison, analytics tracking, statistical significance analysis, experiment setup, and cleanup.
user-invocable	false

A/B Testing

This skill provides the complete lifecycle for production A/B testing of design variants. Variants are real, production-quality code — not mockups.

Lifecycle

CREATE (/design) → DEPLOY (trunk + flags) → MEASURE (analytics) → DECIDE (/ab-decide) → CLEANUP (/ab-cleanup)

1. CREATE

/blueprint-dev:bp:design uses the design-variant-generator to create 2-3 real component variants, the design-critic to evaluate them, and the ab-test-engineer to wire up flags and tracking.

2. DEPLOY

Variants ship to trunk behind feature flags. Compatible with trunk-based development — no long-lived branches needed.

3. MEASURE

Analytics tracking fires at key interaction points. Users monitor their analytics dashboard for results.

4. DECIDE

/blueprint-dev:bp:ab-decide uses the design-decision-analyst to interpret results and recommend a winner based on statistical significance.

5. CLEANUP

/blueprint-dev:bp:ab-cleanup follows the decision document's cleanup plan to remove the losing variant, promote the winner, and clean up flags/tracking.

Key Principles

Meaningful differences: Variants must differ in layout, interaction, hierarchy, density, or navigation — not just cosmetics
Statistical rigor: p < 0.05, 80% power, calculated sample sizes
Guardrail metrics: Tests auto-stop if critical metrics degrade
Clean cleanup: Every test ends with a clean codebase — no lingering dead code

References

references/tracking-plan-template.md — Template for tracking plans
references/code-templates.md — Stack-specific code templates for wrappers, flags, and tracking

Más de este repositorio

mismo repositorio

agent-browser

dlabs/claude-marketplace

Browser automation using Vercel's agent-browser CLI. Use when you need to interact with web pages, fill forms, take screenshots, or scrape data. Uses Bash commands with ref-based element selection. Triggers on "browse website", "fill form", "click button", "take screenshot", "scrape page", "web automation".

2026-03-101

architecture-review

dlabs/claude-marketplace

Multi-agent architecture review combining core architecture design with parallel security, performance, and data integrity assessments. Produces ADRs in MADR format. Covers ADR, architecture decision, system design, scalability assessment. Not for code review or implementation — for architectural decisions only.

2026-03-101

batch-integration

dlabs/claude-marketplace

Reference for how the built-in /batch command integrates with blueprint-dev workflows — parallel codebase-wide changes using worktrees with project context.

2026-03-101

claude-md-learning

dlabs/claude-marketplace

Analyzes detected stack profiles and suggests targeted CLAUDE.md improvements. Covers CLAUDE.md improvement, project configuration, AI instructions. Never auto-writes to CLAUDE.md — stages suggestions for user review.

2026-03-101

compound-knowledge

dlabs/claude-marketplace

Problem documentation methodology for compounding team knowledge. Captures solved problems with structured metadata for searchability, pattern detection, and prevention. Covers postmortem, lessons learned, debugging documentation, solved problem capture. Not for general documentation — specifically for post-debugging problem capture.

2026-03-101

eval-runner

dlabs/claude-marketplace

Run evaluation suites against blueprint-dev skills. Benchmarks skill performance, tracks pass rates, and validates skill quality after changes.

2026-03-101