تشغيل أي مهارة في Manus بنقرة واحدة

ابدأ الآن

experiment-analyzer

النجوم١٠

التفرعات٣

آخر تحديث١٧ يونيو ٢٠٢٦ في ١٥:٣٤

Design and analyze A/B tests and experiments — sample size, statistical significance, and result interpretation

التثبيت

التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.

تشغيل في Manus

المصدر

UitbreidenOS

UitbreidenOS/Claudient

فتح مستودع GitHub عرض مستودعات المنشئ

تنزيل

تشغيل في Manus

المهن ذات الصلةSOC

استنادا إلى تصنيف SOC المهني

علماء البياناتمهن الحاسوب والرياضيات·SOC 15-2051

SKILL.md

readonly

name	experiment-analyzer
description	Design and analyze A/B tests and experiments — sample size, statistical significance, and result interpretation
allowed-tools	["Read","Write","Grep"]
effort	medium

When to activate

Designing A/B tests with proper sample size and duration
Analyzing experiment results for statistical significance
Interpreting metrics lift, confidence intervals, and guardrail metrics
Setting up experimentation frameworks (Statsig, LaunchDarkly, custom)

When NOT to use

For qualitative user research
For feature flagging without measurement
For simple before/after comparisons

Instructions

Pre-experiment. Define primary metric, guardrail metrics, minimum detectable effect (MDE), and required sample size.
Power analysis. Calculate sample size: n = (Z_α/2 + Z_β)² × σ² / δ². Typical: 80% power, 5% significance.
Run experiment. Randomize at user/session level. Check for sample ratio mismatch (SRM). Monitor guardrails daily.
Analyze results. Compute p-value, confidence interval, relative lift. Check for novelty/primacy effects.
Interpret. Statistical significance ≠ practical significance. Consider effect size and business impact.
Report. Executive summary: metric, lift %, CI, p-value, recommendation (ship/iterate/kill).

Example

Experiment: New checkout flow
Primary metric: Conversion rate
MDE: 2% relative lift
Sample size: 50K users/group (14 days at current traffic)

Results:
  Control:    3.21% conversion
  Treatment:  3.34% conversion
  Lift:       +4.1% (95% CI: [+1.2%, +7.0%])
  p-value:    0.008 (< 0.05 threshold)
  Guardrails: Page load time unchanged, error rate unchanged

Recommendation: Ship. Statistically significant with positive lift.

المزيد من هذا المستودع

نفس المستودع

agent-teams

UitbreidenOS/Claudient

Orchestrate multi-agent teams in Claude Code — set up coordinated sessions with task delegation, inter-agent communication, and parallel execution

2026-06-2310

ultraplan

UitbreidenOS/Claudient

Leverage Claude Code's ultra-deep planning mode for complex architecture decisions, multi-file refactors, and system design

2026-06-2310

ultrareview

UitbreidenOS/Claudient

Deep code review using Claude Code's thorough analysis mode — security, performance, correctness, and maintainability

2026-06-2310

swarm-sandbox

UitbreidenOS/Claudient

Safe isolated testing environment for multi-agent swarm topologies before production deployment

2026-06-2210

auto-summarizer

UitbreidenOS/Claudient

Automatically summarizes the current session context to prevent token window overflow.

2026-06-2210

prune-context

UitbreidenOS/Claudient

Claude Code context pruner: slash command to summarize session and reset token bloat

2026-06-1910

name	experiment-analyzer
description	Design and analyze A/B tests and experiments — sample size, statistical significance, and result interpretation
allowed-tools	["Read","Write","Grep"]
effort	medium

When to activate

Designing A/B tests with proper sample size and duration
Analyzing experiment results for statistical significance
Interpreting metrics lift, confidence intervals, and guardrail metrics
Setting up experimentation frameworks (Statsig, LaunchDarkly, custom)

When NOT to use

For qualitative user research
For feature flagging without measurement
For simple before/after comparisons

Instructions

Pre-experiment. Define primary metric, guardrail metrics, minimum detectable effect (MDE), and required sample size.
Power analysis. Calculate sample size: n = (Z_α/2 + Z_β)² × σ² / δ². Typical: 80% power, 5% significance.
Run experiment. Randomize at user/session level. Check for sample ratio mismatch (SRM). Monitor guardrails daily.
Analyze results. Compute p-value, confidence interval, relative lift. Check for novelty/primacy effects.
Interpret. Statistical significance ≠ practical significance. Consider effect size and business impact.
Report. Executive summary: metric, lift %, CI, p-value, recommendation (ship/iterate/kill).

Example

Experiment: New checkout flow
Primary metric: Conversion rate
MDE: 2% relative lift
Sample size: 50K users/group (14 days at current traffic)

Results:
  Control:    3.21% conversion
  Treatment:  3.34% conversion
  Lift:       +4.1% (95% CI: [+1.2%, +7.0%])
  p-value:    0.008 (< 0.05 threshold)
  Guardrails: Page load time unchanged, error rate unchanged

Recommendation: Ship. Statistically significant with positive lift.