name	ice-scorer
description	Automatically score growth experiments using the ICE framework (Impact × Confidence × Ease). Use when the user creates a new experiment, mentions scoring or prioritization, or when analyzing experiment backlogs. Helps prioritize experiments by evaluating Impact (1-10), Confidence (1-10), and Ease (1-10).
allowed-tools	["Read","Write"]

ICE Scorer Skill

Automatically score growth experiments using the ICE (Impact, Confidence, Ease) prioritization framework.

When to Activate

This skill should activate when:

User creates a new experiment without providing ICE scores
User mentions "score", "prioritize", or "ICE"
User asks "which experiment should I run first?"
User wants to evaluate experiment backlog
User compares multiple experiments

ICE Framework Scoring Guidelines

Impact (1-10): How much will this move the key metric?

Score 8-10: High Impact

Affects North Star metric directly
Expected change ≥15%
Targets large user segment
Critical business metric

Score 4-7: Medium Impact

Affects important but secondary metrics
Expected change 5-15%
Targets meaningful user segment
Supports key business goals

Score 1-3: Low Impact

Affects minor or vanity metrics
Expected change <5%
Targets small user segment
Nice-to-have improvement

Confidence (1-10): How certain are we this will work?

Score 8-10: High Confidence

Strong quantitative data supporting hypothesis
User research validates the problem
Similar experiments succeeded elsewhere
Multiple sources of evidence
Detailed rationale (>100 characters)

Score 4-7: Medium Confidence

Some supporting data or research
Analogous experiments showed promise
Logical reasoning with limited evidence
Moderate rationale (50-100 characters)

Score 1-3: Low Confidence

Speculative or gut feeling
No supporting data
Untested assumption
Minimal rationale (<50 characters)

Ease (1-10): How easy is this to implement?

Score 8-10: High Ease

< 1 day of work
No engineering required, or minimal changes
No external dependencies
Can be done with existing tools

Score 4-7: Medium Ease

1-2 days of work
Some engineering work required
May need design support
Uses existing infrastructure

Score 1-3: Low Ease

2 days of work
Significant engineering effort
Requires design and multiple teams
Needs external resources or new tools

Scoring Process

When scoring an experiment:

Read the experiment file from the experiments folder
Analyze the hypothesis components:
- Proposed change
- Target audience
- Expected outcome (look for specific percentages)
- Rationale (check length and evidence quality)
Evaluate Impact:
- Is this a North Star metric or secondary metric?
- What's the expected percentage change?
- How many users will this affect?
- Consider the experiment category (acquisition, activation, etc.)
Evaluate Confidence:
- How much evidence supports the hypothesis?
- Is there user research or data mentioned?
- How detailed is the rationale?
- Are there comparable experiments?
Evaluate Ease:
- Estimate implementation time
- Does it need engineering? Design? External resources?
- How complex is the proposed change?
- Look for keywords: "redesign" (low ease), "copy change" (high ease)
Calculate total ICE score: Impact × Confidence × Ease
Interpret the score:
- 700+: Critical Priority - implement immediately
- 500-699: High Priority - strong candidate
- 300-499: Medium Priority - good experiment
- 150-299: Low Priority
- <150: Very Low Priority - deprioritize
Update the experiment JSON with ICE scores
Move to pipeline if score ≥ 300

Scoring Examples

Example 1: Onboarding Progress Indicators

Experiment: Add progress indicators to 5-step onboarding flow

Analysis:

Impact: 7 - Activation is important, expected 15% increase
Confidence: 6 - User research supports it, but not tested yet
Ease: 9 - Simple UI element, <1 day of work
Total: 378 - Medium-High Priority

Reasoning:

Impact: Activation is a key metric but not the only North Star
Confidence: User research provides evidence but no previous tests
Ease: Adding progress bar is straightforward UI work

Example 2: Social Proof on Pricing Page

Experiment: Add customer logos and testimonials to pricing page

Analysis:

Impact: 7 - Affects acquisition and conversion
Confidence: 8 - Strong industry evidence for B2B social proof
Ease: 9 - Design change only, no engineering
Total: 504 - High Priority

Reasoning:

Impact: Pricing page is high-traffic, affects key conversion
Confidence: Multiple case studies show 10-15% improvement
Ease: Simple asset placement, quick implementation

Example 3: Complete Platform Redesign

Experiment: Redesign entire user interface

Analysis:

Impact: 9 - Could affect all metrics significantly
Confidence: 4 - No data supporting specific improvements
Ease: 2 - Months of work, multiple teams
Total: 72 - Very Low Priority

Reasoning:

Impact: Broad changes could have major impact
Confidence: Too vague, no specific hypothesis about what will improve
Ease: Massive undertaking, not a growth "experiment"

Keywords to Watch

Low Ease indicators:

redesign, rebuild, refactor, overhaul, migration, infrastructure

High Ease indicators:

copy change, button, color, image, text, email, simple

High Confidence indicators:

"data shows", "research indicates", "we tested", "similar experiment"

High Impact indicators:

North Star, conversion, activation, retention, revenue
Specific percentages (e.g., "15% increase")
Large user segments

Output Format

When providing ICE scores, explain your reasoning:

ICE Score Analysis for: [Experiment Title]

Impact: [score]/10
Reasoning: [Why this score based on metric importance, expected change, audience size]

Confidence: [score]/10
Reasoning: [Why this score based on evidence, data, research quality]

Ease: [score]/10
Reasoning: [Why this score based on time, resources, complexity]

Total ICE Score: [Impact × Confidence × Ease] = [total]

Priority: [Critical/High/Medium/Low/Very Low]
Recommendation: [What to do with this experiment]

[If score >= 300:]
✓ Moving to pipeline based on strong ICE score

Integration with Commands

This skill works automatically when:

/experiment-create completes - offer to score immediately
/hypothesis-generate creates ideas - suggest preliminary scores
User asks about prioritization

Continuous Learning

After experiments complete:

Compare predicted Impact vs actual results
Adjust scoring calibration based on outcomes
Learn patterns for better Confidence scoring
Refine Ease estimates based on actual time taken

name	ice-scorer
description	Automatically score growth experiments using the ICE framework (Impact × Confidence × Ease). Use when the user creates a new experiment, mentions scoring or prioritization, or when analyzing experiment backlogs. Helps prioritize experiments by evaluating Impact (1-10), Confidence (1-10), and Ease (1-10).
allowed-tools	["Read","Write"]

ICE Scorer Skill

Automatically score growth experiments using the ICE (Impact, Confidence, Ease) prioritization framework.

When to Activate

This skill should activate when:

User creates a new experiment without providing ICE scores
User mentions "score", "prioritize", or "ICE"
User asks "which experiment should I run first?"
User wants to evaluate experiment backlog
User compares multiple experiments

ICE Framework Scoring Guidelines

Impact (1-10): How much will this move the key metric?

Score 8-10: High Impact

Affects North Star metric directly
Expected change ≥15%
Targets large user segment
Critical business metric

Score 4-7: Medium Impact

Affects important but secondary metrics
Expected change 5-15%
Targets meaningful user segment
Supports key business goals

Score 1-3: Low Impact

Affects minor or vanity metrics
Expected change <5%
Targets small user segment
Nice-to-have improvement

Confidence (1-10): How certain are we this will work?

Score 8-10: High Confidence

Strong quantitative data supporting hypothesis
User research validates the problem
Similar experiments succeeded elsewhere
Multiple sources of evidence
Detailed rationale (>100 characters)

Score 4-7: Medium Confidence

Some supporting data or research
Analogous experiments showed promise
Logical reasoning with limited evidence
Moderate rationale (50-100 characters)

Score 1-3: Low Confidence

Speculative or gut feeling
No supporting data
Untested assumption
Minimal rationale (<50 characters)

Ease (1-10): How easy is this to implement?

Score 8-10: High Ease

< 1 day of work
No engineering required, or minimal changes
No external dependencies
Can be done with existing tools

Score 4-7: Medium Ease

1-2 days of work
Some engineering work required
May need design support
Uses existing infrastructure

Score 1-3: Low Ease

2 days of work
Significant engineering effort
Requires design and multiple teams
Needs external resources or new tools

Scoring Process

When scoring an experiment:

Read the experiment file from the experiments folder
Analyze the hypothesis components:
- Proposed change
- Target audience
- Expected outcome (look for specific percentages)
- Rationale (check length and evidence quality)
Evaluate Impact:
- Is this a North Star metric or secondary metric?
- What's the expected percentage change?
- How many users will this affect?
- Consider the experiment category (acquisition, activation, etc.)
Evaluate Confidence:
- How much evidence supports the hypothesis?
- Is there user research or data mentioned?
- How detailed is the rationale?
- Are there comparable experiments?
Evaluate Ease:
- Estimate implementation time
- Does it need engineering? Design? External resources?
- How complex is the proposed change?
- Look for keywords: "redesign" (low ease), "copy change" (high ease)
Calculate total ICE score: Impact × Confidence × Ease
Interpret the score:
- 700+: Critical Priority - implement immediately
- 500-699: High Priority - strong candidate
- 300-499: Medium Priority - good experiment
- 150-299: Low Priority
- <150: Very Low Priority - deprioritize
Update the experiment JSON with ICE scores
Move to pipeline if score ≥ 300

Scoring Examples

Example 1: Onboarding Progress Indicators

Experiment: Add progress indicators to 5-step onboarding flow

Analysis:

Impact: 7 - Activation is important, expected 15% increase
Confidence: 6 - User research supports it, but not tested yet
Ease: 9 - Simple UI element, <1 day of work
Total: 378 - Medium-High Priority

Reasoning:

Impact: Activation is a key metric but not the only North Star
Confidence: User research provides evidence but no previous tests
Ease: Adding progress bar is straightforward UI work

Example 2: Social Proof on Pricing Page

Experiment: Add customer logos and testimonials to pricing page

Analysis:

Impact: 7 - Affects acquisition and conversion
Confidence: 8 - Strong industry evidence for B2B social proof
Ease: 9 - Design change only, no engineering
Total: 504 - High Priority

Reasoning:

Impact: Pricing page is high-traffic, affects key conversion
Confidence: Multiple case studies show 10-15% improvement
Ease: Simple asset placement, quick implementation

Example 3: Complete Platform Redesign

Experiment: Redesign entire user interface

Analysis:

Impact: 9 - Could affect all metrics significantly
Confidence: 4 - No data supporting specific improvements
Ease: 2 - Months of work, multiple teams
Total: 72 - Very Low Priority

Reasoning:

Impact: Broad changes could have major impact
Confidence: Too vague, no specific hypothesis about what will improve
Ease: Massive undertaking, not a growth "experiment"

Keywords to Watch

Low Ease indicators:

redesign, rebuild, refactor, overhaul, migration, infrastructure

High Ease indicators:

copy change, button, color, image, text, email, simple

High Confidence indicators:

"data shows", "research indicates", "we tested", "similar experiment"

High Impact indicators:

North Star, conversion, activation, retention, revenue
Specific percentages (e.g., "15% increase")
Large user segments

Output Format

When providing ICE scores, explain your reasoning:

ICE Score Analysis for: [Experiment Title]

Impact: [score]/10
Reasoning: [Why this score based on metric importance, expected change, audience size]

Confidence: [score]/10
Reasoning: [Why this score based on evidence, data, research quality]

Ease: [score]/10
Reasoning: [Why this score based on time, resources, complexity]

Total ICE Score: [Impact × Confidence × Ease] = [total]

Priority: [Critical/High/Medium/Low/Very Low]
Recommendation: [What to do with this experiment]

[If score >= 300:]
✓ Moving to pipeline based on strong ICE score

Integration with Commands

This skill works automatically when:

/experiment-create completes - offer to score immediately
/hypothesis-generate creates ideas - suggest preliminary scores
User asks about prioritization

Continuous Learning

After experiments complete:

Compare predicted Impact vs actual results
Adjust scoring calibration based on outcomes
Learn patterns for better Confidence scoring
Refine Ease estimates based on actual time taken

ice-scorer

ICE Scorer Skill

When to Activate

ICE Framework Scoring Guidelines

Impact (1-10): How much will this move the key metric?

Confidence (1-10): How certain are we this will work?

Ease (1-10): How easy is this to implement?

Scoring Process

Scoring Examples

Example 1: Onboarding Progress Indicators

Example 2: Social Proof on Pricing Page

Example 3: Complete Platform Redesign

Keywords to Watch

Output Format

Integration with Commands

Continuous Learning

ICE Scorer Skill

When to Activate

ICE Framework Scoring Guidelines

Impact (1-10): How much will this move the key metric?

Confidence (1-10): How certain are we this will work?

Ease (1-10): How easy is this to implement?

Scoring Process

Scoring Examples

Example 1: Onboarding Progress Indicators

Example 2: Social Proof on Pricing Page

Example 3: Complete Platform Redesign

Keywords to Watch

Output Format

Integration with Commands

Continuous Learning