一键导入
ab-testing-ecommerce
Run controlled experiments on product pages, checkout flows, and pricing to find what converts best using statistical significance testing
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
菜单
Run controlled experiments on product pages, checkout flows, and pricing to find what converts best using statistical significance testing
用 Codex 或 Claude 帮你安装 复制这段 Prompt,粘贴到 Codex、Claude 或其他助手里,让它检查 Skill 页面并帮你完成安装。
基于 SOC 职业分类
Manage supplier invoices and vendor payments with automated receipt matching, payment scheduling, early discount optimization, and reconciliation workflows
Enable wholesale and B2B sales with company accounts, custom catalogs, quote workflows, purchase orders, and net payment terms
Predict future inventory needs using historical sales data, seasonal trends, and reorder points to prevent stockouts and overstock
Launch a multi-vendor marketplace with seller onboarding, commission rules, automated payouts via Stripe Connect, and vendor dashboards
Control which products appear first in collections using automated ranking rules, manual overrides, and performance-based sorting algorithms
Sync your catalog and inventory across your own site, Amazon, eBay, and wholesale channels to sell everywhere from one system
| name | ab-testing-ecommerce |
| description | Run controlled experiments on product pages, checkout flows, and pricing to find what converts best using statistical significance testing |
| category | data-analytics |
| risk | safe |
| source | curated |
| date_added | 2026-03-12 |
| tags | ["ab-testing","experimentation","statistical-significance","feature-flags","checkout","pricing","conversion","hypothesis-testing"] |
| triggers | ["A/B testing","ab test","experimentation platform","split testing","feature flags","statistical significance","conversion test","pricing test"] |
| tools | ["claude-code","cursor","gemini-cli","copilot","codex-cli","kiro","opencode"] |
| platforms | ["shopify","woocommerce","bigcommerce","custom"] |
| difficulty | intermediate |
A/B testing (split testing) runs controlled experiments where a random subset of visitors sees a variant while the rest see the control. Statistical analysis then determines whether any difference is real or due to chance. Good testing disciplines — calculating required sample size before starting, running tests for at least two full weeks, and never stopping early — separate genuine insights from noise.
This skill guides you through running A/B tests on your specific platform, choosing the right tools, and interpreting results correctly.
| Platform | Recommended Tool | Why |
|---|---|---|
| Shopify | Google Optimize (free, sunsetting) → Convert.com or Intelligems | Intelligems is built specifically for Shopify and supports pricing tests with sticky assignment; Convert integrates via Shopify's theme |
| Shopify (pricing tests) | Intelligems | The only tool that does true server-side price testing on Shopify without flickering |
| WooCommerce | Nelio A/B Testing plugin or Google Optimize | Nelio integrates natively with WordPress/WooCommerce; tracks WooCommerce conversion events automatically |
| BigCommerce | Convert.com or VWO (via script injection) | Both integrate via the BigCommerce storefront script manager |
| Custom / Headless | LaunchDarkly (feature flags + experiments) or build with GrowthBook (open source) | Server-side assignment with no flickering; GrowthBook is free and self-hostable |
Never launch a test without knowing how many visitors each variant needs. Running a test without a pre-determined stopping rule leads to peeking and false positives.
Use the free calculator at https://www.evanmiller.org/ab-testing/sample-size.html or follow this guide:
Example: A Shopify store with 2.5% CVR wanting to detect a 0.3pp lift needs approximately 8,600 sessions per variant. At 500 sessions/day, that is 17 days per variant minimum.
Write down the required sample size before the test starts. This is your mandatory stopping rule.
Option A: Theme-based tests with Convert.com
Option B: Pricing tests with Intelligems
For Shopify checkout tests (Shopify Plus only):
Using Nelio A/B Testing (recommended)
Alternative: Google Optimize (free, requires Google Analytics 4)
<head>purchase)/order-confirmation)For headless storefronts, use server-side assignment to avoid flickering and to support pricing tests:
Using GrowthBook (open source, recommended)
npm install @growthbook/growthbookimport { GrowthBook } from "@growthbook/growthbook";
const gb = new GrowthBook({
apiHost: "https://cdn.growthbook.io",
clientKey: process.env.GROWTHBOOK_CLIENT_KEY,
attributes: {
id: userId, // stable user ID for consistent assignment
loggedIn: !!customerId,
},
});
await gb.loadFeatures();
// Assign variant — deterministic for the same userId
const checkoutButtonVariant = gb.getFeatureValue("checkout-button-color", "blue");
gb.setTrackingCallback((experiment, result) => {
analytics.track("Experiment Viewed", {
experimentId: experiment.key,
variationId: result.key,
});
});
// On order completion:
analytics.track("Purchase", { revenue: order.total });
When reviewing results:
| Metric | What to Check |
|---|---|
| Primary | Revenue per visitor (not CVR alone) |
| Guardrail | Return rate (variant should not increase returns) |
| Guardrail | Cart abandonment rate |
| Confidence | p < 0.05 AND minimum sample size reached |
| Problem | Solution |
|---|---|
| Test ends early because it "looks significant" — then the lift disappears | Use pre-calculated sample size as a mandatory stopping rule; configure your testing tool to lock results until sample size is reached |
| Same user sees different variants on different sessions | Use server-side assignment keyed on a stable user ID (not session ID); Intelligems and GrowthBook handle this correctly by default |
| Checkout test shows lift in CVR but drop in AOV | Always measure revenue per visitor as your primary metric; CVR and AOV can move in opposite directions |
| Price flickering on Shopify pricing tests | Use Intelligems instead of client-side tools — it assigns prices server-side before the page renders |
| Novelty effect inflates variant results in the first week | Report results with and without the first 3 days of data; a large week-1 spike that fades is usually novelty |