| name | checkout-verifier |
| description | Use when an API-credits checkout or paid-plan upgrade needs to be proven end-to-end against Stripe test mode — confirming a card charge actually creates the invoice and subscription in the right state, reproducing a "I paid but my credits didn't show up" report, checking that a declined or 3DS card fails the way the UI claims, or wiring a billing smoke test into CI so a checkout regression is caught before a customer's money is. |
Checkout Verifier
Overview
Billing is the one flow where a silent bug costs real money and real trust. The
checkout button can return a cheerful "Success!" while the Stripe invoice is
stuck open, the subscription never activated, or the credits grant webhook
never fired. A 200 from your own backend proves nothing — the source of truth is
Stripe, and the only honest test reads it back.
This skill drives the API-credits checkout UI with Stripe test cards, then
asserts the resulting invoice and subscription state via the Stripe API. It
covers the happy path (4242…), a hard decline (4000000000009995), and a 3DS
challenge card, and it polls Stripe for webhook-driven state changes instead
of sleeping and hoping.
The whole point: the assertion is made against Stripe's record of what happened,
not against your app's optimistic UI. If the invoice isn't paid and the
subscription isn't active, the run fails — even if the page said "Thank you."
When to Use
Reach for this when:
- A PR touches checkout, the Stripe integration, webhook handlers, the
credit-grant logic, or the pricing/plan config, and you need proof a real
charge lands the customer in the right billing state.
- Someone reports "I was charged but got no credits" or "my card was declined but
the UI let me through" — reproduce it deterministically against test mode.
- You're adding a billing smoke test to CI and need Stripe-side state assertions
(invoice
paid, subscription active, credits granted), not just a happy HTTP
response from your own server.
- You changed how 3DS / SCA is handled and need to confirm the challenge actually
gates the charge.
Do NOT use this when:
- You're in live mode. This drives real-looking flows; it must only ever run
against
sk_test_… keys. A live key here charges real cards — refuse to run.
- You only need to unit-test webhook signature handling — construct a signed
event fixture and call the handler directly; a browser + real Stripe is
overkill.
- You're load-testing checkout. This verifies one purchase at a time and is
deliberately assertive and slow.
Running it
cd .claude/skills/checkout-verifier
pip install playwright stripe && playwright install chromium
export CHECKOUT_BASE_URL="https://staging.example.com"
export STRIPE_SECRET_KEY="sk_test_..."
export STRIPE_PRICE_ID="price_..."
export TEST_LOGIN_EMAIL="billing-test@example.com"
export TEST_LOGIN_PASSWORD="..."
python scripts/verify_checkout.py --scenario success
python scripts/verify_checkout.py --scenario decline
python scripts/verify_checkout.py --scenario 3ds
python scripts/verify_checkout.py --scenario all
result.json and a screenshot per step land in artifacts/<run-id>/. Exit code
is non-zero on any failed assertion, so it drops straight into CI.
Test cards this skill uses
| Scenario | Card number | Expected outcome |
|---|
| success | 4242 4242 4242 4242 | invoice paid, subscription active, credits granted |
| decline | 4000 0000 0000 9995 | charge fails (insufficient_funds), no credits, UI shows error |
| 3ds | 4000 0027 6000 3184 | 3DS challenge appears; on confirm, invoice paid |
Any future month / any CVC / any ZIP. These are Stripe's published test numbers —
they only work with a sk_test_… / pk_test_… pair.
How the verifier works
Each scenario follows the same act → poll → assert-against-Stripe shape:
- Guard the key. Refuse to start unless
STRIPE_SECRET_KEY begins with
sk_test_. Mixing a live key into this is the most dangerous mistake here.
- Drive the UI. Log in, open the API-credits checkout, fill the Stripe
Payment Element (it lives in an iframe — the script targets the frame, not the
top document), and submit. For the 3DS card, complete the challenge iframe.
- Find the Stripe objects. Look up the customer by
TEST_LOGIN_EMAIL, then
the latest subscription / invoice / payment intent for that customer — keyed
off the run's idempotency key so reruns don't double-charge or read a stale
object.
- Poll for the terminal state. Invoice
paid and subscription active are
reached via webhook, asynchronously. The script polls Invoice.retrieve
on an interval with a deadline; it never sleeps a fixed amount and assumes.
- Assert. Invoice status, subscription status, amount, and the credit-grant
side effect must all match the scenario's expectation. Anything off fails the
run with the actual Stripe object dumped into
result.json.
The Stripe-side helpers live in scripts/verify_checkout.py; adapt the
customer-lookup and credit-grant checks to your own data model where noted.
Gotchas
- Test mode vs live mode key mixups are the dangerous one. A
sk_live_… key
here drives a real charge flow against real cards. The script hard-refuses any
key that isn't sk_test_… — keep that guard. Likewise the publishable key in
the page must be pk_test_…; a test-secret / live-publishable mismatch makes
the Payment Element silently reject every card with a confusing error.
- Webhooks are asynchronous — poll, never sleep. The invoice flips to
paid
and the subscription to active only after Stripe POSTs your webhook and your
handler processes it. That can take 1–15s in test mode. sleep(3) then
asserting flakes constantly. Poll Invoice.retrieve / Subscription.retrieve
on an interval with a deadline and fail loudly on timeout, dumping the last
object you saw.
- Always pass an idempotency key. A reran CI job or a double-submit must not
create two subscriptions / two charges. Generate one idempotency key per run
and pass it on every mutating Stripe call and on the checkout submit, so a
retry is a no-op rather than a second charge.
- The Payment Element is in an iframe (often nested). You cannot fill the card
field from the top frame. Locate the Stripe iframe (and, for 3DS, the nested
challenge iframe) and type into it. The card number, expiry, and CVC may each
live in their own frame depending on the integration.
- Decline ≠ error. A
4000…9995 decline is the expected outcome for the
decline scenario — the test passes when the charge fails cleanly, the invoice
is open/uncollectible, no credits were granted, and the UI surfaced the
decline. Treating a decline as a script failure inverts the test.
- 3DS needs the challenge handled. The 3DS card pops an authentication iframe;
the payment intent sits in
requires_action until you complete it. Click the
"Complete authentication" control in the challenge frame, then poll the payment
intent to succeeded — don't assert paid before the challenge is done.
- Clean up test customers/subscriptions. Repeated runs accumulate test-mode
subscriptions on the same customer. Cancel the subscription created by the run
in a
finally block (test-mode cancellation is free) so the next run reads a
clean latest-subscription rather than a stale active one.
Files
scripts/verify_checkout.py — Playwright + Stripe driver. Guards against
non-test keys, drives the checkout UI with the scenario's test card (filling
the Stripe Payment Element iframe, handling the 3DS challenge), then looks up
the customer/invoice/subscription via the Stripe API and polls for the
webhook-driven terminal state before asserting invoice/subscription/credit
state. Cancels the run's subscription in finally. Referenced by Running it
and How the verifier works above.