Run any Skill in Manus with one click

funnel-query

Stars0

Forks0

UpdatedJune 24, 2026 at 03:26

Use when someone asks which events to join to see a funnel like signup → activation → paid, where the canonical user_id lives, how to count conversions between two product events, or why their event counts look inflated or don't match between steps. Reach for this whenever a question involves the raw event stream, joining events to users, deduping events, or measuring step-to-step conversion — especially before hand-writing SQL against the event tables, which almost always gets the user_id join wrong.

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

az9713

az9713/skill-best-practices

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

File Explorer

2 files

SKILL.md

readonly

More from this repository

same repository

adversarial-review

az9713/skill-best-practices

Use when a change is written and "looks done" but has not had a hostile second pass before merge — especially diffs touching auth, money, migrations, concurrency, or anything the author is quietly unsure about. Spawns a fresh-eyes reviewer subagent that sees ONLY the diff and the spec, collects findings, drives fixes, and re-dispatches until findings degrade to nitpicks. Reach for this instead of self-reviewing; the author is the worst reviewer of their own diff.

2026-06-240

babysit-pr

az9713/skill-best-practices

Use when a PR is open and green-but-blocked, or red on CI for reasons that smell like flake — a timed-out test runner, a transient network 500 in a setup step, a check that passed locally but failed in CI. Reach for this whenever someone says "this PR keeps failing CI but the test is flaky", "can you babysit this PR to merge", "it's just a flaky check, retry it", or wants a PR shepherded through retries, conflict resolution, and auto-merge without sitting on it manually. Prefer this over hand-clicking "Re-run failed jobs" in the GitHub UI, which gives up no signal on flaky-vs-real and forgets to enable auto-merge.

2026-06-240

billing-lib

az9713/skill-best-practices

Use when writing or reviewing code that meters API token usage, bills accounts, issues invoices, applies credit grants, or computes balances with the internal `billing` library — especially around retries, mid-cycle plan changes, cache-read vs cache-write token pricing, or any place where double-billing or rounding drift would be a problem.

2026-06-240

checkout-verifier

az9713/skill-best-practices

Use when an API-credits checkout or paid-plan upgrade needs to be proven end-to-end against Stripe test mode — confirming a card charge actually creates the invoice and subscription in the right state, reproducing a "I paid but my credits didn't show up" report, checking that a declined or 3DS card fails the way the UI claims, or wiring a billing smoke test into CI so a checkout regression is caught before a customer's money is.

2026-06-240

cherry-pick-prod

az9713/skill-best-practices

Use when a specific fix that's already on main needs to land on a production/release branch without dragging along everything else — a hotfix to backport, a "cherry-pick this commit onto release-2.4", a "we need just that one PR on prod" request. Reach for this whenever someone wants to port one or a few commits to a release branch and open a PR for it, especially before doing it by hand in their main checkout, which pollutes their working tree and routinely leaves conflict markers committed or loses the original commit's provenance.

2026-06-240

code-style

az9713/skill-best-practices

Use when writing or editing code in this org's Python or JS/TS, especially before committing or opening a PR — and proactively the moment a diff adds an import, an except/catch, or any logging. Enforces the style rules Claude gets wrong by default: import grouping, error-wrapping (no bare except / empty catch), no leftover debug prints, explicit over clever. Runs scripts/check_style.sh (ruff, mypy --strict, eslint + grep guards) which exits nonzero so it drops into a pre-commit hook or CI.

2026-06-240

name

funnel-query

description

funnel-query

Overview

This skill answers "which events do I join, and on what key, to measure a funnel" and hands Claude a small Python library (scripts/events.py) for fetching events and composing funnels on the fly. The single most important fact it encodes: raw events do not carry the canonical user_id. They carry an anonymous_id. The canonical user_id lives in identity.users, and you must join through it or your funnel silently fragments one user into many.

The library is credentials-aware (it reads warehouse creds from the environment) and exposes composable primitives — fetch_events, join_on_canonical_user_id, funnel — so Claude writes a short script per question rather than reconstructing boilerplate each time.

When to Use

Use this skill when the request is to:

Build a funnel across product events (e.g. signup_completed → workspace_activated → subscription_started).
Figure out which event names map to a product step, and which table holds them.
Measure step-to-step conversion or drop-off between two events.
Resolve "my counts are inflated / don't add up / a user shows up twice" on the event stream.
Attribute pre-login (anonymous) activity to the user who eventually signed up.

Do NOT use this for: aggregate retention curves or comparing two cohorts with significance testing — that is cohort-compare. Do NOT use it for ops/infra dashboards or "is the pipeline down" questions — that is grafana. This skill stops at producing funnel/conversion numbers from the event stream; it does not do statistics or alerting.

Canonical schema (the part Claude gets wrong by default)

Raw events live in analytics.events (one append-only row per event). Each row has event_id (unique per emission), event_name, anonymous_id, an optional user_id that is NULL for pre-login events, ts (UTC), and a JSON properties blob.
The identity map lives in identity.users. It maps anonymous_id → user_id (the canonical, stable id) and is the ONLY trustworthy source of user_id. Never trust events.user_id directly; always resolve through this table.
A single human can have many anonymous_ids (multiple devices/browsers) all pointing at one user_id. Counting by anonymous_id over-counts users; counting by raw events.user_id drops everyone pre-login.

So every funnel step is: fetch events by name → resolve to canonical user_id via identity.users → dedupe → count distinct users reaching each step.

The library

scripts/events.py exposes:

fetch_events(name, start, end) — pull all events of one event_name in a UTC time window. Returns a DataFrame with event_id, anonymous_id, raw user_id, ts, properties.
join_on_canonical_user_id(df) — join a fetched frame against identity.users and attach the canonical user_id. This is where anonymous pre-login rows get stitched to the eventual account. Drops rows that never resolve (truly anonymous, never-signed-up traffic).
funnel(steps) — given an ordered list of step specs, fetch + resolve + dedupe each step and return the ordered-conversion table (users reaching each step, step-to-step %, overall %). Enforces step ordering by timestamp so a later step only counts if it happened after the earlier one for that user.

Claude composes these. Example for "signup → activation → paid, last 30 days":

from scripts.events import funnel

result = funnel([
    {"name": "signup_completed",      "label": "Signup"},
    {"name": "workspace_activated",   "label": "Activation"},
    {"name": "subscription_started",  "label": "Paid"},
], start="2026-05-24", end="2026-06-23")  # dates are UTC

print(result.to_string())

For an ad-hoc two-event conversion you can drop to the primitives instead of funnel:

from scripts.events import fetch_events, join_on_canonical_user_id

a = join_on_canonical_user_id(fetch_events("signup_completed", "2026-06-01", "2026-06-23"))
b = join_on_canonical_user_id(fetch_events("workspace_activated", "2026-06-01", "2026-06-23"))
converted = b[b.user_id.isin(a.user_id)].user_id.nunique() / a.user_id.nunique()

Don't railroad to funnel() when the question is irregular (non-linear paths, "did X ever happen after Y", multi-touch) — compose the primitives directly.

Gotchas

ALWAYS treat these as real, observed failure modes — each has produced a wrong funnel number that shipped to someone before.

Raw events carry anonymous_id, not the canonical user_id. analytics.events.user_id is NULL for every pre-login event and is not guaranteed stable even when present. Joining funnels on it splits one user across signup (anonymous) and paid (logged-in) and tanks your conversion rate. Always pass each fetched frame through join_on_canonical_user_id before counting. If a step's user count looks suspiciously low, this is the first thing to check.
Dedupe on event_id, not on (user, event_name). The pipeline is at-least-once: the same emission can land twice with the same event_id. fetch_events does NOT pre-dedupe (so you can audit raw volume). Deduping on event_id removes pipeline duplicates while preserving genuinely repeated actions; deduping on (user, event_name) would also collapse a user who legitimately did the action twice. funnel() dedupes on event_id for you — if you hand-roll, do the same.
All timestamps in the warehouse are UTC; report windows are often local. events.ts is UTC. If a stakeholder asks for "yesterday" they mean their local day. Convert the window to UTC before passing start/end, and state the timezone in your answer. A naïve local-vs-UTC mismatch shifts every count by up to a day and is the most common cause of "the dashboard and your number disagree."
Late-arriving events make recent windows under-count. Events can land hours (occasionally a day) after they occurred, especially from mobile/offline clients. A funnel for "today" or the last few hours will under-report later steps and overstate drop-off. For trustworthy conversion, end the window at least 24–48h before now, or explicitly caveat that the tail is still filling in.
Funnel steps must be time-ordered per user. A user counts at step N only if their step-N event happened after their step-(N−1) event. funnel() enforces this; if you compose primitives with a plain isin, you'll wrongly credit users who did the "later" step first (e.g. reactivated before re-signup). Order matters whenever the events aren't naturally monotonic.
An event name is not always one product step. Some steps are an OR of several event names (e.g. activation = workspace_activated OR first_query_run), and some event names are emitted in multiple contexts. Confirm the name→step mapping with properties filters rather than assuming the name is canonical. Pass a where predicate into the step spec when a name needs narrowing.

Files

scripts/events.py — credentials-aware helper library: fetch_events, join_on_canonical_user_id, funnel, plus the identity.users resolution logic. Compose these into per-question scripts; do not re-implement the user-id join inline.