| name | released-cards |
| description | Verify instrumentation, build measurement insights, close Slack loop for Released cards |
| disable-model-invocation | true |
Released Cards
For cards that moved to Released since the last run: verify instrumentation, build
measurement insights, close the loop in Slack, and flag gaps. The natural unit is
"cards released since last processed time." Designed for one-at-a-time depth but
fed a batch.
Also Read
Before starting, read these shared sections from box/shortcut-ops.md:
- Workspace Constants: Shortcut IDs, workflow states, mutation scripts
- API Quirks: Shortcut and Slack API gotchas
Constraints
- Mutation gate: A PreToolUse hook blocks all Slack and Shortcut mutations
through Bash. Route through
agenterminal.execute_approved or present the
command for the user to run.
- Human-in-the-loop: Present scoped card list at checkpoint (end of Phase 1
scope determination). Confirm before proceeding to notifications.
- Insight creation in main context: PostHog insight creation is a durable
write. Keep it under direct control rather than delegating.
- Before Shortcut or Slack API calls: check
reference/tooling-logistics.md
for tested recipes. Don't re-derive payload shapes or endpoint paths from scratch.
Constants
| Constant | Value |
|---|
| #feature-launch | C0E586F33 |
| #ideas | C0ADJ4ATJE4 |
| Released state ID | 500000021 |
| PostHog project ID | 161414 |
| PostHog insight URL | https://us.posthog.com/project/161414/insights/{short_id} |
Steps
1. Check watermark and query Released cards
python3 box/watermark.py get play6 --days
If it prints a number, use that as the lookback (compute --since date from days
ago). If none, fall back to 2 weeks.
python3 box/shortcut-cards.py --state Released --since YYYY-MM-DD --exclude-comment-matching "Release impact" --summary
Cards with any comment containing "Release impact" (insight links or "No insight
needed") are excluded automatically. If zero cards, stop: no Play 6 needed.
2. Scan #feature-launch for existing announcements
Scan #feature-launch (C0E586F33) for bot "shipped this!" posts AND human
announcements. Cards with an existing bot post have already been through this play.
Human announcements match by feature keywords, not SC number.
Do not filter the channel history by keyword. Fetch all recent messages and
cross-reference each card's feature area against the full list. Human posts use
natural language ("X is live", "just launched Y") that won't match bot patterns.
Cross-reference to identify the candidate set: Released cards with no bot
"shipped" post and no matching human announcement.
3. Check #ideas threads for existing bot replies
For the candidate set, check #ideas threads (via external_links) for existing
"This shipped!" bot replies:
python3 box/slack-scanner.py --channel C0ADJ4ATJE4 --threads <permalink_urls>
A card may need only some phases (e.g. missing #feature-launch post but already
has #ideas reply).
4. Checkpoint
Present the scoped card list to the user with per-card status (what's done, what's
needed). Confirm scope before proceeding.
4.5. Link PRs to cards
For each confirmed card, find and attach the GitHub PR if not already linked.
This serves downstream plays (Play 7, Play 8) that need the PR without
searching git logs.
- Check the card's
external_links for an existing GitHub PR URL
(github.com/tailwind/aero/pull/). If found, skip.
- Search for the PR:
git log origin/main --oneline --grep="SC-{id}".
Extract the merge commit's PR number from the "Merge pull request #NNN"
message. If no match, try branch name patterns (sc-{id}/).
- If found, add the PR URL as an external link via
python3 box/shortcut-mutate.py add-link STORY_ID PR_URL
(route through execute_approved).
- If no PR found, flag it. The card may have shipped via config change,
manual deploy, or the PR title didn't reference the SC number.
Note: Shortcut's first-class pull_requests field requires the GitHub
integration webhook to be active (currently not populating). Until that's
fixed, external_links is the mechanism.
4a. Ship-phase checkpoint (notifications)
Before posting any notifications, present the batch plan:
"Entering ship phase (notifications). [N] 'This shipped!' #ideas replies
and [M] #feature-launch cross-posts planned. Each verified via channel
history read-back after posting."
Wait for user acknowledgment before the first execute_approved call.
Run python3 box/ship-gate.py enter before the first production
execute_approved call (the hook will block production mutations until the
plan is declared).
5. Notify: #ideas "This shipped!" replies
For each confirmed card that needs a "This shipped!" #ideas reply, submit via
execute_approved (formulaic, no approve_content needed).
Format: This shipped! <shortcut_url|SC-NNN: Story title>
Cards without Slack threads (no external_links) skip this step but still go
through #feature-launch cross-post and all remaining phases.
6. Notify: #feature-launch cross-posts
For each card that needs a cross-post (skip cards where a human already announced
the feature in step 2). Submit via execute_approved (formulaic).
Format: {Owner} shipped this! <shortcut_url|SC-NNN: Story title>
Map card owner_ids to first names via the Shortcut members API. Multiple owners:
"Bill and Logan shipped this!" No owners: fall back to "This shipped!"
7. Instrumentation check (delegated)
Delegate per card via agenterminal.delegate. PR diffs and card descriptions are
large; reading them in main context wastes budget. Each subagent receives one card
(or 2-3 clustered by product area) and returns a structured summary.
Use the prompt template in box/instrumentation-check-prompt.md. Don't write
the prompt from scratch.
Four outcome categories:
- Tracking implemented: proceed to insight creation.
- Tracking specified but not implemented: flag the gap.
- Tracking specified but feature surface not yet live: note it, revisit when
the surface ships.
- No tracking specified ("Nothing new"): note it and check whether
$pageview
autocapture or existing events provide a usable proxy.
8. Build insights (main context)
- Before creating insights: check each card's existing Shortcut comments for
"Release impact insight:" lines. Cards re-entering Released (e.g. after a
rollback and re-deploy) may already carry insights from a prior cycle. If
insights exist, verify they still point to valid PostHog data and skip creation
unless the prior insight is stale or deleted.
- For cards with implemented tracking (or usable proxies), query PostHog to verify
events are firing.
Verify ALL subagent outcomes, including negatives. A subagent "not found"
means "my grep didn't match," not "the event doesn't exist." Query PostHog for
the event name before reporting a gap. Check Shortcut comments for developer
deploy verification notes.
Before reporting any tracking gap externally, query PostHog for recent events
and check live data. A gap in the PR diff does not mean a gap in production:
follow-up commits can close the gap after the subagent's snapshot.
- Save each query as a named PostHog insight. Every number needs a linkable saved
insight.
- Add insights to a dashboard. Check
box/posthog-events.md Dashboards section
for the current Release Impact dashboard ID. If the month has rolled over and no
dashboard exists, create one and update the table.
After creating or updating any PostHog insight or dashboard, verify it landed:
python3 box/posthog-verify.py insight SHORT_ID [--name "Expected Name"]
or python3 box/posthog-verify.py dashboard DASHBOARD_ID [--name "Expected Name"].
After adding insights to a dashboard, verify tile placement:
python3 box/posthog-verify.py dashboard DASHBOARD_ID --contains SHORT_ID1,SHORT_ID2,...
Exit 0 = verified, 1 = verification failed, 2 = fetch failed.
Dashboard override: If the user specifies that a card should get its own
dedicated dashboard (e.g. split tests, multi-metric experiments), create a
standalone dashboard with all insights instead of adding to the release impact
dashboard. Use the dashboard comment format in Phase 10 instead of per-insight
comments.
- For dual-path events (
Schedule post, Add drafts, etc. routed via
tailwindForwarder -> bach -> PHP PostHog SDK): events exist in PostHog but won't
appear in JS SDK capture calls. Check the PHP allowlist in PostHog.php if a PR
diff shows no tracking but the card's feature area uses the backend path.
Checkpoint counts: before any checkpoint that states a total (e.g.
"N Shortcut comments planned"), tally from the actual artifact list — don't
compute from memory. Proved 2026-04-23: stated "10 comments" across 6
checkpoints; actual was 8 (IDs 1787-1794).
8a. Ship-phase checkpoint (findings + insight links)
Before posting findings or insight link comments, present the batch plan:
"Entering ship phase (findings). [N] Slack findings messages and [M]
Shortcut insight link comments planned. Slack posts verified via channel
history read-back. Shortcut comments verified via shortcut-comment.py
(re-fetches and confirms comment_id)."
Wait for user acknowledgment before the first execute_approved call.
Run python3 box/ship-gate.py enter before the first production
execute_approved call (the hook will block production mutations until the
plan is declared).
9. Report findings
- Present findings per card: what's measurable, what the early data shows, what
gaps exist.
- Draft Slack updates for cards with notable findings. One message per card,
including direct links to saved PostHog insights. Present each via
approve_content with content_type: "slack-message",
filename: "scNNN-findings". After approval, submit via execute_approved.
9a. Tracking opportunities (batch-level)
After reporting per-card findings, review the full batch for tracking gaps worth
filling. Two sources: the unmeasured_surfaces notes from step 7 subagents
(per-card raw material) and your own observations from step 8 insight-building
(where you saw the data and know what's missing).
- Review all
unmeasured_surfaces notes alongside step 8 findings. Look for
cross-cutting patterns — single-card gaps are less interesting than blindness
across a product area.
- For each candidate, articulate what product question the tracking would answer
and why the answer matters now. "We can't tell if X" is not sufficient —
"We can't tell if X, which means we don't know Y" is.
- Present candidates to the user as a ranked list. Include: what's unmeasured,
what signal it would give, why that signal matters (what decision it informs),
and rough scope (new event, new property on existing event, pageview proxy, etc.).
- The user picks which candidates (if any) become cards. Propose each selected
card one at a time through normal card approval.
- Some runs produce nothing. That's fine — this is an opportunity, not a mandatory
deliverable. Don't pad the list to fill it.
10. Link insights to cards
Post a Shortcut comment on each card linking to its saved insight. Use
box/shortcut-comment.py via execute_approved (formulaic, no approve_content
needed). The script posts the comment and verifies it landed (re-fetches comments,
confirms comment_id present). Exit 0/1/2.
Example: python3 box/shortcut-comment.py SC-NNN "Release impact insight: [Name](url)"
Insight comment format:
Release impact insight: [Insight Name](https://us.posthog.com/project/161414/insights/{short_id})
Dashboard comment format (for split tests, multi-metric experiments):
Release impact dashboard: [Dashboard Name](https://us.posthog.com/project/161414/dashboard/{id})
Play 7's gather script fetches all insights from the dashboard automatically.
-
If an insight covers multiple cards, post the same comment on each card.
-
If a card has multiple insights, post one comment per insight.
-
For cards reviewed but not needing an insight (infrastructure chores, cosmetic
changes), post:
Release impact: No insight needed — [brief reason]
Examples:
Release impact: No insight needed — cosmetic change (removed Alpha tag)
Release impact: No insight needed — infrastructure upgrade, no user-facing tracking
Release impact: No insight needed — bug fix, backend data integrity only
This marks the card as reviewed so Play 7's coverage gap detection can
distinguish "skipped intentionally" from "never ran."
11. Update watermark
This must be the final step — after all findings are presented and all comments
are posted. The watermark is a durable gate that determines what future Play 6
runs see. Setting it before findings are reviewed inverts the control loop: the
user can no longer re-scope or re-process cards.
Before updating watermark (point of no return — mechanical gate):
- Present findings summary to user and wait for explicit confirmation
- Spot-check at least one Shortcut comment (read back via API, not from OK response)
- Verify dashboard tile count matches expected insight count
- Spot-check one #ideas reply and one #feature-launch post via slack-scanner
(defense in depth — verifies the tool's verification is working)
Only then:
python3 box/watermark.py set play6
The production mutation gate blocks this command. Route through
agenterminal.execute_approved so the user sees and approves the watermark
advancement.
After any execute_approved failure (partial-state risk):
A failed batch script may have partially executed. Check actual state of ALL
items in the failed batch before retrying. Retry only unposted items.
Idempotency
Before each mutation:
- Check thread for existing "This shipped!" reply before posting a duplicate
- Check dashboard for existing insight before creating a duplicate
- Before posting an insight comment, check the card's existing comments for one
containing the insight's
short_id URL. If it already exists, skip.
- Re-fetch card state from Shortcut before each operation (don't use stale cache)
Key behaviors
- Cards without merged PRs get flagged (may have been released via config change,
manual deploy, or the PR is in a different repo)
- High-volume events (>10k/day) are useful for "nothing broke" checks but not for
attributing small UI changes. Note this when building insights for minor UI cards.
- This play owns "This shipped!" replies. Sync-ideas (Play 1) posts "Tracked"
replies only; "This shipped!" notifications live here.