| name | daily-digest |
| description | Produce a daily "What We're Hearing" digest from Intercom conversations and post to Slack |
| disable-model-invocation | true |
/daily-digest [date]
Produce a "What We're Hearing" digest for the specified date (default: prior
business day, computed by the script — do not compute day-of-week yourself)
and post to #daily-digest on Slack.
Quick reference
- Channel: C0ASG05F1SB (#daily-digest)
- Format: parent title + threaded body (Slack splits at ~3900 chars; thread makes this harmless)
- Research artifacts:
box/research/digest-{date}-*.json and digest-{date}-draft.md
- Script:
python3 box/daily-digest.py (steps 1-3, computes default date)
Steps
1. Run the data pipeline
python3 box/daily-digest.py [--sync]
Do not pass a date argument unless the user specifies a non-default date.
The script computes the prior business day automatically and prints the
resolved date with the day of week. If an explicit date is passed that
differs from the computed default, the script prints a warning.
Use --sync if the target date is recent (within 24h) to ensure full coverage.
The script:
- Checks index freshness (hard gate, exits 2 if >36h stale)
- Queries conversations with CANNED_PATTERNS + operator bot filter applied
- Fetches Intercom API metadata (contact name, state, assignee, AI participation)
- Saves filtered conversations and metadata to
box/research/
- Prints conversation IDs for classification
If the script blocks on freshness, run python3 box/intercom-sync.py --since {date}
first, then retry.
2. Delegate classification
Split the conversation IDs from the script output into 3 roughly equal groups.
Delegate each to a sonnet subagent with this prompt template:
You are classifying Intercom customer support conversations for a daily digest.
**Database**: `psql postgresql://localhost:5432/feedforward`
**Your assigned conversation IDs** ({N} conversations):
{comma-separated IDs}
**For each conversation**: Run this SQL to get the full text:
SET client_min_messages TO WARNING;
SELECT conversation_id, contact_email, full_text
FROM conversation_search_index WHERE conversation_id = '{id}';
Read the full_text and classify with:
- **summary**: One sentence describing what the customer is asking/reporting
- **area**: Product area (Billing, Pin Scheduler, SmartPin, Turbo, SmartBio,
Account, General, Onboarding, Credits, Brand Kit, Keyword Research,
Ghostwriter, Design Pins, or other)
- **skip**: true if NOT a real customer conversation (spam, outreach, sales pitch,
automated email, directory listing, internal outreach, phishing). false for real
customer conversations.
- **notable**: true if this is a specific bug report, interesting product signal,
bot failure, or something unusual. false for routine questions.
- **themes**: Array of sub-theme tags at granular level. Use lowercase-hyphenated
format. Not "billing" but "surprise-renewal-no-notification".
- **bot_quality**: One of: "appropriate" (bot handled well or escalated promptly),
"unhelpful" (gave wrong/irrelevant instructions, repeated without adapting),
"wrong_info" (stated incorrect product facts or fabricated capabilities),
"escalation_failure" (customer asked for human, bot didn't escalate).
Only assess the bot's behavior, not the outcome.
**Important**: Read the FULL text of each conversation, not just the first message.
Output instructions for each delegate:
Return a JSON array of objects. Each: conversation_id (string), summary (string),
area (string), skip (boolean), notable (boolean), themes (string array),
bot_quality (string). No prose, just the JSON array.
Use model: claude-sonnet-4-6. Don't set timeout_ms — the default is the maximum (30 min).
Collect all three in parallel. Merge results and save to
box/research/digest-{date}-classifications.json.
Don't quote counts from delegate output as facts before spot-checking.
Subagent area/skip/notable tallies are proxy data. Spot-check at least one
spam and one notable before synthesizing counts into narration.
3. Primary reads
Quality gate: every conversation linked in the digest must be primary-read.
Subagent classifications are filters, not findings.
Read priority:
- ALL conversations flagged
notable: true — use python3 box/intercom-search.py --read {id1},{id2},...
- Enough of the large theme clusters to verify sub-theme characterization
- Any non-notable conversation you plan to link in the digest
Do not head-limit primary reads (no | head, no limit parameter on
Read calls). Signal appears in unexpected conversation positions. If a batch
read is too large for context, split into smaller batches rather than
truncating. Proved 2026-04-28: head -400 on 5-conversation batch happened
to fit but practice risks silent truncation.
As you read, note:
- Subagent classifications to correct (notable status, area, skip)
- Conversations the subagent missed as notable
- Theme patterns emerging from the actual text (not from subagent labels)
- Bot behavior: wrong instructions, escalation failures, fabricated info,
repeated unhelpful responses. Use bot_quality classifications as a starting
filter but verify during primary reads.
4. Draft the digest
Before composing: Output an accounting block as visible text in the
conversation (not embedded in the draft file). List every conversation ID
that will be linked in the digest. For each, note whether it was primary-read
and any classification corrections. This enumeration precedes the draft Write —
don't compose first and account after. The block's purpose is auditability:
if it's inside the artifact, it was composed alongside the synthesis rather
than functioning as a pre-synthesis gate. Proved 2026-04-16: draft linked 5
conversations not yet primary-read; accounting after the Write missed the gap.
Proved 2026-04-22: accounting block embedded in draft file rather than visible
output; Monitor caught as effort_substitution (medium).
Write to box/research/digest-{date}-draft.md. Re-read the classification file
and metadata file before composing — don't work from memory.
Format (v2, theme-first):
_What We're Hearing_ ({day of week}, {month} {day})
_{N} customer conversations ({M} spam filtered). Here's what stood out._
*{Theme 1 title}*
{Count} of {total}. {Characterization of what people are saying.}
• *{Sub-theme}* ({count}) — {1-2 sentences}. Link · Link
• *{Sub-theme}* — {1-2 sentences}. Link
*{Theme 2 title}*
...
:red_circle: *Notable bugs*
• *{Bug title}* — {repro context}. Link
...
:eyes: *Signal worth watching*
• *{Signal}* — {why it matters}. Link
...
:robot_face: *Bot observations*
• *{Pattern}* — {what happened, which conversations}. Link
_By the numbers:_ {Area} {count} · {Area} {count} · ...
Bot observations that assert product correctness (e.g. "bot fabricated a
capability," "bot gave wrong explanation") require codebase verification before
the label goes into the draft. Conversation text shows what the bot said; it
does not establish whether the bot was factually wrong. Grep or read the
relevant code/docs before using labels like "fabricated" or "wrong."
Proved 2026-04-23: "fabricated capability" label for Turbo messaging was
accurate but was applied from memory before codebase verification; user
correction on Failed/Missed Posts tab exposed the class of error.
Link format: <https://app.intercom.com/a/inbox/_/inbox/conversation/{id}|{Name}>
Theme granularity: sub-themes within categories ("surprise renewals without
notification" not "billing"). Let themes emerge from the data — don't copy
prior day's structure.
5. Present for approval
Before presenting: Re-read the draft file and spot-check at least 2
attributed quotes against their source conversations. Don't vouch for accuracy
from memory — the verification must precede the claim. Proved 2026-04-21:
Billing 23/24 count error caught by reflection, but 3 quote attributions
and 26 link mappings asserted without re-reads.
Show the exact text to the user. This is the approval gate. The user reviews
the framing, theme selection, and linked conversations.
In headless sessions, the user cannot see Read tool output. "Show the
exact text" means copy the full draft into a conversation message. Reading the
file is verification; posting it in the conversation is presentation. These
are separate steps. Proved 2026-04-28: agent read the draft twice, told user
"ready for review" twice, never showed the text.
6. Post to Slack
Two mutations via execute_approved, one at a time:
- Parent message:
python3 box/slack-mutate.py post C0ASG05F1SB '{title text}'
Record the returned ts value.
- Thread reply:
python3 box/slack-mutate.py reply C0ASG05F1SB {ts} "$(cat /tmp/ff-digest-body.txt)"
Save body to a temp file first to avoid shell escaping issues.
Verify the thread after posting (read back via Slack API).
slack-mutate.py's VERIFIED output covers one fragment. If Slack split the
body (>3900 chars), use slack_read_thread to confirm all sections landed.
Known gotchas
- Slack splits at ~3900 chars. Thread format makes this harmless (just
becomes 2-3 thread replies). Don't try to compress the digest to fit.
source.author.name for operator-initiated conversations returns the
bot name. The metadata file may show bot names for some contacts. Check
the conversation text if a contact name looks like a bot.
- Sync lag. The 4h launchd cron means evening conversations may be missing
if running the morning after. Use
--sync for recent dates.
- Subagent theme tags guide reads, they don't replace them. The notable
flag has ~90% accuracy. Read the conversation before trusting the classification.
- Don't copy prior day's theme structure. Each day's themes should emerge
from fresh reads. Using yesterday's sections as a template biases what you see.