ワンクリックで
blog-feedback
// Post-publication assessment for blog posts classified by the tier system. Records whether the classification was correct, captures engagement data, and feeds the calibration loop. Trigger with /blog-feedback.
// Post-publication assessment for blog posts classified by the tier system. Records whether the classification was correct, captures engagement data, and feeds the calibration loop. Trigger with /blog-feedback.
Backfill daily blog posts to startaitools.com for a date range. Auto-classifies content into 3 tiers (Field Note / Deep-Dive / Case Study). Tracks methodology decisions in JSONL for calibration analysis. Dual-publishes to tonsofskills.com/blog, cross-posts to Dev.to, Hashnode, Medium, and Substack. Generates weekly recaps and monthly retrospectives on cadence. Use when backfilling missed blog posts. Trigger with /blog-backfill.
Monthly calibration report for the blog content tier classification system. Analyzes decisions.jsonl and feedback.jsonl to detect tier inflation, assess calibration accuracy, and surface emergent patterns. Trigger with /blog-calibrate.
| name | blog-feedback |
| description | Post-publication assessment for blog posts classified by the tier system. Records whether the classification was correct, captures engagement data, and feeds the calibration loop. Trigger with /blog-feedback. |
| version | 1.0.0 |
| author | Jeremy Longshore <jeremy@intentsolutions.io> |
| license | Proprietary |
| tags | ["content","blog","methodology","feedback","calibration"] |
| allowed-tools | Read,Write,Edit,Glob,Grep,Bash(command:*),Agent,AskUserQuestion |
| argument-hint | <slug> [--correct|--wrong TIER] [--notes 'text'] |
| model | opus |
| compatibility | Designed for Claude Code |
Records post-publication assessments for posts classified by the blog-backfill tier system. Each feedback record captures whether the original classification was correct, optional engagement data, and qualitative notes. This data feeds /blog-calibrate for Brier score calculation and decision quality analysis.
Decoupled from classification time by design — you can assess a post days or weeks after publication, once you've seen how it performs.
content/posts/ with a matching decision record in methodology/decisions.jsonl/blog-calibrate uses this data — more feedback = better calibration$ARGUMENTS parsing:
<slug> — required. The post slug (kebab-case, matches filename without .md)--correct — flag indicating the original classification was correct--wrong <TIER> — the tier it should have been (1, 2, or 3)--notes "text" — optional qualitative notesExamples:
/blog-feedback knowledge-os-bootstrap --correct
/blog-feedback wcag-color-audit --wrong 2 --notes "Had more teaching value than expected"
/blog-feedback plugin-registry-design --correct --notes "Good call on Tier 3, got 2 backlinks"
If no flags provided, enter interactive mode (ask the user).
Search methodology/decisions.jsonl for the slug:
grep "\"slug\": \"$SLUG\"" /home/jeremy/000-projects/blog/startaitools/.claude/skills/blog-backfill/methodology/decisions.jsonl
If not found, check if the post exists in content/posts/$SLUG.md. If the post exists but has no decision record, it was published before the tier system — offer to create a retroactive classification.
Show the user:
If flags provided in arguments:
--correct or --wrong <TIER>--notes if presentIf interactive mode: Ask the user:
Append to methodology/feedback.jsonl:
{
"slug": "the-post-slug",
"date_assessed": "2026-04-09",
"original_tier": 2,
"correct_tier": null,
"was_correct": 1,
"reasoning": "User confirmed classification was appropriate",
"year_from_now_useful": null,
"engagement_data": null
}
Schema fields:
slug — matches decisions.jsonldate_assessed — today's dateoriginal_tier — from the decision recordcorrect_tier — null if correct, integer if wrongwas_correct — 1 if correct, 0 if wrongreasoning — why the user agrees/disagreesyear_from_now_useful — 1/0/null, assessed retrospectivelyengagement_data — JSON object with available metrics, or nullPrint summary:
Feedback recorded for "{title}":
Original: Tier {N} ({name}) @ {confidence}
Assessment: {CORRECT / WRONG → Tier N}
Notes: {notes or "none"}
Feedback records total: {count in feedback.jsonl}
Quick check:
For assessing multiple posts at once:
/blog-feedback --batch 2026-04-01 2026-04-07
Shows each post from the date range with its classification and asks for quick assessment (correct/wrong + optional tier). Efficient for weekly review sessions.
The original flow assumed human-judgment-is-truth. At ~30 posts/month with autonomous publishing, manual sweep proved unrealistic — feedback.jsonl sat empty from 2026-04-09 onward, and /blog-calibrate lost its Brier-score input. Auto-sweep fixes that with a deterministic, no-LLM rubric grader.
Script: /home/jeremy/000-projects/blog/startaitools/.claude/skills/blog-backfill/scripts/feedback-sweep.py
Cron wrapper: ~/bin/blog-feedback-sweep.sh (Sunday 10:00 local)
Logic (deterministic, no LLM, no API calls):
decisions.jsonl without a matching feedback.jsonl entry,feedback.jsonl entry with source: "structural_auto_confirm",User override: any auto-sweep record can be overridden by running /blog-feedback <slug> --correct or /blog-feedback <slug> --wrong N. feedback.jsonl is append-only and the latest record for a slug wins for calibration purposes.
Why deterministic (not LLM): the inflation problem this fixes was caused by an LLM-classifier reading anchors generously. Pasting another LLM on top to grade the first LLM's output reproduces the failure mode. The structural rubric is crude but auditable — when it's wrong, the user can see exactly why (line count, title words) and adjust the heuristic.
v2 (future): layer Umami engagement signal on top of structural grading. High views + low bounce on a Tier 1 → maybe under-classified. Low engagement on a Tier 3 → likely over-classified. Use engagement as a tiebreaker for borderline structural calls. Requires Umami API creds in cron env (currently MCP-only).
| Artifact | Location |
|---|---|
| Feedback record | methodology/feedback.jsonl (appended) |
| Console summary | Printed to screen |
Quick correct:
/blog-feedback knowledge-os-bootstrap --correct
Records that the classification was correct. Takes 2 seconds.
Wrong classification:
/blog-feedback wcag-color-audit --wrong 2 --notes "The WCAG debugging journey was more transferable than I expected"
Records that Tier 1 was wrong, should have been Tier 2. Valuable calibration data.
Interactive:
/blog-feedback plugin-registry-design
No flags — enters interactive mode, walks through the assessment questions.
Batch review:
/blog-feedback --batch 2026-04-01 2026-04-07
Reviews all posts from that week sequentially.