| name | context-memory-review |
| description | Weekly review of an investigation tenant-context memory file against the most recent SOC scan reports (e.g. Threat Pulse). Surfaces candidate ADD / MODIFY / FLAG changes to the context file as a propose-only review document for human approval — it NEVER edits the context file, commits, or opens a PR. Trigger on 'review my context file', 'review tenant context', 'propose context updates', 'what should I add to my context memory'. |
Context Memory Review — Instructions
Purpose
Investigation workflows in this project lean on a tenant-context memory file — a local, gitignored
living document that records environment-specific ground truth (known automation/orchestration
fingerprints, known-good IPs, account classifications, honeypot/field-device inventory, validated
personnel, and documented false-positive rules). Scan automations (e.g. the daily Threat Pulse) read
that file to render accurate verdicts.
Over a week of scans, drill-down investigations validate new ground truth — new IPs, new personas,
new FP classes, new device classes — that is not yet captured in the context file. This skill reads the
last N days of scan reports, compares them against the current context file, and produces a
propose-only review document: a list of discrete, human-reviewable candidate changes (ADD / MODIFY /
FLAG) with section anchors, proposed text, supporting evidence, recurrence counts, and confidence.
This skill is the first half of a deliberate two-phase, human-in-the-loop workflow:
| Phase | Who | Action |
|---|
| 1. Propose (this skill) | Automation / interactive | Read reports + context file → emit review doc. No edits. |
| 2. Apply (separate, manual) | Human-directed interactive session | Operator reviews the doc, says "apply items X, Y, Z" → surgical edits to the context file. |
🔴 CRITICAL RULES — READ FIRST
-
PROPOSE-ONLY. NEVER edit the context file in this skill. Do not write, append to, or modify the
context memory file. Do not git commit, push, or open a PR. The only file this skill writes is the
review document in the output directory.
-
Read-only against the tenant. If any live queries are needed to corroborate a candidate change,
they MUST be read-only (per the Remediation Output Policy). Prefer evidence already present in the
reports — only query the tenant to disambiguate a contradiction.
-
⛔ Feedback-loop guard (the single most important rule). Scan reports are partly downstream of
the context file: a scan verdict may simply echo an existing context entry rather than
independently confirm it. You MUST distinguish:
- First-party validation — a drill-down in the report actually ran a query/enrichment and
confirmed the fact (e.g. "enriched IP 203.0.113.10 → datacenter ASN, 0 abuse reports, recurred on
3 days"). This CAN drive a High-confidence proposal.
- Context-derived echo — the report verdict only restated something the context file already
said ("🟢 known orchestration IP per tenant context"). This must NOT be promoted into a new or
strengthened entry. Promoting echoes entrenches errors. When unsure, classify as echo.
-
Never propose weakening or removing a documented FP/safety guardrail based solely on its absence
from the week's reports. Absence of a finding ≠ obsolescence of a guardrail. Staleness candidates are
FLAG-only, Low confidence, for human judgment — never auto-REMOVE.
-
Evidence-based only. Every proposed change cites the specific report file(s), date(s), and
finding it derives from. Never invent entities, counts, IPs, UPNs, or dates. If the reports don't
support a change, don't propose it.
-
PII stays local. The review document will contain live tenant entities (IPs, UPNs, device names).
Write it ONLY to the gitignored output directory. Never commit it, never include it in a PR, never
paste tenant PII into any git artifact.
Inputs (supplied by the invoking prompt / workflow)
The invoking workflow or user supplies these. If invoked interactively without them, ask once, then
proceed with the defaults shown.
| Input | Meaning | Default |
|---|
context_file | Absolute path to the tenant-context memory file to review | (must be provided) |
reports_dir | Directory (or glob) holding the scan reports to review | (must be provided) |
reports_glob | Filename pattern for the reports of interest | *.md |
lookback_days | How far back to include reports (by filename date or mtime) | 7 |
output_dir | Where to write the review document (must be gitignored) | reports/context-reviews |
Execution Workflow
Phase 0 — Load inputs and current state
- Read the context file in full (
context_file). Build an internal index of its structure: every
section heading (the anchor targets for proposals), and within sections the discrete entries — table
rows (e.g. IP tables), bullet points, labelled sub-notes (e.g. "A.2", "Section C"), device entries.
Note any validated YYYY-MM-DD provenance stamps.
- Enumerate the reports in window. List files in
reports_dir matching reports_glob, select those
whose date (from filename YYYYMMDD if present, else file mtime) falls within lookback_days. Sort
oldest→newest. If zero reports are in window, STOP and report "no reports in window — nothing to
review" (a normal quiet-week outcome, not a failure).
- Read each in-window report. For large reports, read in ranges. Extract structured signal:
- Concrete entities that appeared with a verdict: IPs, UPNs/accounts, device/host names, OAuth apps,
incident IDs, CVEs.
- For each: was the verdict reached by a first-party drill-down (a query/enrichment was executed
in the report) or an echo of existing context? Capture the distinction — it gates confidence.
- New FP classes / tuning notes the report's drill-downs articulated.
- Any contradiction: a drill-down that concluded the opposite of an existing context entry.
- Note in each report whether the context file was successfully loaded/applied during that scan (the
reports state this) — echoes only count as echoes if context was actually applied.
Phase 1 — Correlate across the week
Aggregate signal across all in-window reports:
- Recurrence — For each candidate entity/pattern, count how many distinct report-days it appeared on
with a consistent first-party classification. Recurrence is the backbone of confidence.
- Match against the context file index — For each candidate, determine whether it is:
- Absent from the context file → ADD candidate.
- Present but refined by the reports (role/volume/scope changed, new regional sibling, expanded
persona list) → MODIFY candidate.
- Present and merely echoed (no new first-party info) → NOT a candidate (drop it; feedback-loop
guard). It may at most justify refreshing a
validated date if a first-party drill-down re-confirmed
it — and that is a Low/Medium MODIFY, clearly labelled "provenance refresh only".
- Contradicted by a first-party drill-down → FLAG candidate (never auto-resolve).
- Staleness sweep (FLAG-only) — Identify context entries that were NOT referenced by ANY in-window
report. These are candidates for human review, not removal. Low confidence. Exclude documented
safety/FP guardrails from staleness flags entirely (their value is in preventing future errors, not
in weekly hit-rate).
Phase 2 — Score and assemble proposals
Assign each candidate a type and confidence:
| Type | When |
|---|
| ADD | New, first-party-validated fact absent from the context file. |
| MODIFY | Existing entry that a first-party drill-down refined/expanded, or a provenance-refresh. |
| FLAG | A contradiction needing human judgment, or a staleness candidate. Never an auto-edit. |
| Confidence | Criteria |
|---|
| High | First-party validated AND recurred on ≥3 report-days (or a single explicit, thorough validated drill-down with enrichment/queries). Consistent classification, no contradicting evidence. |
| Medium | First-party validated on 2 report-days, OR 1 strong drill-down without recurrence. |
| Low | Single weak signal, provenance-refresh only, or any FLAG/staleness candidate. |
For every proposal, produce:
- ID — sequential (
P1, P2, …).
- Type + Confidence.
- Target section — the exact heading/anchor in the context file where it belongs (for ADD), or the
exact existing entry text being changed (for MODIFY/FLAG).
- Proposed text — for ADD/MODIFY, the literal line/table-row/bullet to insert or the
before→after change, written in the context file's existing style and including a
(validated <today's date>) stamp where the file uses that convention.
- Rationale — one or two sentences.
- Evidence — the report file name(s) + date(s) + the specific finding, and an explicit note of
whether it was first-party or echo (only first-party drives ADD/MODIFY).
- Recurrence — "appeared on N of M report-days".
- Apply instruction — precise enough for a later interactive session to make a surgical edit (which
section, insert-after-which-line, exact text). For FLAG items, the question the human must answer.
Phase 3 — Write the review document
Write the document to output_dir (create the folder if needed) as:
<output_dir>/context-review_<YYYYMMDD>_<HHMMSS>.md
Use this structure:
# Context Memory Review — <today's date>
**Context file reviewed:** <context_file>
**Reports reviewed:** <N> file(s) over <lookback_days>d (<earliest date> → <latest date>)
**Proposed changes:** <A> ADD · <M> MODIFY · <F> FLAG
**Confidence mix:** <High count> High · <Medium count> Medium · <Low count> Low
> ⚠️ PROPOSE-ONLY. No changes have been made to the context file. To apply, open an interactive
> session and say e.g. "apply items P1, P3, P7" — those edits will be made surgically with a
> validated-date stamp. Review each item's evidence before approving.
## Reports in this review window
| Date | File | Context applied during scan? |
|------|------|------------------------------|
| ... | ... | yes / no |
## Proposed changes
### P1 — [ADD · High] <short title>
- **Target section:** <heading/anchor>
- **Proposed text:**
> <literal text to add, in file style, with (validated <date>)>
- **Rationale:** ...
- **Evidence:** <report file(s) + date(s) + finding>; first-party drill-down.
- **Recurrence:** appeared on N of M report-days.
- **Apply instruction:** Insert under "<section>" after "<anchor line>".
### P2 — [MODIFY · Medium] ...
...
### P3 — [FLAG · Low] <contradiction or staleness> ...
- **Question for human:** ...
## Items considered but NOT proposed (feedback-loop guard)
Brief list of candidate signals that were only context-echoes (already in the file, no new first-party
evidence) and were therefore intentionally dropped — so the reviewer can confirm nothing was missed.
## Summary
One paragraph: the week's theme, the highest-value proposed addition, any contradiction needing
attention, and the count of staleness flags.
Phase 4 — Report to chat
End your response with a concise summary: context file + report window reviewed, counts of ADD/MODIFY/FLAG
by confidence, the single highest-value proposed change, any contradictions surfaced, the output document
path, and a reminder that nothing was applied and how to apply (interactive "apply items …").
Quality Checklist
Before finishing, verify:
Notes
- This skill is environment-agnostic. All tenant-specific values (which context file, which reports,
output location) are supplied by the invoking workflow or user — keep this file free of any
tenant-specific identifiers, hostnames, UPNs, or environment names.
- Apply is intentionally out of scope here. Keeping propose and apply as separate phases — with apply
driven by an explicit human instruction — is the safety boundary that prevents an unattended run from
silently rewriting the ground-truth the scans depend on.