ワンクリックで
best-practices-skills
Best practices for designing and structuring agent skills: SKILL.md frontmatter rules, triggers, progressive disclosure, and when to use scripts vs references.
Best practices for designing and structuring agent skills: SKILL.md frontmatter rules, triggers, progressive disclosure, and when to use scripts vs references.
| name | best-practices-skills |
| description | Best practices for designing and structuring agent skills: SKILL.md frontmatter rules, triggers, progressive disclosure, and when to use scripts vs references. |
| triggers | ["best practices skills","skill structure","skill design","skill frontmatter","skill template","skill checklist"] |
| metadata | {"short-description":"Skill structure and design patterns"} |
| provides | ["skill-validation","skill-scaffolding","composition-rules","misuse-guard-template","project-state-readiness-pattern"] |
| composes | ["task-monitor","monitor-misuse","memory"] |
| taxonomy | ["validation","compliance","composition","self-improvement"] |
Use this skill when creating or reviewing skills under .pi/skills/.
/memory is the ONLY skill that accesses ArangoDB directly/ops-arango handles admin ops (backups, indexes, migrations)monitor-memory has read-only exception for health probes (documented)memory/run.sh subcommands:
memory recall — semantic + BM25 searchmemory learn — store lessons/datamemory sample — random document samplingmemory tag — post-insert tag stampingmemory count — collection statisticsmemory archive-session — episodic archivalfrom arango import ArangoClientsys.path.insert(0, MEMORY_PATH)/_api/cursor callsThe root NVMe is for CODE ONLY. All heavy artifacts MUST live on the 12TB drive
and be symlinked back. This is enforced by /skills-broadcast and /ops-workstation.
/mnt/storage12tb| Category | Examples | Storage Path |
|---|---|---|
| Model weights | .safetensors, .gguf, .bin, .pt | /mnt/storage12tb/skills/<skill-name>/models/ |
| Training logs | RVC logs, checkpoints, tensorboard | /mnt/storage12tb/skills/<skill-name>/logs/ |
| Extracted data | extracted_runs/, PDF extractions | /mnt/storage12tb/skills/<skill-name>/extracted_runs/ |
| Generated outputs | batch results, GRPO outputs | /mnt/storage12tb/skills/<skill-name>/outputs/ |
| Datasets | training data, WAV files, corpora | /mnt/storage12tb/skills/<skill-name>/data/ |
| Work dirs | temp processing, intermediate files | /mnt/storage12tb/skills/<skill-name>/work/ |
| Backups | .backups/, snapshots | /mnt/storage12tb/backups/<project>/ |
/skills-broadcastThese directories are excluded from rsync and must not exist as real directories
in skill folders (only as symlinks to /mnt/storage12tb/):
.venv, node_modules, __pycache__, models, rvc, outputs, logs, data,
pods, extracted_runs, work, weights, checkpoints, artifacts, sessions,
papers, datasets, *.safetensors, *.gguf, *.bin, *.pt
# 1. Create the storage location on 12TB drive
mkdir -p /mnt/storage12tb/skills/<skill-name>/models
# 2. Move existing data (if any)
mv /path/to/skill/models/* /mnt/storage12tb/skills/<skill-name>/models/
# 3. Remove the directory and create symlink
rmdir /path/to/skill/models
ln -s /mnt/storage12tb/skills/<skill-name>/models /path/to/skill/models
/skills-broadcast sanity FAILS if any skill has non-symlinked dirs >100MB/ops-workstation slim reports storage policy violations.gitignore in every skill should exclude heavy artifact patternsSKILL.md at the root.SKILL.md must start with YAML frontmatter (no code fences).--- on line 1 and closing --- on its own line.name and description.description should contain explicit trigger contexts (what users will say).SKILL.md concise; move large content into references/ or scripts/.provides: (composable primitives with human developer audiences). SKILL.md is for agents; README.md is for humans browsing the directory.Skills are like chemical elements — they bind to each other through defined interfaces.
The provides: and composes: frontmatter fields declare a skill's valence shell:
what it offers to others and what it needs from others.
---
name: my-skill
description: >
What this skill does and trigger phrases.
triggers:
- natural language phrase users will say
- another trigger phrase
provides:
- capability-a # What this skill outputs/offers
- capability-b
composes:
- memory # Skills this delegates to (by name)
- scillm
- extractor
---
| Field | Type | Required | Description |
|---|---|---|---|
triggers | list[str] | Yes | Natural-language phrases users will say. Parsed at runtime by skill-selector extension for BM25-style matching. Skills without triggers are invisible to implicit routing. |
provides | list[str] | Yes | Capabilities this skill makes available. Used by /skill-lab gap detector. |
composes | list[str] | Yes | Skills this skill delegates to via subprocess/import. Empty list [] if self-contained. Parsed at runtime by skill-selector extension for dependency expansion — when a skill is selected, its composes deps are automatically included in context. |
taxonomy | list[str] | Recommended | Federated taxonomy bridge tags for multi-hop discovery via /memory. Uses standard vocabulary: precision, resilience, fragility, corruption, loyalty, stealth, plus domain tags. |
The .pi/extensions/skill-selector.ts extension reads frontmatter at session start:
triggers → Built into an inverted token index. When users type natural language
(no /skill-name ref), the extension scores the prompt against triggers+descriptions
to select relevant skills. Skills without triggers are invisible to implicit routing.composes → Parsed into a dependency map. When a skill is selected (explicitly or
via trigger match), all its composes dependencies are automatically pulled into context.
This replaced a hardcoded static map (Feb 2026) — the extension now reads live frontmatter.provides → Used by /skill-lab for gap detection and capability graph traversal.
Not yet consumed by skill-selector (future: reverse-index for "I need X capability" queries).composes:.provides:.composes: [].provide: [skill-creation].| Capability | Skills that provide it |
|---|---|
llm-completion | scillm, codex |
embedding | embedding |
memory-recall | memory |
memory-learn | memory |
web-search | brave-search, dogpile |
pdf-extraction | extractor, review-pdf |
security-scan | hack, security-scan |
skill-creation | prompt-lab, gpt-lab, classifier-lab |
skill-validation | skills-ci, best-practices-skills |
competitive-selection | battle |
hardening | anvil |
docker-isolation | battle, hack |
human-interview | interview |
task-planning | plan |
task-orchestration | orchestrate |
taxonomy-tagging | taxonomy |
progress-tracking | task-monitor |
New capabilities should be added to references/capability_vocabulary.yml.
Skills SHOULD be registered in /memory as nodes in the knowledge graph.
This enables multi-hop traversal — when /skill-lab needs a capability,
it can traverse composes edges to find transitive dependencies, just like
/memory traverses relates_to edges for knowledge discovery.
skill:extractor ──composes──► skill:memory
──composes──► skill:scillm
──provides──► capability:pdf-extraction
skill:learn-datalake ──composes──► skill:extractor
──composes──► skill:review-pdf
──composes──► skill:memory
This is analogous to chemical bonding — the graph reveals which elements
naturally form molecules. /taxonomy tags provide the bridge keywords
that enable cross-domain discovery (a security skill and an extraction skill
might share taxonomy:validation tags).
Registration pattern:
from common.memory_client import learn, MemoryScope
# Register skill as a knowledge node
learn(
problem=f"What does {skill_name} provide?",
solution=f"Provides: {', '.join(provides)}. Composes: {', '.join(composes)}",
scope=MemoryScope.OPERATIONAL,
tags=["skill_registry", skill_name] + provides,
)
See references/rules.yml for the complete machine-parseable rule set
that /skills-ci and /skill-lab validate against.
See references/composition_manifest.yml for the schema /skill-lab uses
when planning new composite skills.
Run ./sanity.sh in this skill to enforce the strict frontmatter gate across all skills.
Progressive disclosure
name, description) for routing.SKILL.md body for the workflow map.scripts/, references/, assets/ for details on demand.Guardrails vs freedom
SKILL.md + references/scripts.Single source of truth
references/.SKILL.md should point to references, not duplicate them.Project state transparency
NOT_TESTED, NOT_ESTABLISHED,
NEEDS_ATTENTION, or BLOCKED when evidence is missing.Skills that orchestrate other skills, external services, Docker stacks, or long-running agent workflows SHOULD provide a machine-readable project state report and a human HTML view. The goal is to make current project state inspectable at a glance and prevent agents from hallucinating readiness that was not proven.
| Concept | Requirement |
|---|---|
| Overall readiness | One of READY, USABLE_WITH_GAPS, NOT_READY, NOT_ESTABLISHED |
| Profile | Explicit profile such as smoke, core-live, or release |
| User attention | Missing config, credentials, review gates, or ambiguous decisions |
| Feature readiness | One row per user-facing feature, not only one row per command |
| Claim coverage | README/SKILL claims mapped to cases and evidence |
| Project knowledge | Skill-specific current-state document maintained by /project-knowledge |
| Execution status | Did the command run? |
| Assertion status | Did the expected checks pass? |
| Feature readiness | Is the user-facing feature usable? |
| Coverage gaps | Untested claims and skipped cases, separate from bugs |
| Artifact validation | Existence, schema, and status registration checks |
| Liveness | Event-tail/SSE/progress liveness, not only subprocess timeout |
<skill-artifacts>/readiness/<run-id>/
report.json # source of truth, schema-versioned
index.html # human view over report.json
report.md # optional text handoff
index.html is a view, not the source of truth. Other tools should consume
report.json.
{
"schema": "skill.readiness_report.v1",
"profile": "release",
"overall_readiness": "not_ready",
"release_readiness": "not_ready",
"needs_attention": [
{
"reason": "missing_config",
"safe_default": "do_not_claim_release_ready",
"resume_hint": "./run.sh config init"
}
],
"features": [
{
"id": "argue",
"readiness": "partial",
"required_cases": 2,
"passed_cases": 1,
"coverage_gaps": ["No evidence-backed successful verdict case"]
}
],
"cases": [
{
"id": "argue-insufficient-evidence-fail-closed",
"feature": "argue",
"case_type": "negative-control",
"execution_status": "pass",
"assertion_status": "pass",
"readiness_contribution": "safe_failure_only"
}
]
}
| Profile | Purpose | Release implication |
|---|---|---|
smoke | Fast local sanity checks | Never establishes release readiness |
core-live | Main interactive live paths | Can show usable-with-gaps only |
release | Full user-facing readiness | No skipped required checks allowed |
feature:<name> | Focused debug profile | Establishes only that feature |
Reports may run smaller profiles, but the top banner must still state release
readiness honestly. For example: Release readiness: NOT_ESTABLISHED because SPARTA and deployment checks were not run.
Complex skills SHOULD include a config layer:
<skill>.config.yml.example # documented defaults, no secrets
<skill>.config.yml # local non-secret config, gitignored when appropriate
.env # secrets only
Required commands:
./run.sh config doctor --json # non-interactive, CI-safe
./run.sh config init # interactive; may call /interview
config doctor MUST NOT prompt. It returns needs_attention with a
safe_default and resume_hint when config is missing. config init MAY use
/interview to collect missing values from the human.
Complex skills SHOULD maintain a skill-specific project knowledge document:
docs/PROJECT_KNOWLEDGE.md
This document is the curated current-state projection for the skill: recent architecture decisions, known gaps, active readiness blockers, companion skill assumptions, deployment notes, and validation evidence. It prevents agents from reconstructing project state from stale README text, stale memory snippets, or optimistic inference.
Required practices:
/project-knowledge after durable readiness or architecture changes.docs/PROJECT_KNOWLEDGE.md.NOT_ESTABLISHED or NEEDS_ATTENTION.If a skill claims to be usable by other developers, release readiness SHOULD include Docker deployment:
docker-compose.yml or documented compose include.needs_attention./mnt/storage12tb
or developer-configured external volumes.config doctor gates over optimistic startup logs.READY because its command merely exited zero.found, observed, executed, not established.needs_attention, not inferred defaults.--- on standalone lines.name matches the directory name.description contains clear trigger phrases, uses YAML fold syntax (>) — never inline.triggers list contains natural-language phrases users will say. Required — skills without triggers are invisible to implicit routing via skill-selector.provides list declares capabilities this skill outputs. Required.composes list declares all skills this delegates to. Required (use [] if self-contained). Parsed at runtime for automatic dependency inclusion.run.sh exists only if the skill needs execution.sanity.sh exists if the skill runs non-trivial scripts.sanity.sh for non-trivial skills MUST include behavioral acceptance gates,
not only import/CLI smoke checks: at least one positive-control fixture, one
negative-control/noise fixture, safety-boundary assertions for forbidden side
effects, and concrete artifact/schema assertions for the skill's claimed
outputs.memory, dogpile, ask, scillm, or surf MUST also provide an
opt-in live E2E gate (sanity-live.sh, sanity-e2e.sh, sanity-webgpt.sh,
or scripts/live_e2e.py). The live gate must call the real downstream skill
entrypoints, persist machine-readable proof artifacts, and fail closed when a
required downstream receipt is missing. A generated request file is not a live
proof.typer. NEVER argparse or click.pyyaml (not a fallback regex parser). The > and | YAML block scalars, nested
objects, and multi-line strings are only reliably parsed by a real YAML parser.See PATTERNS.md for anti-patterns, runtime integration patterns, task-monitor, NDJSON streaming, self-correction, quality gates, memory integration, human-in-the-loop, and templates.
---
name: my-skill
description: Does something useful
provides: [my-capability]
composes: [memory]
---
---
name: my-skill
description: Does something useful
triggers:
- do the useful thing
- run my-skill
provides: [my-capability]
composes: [memory]
---
from arango import ArangoClient
client = ArangoClient(hosts="http://127.0.0.1:8529")
.pi/skills/memory/run.sh recall --q "query" --collections lessons
.pi/skills/memory/run.sh learn --problem "X" --solution "Y"
def custom_bm25_search(query, docs): # /memory already does this
...
composes:
- memory # use /memory recall for search
Project agents in 2026 do not thoroughly read SKILL.md files. Skills that expose HTTP endpoints or APIs MUST implement server-side misuse detection with helpful error messages. Documentation alone is insufficient — the API itself must teach correct usage.
| Tier | Type | Examples | Misuse Handling |
|---|---|---|---|
| 1 | One-shot script | /create-icon, /png-svg-converter | None — fail fast, error is obvious |
| 2 | CLI with params | /extractor, /embedding | Typer handles it; add --help examples |
| 3 | Daemon/socket | /memory, /inference | Basic input validation in handler |
| 4 | HTTP API | /scillm, /fetcher | Full misuse detection required |
| 5 | Multi-agent orchestrator | /orchestrate, /battle | Full detection + state machine guards |
Rule: If agents call it programmatically (not via slash command), it needs defensive handling.
For Tier 4-5 skills, use the reusable template: references/misuse_guard_template.py
When a caller misuses your API, don't return a generic error. Return an error that:
| Misuse Category | Detection | Response |
|---|---|---|
| Missing required params | Presence check | 400 + list required params + example |
| Wrong param types | Type check | 400 + expected type + example |
| Unknown params | Schema validation | Warning log (don't reject, be tolerant) |
| Batch overload | Queue depth/concurrency | 429 + chunk size recommendation |
| Timeout-prone input | Size/complexity check | Warning header or 413 + size limits |
| Provider mismatch | Feature/provider compatibility | 400 + correct provider/format hint |
| Service not running | (client-side preflight) | Document in SKILL.md troubleshooting |
/scillm implements comprehensive misuse detection — use it as a template:
# In your validation middleware or endpoint handler:
def validate_request(body: dict, model: str) -> None:
"""Validate request and return helpful errors."""
# 1. Auto-fix common mistakes (tolerant)
if isinstance(body.get("messages"), str):
logger.warning("Auto-wrapping string messages as list")
body["messages"] = [{"role": "user", "content": body["messages"]}]
# 2. Strip problematic params with warning (tolerant)
if "max_tokens" in body:
logger.warning(f"Stripping max_tokens — causes empty output")
del body["max_tokens"]
# 3. Reject incompatible combinations (strict with guidance)
if has_inline_data(body) and not model.startswith("gemini"):
raise HTTPException(
400,
f"inlineData only works with Gemini. You used model='{model}'. "
f"For other providers, use image_url format. See SKILL.md."
)
# 4. Detect resource exhaustion (early warning)
queue_depth = get_queue_depth(provider)
if queue_depth > 100:
raise HTTPException(
429,
f"BATCH MISUSE: {queue_depth} requests queued. "
f"Use chunked processing: process 4 at a time, wait, repeat. "
f"Example: for chunk in chunks(items, 4): await gather(*chunk)"
)
raise HTTPException(400, "Bad Request")
raise HTTPException(
400,
f"Unknown model '{model}'. Available: text, vlm, local-text. "
f"Usage: POST /v1/chat/completions with model='text'. "
f"This is an HTTP API — call via httpx, not import."
)
if not valid:
return None # Caller has no idea what went wrong
if not valid:
raise HTTPException(
400,
f"Invalid format. Expected: {expected_format}. "
f"Got: {actual_format}. See SKILL.md examples."
)
# 400 requests fire at once, most timeout after 5 minutes
results = await gather(*[call(x) for x in all_400_items])
if queue_depth > THRESHOLD:
raise HTTPException(429, f"Too many queued ({queue_depth}). Use chunking.")
# BAD: This list becomes stale as collections are added/removed
VALID_COLLECTIONS = {"sparta_qra", "lessons_v2", "personas", "checkpoints", ...}
def validate(collection):
if collection not in VALID_COLLECTIONS: # ← False negatives for new collections
raise HTTPException(400, f"Unknown collection")
# GOOD: Only catches specific mistakes agents make, never false negatives
COLLECTION_CORRECTIONS = {
"sparta_qras": "sparta_qra", # plural → singular
"sparta_control": "sparta_controls", # singular → plural (this one IS plural)
"lesons": "lessons_v2", # typo
}
def validate(collection):
if collection in COLLECTION_CORRECTIONS:
correct = COLLECTION_CORRECTIONS[collection]
raise HTTPException(400, f"'{collection}' is wrong. Use '{correct}'.")
# Unknown collections pass through — actual existence checked by db.has_collection()
Do NOT create a shared misuse guard module imported across skills. Skills should be self-contained and portable. The correct pattern is copy and adapt from the template.
| Concern | Problem with shared script |
|---|---|
| Coupling | Skills should be self-contained and portable |
| Different projects | Skills may live in separate repos/directories |
| Import paths | sys.path hacks are fragile across containers/venvs |
| Skill-specific validators | Each skill has unique misuse patterns |
| Versioning | Change to shared breaks all skills simultaneously |
~/.pi/skills/best-practices-skills/references/
└── misuse_guard_template.py ← TEMPLATE (copy and adapt)
Each Tier 4-5 skill that needs it:
├── skill-a/src/.../app/_misuse_guard.py ← local copy, skill-specific validators
├── skill-b/src/.../proxy/validation.py ← inline validation (alternative)
└── skill-c/.../_misuse_guard.py ← local copy, different validators
The template at references/misuse_guard_template.py includes:
MisuseGuard class — Copy as-is, configure skill_name and thresholdsrequire_non_empty(field, example) — Required param validationauto_wrap_string_as_list(field) — Tolerant type coercionreject_incompatible(condition, message) — Feature combination guardswarn_and_strip(field, reason) — Strip problematic paramsdetect_batch_abuse(get_queue_depth, threshold) — Queue overload detectioncorrect_value(field, corrections) — Map misuse patterns → corrections/memory for nightly analysisAll misuse events are logged to the misuse_events collection in /memory. The
nightly /monitor-misuse job analyzes these events across all skills to:
Storage location: POST /store with collection: "misuse_events"
Two implementations depending on context:
| Context | Implementation | Why |
|---|---|---|
| Inside /memory service | Direct ArangoDB write | No HTTP round-trip, already have db connection |
| Other skills (template) | httpx POST /store to memory socket | Standard API access |
# Template version (for skills OUTSIDE /memory):
def log_misuse_event(skill, endpoint, error_type, sent_value, correct_value=None):
import httpx
transport = httpx.HTTPTransport(uds="/run/user/1000/embry/memory.sock")
with httpx.Client(transport=transport, base_url="http://localhost", timeout=2.0) as client:
client.post("/store", json={
"document": {
"_key": hashlib.sha256(f"{skill}:{endpoint}:{error_type}:{sent_value}".encode()).hexdigest()[:16],
"skill": skill,
"endpoint": endpoint,
"error_type": error_type,
"sent_value": sent_value,
"correct_value": correct_value,
"was_known": correct_value is not None,
"ts": int(time.time()),
},
"collection": "misuse_events",
})
# /memory version (direct ArangoDB, avoids HTTP):
def _log_misuse_event(endpoint, error_type, sent_value, correct_value=None):
from ...arango_client import get_db
db = get_db()
coll = db.collection("misuse_events")
# ... direct insert/update
Event schema:
| Field | Type | Description |
|---|---|---|
_key | str | Hash of skill:endpoint:error_type:sent_value (dedupes) |
skill | str | Skill name (memory, scillm, fetcher) |
endpoint | str | Endpoint path (/store, /v1/chat/completions) |
error_type | str | Category (wrong_collection, missing_param, batch_abuse) |
sent_value | str | What the caller sent |
correct_value | str? | What they should have sent (None if unknown) |
was_known | bool | True if we had a correction pattern |
ts | int | Unix timestamp |
count | int | Occurrence count (incremented on duplicates) |
# 1. Copy template to your skill
cp ~/.pi/skills/best-practices-skills/references/misuse_guard_template.py \
./src/my_skill/_misuse_guard.py
# 2. Import and configure
from ._misuse_guard import MisuseGuard, require_non_empty, reject_incompatible
guard = MisuseGuard(skill_name="my-skill")
# 3. Add skill-specific validators
guard.add_validator("input", require_non_empty("input", "example text"))
guard.add_validator("no_conflicting_flags", reject_incompatible(
lambda body: body.get("flag_a") and body.get("flag_b"),
"flag_a and flag_b are mutually exclusive"
))
# 4. Define skill-specific schema
schema = {
"input": {"required": True, "type": str, "example": "your input here"},
"format": {"required": False, "type": str, "example": "json"},
}
# 5. Use in endpoint
@app.post("/v1/my-endpoint")
async def endpoint(body: dict):
body = guard.validate(body, schema=schema)
# ... rest of handler
When you discover a new misuse pattern in any skill:
_misuse_guard.pybest-practices-skills/references/misuse_guard_template.pyThis is the copy-up pattern — local innovation, then standardize for future skills.
Misuse events are logged to the misuse_events collection. The /monitor-misuse
skill runs nightly to:
_misuse_guard.pyAll skills with misuse guards
│
│ log_misuse_event()
▼
misuse_events collection
│
│ nightly /monitor-misuse analyze
▼
misuse_corrections collection
│
│ human review + /monitor-misuse apply
▼
skill/_misuse_guard.py updated
To enable auto-apply for your skill, register it in /monitor-misuse/scripts/skill_registry.py:
SKILL_GUARDS = {
"memory": SkillMisuseGuard(
skill_name="memory",
guard_path=Path("/path/to/skill/_misuse_guard.py"),
corrections_var="COLLECTION_CORRECTIONS", # or MODEL_CORRECTIONS, etc.
),
}
This closes the loop: agents make mistakes → events logged → patterns detected → corrections proposed → guards updated → future agents get helpful errors.
# DON'T DO THIS
import sys
sys.path.insert(0, "/home/user/.pi/skills/shared-utils/")
from shared_misuse_guard import MisuseGuard # Fragile, not portable
# Each skill has its own copy with skill-specific validators
from ._misuse_guard import MisuseGuard, require_non_empty
guard = MisuseGuard(skill_name="this-skill")
# Overkill for the current skill ecosystem
from pi_skill_utils import MisuseGuard # Adds dependency management overhead
# Template is self-contained, just copy it
cp ~/.pi/skills/best-practices-skills/references/misuse_guard_template.py \
./my_skill/_misuse_guard.py
Infrastructure skills that other projects call programmatically (scillm, memory, fetcher, etc.)
SHOULD provide an assess capability that audits external code for correct usage.
Each infrastructure skill documents:
Then exposes an assess command that checks external code against these patterns.
./run.sh assess <file># scillm assess — checks LLM API usage
./run.sh assess /path/to/script.py
# Output:
# ✅ HTTP API (not import)
# ✅ Model aliases used (text, vlm)
# ✅ No max_tokens
# ✅ Chunked batching (CHUNK_SIZE=4)
# ⚠️ Missing response_format on JSON call (line 89)
# ⚠️ No preflight check before batch
# memory assess — checks memory API usage
./run.sh assess /path/to/script.py
# Output:
# ✅ Uses Unix socket transport
# ✅ Reads data["items"] (not "results")
# ⚠️ subprocess.run in loop (line 78) — use httpx client
# ⚠️ /store without tags (line 134) — multi-hop won't find it
| Service | Misuse Patterns to Detect |
|---|---|
| scillm | Import instead of HTTP; fire-all-at-once batching; max_tokens; missing response_format; no preflight |
| memory | data["results"] instead of items; TCP instead of Unix socket; subprocess loops; raw AQL; /learn (deprecated) |
| fetcher | Missing timeout; no retry logic; blocking calls in async context |
| embedding | Wrong dimension (384 required); batch size too large; missing normalization |
assess subcommand in run.sh that greps/parses external code# In run.sh, add: assess) python3 scripts/assess_usage.py "$2" ;;
# scripts/assess_usage.py
import re
import sys
import json
from pathlib import Path
PATTERNS = [
{
"name": "fire_all_at_once",
"pattern": r"asyncio\.gather\(\*\[.*for.*in.*all_",
"severity": "error",
"message": "Fires all requests at once — use CHUNK_SIZE batching",
"fix": "for i in range(0, len(items), CHUNK_SIZE): await gather(*chunk)"
},
{
"name": "missing_response_format",
"pattern": r'"model":\s*\w+.*"messages".*(?!"response_format")',
"severity": "warning",
"message": "JSON expected but response_format not set",
"fix": 'Add "response_format": {"type": "json_object"}'
},
]
def assess(file_path: str) -> list[dict]:
content = Path(file_path).read_text()
issues = []
for p in PATTERNS:
for m in re.finditer(p["pattern"], content, re.MULTILINE | re.DOTALL):
line = content[:m.start()].count("\n") + 1
issues.append({
"file": file_path,
"line": line,
"pattern": p["name"],
"severity": p["severity"],
"message": p["message"],
"fix": p["fix"],
})
return issues
if __name__ == "__main__":
issues = assess(sys.argv[1])
print(json.dumps({"issues": issues, "passed": len(issues) == 0}, indent=2))
# CI workflow: assess all files that changed and use scillm
git diff --name-only main | xargs grep -l "localhost:4001\|SCILLM" | while read f; do
~/.pi/skills/scillm/run.sh assess "$f"
done
# Agent self-check before running batch
if ! ~/.pi/skills/scillm/run.sh assess scripts/generate_qras.py | jq -e '.passed'; then
echo "Fix usage issues before running batch"
exit 1
fi
| Skill | Status | Location |
|---|---|---|
| scillm | Implemented | ~/.pi/skills/scillm/scripts/assess_usage.py |
| memory | Implemented | ~/.pi/skills/memory/scripts/assess_usage.py |
| fetcher | Planned | — |
| embedding | Planned | — |
Unified browser automation for AI agents. Uses surf-cli extension when available (full features), falls back to CDP (zero-config). Navigate, read with element refs, click, type, screenshot.
Universal LLM proxy on localhost:4001. Surfaces: chat/batch completions, scillm exec, OpenCode serve (coding delegate), OpenCode transport (DAG/SSE), standing Codex agents. Chutes, Gemini, Claude/Codex OAuth, OpenCode Go, Ollama. Auto-routes by model name. ZIP/PDF, JSON repair, batch pools.
Zero cognitive-load learning and querying skill. Learn about a topic or persona (e.g., "Lisa Feldman Barrett") by discovering, ingesting, and extracting knowledge — or ask questions against what's been learned. Supports multi-hour deep learning with progress tracking, persona profiles, and nightly incremental updates. Uses Federated Taxonomy for multi-hop graph traversal across knowledge domains. Composes: dogpile, discover-books, ingest-youtube, fetcher, extractor, memory, taxonomy, task-monitor.
Copy the last complete Cursor user/assistant turn to the clipboard (Codex-style /copy). On modern Cursor Agent installs reads ~/.cursor/projects/*/agent-transcripts/*.jsonl when SQLite bubbleId rows are absent. Use for ccopy, cursor copy, or export last Cursor turn.
Validate ask/scillm DAG JSON and render PHART 1.5 ASCII decision-tree charts for terminals and dry-run output. DAG.json in → chart on stdout or actionable errors on stderr (no tracebacks). Python 3.14+ with PHART from github.com/scottvr/phart.
Force the project-agent to use a real debugger instead of guessing: set breakpoints where the problem might be, stop execution at those breakpoints, inspect live variable state, and analyze the observed runtime state before patching. Use when a project agent is stuck, sees confusing or repeated failures, suspects state mutation, routing, async, serialization, cache, closure, test fixture, UI/backend mismatch, or any bug where logs/static reading would lead to speculation. Use before further patching after two failed attempts or whenever the user asks for debugger, breakpoints, debug mode, variable state, inspect locals, step through, VS Code debugger, or prove runtime behavior.