تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

context-engineering

Name: Context Engineering
Author: damionrashford

// Context engineering for building production LLM applications: context window management, degradation patterns, optimization strategies, memory system selection, multi-agent architecture, filesystem context patterns, and tool design principles. Use when building LLM apps, RAG pipelines, AI agents, multi-agent systems, or when designing memory, tool APIs, or context strategies for any language model application.

تشغيل في Manus

$ git log --oneline --stat

stars:٢

forks:٠

updated:٩ أبريل ٢٠٢٦ في ٠١:٢١

مستكشف الملفات

3 ملفات

SKILL.md

readonly

related-skills.json

نفس المستودع

claude-code.md

from "damionrashford/mlx"

Comprehensive Claude Code knowledge base — plugins, hooks, skills, agents, MCP, channels, headless mode, permissions, settings, and all extensibility features. Use when building, configuring, debugging, or extending Claude Code.

2026-04-092

skill-creator.md

from "damionrashford/mlx"

Use when building a skill, creating a SKILL.md, packaging a workflow, making a slash command, or asked "how do I make a skill". Scaffolds the folder, generates SKILL.md from a template, validates against spec. Produces a complete ready-to-deploy skill folder: scripts, references, assets. Also use to review or improve an existing skill.

2026-04-092

analyze.md

from "damionrashford/mlx"

Statistical analysis, hypothesis testing, A/B testing, cohort analysis, segmentation, trend detection, business metrics, pre-delivery validation, and data visualization. Use when the user asks to "analyze this data", "run a statistical test", "compare groups", "find trends", "do A/B test analysis", "segment customers", "calculate KPIs", "validate this analysis", "check my work", "sanity check", "review my numbers", "make a chart", "create a dashboard", "plot the data", "visualize results", or mentions hypothesis testing, cohort analysis, business analytics, data validation, bar charts, line charts, heatmaps, scatter plots, or data storytelling.

2026-04-092

autoexperiment.md

from "damionrashford/mlx"

Autonomous time-budget experiment loop. Modify a training script, train for a fixed wall-clock budget, evaluate, record, repeat. Inspired by karpathy/autoresearch. Use for overnight architecture search, systematic hyperparameter sweeps, or any iterative model improvement workflow.

2026-04-092

data-prep.md

from "damionrashford/mlx"

Explore, clean, and engineer datasets end-to-end: statistical profiling, distribution checks, missing value analysis, duplicate detection, outlier removal, type fixing, encoding, create features, encode categories, transform columns, add rolling windows, build interaction terms, and feature engineering. Supports pandas, polars, and PySpark. Use when the user wants to explore data, profile columns, understand a dataset, clean data, handle missing values, remove duplicates, fix data types, preprocess a dataset before modeling, create features, encode categories, transform columns, add rolling windows, build interaction terms, or do feature engineering.

2026-04-092

drift-detect.md

from "damionrashford/mlx"

Detect data drift, concept drift, and model performance degradation in production. Uses PSI, KS-test, and chi-squared for statistical drift, plus evidently and nannyml for automated reports. Use when monitoring a deployed model or comparing training vs production data distributions.

2026-04-092

package.json

"author": "damionrashford"

"repository": "damionrashford/mlx"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

مطوّرو البرمجياتمهن الحاسوب والرياضيات15-1252L4

name	context-engineering
description	Context engineering for building production LLM applications: context window management, degradation patterns, optimization strategies, memory system selection, multi-agent architecture, filesystem context patterns, and tool design principles. Use when building LLM apps, RAG pipelines, AI agents, multi-agent systems, or when designing memory, tool APIs, or context strategies for any language model application.
allowed-tools	Bash, Read, Write, Glob, Grep
argument-hint	describe the LLM system you are building (e.g. "RAG pipeline" or "multi-agent orchestrator")
user-invocable	false
disable-model-invocation	true
model	sonnet
effort	low
compatibility	>=1.0
metadata	{"category":"ai-engineering","tags":["llm","context-window","rag","memory","multi-agent","prompt-engineering","tool-design"],"phase":"build"}

Context Engineering for LLM Applications

Production reference for building reliable, efficient LLM-powered systems. Context engineering is the discipline of curating what enters the model's attention window — not prompt writing, but holistic management of system prompts, tool definitions, retrieved documents, message history, and tool outputs.

The Attention Budget

Context is a finite resource with diminishing returns. Every token added depletes an attention budget shared across all content. The engineering goal is the smallest high-signal token set that achieves the desired outcome.

Critical attention pattern — lost-in-the-middle: Information buried in the center of context receives 10-40% lower recall accuracy than content at the beginning or end. Always place critical instructions and key facts at the start or end of context.

Context ordering for KV-cache efficiency (stable → reusable → unique):

1. System prompt (never changes)
2. Tool definitions (rarely change)
3. Retrieved documents / examples (reused across requests)
4. Conversation history + current query (unique per request)

This ordering maximizes cache hits, dramatically reducing cost and latency.

5 Degradation Patterns

Pattern	What happens	Mitigation
Lost-in-middle	Center content gets less attention	Put critical info at edges
Context poisoning	Hallucination or error enters context, compounds	Validate tool outputs; restart with clean context
Context distraction	Irrelevant info competes for attention budget	Filter ruthlessly; exclude what's not needed now
Context confusion	Model can't tell which context applies to current task	Segment tasks; clear section boundaries
Context clash	Multiple correct sources contradict each other	Priority rules; version filtering

Trigger for compaction: at 70-80% context utilization. Don't wait for failure.

4 Optimization Strategies

1. Compaction

Summarize old context when approaching limits. Preserve: key decisions, file paths, error messages, current task. Never compress the system prompt. Use structured sections to prevent silent loss:

## Session Intent
## Files Modified  
## Decisions Made
## Current State
## Next Steps

Correct metric: tokens-per-task, not tokens-per-request. A compression saving 0.5% more tokens but causing 20% more re-fetching costs more overall.

2. Observation Masking

Tool outputs can be 80%+ of context in agent trajectories. Replace verbose outputs with compact references once their purpose is served:

if len(output) > 2000:
    ref_id = store(output)
    return f"[Result in scratch/{ref_id}.txt — key finding: {summary}]"

3. KV-Cache Optimization

Reorder context with stable elements first. Avoid dynamic content (timestamps, random IDs) in stable sections. Result: 70%+ cache hit rates on repeated requests.

4. Context Partitioning (Sub-Agents)

The most aggressive strategy: split work across isolated context windows. Each sub-agent operates with a clean, focused context — no accumulated history from other tasks. This is the primary reason to use multi-agent architectures.

Memory System Selection

Start simple. Add complexity only when retrieval quality fails:

Stage	Implementation	When to use
Prototype	Plain files + JSON	Always start here — Letta filesystem agents score 74% on LoCoMo, beating Mem0's 68.5%
Scale	Vector store / Mem0	When semantic search is needed or multi-tenant isolation required
Complex reasoning	Zep/Graphiti temporal KG	When relationships + time-travel queries matter (facts that change)
Full control	Letta, Cognee	When agents need to self-manage memory with deep introspection

Key insight: Tool complexity matters less than reliable retrieval. A filesystem agent with good naming conventions outperforms specialized memory tools for most use cases.

Memory layers:

Working: Context window scratchpad — place at attention-favored positions
Short-term: Session files, in-memory cache
Long-term: Key-value store → graph DB for cross-session knowledge
Temporal: Facts with validity periods (prevents context clash from stale data)

Anti-patterns: Stuffing everything in context; ignoring temporal validity; no consolidation strategy.

Multi-Agent Patterns

The 3 Architectures

Supervisor/Orchestrator — central agent delegates to specialists

User → Supervisor → [Researcher, Coder, Fact-checker] → Synthesis → Output

Use when: clear task decomposition, human oversight important. Weakness: supervisor context becomes bottleneck; "telephone game" fidelity loss. Fix: forward_message tool lets sub-agents respond directly to users, bypassing supervisor synthesis.

Peer-to-Peer/Swarm — agents hand off directly

def transfer_to_coder(): return coder_agent  # direct handoff

Use when: flexible exploration, emergent requirements. Benefit: slightly outperforms supervisor when direct response matters.

Hierarchical — strategy → planning → execution layers Use when: large-scale projects with clear organizational structure.

Context Isolation Rule

Sub-agents exist primarily to isolate context, not to simulate organizational roles. Each sub-agent operates in a clean window focused on its subtask.

Token Economics

Architecture	Token multiplier
Single agent chat	1×
Single agent with tools	~4×
Multi-agent system	~15×

Token budget accounts for 80% of performance variance. Upgrading the model provides larger gains than doubling token budget. Design for realistic budgets.

Failure Modes

Supervisor bottleneck: implement output schema constraints so workers return distilled summaries
Divergence: time-to-live limits on agent execution
Error propagation: validate outputs before passing between agents

Filesystem Context Patterns

The filesystem provides unlimited context via just-in-time loading. 6 patterns:

1. Scratch Pad (Tool Output Offloading)

Write large tool outputs to files instead of keeping in context:

# Web search returns 10k tokens → write to file, return 100-token reference
"[Results in scratch/search_001.txt. Key finding: rate limit is 1000 req/min]"
# Agent greps file when needing specific details

2. Plan Persistence

Write plans to structured YAML files. Re-read at each turn to prevent losing track:

# scratch/current_plan.yaml
objective: "Build RAG pipeline"
steps:
  - id: 1
    description: "Embed documents"
    status: completed
  - id: 2
    description: "Build retrieval layer"
    status: in_progress

3. Sub-Agent Communication via Files

Sub-agents write findings directly to files; coordinator reads without message-passing loss:

workspace/agents/researcher/findings.md
workspace/agents/coder/changes.md
workspace/coordinator/synthesis.md

4. Dynamic Skill Loading

Include only skill names in static context. Load full content when relevant:

Available skills (load with read_file when needed):
- rag-patterns: Chunking, embedding, retrieval strategies
- prompt-engineering: Few-shot, chain-of-thought, system prompt design

5. Terminal / Log Persistence

Persist terminal output to files; use grep for targeted extraction rather than loading entire logs.

6. Self-Modification (Use with caution)

Agents write learned preferences to their own instruction files, loaded next session.

Tool Design Principles

The Consolidation Principle

If a human engineer can't definitively say which tool to use in a given situation, an agent can't either. Prefer one comprehensive tool over multiple narrow tools that cover similar ground.

Description Engineering

Every tool description must answer four questions:

What does it do? (specific, not vague)
When to use it? (triggers and contexts)
What inputs does it accept? (types, formats, examples)
What does it return? (format, fields, error conditions)

Architectural Reduction

Sometimes fewer, more primitive tools outperform sophisticated tool collections. Direct filesystem/bash access + good documentation can replace elaborate specialized tools. Ask: does this tool enable new capabilities or constrain reasoning the model could handle?

Tool Collection Limits

10-20 tools is the practical ceiling for most applications. Beyond that, use namespacing to create logical groupings. Tool description overlap causes model confusion.

MCP Naming

Always use fully-qualified names: ServerName:tool_name. Without the server prefix, agents fail to locate tools when multiple MCP servers are available.

Rules

Treat context as a finite resource — exclude anything not needed for the current decision
Place critical instructions at the beginning or end of context (never buried in middle)
Trigger compaction at 70-80% context utilization
Start with filesystem memory; add vector stores or graphs only when retrieval quality demands it
Use sub-agents for context isolation, not for organizational metaphor
Optimize for tokens-per-task, not tokens-per-request
Build minimal tool architectures that benefit from model improvements

context-engineering

المزيد من هذا المستودع

المزيد من هذا المستودع

Context Engineering for LLM Applications

The Attention Budget

5 Degradation Patterns

4 Optimization Strategies

1. Compaction

2. Observation Masking

3. KV-Cache Optimization

4. Context Partitioning (Sub-Agents)

Memory System Selection

Multi-Agent Patterns

The 3 Architectures

Context Isolation Rule

Token Economics

Failure Modes

Filesystem Context Patterns

1. Scratch Pad (Tool Output Offloading)

2. Plan Persistence

3. Sub-Agent Communication via Files

4. Dynamic Skill Loading

5. Terminal / Log Persistence

6. Self-Modification (Use with caution)

Tool Design Principles

The Consolidation Principle

Description Engineering

Architectural Reduction

Tool Collection Limits

MCP Naming

Rules

Context Engineering for LLM Applications

The Attention Budget

5 Degradation Patterns

4 Optimization Strategies

1. Compaction

2. Observation Masking

3. KV-Cache Optimization

4. Context Partitioning (Sub-Agents)

Memory System Selection

Multi-Agent Patterns

The 3 Architectures

Context Isolation Rule

Token Economics

Failure Modes

Filesystem Context Patterns

1. Scratch Pad (Tool Output Offloading)

2. Plan Persistence

3. Sub-Agent Communication via Files

4. Dynamic Skill Loading

5. Terminal / Log Persistence

6. Self-Modification (Use with caution)

Tool Design Principles

The Consolidation Principle

Description Engineering

Architectural Reduction

Tool Collection Limits

MCP Naming

Rules