| id | 65-context-token-optimization-retrieval-playbook-for-ai |
| name | Retrieval Playbook for AI |
| description | Strategies for efficient context retrieval for AI - what to include, what to exclude, and how to prioritize |
| version | 1.0.0 |
| status | Active |
| owner | AI Engineering Team |
| last_updated | "2025-02-15T00:00:00.000Z" |
| category | AI-RAG |
| tags | ["retrieval","context","ai","token-optimization","prioritization"] |
| stack | ["Not Applicable"] |
| difficulty | Intermediate |
Retrieval Playbook for AI
Skill Profile
Overview
Playbook for deciding what context to retrieve for AI - select only what's needed, exclude what's irrelevant, and prioritize appropriately.
Why This Matters
- Relevance: Only include what's relevant
- Token efficiency: Don't waste tokens on irrelevant information
- Better results: AI can focus better
- Faster: Less context = faster
Core Concepts & Rules
1. Decision Tree
Task received
ā
Is it code-related?
āā Yes ā Retrieve code files
āā No ā Skip code
ā
Need current state?
āā Yes ā Include recent changes
āā No ā Skip history
ā
Need examples?
āā Yes ā Include 1-2 examples
āā No ā Skip examples
ā
Assemble context
2. Retrieval Rules
Rule 1: Only Direct Dependencies
ā Include entire codebase
ā
Include only files directly related
Example:
Task: Fix bug in auth.ts
Include:
- auth.ts (the file)
- types.ts (if auth.ts imports it)
- config.ts (if auth.ts uses it)
Don't include:
- unrelated files
- test files (unless debugging tests)
- documentation
Rule 2: Snippets Over Full Files
ā Include entire 500-line file
ā
Include 20-line relevant function
Example:
Task: Fix validateToken function
Include:
- validateToken function (lines 45-60)
- Related types (5 lines)
- Helper functions used (10 lines)
Total: ~35 lines vs 500 lines
Savings: 93%
Rule 3: Recent Over Old
ā
Last 3 commits
ā Full git history
ā
Current implementation
ā Deprecated code
Rule 4: Examples Only When Needed
Include examples when:
ā New pattern/concept
ā Complex logic
ā User explicitly asks
Skip examples when:
ā Simple CRUD
ā Standard patterns
ā Self-explanatory
3. Context Prioritization
Priority 1: Critical (Always Include)
- File with bug/feature
- Error messages
- Relevant types/interfaces
- Direct dependencies
Priority 2: Important (Include if Space)
- Related functions
- Configuration
- Recent changes
- Test cases
Priority 3: Nice-to-Have (Usually Skip)
- Full documentation
- Examples
- Comments
- Historical context
Inputs / Outputs / Contracts
Inputs
- Task or problem description
- Code repository
- File system
- Documentation
- Error messages and stack traces
Outputs
- Optimized context for AI
- Relevant code snippets
- Prioritized information
- Token-efficient retrieval
- Quality maintained
Contracts
- Input Validation: All inputs must be valid task descriptions
- Output Format: Context follows retrieval playbook standards
- Token Budget: Retrieved context respects configured limits
- Relevance Guarantee: Only directly relevant information included
- Priority Enforcement: Critical information prioritized over nice-to-have
Skill Composition
Quick Start
Quick Checklist
Before retrieving, ask:
ā Is this directly related to task?
ā Can I use a snippet instead of full file?
ā Is this the most recent version?
ā Will AI actually use this?
ā Am I under my token budget?
If any answer is "no", reconsider including it.
Assumptions
- AI models have token limits
- Context needs to be efficient
- Relevance is more important than completeness
- Token cost is a consideration
- Quality must be maintained
Compatibility
- AI Models: GPT-4, Claude, etc.
- Repositories: Git, SVN, etc.
- File Types: Code, docs, configs
- Languages: All programming languages
Test Scenario Matrix
| Scenario | Input | Expected Output | Verification |
|---|
| Bug fix | Error message | Relevant code snippet | Token count reduced |
| Feature dev | Feature spec | Similar features + types | Token count reduced |
| Code review | PR diff | Changed files only | Token count reduced |
| Refactoring | File to refactor | File + dependencies | Token count reduced |
Technical Guardrails
Retrieval Requirements
- All context MUST be directly relevant
- All context MUST use snippets over full files
- All context MUST be recent and accurate
- All context MUST follow priority levels
Filtering Requirements
- All irrelevant files MUST be excluded
- All outdated information MUST be excluded
- All redundant information MUST be excluded
- All unnecessary examples MUST be excluded
Quality Requirements
- All critical info MUST be included
- All context MUST be accurate
- All context MUST be complete for task
- All context MUST be well-organized
Security Threat Model
Threats Addressed
- Token waste: Efficient retrieval
- Information overload: Prioritization
- Context overflow: Size limits enforced
- Quality degradation: Structured retrieval
Mitigation Strategies
- Use decision tree for retrieval
- Implement filtering by relevance
- Follow priority levels
- Monitor token usage
- Validate context quality
Domain-Specific Modules
Context Retriever Module
export interface ContextRetriever {
retrieve(task: string, repository: string): Promise<ContextPack>;
}
export class SmartRetriever implements ContextRetriever {
async retrieve(task: string, repository: string): Promise<ContextPack> {
const files = await this.getFiles(repository);
const relevant = this.filterByRelevance(files, task);
const prioritized = this.prioritize(relevant);
return this.assemble(prioritized);
}
private async getFiles(repo: string): Promise<File[]> {
return [];
}
private filterByRelevance(files: File[], task: string): File[] {
return files.filter(f => this.calculateRelevance(f, task) > 0.7);
}
private prioritize(files: File[]): File[] {
return files.sort((a, b) => b.priority - a.priority);
}
private assemble(files: File[]): ContextPack {
return {
summary: this.generateSummary(files),
code: this.extractSnippets(files),
types: this.extractTypes(files),
};
}
}
Relevance Scorer Module
export function calculateRelevance(file: File, task: string): number {
let score = 0;
if (file.path.includes(task)) score += 0.5;
if (file.lastModified > sevenDaysAgo) score += 0.2;
score += Math.max(0, 1 - file.lines / 500);
return Math.min(1, score);
}
Token Budget Module
export interface TokenBudget {
maxTokens: number;
usedTokens: number;
remainingTokens: number;
}
export function checkBudget(context: ContextPack, budget: number): TokenBudget {
const tokens = estimateTokens(context);
return {
maxTokens: budget,
usedTokens: tokens,
remainingTokens: budget - tokens,
};
}
Release, Rollback & Ops Notes
Release Process
- Define retrieval strategies
- Implement filtering logic
- Create prioritization rules
- Test with sample tasks
- Deploy to production
- Monitor retrieval efficiency
- Adjust strategies as needed
Rollback Procedure
- Revert retrieval changes
- Restore previous retrieval logic
- Monitor quality impact
- Roll back if necessary
Operational Procedures
- Retrieval monitoring: Track token usage
- Quality checks: Verify AI comprehension
- Strategy updates: Improve based on feedback
- Budget management: Enforce token limits
Code Quality & Documentation
Retrieval Standards
- Use decision tree for selection
- Filter by relevance and recency
- Prioritize critical information
- Use snippets over full files
- Track token savings
Documentation Requirements
- Document retrieval strategies
- Provide examples for each rule
- Include prioritization guidelines
- Track token savings
- Document best practices
Agent Directives & Error Recovery
Agent Behavior Rules
- Always use decision tree for retrieval
- Always filter by relevance
- Always prioritize critical information
- Always use snippets over full files
- Always track token usage
Error Recovery Patterns
| Error Type | Detection | Recovery |
|---|
| Poor retrieval | AI can't complete task | Add more context |
| Token limit | Context too large | Remove nice-to-have items |
| Quality issue | AI response degraded | Add critical information |
Agent Prompt Pack
Retrieval Prompts
"Retrieve context for {task} that:
- Uses decision tree for selection
- Filters by relevance and recency
- Prioritizes critical information
- Uses snippets over full files
- Stays under token budget"
Filtering Prompts
"Filter files for {task} that:
- Includes only directly related files
- Uses snippets over full files
- Excludes outdated information
- Excludes redundant information
- Prioritizes by relevance"
Prioritization Prompts
"Prioritize context for {task} that:
- Puts critical info first
- Includes important info if space allows
- Excludes nice-to-have items
- Follows hierarchy levels
- Maintains quality"
Definition of Done
Retrieval is complete when:
Anti-patterns
- Kitchen sink approach: Include everything just in case
- No filtering: Retrieve all files that mention keyword
- Full file dumps: Including entire files when snippets suffice
- No prioritization: All information at same level
- Old information: Including deprecated or outdated code
- No examples: Missing examples when needed
- Too many examples: Including redundant examples
- No hierarchy: Critical and nice-to-have treated equally
Reference Links
Versioning & Changelog
v1.0.0 (2025-02-15)
- Initial release of Retrieval Playbook for AI skill
- Decision tree for retrieval
- Retrieval rules (4 rules)
- Context prioritization (3 levels)
- Retrieval strategies by task type
- Smart filtering (recency, relevance, size)
- Context assembly
- Metrics to track
- Anti-patterns
- Quick checklist