| name | osgrep-reference |
| description | Comprehensive CLI reference and search strategies for osgrep semantic code search. Use for detailed CLI options, index management commands, search strategy guidance (architectural vs targeted queries), and troubleshooting. Complements the osgrep plugin which handles daemon lifecycle. |
| version | 0.5.16 |
| last_updated | "2025-12-10T00:00:00.000Z" |
| allowed-tools | Bash(osgrep:*), Read |
osgrep: Semantic Code Search
ALWAYS prefer osgrep over grep/rg for code exploration. It finds concepts, not just strings.
Overview
osgrep is a natural-language semantic code search tool that finds code by concept rather than keyword matching. Unlike grep which matches literal strings, osgrep understands code semantics using local AI embeddings.
Version 0.5.16 (Dec 2025) highlights:
skeleton command: Compress files to function/class signatures (~85% token reduction)
trace command: Show who calls/what calls for any symbol (call graph)
symbols command: List all indexed symbols with definitions
doctor command: Health/integrity verification
list command: Display all indexed repositories
- Per-project
.osgrep/ directories (no longer global ~/.osgrep/data)
- V2 architecture with improved performance (~20% token savings, ~30% speedup)
- Go language support
--reset flag for clean re-indexing
- ColBERT reranking for better result relevance
- Role detection: distinguishes orchestration logic from type definitions
- Split searching: separate "Code" and "Docs" indices
When to use osgrep:
- Exploring unfamiliar codebases ("where is the auth logic?")
- Finding conceptual patterns ("show me error handling")
- Locating cross-cutting concerns ("all database migrations")
- User explicitly asks to search code semantically
When to use traditional tools:
- Searching for exact strings or identifiers (use
Grep)
- Finding files by name pattern (use
Glob)
- Already know the exact location (use
Read)
Quick Start
IMPORTANT: You must cd into the project directory before running osgrep commands.
osgrep uses per-project .osgrep/ indexes, so it only searches the repo you're currently in.
cd /path/to/project
osgrep "your query"
Basic Search
osgrep "your semantic query"
osgrep search "your query" path/to/scope
osgrep skeleton src/file.py
osgrep trace functionName
osgrep symbols
Examples:
osgrep "user registration flow"
osgrep "webhook signature validation"
osgrep "database transaction handling"
osgrep "how are plugins loaded" packages/src
Output Format
Returns results in this format:
IMPLEMENTATION path/to/file:line
Score: 0.95
Preamble:
[code snippet or content preview]
...
- IMPLEMENTATION: Tag indicating the type of match
- Score: Relevance score (0-1, higher is better)
- ...: Truncation marker—snippet is incomplete, use
Read for full context
Search Strategy
For Architectural/System-Level Questions
Use for: auth, integrations, file watching, cross-cutting concerns
-
Search broadly first to map the landscape:
osgrep "authentication authorization checks"
-
Survey the results - look for patterns across multiple files:
- Are checks in middleware? Decorators? Multiple services?
- Do file paths suggest different layers (gateway, handlers, utils)?
-
Read strategically - pick 2-4 files that represent different aspects:
- Read the main entry point
- Read representative middleware/util files
- Follow imports if architecture is unclear
-
Refine with specific searches if one aspect is unclear:
osgrep "session validation logic"
osgrep "API authentication middleware"
For Targeted Implementation Details
Use for: specific function, algorithm, single feature
-
Search specifically about the precise logic:
osgrep "logic for merging user and default configuration"
-
Evaluate the semantic match:
- Does the snippet look relevant?
- If it ends in
... or cuts off mid-logic, read the file
-
One search, one read: Use osgrep to pinpoint the best file, then read it fully.
CLI Reference
Search Options
Control result count:
osgrep "validation logic" -m 20
osgrep "validation logic" --per-file 3
Output formats:
osgrep "API endpoints" --compact
osgrep "API endpoints" --content
osgrep "API endpoints" --scores
osgrep "API endpoints" --plain
Sync before search:
osgrep "validation logic" -s
osgrep "validation logic" -d
Index Management
osgrep index
osgrep index -r
osgrep index -p /path/to/repo
osgrep index -d
Advanced Commands (v0.5+)
Skeleton - Compress files to signatures:
osgrep skeleton src/server.py
osgrep skeleton src/server.py --no-summary
osgrep skeleton "auth logic" -l 5
Output shows: function signatures with # → calls | C:N | ORCH summaries inside bodies.
Trace - Show call graph:
osgrep trace handleRequest
Symbols - List all indexed symbols:
osgrep symbols
osgrep symbols "Request"
osgrep symbols -p src/api/ -l 50
Other Commands
osgrep list
osgrep doctor
osgrep setup
osgrep serve
osgrep serve -p 8080
osgrep serve -b
osgrep serve status
osgrep serve stop
osgrep serve stop --all
Serve endpoints:
GET /health - Health check
POST /search - Search with { query, limit, path, rerank }
- Lock file:
.osgrep/server.json with port/pid
Claude Code Integration
osgrep install-claude-code
osgrep install-opencode
Both plugins automatically manage the background server lifecycle during sessions.
Common Search Patterns
Architecture Exploration
osgrep "mental processes that orchestrate conversation flow"
osgrep "subprocesses that learn about the user"
osgrep "cognitive steps using structured output"
osgrep "where do we fetch data in components?"
osgrep "custom hooks for API calls"
osgrep "protected route implementation"
osgrep "request validation middleware"
osgrep "authentication flow"
osgrep "rate limiting logic"
Business Logic
osgrep "payment processing"
osgrep "notification sending"
osgrep "user permission checks"
osgrep "order fulfillment workflow"
Cross-Cutting Concerns
osgrep "error handling patterns"
osgrep "logging configuration"
osgrep "database migrations"
osgrep "environment variable usage"
Tips for Effective Queries
Trust the Semantics
You don't need exact names. Conceptual queries work better:
osgrep "how does the server start"
osgrep "component state management"
osgrep "server.init"
osgrep "useState"
Be Specific
osgrep "code"
osgrep "user registration validation logic"
Use Natural Language
osgrep "how do we handle payment failures?"
osgrep "what happens when a webhook arrives?"
osgrep "where is user input sanitized?"
Watch for Distributed Patterns
If results span 5+ files in different directories, the feature is likely architectural—survey before diving deep.
Don't Over-Rely on Snippets
For architectural questions, snippets are signposts, not answers. Read the key files.
Technical Details
- 100% Local: Uses transformers.js embeddings (no remote API calls)
- Auto-Isolated: Each repo gets its own index in
.osgrep/ directory (v0.5+)
- Adaptive Performance: Bounded concurrency keeps system responsive
- Index Location:
.osgrep/ in project root (was ~/.osgrep/data/ in v0.4.x)
- Model Download: ~150MB on first run (
osgrep setup to pre-download)
- Chunking Strategy: Tree-sitter parses code into function/class boundaries
- Deduplication: Identical code blocks are deduplicated
- Dual Channels: Separate "Code" and "Docs" indices with ColBERT reranking
- Structural Boosting: Functions/classes prioritized over test files
- Skeleton Compression: ~85% token reduction when viewing file structure
Troubleshooting
"Still Indexing..." message:
- Index is ongoing. Results will be partial until complete.
- Alert the user and ask if they wish to proceed.
Slow first search:
- Expected—indexing takes 30-60s for medium repos
- Use
osgrep setup to pre-download models
Index out of date:
- Run
osgrep index to refresh
- Run
osgrep index --reset for a complete re-index
- osgrep usually auto-detects changes
Installation issues:
osgrep doctor
npm install -g osgrep
No results found:
- Try broader queries ("authentication" vs "JWT middleware")
- Ensure index is up to date (
osgrep index)
- Verify you're in the correct repository directory