// Semantic code search using osgrep for understanding codebases, finding implementations, and navigating large projects. Use when asked "where is", "how does", "find the code that", or any question about code location or implementation.
| name | codebase-navigator |
| description | Semantic code search using osgrep for understanding codebases, finding implementations, and navigating large projects. Use when asked "where is", "how does", "find the code that", or any question about code location or implementation. |
Semantic code search powered by osgrep - find code by meaning, not just keywords.
Invoke when user:
Before searching, check if index is stale (>4 hours):
# Find store for current repo
osgrep list
# Check age of relevant store (macOS)
STORE=~/.osgrep/data/YOUR-STORE.lance
STORE_AGE=$(( $(date +%s) - $(stat -f %m "$STORE") ))
# If older than 4 hours (14400 seconds), refresh
if [ $STORE_AGE -gt 14400 ]; then
echo "Index is $(( STORE_AGE / 3600 )) hours old - refreshing..."
osgrep index
fi
Quick version: If unsure, just use --sync:
osgrep search "query" --sync # Always safe, updates before searching
If no store exists for current repo:
osgrep list # See available stores
osgrep doctor # Verify setup is healthy
osgrep index # Index current directory (takes ~30s-2min)
Basic search:
osgrep search "natural language description of what you're looking for"
Tuned search:
osgrep search "query" --max-count 10 # Limit total results
osgrep search "query" --per-file 3 # Multiple matches per file
osgrep search "query" --content # Show full chunk content
osgrep search "query" --compact # File paths only
osgrep search "query" --scores # Show relevance scores
osgrep search "query" --json # Machine-readable output
DO NOT dump raw osgrep output. Instead:
Semantic queries work best. Transform user questions:
| User asks | osgrep query |
|---|---|
| "Where's the auth?" | "authentication logic and user login" |
| "How do we handle errors?" | "error handling and exception management" |
| "Find the API endpoints" | "HTTP routes and API endpoint definitions" |
| "Database queries" | "database queries and SQL execution" |
| "Config loading" | "configuration loading and environment variables" |
Tips for better queries:
Shows snippet preview with line numbers:
📂 src/auth/login.ts
1 │ export async function login(username: string, password: string) {
2 │ const user = await findUser(username);
--content)Shows full chunk content for deeper context.
--compact)File paths only - useful for getting quick overview:
📂 src/auth/login.ts
📂 src/auth/session.ts
📂 src/middleware/auth.ts
--json)Machine-readable for programmatic use.
--scores)Shows relevance scores (0-1) - useful for understanding match quality.
osgrep indexes can become stale. Refresh regularly, especially after:
osgrep search "query" --sync # Update index then search
osgrep index # Full re-index if --sync isn't enough
Symptom of stale index: Known files not appearing in results, or deleted files still showing up.
For large codebases with frequent changes:
osgrep serve # Runs on port 4444
osgrep serve --port 8080 # Custom port
Work with specific indexed stores:
osgrep --store myproject.lance search "query"
When first search returns too many/wrong results:
Step 1: Check result quality
osgrep search "query" --scores # Low scores (<0.15) = poor matches
Step 2: Narrow with domain terms
❌ "packaging workflow" → finds ArtifactsBuilder, MCPBuilder
✅ "skill packaging automation" → finds SkillPackager
Step 3: Add specificity
❌ "validation" → too broad (25+ files)
✅ "YAML frontmatter validation for skills" → targeted
Step 4: Try synonyms if nothing found
❌ "auth" → too terse
✅ "authentication login session user credentials" → covers variations
| Use osgrep when... | Use grep/rg when... |
|---|---|
| Searching by concept | Searching for exact strings |
| "Where is auth handled?" | "Find TODO:" |
| "How does caching work?" | "Find sha256" |
| Unknown function names | Known function names |
| Architecture questions | Error message lookup |
| Understanding code purpose | Finding specific identifiers |
Rule of thumb: If you could type the exact string, use grep. If you're describing what code does, use osgrep.
osgrep finds code that mentions Python, not just .py files:
# Find Python data processing
osgrep search "python data processing" --compact # May include .md files
# Then filter:
# Use Glob tool with pattern "**/*.py" for actual scripts
# Step 1: Find relevant area
osgrep search "checksum verification" # May miss literal "sha256"
# Step 2: grep for specific term
grep -r "sha256" --include="*.sh" # Finds exact matches
# Step 1: Find files
osgrep search "error handling middleware" --compact
# Step 2: Read to understand
# Use Read tool on top results
DON'T:
DO:
osgrep list if searches return nothing--content when you need more context--scores to assess match qualityUser: "Where do we calculate shipping costs?"
Process:
osgrep search "shipping cost calculation and pricing logic"
Results show: src/orders/shipping.ts, src/utils/pricing.ts
Response:
"Shipping costs are calculated in src/orders/shipping.ts:45-67, which uses the calculateShipping() function. This calls pricing utilities from src/utils/pricing.ts for rate lookups. The calculation considers weight, distance, and shipping method."
For query patterns and examples:
references/query-patterns.md - Common query formulationsreferences/troubleshooting.md - Common issues and fixes