// Intent-based code discovery for CLI AI agents using semantic search MCP tools. Use when finding code by what it does (not what it's called), exploring unfamiliar areas, or understanding feature implementations. Mandatory for code discovery tasks when you have MCP access.
| name | mcp-semantic-search |
| description | Intent-based code discovery for CLI AI agents using semantic search MCP tools. Use when finding code by what it does (not what it's called), exploring unfamiliar areas, or understanding feature implementations. Mandatory for code discovery tasks when you have MCP access. |
| allowed-tools | ["Grep","Read","Glob"] |
| version | 1.0.0 |
Semantic code search for CLI AI agents that enables AI-powered codebase exploration using natural language queries instead of keyword searches. Available exclusively for CLI AI agents with MCP (Model Context Protocol) support.
This file (SKILL.md): Core workflow and essential guidance
Reference Files (detailed documentation):
Assets (examples):
Use this skill when:
Exploring unfamiliar code
Finding by behavior/intent
Understanding patterns
Discovering cross-file relationships
Code discovery tasks for CLI AI agents
Use different tools instead:
Known exact file paths โ Use Read tool
โ search_codebase("Find hero_video.js content")
โ
Read("src/hero/hero_video.js")
Specific symbol searches โ Use Grep tool
โ search_codebase("Find all calls to initVideoPlayer")
โ
Grep("initVideoPlayer", output_mode="content")
Simple keyword searches โ Use Grep tool
โ search_codebase("Find all TODO comments")
โ
Grep("TODO:", output_mode="content")
File structure exploration โ Use Glob tool
โ search_codebase("Show me all JavaScript files")
โ
Glob("**/*.js")
IDE integrations โ NOT SUPPORTED
Activate this skill when user asks:
Do NOT activate for:
| Document | Purpose | Key Insight |
|---|---|---|
| MCP Semantic Search - Intent-Based Code Discovery | Enable CLI AI agents to search codebases by intent using natural language queries | Finds code by what it does, not what it's called |
| Document | Purpose | Key Insight |
|---|---|---|
| references/tool_comparison.md | Decision framework for semantic search vs grep vs glob | When to use each tool based on knowledge and intent |
| references/architecture.md | System architecture and data flow | Two-component system: Indexer + MCP Server + Vector DB |
| references/query_patterns.md | Effective query writing guide | Describe behavior in natural language for best results |
| assets/query_examples.md | Categorized example queries | 9 categories of real-world query patterns |
User Request
โ
[Detect Intent]
โ
Know exact file path? โโโโ YES โโโ [Use Read tool] โโโ DONE
โ
NO
โ
Know what code does? โโโโ YES โโโ [Use search_codebase] โโโ Parse Results
โ โ
NO Return ranked code snippets
โ โ
Know exact symbol? โโโโ YES โโโ [Use Grep tool] โโโ DONE
โ
NO
โ
Exploring file structure? โโโโ YES โโโ [Use Glob tool] โโโ DONE
โ
NO
โ
[Default: Use search_codebase]
โ
COMPLETE
Three semantic search MCP tools available:
search_codebase - Search current project semantically
search_commits - Search git commit history
search_other_workspace - Search other indexed projects
Query structure - describe what code does:
// Good: Natural language, behavior-focused
search_codebase("Find code that validates email addresses in contact forms")
// Good: Question format
search_codebase("How do we handle page transitions?")
// Good: Feature discovery
search_codebase("Find cookie consent implementation")
// Bad: Grep syntax
search_codebase("grep validateEmail") // โ Use grep tool instead
// Bad: Known file path
search_codebase("Show me hero_video.js") // โ Use Read tool instead
Goal: Find email validation logic
// Step 1: Use semantic search
search_codebase("Find code that validates email addresses in contact forms")
// Expected results:
// - src/form/form_validation.js (ranked #1)
// - src/utils/email_validator.js (ranked #2)
// - Code snippets with validation logic
// Step 2: Read full context
Read("src/form/form_validation.js")
// Step 3: Analyze and make changes
Edit(...) or Write(...)
Why it works: Query describes behavior (validates email), context (contact forms), allowing semantic search to find relevant code.
Goal: Find what code depends on video player
// Use relationship query
search_codebase("What code depends on the video player?")
// Expected results:
// - src/components/hero_section.js (uses video player)
// - src/animations/hero_animations.js (triggers on video events)
// - Code snippets showing imports and usage
// Follow up: Read specific files
Read("src/components/hero_section.js")
Why it works: Semantic search understands dependencies and can find related code across files.
Do:
Don't:
For more query patterns, see: query_patterns.md
Results are reranked for relevance:
ALWAYS use for intent-based discovery
ALWAYS use natural language
ALWAYS provide context in queries
ALWAYS combine with Read tool
ALWAYS check for MCP availability
NEVER use for known file paths
NEVER use for exact symbol searches
NEVER use grep/find syntax
NEVER skip validation of MCP access
NEVER use for file structure exploration
ESCALATE IF MCP server unavailable
ESCALATE IF results consistently irrelevant
ESCALATE IF uncertain about tool selection
ESCALATE IF IDE integration requested
Task complete when:
Required MCP tools:
search_codebase - Semantic code searchsearch_commits - Semantic commit history searchsearch_other_workspace - Cross-project searchMCP server: semantic-search (Python)
Availability: CLI AI agents only (Claude Code AI, GitHub Copilot CLI, Opencode, Kilo CLI)
NOT available: IDE integrations (GitHub Copilot in VS Code/IDEs)
Read tool:
Grep tool:
Glob tool:
mcp-code-mode:
Indexer: codebase-index-cli (Node.js)
Vector Database: SQLite (.codebase/vectors.db)
Voyage AI API:
Must be indexed first:
codesql index to create .codebase/vectors.dbspecs/025-semantic-search-setup/README.md for setupCurrent anobel.com index:
โ Works with (CLI AI agents):
โ Does NOT work with (IDE integrations):
Reason: Different systems - semantic search is for CLI AI agents helping you via chat, not autocomplete while typing.
// Feature discovery
search_codebase("Find cookie consent implementation")
// Behavior search
search_codebase("Find code that validates email addresses")
// Relationship discovery
search_codebase("What code depends on Motion.dev library?")
// Problem solving
search_codebase("How do we prevent duplicate form submissions?")
// Commit history
search_commits("Find commits related to contact form")
Ask yourself:
Do I know the exact file path?
Read(path)Do I know what the code does?
search_codebase("what it does")Am I searching for exact text/symbol?
Grep("symbol")Am I exploring file structure?
Glob("**/*.js")search_codebase()| Scenario | Use This Tool | Example |
|---|---|---|
| Know file path | Read() | Read("src/hero/hero_video.js") |
| Find by behavior | search_codebase() | search_codebase("Find video playback code") |
| Find function calls | Grep() | Grep("initVideoPlayer", output_mode="content") |
| Find all files of type | Glob() | Glob("**/*.js") |
| Understand feature | search_codebase() | search_codebase("How do we handle forms?") |
specs/025-semantic-search-setup/README.mdTwo-component system:
Data flow:
For detailed architecture, see: architecture.md
| Metric | Value | Notes |
|---|---|---|
| Search latency | ~200-400ms | End-to-end including reranking |
| Indexed files | 249 files | anobel.com project |
| Code blocks | 496 chunks | Semantic units |
| Vector dimensions | 1024 | Voyage AI embeddings |
Core principle: Semantic search is for understanding, not navigation. Use it to find what you don't know exists. When you know the file path, use Read. When you know the exact text, use Grep. When you know the intent but not the location, use semantic search.