Skip to content
ChunkHound Logo ChunkHound Logo

Modern RAG for your codebase - Semantic and Regex Search via MCP

LLMs like Claude and GPT don’t know your codebase - they only know what they were trained on. Every time they help you code, they need to search your files to understand your project’s specific patterns and terminology.

ChunkHound integrates with AI assistants via the Model Context Protocol (MCP) to give them two ways to explore your code:

  • Semantic search - Finds code by meaning, so when the AI looks for “user authentication” it also finds your validateLogin() and checkCredentials() functions
  • Regex search - Pattern matching for precise code structures

Traditional search was built for humans who know what they’re looking for. But AI assistants start with zero knowledge about your codebase. Semantic search bridges this gap by understanding that “database timeout” and “SQL connection lost” are related concepts, even though they share no keywords.

ChunkHound supports 22 languages with structured parsing:

  • Programming (via Tree-sitter): Python, JavaScript, TypeScript, JSX, TSX, Java, Kotlin, Groovy, C, C++, C#, Go, Rust, Bash, MATLAB, Makefile
  • Configuration (via Tree-sitter): JSON, YAML, TOML, Markdown
  • Text-based (custom parsers): Text files, PDF
Terminal window
# Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install ChunkHound
uv tool install chunkhound

ChunkHound works without configuration for regex search. For semantic search, create .chunkhound.json in your project root:

Recommended: Fastest, most accurate, and cost effective

{
"embedding": {
"provider": "voyageai",
"api_key": "pa-your-voyage-key"
}
}

Get API key from VoyageAI Console | Documentation

Configure ChunkHound as an MCP server in your AI assistant:

Add to ~/.claude.json:

{
"mcpServers": {
"chunkhound": {
"command": "chunkhound",
"args": ["mcp"]
}
}
}
Terminal window
# Index your codebase (respects .gitignore automatically)
cd /path/to/project && chunkhound index