DuckDB (Default)
File: Single .db
file
Performance: Excellent for code search
Storage: Efficient columnar format
Setup: Zero configuration required
ChunkHound uses a 5-level configuration hierarchy. Each source can override the previous ones:
--api-key
, --model
, --debug
.chunkhound.json
- Project-specific config in target directory--config
path or CHUNKHOUND_CONFIG_FILE
CHUNKHOUND_*
prefixed variables{ "database": { "provider": "duckdb", "path": "/path/to/database" }, "embedding": { "provider": "voyageai", "model": "voyage-3.5", "api_key": "pa-your-key", "base_url": "https://api.voyageai.com/v1", "rerank_model": "rerank-lite-1", "rerank_url": "/rerank" }, "indexing": { "include": ["**/*.py", "**/*.js", "**/*.ts"], "exclude": ["**/node_modules/**", "**/__pycache__/**"] }, "mcp": { "transport": "stdio", "host": "0.0.0.0", "port": 3000 }, "debug": false}
DuckDB (Default)
File: Single .db
file
Performance: Excellent for code search
Storage: Efficient columnar format
Setup: Zero configuration required
LanceDB (Alternative)
File: Directory with multiple files
Performance: Optimized for vector operations
Storage: Native vector format
Setup: Set "provider": "lancedb"
Field | Type | Default | Description |
---|---|---|---|
provider | "duckdb" | "lancedb" | "duckdb" | Database engine |
path | string | .chunkhound | Database directory path |
Environment Variables:
CHUNKHOUND_DATABASE__PROVIDER
- Database providerCHUNKHOUND_DATABASE__PATH
- Database directory pathCLI Arguments:
--database-provider
- Choose database provider--db
, --database-path
- Set database pathBest for: Accuracy, cost efficiency, code understanding
VoyageAI Documentation | API Reference
{ "embedding": { "provider": "voyageai", "api_key": "pa-your-voyage-key", "model": "voyage-3.5", "rerank_model": "rerank-lite-1" }}
Available Models (full list):
voyage-3.5
(default) - General purpose, 1024 dimensionsvoyage-code-3
- Optimized for code, 1024 dimensionsvoyage-3-large
- Higher accuracy, 1024 dimensionsvoyage-law-2
- Legal documents, 1024 dimensionsBest for: Wide compatibility and ecosystem support
OpenAI Documentation | Embeddings Guide
{ "embedding": { "provider": "openai", "api_key": "sk-your-openai-key", "model": "text-embedding-3-small" }}
Available Models (pricing):
text-embedding-3-small
(default) - Fast, 1536 dimensionstext-embedding-3-large
- Higher accuracy, 3072 dimensionstext-embedding-ada-002
- Legacy model, 1536 dimensionsBest for: Privacy, custom models, local deployment
Uses OpenAI-compatible API format for maximum compatibility.
{ "embedding": { "provider": "openai", "base_url": "http://localhost:11434/v1", "model": "nomic-embed-text" }}
Compatible Servers:
http://localhost:11434/v1
(API docs)http://localhost:8080/v1
(setup guide)http://localhost:1234/v1
(local server docs)Field | Type | Default | Description |
---|---|---|---|
provider | "openai" | "voyageai" | None | Embedding provider |
model | string | Provider default | Model name |
api_key | string | None | API key for authentication |
base_url | string | Provider default | Custom API base URL |
rerank_model | string | None | Reranking model |
rerank_url | string | "/rerank" | Rerank endpoint path |
ChunkHound automatically respects .gitignore
files and includes comprehensive defaults. File discovery uses Tree-sitter for language detection:
Default Include Patterns:
[ "**/*.py", "**/*.js", "**/*.ts", "**/*.tsx", "**/*.jsx", "**/*.go", "**/*.rs", "**/*.java", "**/*.c", "**/*.cpp", "**/*.h", "**/*.hpp", "**/*.cs", "**/*.php", "**/*.rb", "**/*.swift", "**/*.kt", "**/*.scala", "**/*.clj", "**/*.sh", "**/*.bash", "**/*.zsh", "**/*.fish", "**/*.sql", "**/*.json", "**/*.yaml", "**/*.yml", "**/*.toml", "**/*.xml", "**/*.html", "**/*.css", "**/*.scss", "**/*.sass", "**/*.less", "**/*.md", "**/*.rst", "**/*.txt", "**/*.dockerfile", "**/Dockerfile*", "**/Makefile*", "**/*.mk"]
Default Exclude Patterns:
[ "**/node_modules/**", "**/.git/**", "**/__pycache__/**", "**/venv/**", "**/.venv/**", "**/dist/**", "**/build/**", "**/target/**", "**/.vscode/**", "**/.idea/**", "**/*.tmp*", "**/*.swp", "**/*.swo", "**/*.min.js", "**/*.min.css", "**/package-lock.json", "**/yarn.lock"]
Field | Type | Default | Description |
---|---|---|---|
include | string[] | Comprehensive list | File patterns to include |
exclude | string[] | Comprehensive list | File patterns to exclude |
Environment Variables:
CHUNKHOUND_INDEXING__INCLUDE
- Comma-separated include patternsCHUNKHOUND_INDEXING__EXCLUDE
- Comma-separated exclude patternsCLI Arguments:
--force-reindex
- Force reindexing all files--include PATTERN
- Add include pattern (can be used multiple times)--exclude PATTERN
- Add exclude pattern (can be used multiple times)MCP transport mode is controlled via CLI arguments when starting the server, not through configuration files.
Best for: IDE integrations (Claude Desktop, Claude Code, Cursor, VS Code)
Follows MCP specification for standard I/O transport.
# Default stdio modechunkhound mcp
# Explicit stdio modechunkhound mcp --stdio
Uses standard input/output for communication. Most IDE integrations expect this mode.
Best for: Web applications, VS Code extensions, debugging
Uses MCP over HTTP transport.
# HTTP mode with default port (3000)chunkhound mcp --http
# HTTP mode with custom port and hostchunkhound mcp --http --port 8000 --host 127.0.0.1
Runs an HTTP server for MCP communication. Easier to debug and test.
Argument | Description | Example |
---|---|---|
--stdio | Use stdio transport (default) | chunkhound mcp --stdio |
--http | Use HTTP transport | chunkhound mcp --http |
--host HOST | Set HTTP server host | chunkhound mcp --http --host localhost |
--port PORT | Set HTTP server port | chunkhound mcp --http --port 8000 |
Environment Variables (for HTTP mode):
CHUNKHOUND_MCP__HOST
- Default HTTP server hostCHUNKHOUND_MCP__PORT
- Default HTTP server portChunkHound uses a standardized naming pattern:
CHUNKHOUND_
__
(double underscore)CHUNKHOUND_EMBEDDING__API_KEY
# Main ConfigurationCHUNKHOUND_DEBUG=true # Enable debug modeCHUNKHOUND_CONFIG_FILE=/path/to/config.json # Config file path
# Database ConfigurationCHUNKHOUND_DATABASE__PROVIDER=duckdb # Database providerCHUNKHOUND_DATABASE__PATH=/custom/db/path # Database directory
# Embedding ConfigurationCHUNKHOUND_EMBEDDING__PROVIDER=voyageai # Embedding providerCHUNKHOUND_EMBEDDING__API_KEY=pa-your-key # API keyCHUNKHOUND_EMBEDDING__BASE_URL=https://api... # Custom base URLCHUNKHOUND_EMBEDDING__MODEL=voyage-3.5 # Model name
# Indexing ConfigurationCHUNKHOUND_INDEXING__INCLUDE="*.py,*.js" # Include patternsCHUNKHOUND_INDEXING__EXCLUDE="*/tests/*" # Exclude patterns
# MCP Configuration (HTTP mode only)CHUNKHOUND_MCP__HOST=localhost # Default HTTP server hostCHUNKHOUND_MCP__PORT=8080 # Default HTTP server port
# Provider Fallback VariablesOPENAI_API_KEY=sk-your-key # OpenAI API key fallbackOPENAI_BASE_URL=https://api.openai.com/v1 # OpenAI base URL fallbackVOYAGE_API_KEY=pa-your-key # VoyageAI API key fallback