agentd-index Service Documentation¶

The index service provides semantic code search over registered repositories. It chunks source files using tree-sitter, generates embeddings via Ollama, and stores them in LanceDB for fast approximate nearest-neighbour (ANN) search. Optional BM25 full-text search via tantivy can be combined with vector search using Reciprocal Rank Fusion (RRF).

Base URL¶

http://127.0.0.1:17012

Port defaults to 17012, configurable via AGENTD_PORT.

Architecture¶

source files
     │
     ▼
  Chunker (tree-sitter)         ← semantic / hierarchical split
     │
     ▼
  Embedder (Ollama / OpenAI)    ← nomic-embed-text by default
     │
     ▼
  LanceDB vector store          ← ANN index for fast retrieval
     │
     ▼
  Search endpoint               ← vector / hybrid / keyword

Search Modes¶

Mode	Algorithm	Description
`vector`	ANN similarity	Pure semantic search (default)
`hybrid`	Vector + BM25 via RRF	Combines semantic and keyword relevance
`keyword`	BM25 (tantivy)	Exact term / identifier matching

Environment Variables¶

Variable	Default	Description
`AGENTD_PORT`	`17012`	HTTP listen port
`AGENTD_INDEX_EMBEDDING_PROVIDER`	`ollama`	Embedding provider (`ollama`/`openai`)
`AGENTD_INDEX_EMBEDDING_MODEL`	`nomic-embed-text`	Embedding model name
`AGENTD_INDEX_EMBEDDING_ENDPOINT`	`http://localhost:11434/v1`	Ollama / OpenAI API endpoint
`AGENTD_INDEX_LANCE_PATH`	XDG data dir / `lancedb`	LanceDB directory path
`AGENTD_INDEX_LANCE_TABLE`	`code_chunks`	LanceDB table name
`AGENTD_INDEX_WATCH_INTERVAL`	`30`	File watch poll interval (seconds)
`AGENTD_INDEX_LANGUAGES`	`rust,python,javascript,typescript`	Comma-separated languages to index
`AGENTD_INDEX_IGNORE_PATTERNS`	`.git,target,node_modules,dist`	Comma-separated patterns to skip
`AGENTD_INDEX_SUMMARY_ENABLED`	`false`	Enable LLM-generated chunk summaries
`AGENTD_INDEX_SUMMARY_MODEL`	`qwen2.5-coder:7b`	Ollama model for summaries
`AGENTD_INDEX_RERANK_ENABLED`	`false`	Enable cross-encoder reranking
`AGENTD_INDEX_RERANK_MODEL`	`qwen2.5-coder:7b`	Ollama model for reranking
`AGENTD_INDEX_RERANK_CANDIDATES`	`30`	Candidate count before reranking
`RUST_LOG`	`info`	Log level filter

Endpoints¶

Health Check¶

GET /health

Response:

{
  "status": "ok",
  "service": "agentd-index",
  "version": "0.2.0"
}

Search¶

POST /search

Searches indexed code chunks using the configured search mode.

Request body:

{
  "query": "authentication middleware",
  "search_mode": "vector",
  "repo_id": "agentd",
  "language": "rust",
  "file_pattern": "src/auth/**",
  "hierarchy_level": "symbol",
  "limit": 10
}

Field	Type	Required	Default	Description
`query`	string	yes	—	Natural-language or identifier query
`search_mode`	string	no	`vector`	`vector`, `hybrid`, or `keyword`
`repo_id`	string	no	—	Filter to a specific repository
`language`	string	no	—	Filter by language (`rust`, `python`, …)
`file_pattern`	string	no	—	Glob pattern for file path filtering
`hierarchy_level`	string	no	—	`symbol`, `file`, `directory`, or `repository`
`limit`	integer	no	`10`	Max results (clamped to 1–100)

Response:

{
  "results": [
    {
      "id": "chunk_abc123_0",
      "file_path": "src/auth/middleware.rs",
      "language": "rust",
      "chunk_type": "function",
      "symbol_name": "authenticate_request",
      "start_line": 42,
      "end_line": 68,
      "content": "pub async fn authenticate_request(...) {",
      "summary": "Validates JWT tokens and attaches the user context.",
      "score": 0.92,
      "repo_id": "agentd"
    }
  ],
  "total": 1,
  "query_time_ms": 14
}

Error responses: - 422 Unprocessable Entity — empty or invalid query - 500 Internal Server Error — embedding or store failure

Agentic Search (grep fallback)¶

POST /search/agentic

Grep-based search over raw source files. Does not require the vector index — useful for exact identifier lookup or when files have not yet been indexed.

Request body:

{
  "query": "authenticate_request",
  "path": "crates/index/src",
  "file_pattern": "*.rs",
  "context_lines": 2,
  "limit": 20
}

Field	Type	Required	Default	Description
`query`	string	yes	—	Basic regex passed to `grep`
`path`	string	no	`.`	Directory to search (relative to service cwd)
`file_pattern`	string	no	`*`	`--include` glob for file filtering
`context_lines`	integer	no	`2`	Lines of context before/after each match
`limit`	integer	no	`20`	Max matches (capped at 200)

Response:

{
  "matches": [
    {
      "file_path": "crates/index/src/api.rs",
      "line_number": 88,
      "content": "async fn authenticate_request(",
      "context_before": ["", "/// Authenticates an incoming request."],
      "context_after": ["    let token = req.headers().get(\"Authorization\");", "}"]
    }
  ],
  "total": 1,
  "query_time_ms": 12
}

Register Repository¶

POST /repositories

Register a local repository for background indexing.

Request body:

{
  "name": "agentd",
  "path": "/home/user/agentd"
}

Response (201 Created):

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "agentd",
  "path": "/home/user/agentd",
  "status": "pending",
  "created_at": "2024-01-01T00:00:00Z",
  "updated_at": "2024-01-01T00:00:00Z",
  "last_indexed": null,
  "error_message": null
}

List Repositories¶

GET /repositories

Response:

{
  "repositories": [
    {
      "id": "550e8400-...",
      "name": "agentd",
      "path": "/home/user/agentd",
      "status": "ready",
      "created_at": "2024-01-01T00:00:00Z",
      "updated_at": "2024-01-01T01:00:00Z",
      "last_indexed": "2024-01-01T01:00:00Z"
    }
  ],
  "total": 1
}

Get Repository¶

GET /repositories/{id}

Returns a single repository record. 404 Not Found if the ID is unknown.

Delete Repository¶

DELETE /repositories/{id}

Removes a repository record.

204 No Content on success
404 Not Found if the ID is unknown

Note: Indexed vector chunks for this repository are NOT automatically removed from LanceDB when a repository is deleted.

Repository Status¶

GET /repositories/{id}/status

Returns the current indexing lifecycle status.

Response:

{
  "id": "550e8400-...",
  "status": "ready",
  "last_indexed": "2024-01-01T01:00:00Z",
  "error_message": null
}

Status Values¶

Status	Description
`pending`	Registered; waiting for the background indexer to start
`indexing`	Currently being indexed
`ready`	Successfully indexed; available for search
`error`	Last index run failed; see `error_message`

Trigger Re-index¶

POST /repositories/{id}/reindex

Marks the repository as pending so the background watcher will schedule a full re-index pass.

Response (202 Accepted):

{ "status": "pending" }

404 Not Found if the ID is unknown

CLI Usage¶

The agent index subcommand provides a user-friendly interface to all endpoints. Service URL is read from AGENTD_INDEX_SERVICE_URL (default: http://localhost:17012).

# Health
agent index health

# Repository management
agent index add-repo --name agentd --path /home/user/agentd
agent index list-repos
agent index status <repo-id>
agent index reindex <repo-id>
agent index remove-repo <repo-id>

# Code search
agent index search "authentication middleware"
agent index search "error handling" --mode hybrid --language rust
agent index search "ConnectionPool" --mode keyword --limit 5
agent index search "deploy handler" --file-pattern "src/api/**" --json

See agent index --help for all options.

Claude Code Integration¶

The agent-index Claude Code skill (.claude/skills/agent-index/) enables agents to search indexed code directly from conversations:

Use the agent-index skill to find the authentication middleware implementation.

Running the Service¶

# Development
cargo run -p index
RUST_LOG=debug cargo run -p index

# With custom embedding endpoint
AGENTD_INDEX_EMBEDDING_ENDPOINT=http://my-ollama:11434/v1 cargo run -p index

# Production (via xtask)
cargo xtask start-services

Metrics¶

The service exposes Prometheus metrics at GET /metrics.

Metric	Description
`service_info`	Service version and name gauge

Standard HTTP metrics (request count, latency, error rate) are emitted via the tower-http tracing layer and captured by the Prometheus scraper.