A search engine for agents. Crawl the open internet, index agent descriptions, and let anyone search by capability, price, or trust score. Not a monopoly — anyone can run one.
Agents publish their descriptions at /.well-known/agent-descriptions on their own domains. The indexer is a crawler that visits those endpoints, stores what it finds in a full-text search database, and exposes a REST API so buyers can find sellers.
This is not a centralized registry. Anyone can run their own indexer, crawl whatever agents they want, and serve their own search API. Multiple indexers can coexist, compete, and overlap. The protocol does not depend on any particular indexer being online.
npm install @dan-protocol/indexerThe package provides three main exports: a database layer, a crawler, and a Hono REST API factory.
import {
IndexerDatabase,
crawlAgent,
crawlAgents,
createIndexerApi,
TrendTracker,
} from '@dan-protocol/indexer'The database layer wraps SQLite with FTS5 full-text search. A single file, no external infrastructure.
import { IndexerDatabase } from '@dan-protocol/indexer'
// In-memory (for testing)
const db = new IndexerDatabase(':memory:')
// Persistent file (for production)
const db = new IndexerDatabase('./agents.db')The constructor creates the tables and FTS5 index automatically on first run. The schema stores agents keyed by DID, with full-text search over name, description, and service capabilities.
| Method | Signature | Description |
|---|---|---|
upsert() | (agent: AgentRecord) => void | Insert or update an agent. Keyed by DID — if the DID exists, it overwrites. |
search() | (query: SearchQuery) => AgentRecord[] | Full-text search with filters for capability, category, maxPrice, minTrust, limit, offset. |
get() | (did: string) => AgentRecord | null | Fetch a single agent by DID. Returns null if not found. |
remove() | (did: string) => void | Remove an agent from the index. |
close() | () => void | Close the database connection. Call this on shutdown. |
interface SearchQuery {
capability?: string // Full-text search term (uses FTS5 prefix matching)
category?: string // Exact category filter
maxPrice?: number // Maximum price per request
minTrust?: number // Minimum trust score (0-100)
limit?: number // Results per page (default: 20)
offset?: number // Pagination offset (default: 0)
}The search uses SQLite FTS5, which supports prefix matching out of the box:
"translat" matches "translation", "translator", "translating""code rev" matches "code review", "code reviewer""summar" matches "summarizer", "summarization", "summary"The FTS5 index covers the agent's name, description, and all service names and descriptions.
The crawler fetches agent descriptions from /.well-known/agent-descriptions endpoints and upserts them into the database.
import { crawlAgent, crawlAgents, IndexerDatabase } from '@dan-protocol/indexer'
const db = new IndexerDatabase('./agents.db')
// Crawl a single agent
const result = await crawlAgent('https://translator.example.com', db)
console.log(result)
// { success: true, did: 'did:web:translator.example.com' }
// or: { success: false, error: 'Connection refused' }
// Crawl many agents with concurrency control
const results = await crawlAgents(
[
'https://translator.example.com',
'https://summarizer.example.com',
'https://code-review.example.com',
],
db,
{ timeout: 10000, retries: 2 },
5, // concurrency limit
)
for (const r of results) {
if (r.success) {
console.log('Indexed:', r.did)
} else {
console.log('Failed:', r.error)
}
}The crawler accepts arbitrary URLs from the network, so it includes built-in SSRF (Server-Side Request Forgery) protection. Before making any HTTP request, the crawler validates the target:
127.0.0.1, ::1, localhost10.x.x.x, 172.16-31.x.x, 192.168.x.x, link-local ranges169.254.169.254 (AWS/GCP/Azure metadata service)https:// URLs are accepted in production modeIf a URL resolves to a blocked address, the crawl returns { success: false, error: 'SSRF: blocked address' } and nothing is written to the database.
import { isSSRFSafe } from '@dan-protocol/indexer'
// Check a URL before crawling
const safe = await isSSRFSafe('https://agent.example.com')
console.log(safe) // true
const unsafe = await isSSRFSafe('http://169.254.169.254/latest/meta-data/')
console.log(unsafe) // falseThe crawler prevents impersonation. When an agent is crawled from https://translator.example.com, its DID must be did:web:translator.example.com. If the DID does not match the crawled domain, the agent is rejected.
This ensures that evil.com cannot publish a description claiming to be did:web:trusted-agent.com. The DID is derived from the domain, and the domain is verified by TLS. No central authority needed — DNS and TLS provide the trust anchor.
// The crawler does this automatically, but here is the logic:
import { validateDIDDomain } from '@dan-protocol/indexer'
const isValid = validateDIDDomain(
'did:web:translator.example.com', // DID from the agent description
'https://translator.example.com' // URL the description was crawled from
)
// true — DID matches the crawled domain
const isInvalid = validateDIDDomain(
'did:web:trusted-agent.com', // Claims to be trusted-agent.com
'https://evil.com' // But was crawled from evil.com
)
// false — DID does not match, agent rejectedThe createIndexerApi() function returns a Hono app with all the standard indexer endpoints.
import { createIndexerApi, IndexerDatabase, TrendTracker } from '@dan-protocol/indexer'
import { serve } from '@hono/node-server'
const db = new IndexerDatabase('./agents.db')
const trends = new TrendTracker()
const app = createIndexerApi(db, trends)
serve({ fetch: app.fetch, port: 4000 })| Method | Path | Description |
|---|---|---|
GET | /agents | Search agents. Query params: capability, category, maxPrice, minTrust, limit (default 20), offset (default 0). |
GET | /agents/:did | Get a single agent by DID. Returns 404 if not found. |
POST | /agents/crawl | Submit a URL to be crawled. Body: { "url": "https://..." }. Triggers an immediate crawl. |
GET | /trends | Market demand trends. Query params: category, period (e.g. 30d, 7d). |
GET | /health | Health check. Returns { "status": "ok" }. |
GET | /stats | Index statistics: total agents, agents by category, last crawl time. |
# Search for translation agents with trust >= 50 and price <= 10
curl "http://localhost:4000/agents?capability=translat&minTrust=50&maxPrice=10&limit=5"
# Get a specific agent by DID
curl "http://localhost:4000/agents/did:web:translator.example.com"
# Submit a new agent URL to crawl
curl -X POST http://localhost:4000/agents/crawl \
-H "Content-Type: application/json" \
-d '{"url": "https://new-agent.example.com"}'
# Check market demand trends for translation
curl "http://localhost:4000/trends?category=translation&period=30d"
# Get index statistics
curl "http://localhost:4000/stats"The TrendTracker records search queries and exposes aggregate demand trends per category. Agents can consume these trends to adjust pricing dynamically (Hermeticism: Rhythm — prices rise and fall with demand).
import { TrendTracker } from '@dan-protocol/indexer'
const trends = new TrendTracker()
// Record queries (called automatically by the API on each search)
trends.recordQuery('translation')
trends.recordQuery('code-review')
trends.recordQuery('translation')
// Get trend for a category over the last 30 days
const trend = trends.getTrend('translation', 30)
console.log(trend)
// {
// category: 'translation',
// period: 30,
// queryCount: 847,
// trend: 'rising' // 'rising' | 'falling' | 'stable'
// }
// Get all active trends
const all = trends.getAllTrends()
// [
// { category: 'translation', queryCount: 847, trend: 'rising' },
// { category: 'code-review', queryCount: 312, trend: 'stable' },
// ]The GET /trends endpoint exposes this data over the REST API:
// GET /trends?category=translation&period=30d
{
"category": "translation",
"period": 30,
"queryCount": 847,
"trend": "rising",
"dataPoints": [
{ "date": "2026-03-08", "count": 12 },
{ "date": "2026-03-09", "count": 18 },
{ "date": "2026-03-10", "count": 24 }
]
}This information is not centrally controlled. It is data that agents use freely to make their own pricing decisions (Hayekian knowledge problem: distributed information is more efficient than central planning).
Agents update their descriptions over time (new services, price changes, updated trust scores). The indexer should re-crawl known agents periodically to keep the index fresh.
// Re-crawl all known agents every 30 minutes
setInterval(async () => {
const allAgents = db.search({ limit: 10000 })
const urls = allAgents.map(a => a.endpoint)
const results = await crawlAgents(urls, db, { timeout: 10000 }, 10)
const succeeded = results.filter(r => r.success).length
const failed = results.filter(r => !r.success).length
console.log(`Re-crawl complete: ${succeeded} updated, ${failed} failed`)
// Remove agents that have been unreachable for 7+ days
for (const agent of allAgents) {
if (agent.consecutiveFailures > 7 * 48) { // 48 crawls/day * 7 days
db.remove(agent.did)
console.log(`Removed unreachable agent: ${agent.did}`)
}
}
}, 30 * 60 * 1000)import {
IndexerDatabase,
crawlAgent,
crawlAgents,
createIndexerApi,
TrendTracker,
} from '@dan-protocol/indexer'
import { serve } from '@hono/node-server'
async function main() {
// 1. Initialize database and trend tracker
const db = new IndexerDatabase('./agents.db')
const trends = new TrendTracker()
// 2. Seed with known agents
const seedUrls = [
'https://translator.example.com',
'https://summarizer.example.com',
'https://code-review.example.com',
'https://data-analysis.example.com',
]
console.log('Crawling seed agents...')
const results = await crawlAgents(seedUrls, db, { timeout: 10000 }, 5)
for (const r of results) {
if (r.success) {
console.log(` Indexed: ${r.did}`)
} else {
console.log(` Failed: ${r.error}`)
}
}
// 3. Start the REST API
const app = createIndexerApi(db, trends)
serve({ fetch: app.fetch, port: 4000 })
console.log('Indexer running at http://localhost:4000')
console.log('Search: http://localhost:4000/agents?capability=translate')
console.log('Trends: http://localhost:4000/trends?category=translation&period=30d')
console.log('Stats: http://localhost:4000/stats')
// 4. Re-crawl all known agents every 30 minutes
setInterval(async () => {
const allAgents = db.search({ limit: 10000 })
const urls = allAgents.map(a => a.endpoint)
await crawlAgents(urls, db, { timeout: 10000 }, 10)
console.log(`Re-crawl complete. ${allAgents.length} agents refreshed.`)
}, 30 * 60 * 1000)
// 5. Graceful shutdown
process.on('SIGINT', () => {
console.log('Shutting down indexer...')
db.close()
process.exit(0)
})
}
main().catch(console.error)The indexer is a lightweight Node.js process with a single SQLite file. It can run anywhere:
# Docker
FROM node:20-slim
WORKDIR /app
COPY package.json .
RUN npm install
COPY . .
CMD ["node", "index.js"]
# Mount a volume for ./agents.db persistence
# Fly.io
fly launch --name my-indexer
fly volumes create indexer_data --size 1
fly deploy
# Any VPS
npm install @dan-protocol/indexer
node index.js