Skip to content

AgentDNS (Implementation)

agentdns is the open-source Go binary that powers zns01.zynd.ai and (typically) any other production registry. It's a single process backed by PostgreSQL — and optionally Redis — that runs the HTTP API, the TCP gossip mesh, the Kademlia DHT, the search engine, and the trust calculator.

If Registry Spec is the protocol, this section is the implementation: which Go files own which responsibility, what tables look like, what background loops fire.

When to read this

  • You're operating a self-hosted registry node.
  • You're debugging gossip propagation, search ranking, or DHT lookups.
  • You're contributing to the binary.
  • You're porting another runtime that conforms to the same wire protocol.

What's in the binary

A single Go process. Everything ships in one binary; subsystems are wired in cmd/agentdns/main.go.

Subsystem map

LayerSourceResponsibility
HTTP APIinternal/apiREST endpoints, Swagger docs, middleware (rate limits, CORS, logging), WebSocket heartbeat at /v1/entities/{id}/ws, activity stream at /v1/ws/activity
Mesh transportinternal/mesh/transport.goTCP + TLS listener on port 4001, length-prefixed JSON wire protocol
Gossip protocolinternal/mesh/gossip.goDedup, signature verification, key pinning, hop counting, broadcast
Peer managerinternal/mesh/peers.goConnection set, max-peer eviction, bloom-scored peer selection
Bootstrapinternal/mesh/bootstrap.goDial bootstrap peers with exponential backoff, reconnect loop
Bloom filtersinternal/mesh/bloom.goFNV double-hashing, periodic rebuild for query routing
DHTinternal/dhtKademlia routing table, iterative lookup, republish/expire
Search engineinternal/searchBM25 keyword + semantic vectors + pluggable embedders
Rankinginternal/rankingWeighted-linear and RRF scoring
Card fetcherinternal/cardHTTP fetch with .well-known fallbacks + LRU
Cacheinternal/cacheRedis adapter (optional)
Identityinternal/identityEd25519 keypairs, derivation proofs, signing
Trustinternal/trustEigenTrust calculator
Storeinternal/storePostgreSQL persistence (pgxpool)
Event businternal/eventsIn-process pub/sub fan-out

Startup sequence

cmd/agentdns/main.go triggers the following on start:

  1. Load ~/.zynd/config.toml (or --config override).
  2. Load node identity from ~/.zynd/identity.json (Ed25519 keypair generated by agentdns init).
  3. Connect to PostgreSQL — pgxpool with min 2 / max 20 connections, 30 min lifetime, 5 min idle.
  4. Connect to Redis — optional; failures fail-open and the binary keeps running cache-less.
  5. Initialise the search engine with the configured embedder (hash / onnx / http).
  6. Create the peer manager and gossip handler.
  7. Create the EigenTrust calculator.
  8. Start the mesh transport (TCP listener on :4001).
  9. Wire federated search into the mesh.
  10. Initialise the DHT (if [dht].enabled = true).
  11. Start background loops (next section).
  12. Start the HTTP API server.
  13. Block on SIGINT / SIGTERM for graceful shutdown.

Shutdown drains in-flight requests, closes the mesh listener, and lets pgxpool finish active queries before exiting.

Background loops

Once boot completes, these tickers run for the lifetime of the process:

LoopIntervalWhat it does
DHT republish1 hourRe-stores all locally-owned records at the K closest nodes
Mesh heartbeat30 sBroadcasts a MsgHeartbeat with bloom filter + peer addresses
Peer reconnect30 sRe-dials disconnected bootstrap peers (exponential backoff capped at 60 s)
Tombstone GC1 hourDrops expired tombstones from tombstones and gossip_entries
Liveness sweep60 sMarks active agents inactive if last_heartbeat < now - threshold (default 5 min) and gossips an agent_status announcement
Bloom rebuild5 minRebuilds the local bloom filter from all current local + gossip agents

Wire protocol

The mesh transport is length-prefixed JSON over TCP+TLS:

[4 bytes big-endian length][JSON payload]
  • Max message size: 1 MB
  • Write timeout: 10 s
  • Read timeout: 90 s (3× heartbeat interval)

Message types:

TypePurpose
MsgHelloHandshake — registry ID, public key, current agent count
MsgHeartbeatPeriodic peer heartbeat with bloom filter + peer list
MsgGossipAnnouncement propagation
MsgSearchFederated search query
MsgSearchAckFederated search results
MsgDHTKademlia STORE / FIND_VALUE / FIND_NODE / PING

The handshake is two HELLOs carrying registry ID, public key, and current agent count; self-connections and duplicates are rejected.

Why TLS from Ed25519

The mesh uses self-signed TLS certificates derived from each node's Ed25519 key. CA trust is irrelevant on the mesh port — verification happens at the application layer in the HELLO handshake against keys learned from gossip, the registry identity proof, or DNS TXT. TLS 1.3 minimum.

The HTTP API on :8080 is the opposite: it uses CA-issued certs (Let's Encrypt is expected) because clients use TLS to verify the domain.

Identity — IDs from public keys

Every entity ID is derived deterministically:

agent ID     = "zns:"     + sha256(pubkey)[:16].hex()
service ID   = "zns:svc:" + sha256(pubkey)[:16].hex()
developer ID = "zns:dev:" + sha256(pubkey)[:16].hex()

For HD-derived agent keys, the formula is:

seed = SHA-512(dev_priv_key || "zns:agent:" || uint32_be(index))[:32]
agent_kp = Ed25519(seed)

The verification logic in internal/identity reproduces both formulas to validate a developer_proof on POST /v1/entities.

Storage schema (PostgreSQL)

TablePurpose
agentsRegistry record per agent
servicesRegistry record per service
developersDeveloper profiles
handlesZNS handle claims
zns_namesZNS name bindings (handle → entity)
zns_versionsVersion history of name bindings
gossip_entriesReplication log for cross-node propagation
tombstonesDeregistered entity markers; suppressed during gossip dedup
peersCurrently-known peer addresses + public keys
card_cacheOptional Postgres cache when Redis is unavailable

Each table is owned by internal/store/<table>.go.

Card cache

internal/card fetches /.well-known/agent-card.json from each entity's entity_url:

  • TTL: 1 hour.
  • LRU bound: 1000 cards by default (configurable).
  • Fall-back paths: /agent-card.json, /.well-known/agent.json (for older entities), /well-known/agent.json.
  • Validates Ed25519 signature against the registry record's public_key.

Internal event bus

Every subsystem publishes lifecycle events into an in-process pub/sub bus (internal/events/bus.go):

  • Each subscriber gets a 256-buffered channel.
  • Slow subscribers see drops; never backpressure.
  • The WebSocket activity stream at /v1/ws/activity is just one subscriber.

Categories: agent_*, gossip_*, search_*, peer_*, handle_*, name_*. Used for dashboards, metrics exporters, and tests.

CLI

The agentdns binary itself ships these subcommands (separate from the user-facing zynd CLI):

CommandPurpose
agentdns initGenerate node identity + write a default config.toml
agentdns startRun the registry process
agentdns migrateRun pending Postgres migrations
agentdns peer add <addr>Manually add a peer to the persistent peer list
agentdns peer listPrint the current peer set
agentdns trust listPrint top-N trust scores
agentdns versionBuild version

The full operator's flow lives at Run a Registry Node.

Configuration

Config is a single TOML file (default ~/.zynd/config.toml). Top-level sections:

toml
[server]
http_port  = 8080
mesh_port  = 4001
external_url = "https://my-registry.example.com"

[postgres]
url = "postgres://user:pass@localhost:5432/agentdns"

[redis]
url = "redis://localhost:6379"

[search]
embedder = "onnx"
onnx_model = "bge-small-en-v1.5"

[mesh]
bootstrap_peers = ["zns-boot.zynd.ai:4001"]
listen_port = 4001
max_peers = 64

[dht]
enabled = true
k = 20
alpha = 3

[onboarding]
mode = "open"          # or "restricted"
auth_url = null

[heartbeat]
ttl_minutes = 5

agentdns init generates a default file with sensible values.

Observability

SurfaceEndpoint
HealthGET /health
Node infoGET /v1/info
Network statusGET /v1/network/status
Network statsGET /v1/network/stats
Peer listGET /v1/network/peers
Live activity streamWSS /v1/ws/activity
Prometheus (optional)GET /metrics if [metrics].enabled = true

For an operator's monitoring playbook see Metrics & Monitoring.

See also

Released under the MIT License.