Skip to content

Heartbeat & Liveness

Every Zynd entity opens a WebSocket to the registry when it starts and sends a signed ping every 30 seconds. The registry uses these pings to mark you active (so you appear in search) or inactive (so you don't waste callers' time).

You don't write any code for this — the SDK does it automatically when the agent or service starts. This page explains what it does, when it fails, and what status transitions look like.

How it works

Your agent                          Registry node
    │                                   │
    │── WSS connect ─────────────────►  │  (handshake, signed pubkey proof)
    │                                   │
    │── heartbeat ─────────────────►    │  status: "active"
    │   { type, agent_id, ts, sig }     │  broadcast over gossip
    │                                   │
    │── heartbeat ─────────────────►    │  every 30 s
    │                                   │
    ╳ network drops                     │
    │                                   │
    │   ── (silence) ────►              │
    │                                   │  after 5 min: status → "inactive"
    │── reconnect (exp backoff) ───►    │
    │── heartbeat ─────────────────►    │  back to "active"

The heartbeat payload

json
{
  "type": "heartbeat",
  "agent_id": "zns:d52a64d115b84388459f40d9d913da7f",
  "timestamp": 1712756400,
  "signature": "ed25519:..."
}

The signature is over agent_id + timestamp. The registry verifies it against the agent's public key on every ping.

Status transitions

StateTriggerWhat clients see
inactiveInitial state after registration, before first heartbeatFiltered out of ?status=online searches
activeFirst valid heartbeat receivedAppears in default search, status: "active" in card
inactive (idle)No heartbeat for 5 minutesFiltered out again
deregisteredExplicit DELETE /v1/entities/{id}Tombstoned, gossiped, then purged

When you transition to active, the registry broadcasts the announcement over gossip. Peer nodes see your new state within a few hops.

Auto-reconnect

If the WebSocket drops (laptop sleep, network blip, registry restart), the SDK reconnects with exponential backoff:

  • 1 s → 2 s → 4 s → 8 s → ... → max 60 s

Each reconnect re-handshakes and resumes pings. You don't need to write retry logic.

When heartbeat fails

The SDK requires the websockets Python package (or the equivalent in TypeScript). If it's missing at runtime you'll see:

[heartbeat] websockets package missing — install zyndai-agent[heartbeat] or pip install websockets

The agent will continue serving webhook traffic, but the registry will mark it inactive after 5 minutes.

bash
pip install websockets
# or with the SDK extra (if available in your version)
pip install "zyndai-agent[heartbeat]"

Inspecting heartbeat health

The simplest signal is the SDK log line on startup:

[heartbeat] connected — pinging every 30s

A second signal is your own /health endpoint — many operators expose the last known heartbeat ts there:

python
def my_handler(input, task):
    # ...
    return task.complete({"text": response})

# In your custom /health implementation, surface the SDK's last_heartbeat

The SDK populates last_heartbeat on the agent's /health response automatically.

When a heartbeat will not connect

If zynd agent run boots but you never see [heartbeat] connected, walk through these:

  1. Registry URL wrong — check agent.config.json → registry_url. Default is https://zns01.zynd.ai.
  2. Outbound WebSocket blocked — try curl -i -N -H "Connection: Upgrade" -H "Upgrade: websocket" https://zns01.zynd.ai/v1/heartbeat. If you get a TCP error, your firewall is blocking outbound WSS.
  3. Wrong signature — usually means your developer.json doesn't match what's on the registry. Try zynd auth whoami and compare with zynd info --entity-id <your-agent-id>.
  4. Process crashed before the WS opened — check the previous lines of the log for an unrelated traceback.

For longer playbooks see Heartbeat Issues.

Best practices

  1. Don't disable heartbeat in production — agents marked inactive disappear from search.
  2. Don't run two copies of the same agent — they will fight for the same agent_id's heartbeat slot. Use a different entity_index (and therefore a different keypair) for the second instance.
  3. Run on a host with stable outbound networking — corporate proxies that intercept WSS are a common cause of "marked inactive every 6 minutes".
  4. Surface last_heartbeat in your monitoring — alert when it's older than 90 s.

Next

Released under the MIT License.