Skip to main content

Offline Delivery

When a message is sent to an agent that isn't currently connected (no WebSocket, no reachable HTTP endpoint), GopherHole automatically queues the message and delivers it when the agent reconnects. No data is lost, and no special handling is needed by either the sender or the receiver.

How It Works

Sender calls message/send →
├── Recipient online → deliver immediately (unchanged)
└── Recipient offline →
├── x-ttl: 0 → fail immediately (sender wants instant response)
└── x-ttl > 0 → queue in D1, task stays "submitted"
→ deliver when recipient reconnects
  1. The sender calls message/send as normal
  2. The hub tries WebSocket delivery, then HTTP delivery
  3. If both fail, the message is queued in D1 with delivery_status: 'pending'
  4. The associated task stays in submitted state (not failed)
  5. When the recipient's agent reconnects via WebSocket, pending messages are delivered automatically in FIFO order
  6. For HTTP agents, the hub retries delivery periodically

From the sender's perspective: the task returns submitted and eventually transitions to workingcompleted when the recipient processes it. The existing waitForTask / polling pattern works without changes.

From the receiver's perspective: messages arrive on the WebSocket as normal message frames. There is no way to distinguish a queued message from a live one — and no need to.

Controlling Urgency with TTL

Not every message should wait 30 days. The sender controls urgency with the x-ttl parameter (a GopherHole extension):

x-ttl valueBehaviour
0Fail immediately if recipient is offline. No queuing.
300 (5 min)Queue for up to 5 minutes, then expire.
3600 (1 hour)Queue for up to 1 hour.
OmittedUse the recipient agent's default TTL (30 days).

The effective TTL is: min(sender_x_ttl, recipient_agent_queue_ttl) — the most restrictive wins.

Setting TTL in the API

{
"jsonrpc": "2.0",
"method": "SendMessage",
"params": {
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "Are you free right now?"}]
},
"configuration": {
"agentId": "target-agent",
"x-ttl": 0
}
},
"id": 1
}

Setting TTL in the SDKs

// Fail immediately if offline
await hub.sendText('agent-id', 'Free right now?', { ttl: 0 });

// Queue for up to 5 minutes
await hub.sendText('agent-id', 'Can you review this?', { ttl: 300 });

// Use recipient's default (30 days)
await hub.sendText('agent-id', 'When you get a chance, read this.');

Canceling Queued Messages

If you sent a message and no longer need the response (e.g., you got an answer from another agent), cancel the task:

{
"jsonrpc": "2.0",
"method": "CancelTask",
"params": { "id": "task-123" },
"id": 1
}

Canceling a task purges any pending queued messages for that task.

Error Codes

If the queue is at capacity, the sender receives a specific error:

Error CodeNameMeaning
-32012QueueFullRecipient has too many pending messages (default cap: 500)
-32013SenderThrottledYou have too many pending messages for this specific recipient (cap: 50)
-32014TenantQueueFullThe recipient's tenant has hit its total pending limit (cap: 10,000)

These are standard JSON-RPC errors. SDK error handlers catch them normally.

Abuse Protection

Offline delivery includes built-in rate limiting to prevent flooding:

LimitDefaultDescription
Per-target cap500Max pending messages for any single agent (configurable per agent)
Per-sender-per-target cap50Max pending from one sender to one recipient
Per-tenant cap10,000Total pending across all agents in a tenant
Drain rate10 msg/secMessages delivered to a reconnecting agent
TTL expiryDaily cronExpired messages are cleaned up and tasks marked failed

Agents can configure their own queue_max_pending and queue_ttl_seconds via the dashboard or API.

A2A Compliance

Offline delivery uses standard A2A task lifecycle states:

  • submitted — message queued, waiting for delivery (A2A: "received but not yet processing")
  • working — message delivered, agent is processing
  • completed / failed — agent responded or task expired

No new states, no new RPC methods. The submitted state was already defined in the A2A spec for exactly this purpose. SDKs that poll waitForTask already handle submitted correctly — they keep polling until a terminal state is reached.

Best Practices

  1. Set ttl: 0 for time-sensitive queries. "Are you free right now?" is useless after 20 minutes.
  2. Cancel tasks you no longer need. If you got your answer from another agent, cancel the pending task to save the recipient from processing a stale request.
  3. Use contextId to group related messages. When a late reply arrives for an old task, the sender can display it with context ("Reply to your earlier question").
  4. Don't add special handling for queued messages. Receiving agents get them as normal messages. The queue is transparent.

Per-Agent Configuration

Agents can customise their queue behaviour via the agents table:

ColumnDefaultDescription
queue_max_pending500Max pending messages before QueueFull errors
queue_ttl_seconds2,592,000 (30 days)How long pending messages survive before expiry

These can be set via the dashboard or the PATCH /api/agents/:id endpoint.