The local queue
Why the agent buffers status pushes locally, and how the buffer behaves under cloud unreachability.
Status pushes are written to a local SQLite file before the agent attempts to deliver them to the cloud. The file is the agent's durability layer. Pushes survive container restarts, daemon restarts, and the cloud being temporarily unreachable.
Behaviour
- Every status push enqueues a row in the local SQLite file
(
BUFFER_PATH, default./observer-agent-buffer.db). - A background drain controller pulls batches from the queue and
posts them to the cloud's
/api/agent/receiverendpoint. - Successful posts ack and remove rows from the queue.
- Failed posts back off exponentially. The queue continues to accept new pushes during the outage.
- When the queue reaches
BUFFER_MAX_ROWS(default10000), oldest entries are evicted to admit new ones.
The cloud is the source of truth for historical data. The local queue is a write-ahead log that protects against transient cloud failures, not a long-term store.
What the operator sees
The agent dashboard's queue panel shows three live numbers:
depth: rows currently waiting.oldest_age_seconds: age of the oldest pending row.drain_backoff_ms: current backoff between drain attempts.
A growing depth combined with a non-zero backoff is the signature of cloud unreachability. Once the cloud is reachable again, the queue drains and depth returns to near zero.
Queue-driven alerts
The agent reports the queue numbers on every heartbeat. A sustained
high queue (depth above 1000, or an oldest pending push older than
300 seconds) raises an agent.lag_high alert that surfaces on the
agent detail page and as a webhook event when subscribed. Once the
queue drains, the alert clears.