Observer
Observer Agent

Configure ICMP ping probes

Ping a host for reachability, latency, or packet loss. The simplest layer-3 network health check.

ICMP probes ping a host on the configured interval and report one of: average round-trip latency, packet loss percentage, or plain reachability. This is the lowest-level "is this host up?" check, below HTTP / TCP / DNS.

The agent shells out to the system ping binary, so ICMP needs raw-socket privilege on the agent (see below). The agent never falls back to a TCP probe when ICMP can't run: a TCP connect tests a different layer and would hide the real failure.

When to use ICMP vs a TCP probe

  • ICMP tests pure network reachability to a host (router, switch, bastion, internal host). It doesn't care whether any service is listening.
  • TCP tests that a specific port accepts connections, and runs unprivileged. If you only care that a service is up, prefer the TCP probe (no special privilege needed).
  • Many networks block ICMP entirely (cloud security groups, corporate firewalls). If pings are filtered, use TCP.

Privilege requirement

Sending ICMP needs a raw socket, which is privileged on every platform:

  • Linux: CAP_NET_RAW capability (or root).
  • macOS: the system ping is setuid and works without extra setup.
  • Windows: administrator.

In a container the agent's ping binary must be allowed to send ICMP. Grant the capability rather than running the whole agent as root.

systemd

[Service]
AmbientCapabilities=CAP_NET_RAW
CapabilityBoundingSet=CAP_NET_RAW

Docker

docker run --cap-add=NET_RAW ...

Kubernetes

securityContext:
  capabilities:
    add: ["NET_RAW"]

Some base images don't ship ping. On Debian/Ubuntu install iputils-ping; on Alpine, apk add iputils. Without the binary the probe reports icmp_unavailable.

When the agent lacks the privilege, the probe reports icmp_privilege_denied with a pointer to this page. It does not silently degrade to a different check.

Configuration shape

{
  "host": "db.internal",
  "count": 3,
  "timeout_ms": 1000,
  "interpretation": "latency"
}

Field reference

FieldDefaultNotes
hostrequiredHostname or IPv4 address. Must not start with -. ICMPv4 only.
count3Pings per probe, 1 to 10. Latencies are averaged across the successful ones.
timeout_ms1000Per-ping timeout, 100 to 5000 ms.
interpretationlatencyWhat the metric value represents: latency, packet_loss, or reachability.

Interpretations

InterpretationValueThreshold idea
latencyAverage RTT in ms across successful pings. no_data (icmp_all_timeout) when every ping is lost.Healthy: under 50, Unhealthy: over 200
packet_lossPercentage of pings that didn't return (0 to 100).Healthy: under 1, Unhealthy: over 10
reachability1 if any ping succeeded, else 0.Healthy: over 0, Unhealthy: under 1

Pick reachability for "is this host up at all", latency for network quality, packet_loss for flaky links.

Reason codes

  • icmp_privilege_denied: the agent can't open a raw socket. Grant CAP_NET_RAW and restart.
  • icmp_unavailable: no usable ping binary on the agent host. Install iputils-ping / iputils.
  • icmp_dns_failed: the hostname didn't resolve. Distinct from "host unreachable". Check the name and the agent's DNS.
  • icmp_all_timeout: every ping timed out (latency interpretation only). The host is down or ICMP is filtered. With packet_loss this reads as 100; with reachability it reads as 0.
  • icmp_error: an uncategorised ping failure. Check the agent log.

Troubleshooting

  • icmp_privilege_denied even after adding CAP_NET_RAW. The capability must reach the agent process, not just the container. For systemd use AmbientCapabilities (not only CapabilityBoundingSet). For Kubernetes confirm the pod's securityContext adds NET_RAW and no policy strips it.
  • icmp_all_timeout but the host is up. The network likely blocks ICMP. Confirm with a TCP probe against a known-open port; if TCP succeeds, switch this metric to TCP.
  • icmp_dns_failed for an IP literal. Check the value isn't mistyped; IP literals never go through DNS.
Was this page helpful?