Configure ICMP ping probes

Ping a host for reachability, latency, or packet loss. A layer-3 network health check.

ICMP probes ping a host on the configured interval and report one of: average round-trip latency, packet loss percentage, or plain reachability. This is the lowest-level "is this host up?" check, below HTTP / TCP / DNS.

The agent shells out to the system ping binary, so ICMP needs raw-socket privilege on the agent (see below). The agent never falls back to a TCP probe when ICMP can't run: a TCP connect tests a different layer and would hide the real failure.

When to use ICMP vs a TCP probe#

ICMP tests pure network reachability to a host (router, switch, bastion, internal host). It doesn't care whether any service is listening.
TCP tests that a specific port accepts connections, and runs unprivileged. If you only care that a service is up, prefer the TCP probe (no special privilege needed).
Many networks block ICMP entirely (cloud security groups, corporate firewalls). If pings are filtered, use TCP.

Privilege requirement#

Sending ICMP needs a raw socket, which is privileged on every platform:

Linux: CAP_NET_RAW capability (or root).
macOS: the system ping is setuid and works without extra setup.
Windows: administrator.

In a container the agent's ping binary must be allowed to send ICMP. Grant the capability rather than running the whole agent as root.

systemd#

[Service]
AmbientCapabilities=CAP_NET_RAW
CapabilityBoundingSet=CAP_NET_RAW

Docker#

docker run --cap-add=NET_RAW ...

Kubernetes#

securityContext:
  capabilities:
    add: ["NET_RAW"]

Some base images don't ship ping. On Debian/Ubuntu install iputils-ping; on Alpine, apk add iputils. Without the binary the probe reports icmp_unavailable.

When the agent lacks the privilege, the probe reports icmp_privilege_denied with a pointer to this page. It does not silently degrade to a different check.

Configuration shape#

{
  "host": "db.internal",
  "count": 3,
  "timeout_ms": 1000,
  "interpretation": "latency"
}

Field reference#

Field	Default	Notes
`host`	required	Hostname or IPv4 address. Must not start with `-`. ICMPv4 only.
`count`	`3`	Pings per probe, 1 to 10. Latencies are averaged across the successful ones.
`timeout_ms`	`1000`	Per-ping timeout, 100 to 5000 ms.
`interpretation`	`latency`	What the metric value represents: `latency`, `packet_loss`, or `reachability`.

Interpretations#

Interpretation	Value	Threshold idea
`latency`	Average RTT in ms across successful pings. `no_data` (`icmp_all_timeout`) when every ping is lost.	Healthy: under 50, Unhealthy: over 200
`packet_loss`	Percentage of pings that didn't return (0 to 100).	Healthy: under 1, Unhealthy: over 10
`reachability`	`1` if any ping succeeded, else `0`.	Healthy: over 0, Unhealthy: under 1

Pick reachability for "is this host up at all", latency for network quality, packet_loss for flaky links.

Reason codes#

icmp_privilege_denied: the agent can't open a raw socket. Grant CAP_NET_RAW and restart.
icmp_unavailable: no usable ping binary on the agent host. Install iputils-ping / iputils.
icmp_dns_failed: the hostname didn't resolve. Distinct from "host unreachable". Check the name and the agent's DNS.
icmp_all_timeout: every ping timed out (latency interpretation only). The host is down or ICMP is filtered. With packet_loss this reads as 100; with reachability it reads as 0.
icmp_error: an uncategorised ping failure. Check the agent log.

Troubleshooting#

icmp_privilege_denied even after adding CAP_NET_RAW. The capability must reach the agent process, not just the container. For systemd use AmbientCapabilities (not only CapabilityBoundingSet). For Kubernetes confirm the pod's securityContext adds NET_RAW and no policy strips it.
icmp_all_timeout but the host is up. The network likely blocks ICMP. Confirm with a TCP probe against a known-open port; if TCP succeeds, switch this metric to TCP.
icmp_dns_failed for an IP literal. Check the value isn't mistyped; IP literals never go through DNS.