Back to all questions

How Does Server Health Monitoring Influence CDN Traffic Routing Decisions?

Edward Tsinovoi
Traffic Routing
February 25, 2026

Server health monitoring is the CDN’s steering wheel. When an edge server, PoP, or your origin starts failing checks, timing out, or slowing down, the CDN changes its routing so less CDN traffic goes there and more goes to healthier locations. 

That shift can happen through CDN DNS answers (different IPs returned), through Anycast and internal load balancing (a different PoP wins), or right at the edge while handling the request (retries, failover, stale cache). In practice, “closest” is only the starting guess. 

Health and speed decide who actually serves the user.

How The Health-To-Routing Loop Works

A CDN keeps making one decision: “Where will this request succeed quickly right now?” Server health monitoring turns that into a live feedback loop:

  • Measure availability and performance in multiple places
  • Label targets as healthy, degraded, or failing
  • Update routing weights and eligibility
  • Watch results and adjust again

I think of it like GPS rerouting. The map is static, but the route changes when traffic or road closures show up.

What Health Signals The CDN Watches

“Healthy” is not just “ping works.” CDNs blend active probes and real user telemetry so they do not get fooled by one perfect-looking endpoint.

Here are the signals that most often influence routing:

  • Active checks: HTTP/TCP/TLS probes that confirm reachability and correct responses
  • Passive telemetry: real request error rates (4xx/5xx), timeouts, connection failures, cache hit ratio, and upstream behavior
  • Performance monitoring: latency, time-to-first-byte, queueing, saturation, and retry rates that indicate “up, but struggling”
  • Network health: packet loss, route instability, and congestion that can break one region while everything looks fine elsewhere

This is where real-time monitoring matters. If loss or errors spike for 90 seconds, the difference between reacting in seconds versus minutes is the difference between “minor slowdown” and “people think you’re offline.”

Where Routing Decisions Actually Change

Health data has to turn into an actual steering action. CDNs usually apply it at three layers, and you benefit when all three exist.

  1. DNS layer (CDN DNS steering)
    The CDN’s authoritative DNS decides which PoP or edge IPs to return for your hostname. If a PoP is failing, it stops being returned. If it is degraded, it might still be returned, but with a lower weight or lower priority. DNS is great for broad regional steering, but caching means changes propagate on a delay.
  2. Network layer (Anycast and BGP behavior)
    With Anycast, the same IP can land users in different locations. Health can influence route announcements or preferences so an unhealthy PoP attracts less traffic. Inside a PoP, unhealthy edge machines can be removed from rotation immediately, which is much faster than waiting for DNS caches.
  3. Request layer (edge-time decisions)
    Even after a user reaches an edge location, the edge can decide what to do next. If an origin fetch fails, it can retry, pick a different upstream, fail over to a backup origin or shield, or serve stale content if your cache rules allow it. This layer is often what makes incidents feel “contained.”

Health Signals And The Routing Outcomes You Get

Health Signal Observed What The CDN Assumes Typical Routing Move User Impact
Edge PoP 5xx rate climbs Edge services or upstreams failing Lower weight or remove PoP from mapping Fewer hard failures
Origin timeouts increase Origin slow/down or path broken Fail over to backup origin or shield More latency, but pages load
TLS handshakes fail Network or TLS stack trouble Stop returning those IPs, reroute internally Fewer “cannot connect” errors
Latency spikes, low errors Congestion or overload Prefer alternate PoPs, cap new connections More consistent speed

Notice what’s missing: “closest wins.” Proximity is helpful, but health is decisive.

Origin Health And Failover Decisions

If the origin is unhealthy, routing gets expensive fast, because cache misses and dynamic requests have nowhere good to go. Health-driven routing commonly does a few things:

  • Switch from a primary origin to a backup origin when failure thresholds are met
  • Spill over across regions when one origin cluster is down
  • Adjust origin shield selection so the shield does not become a single bad choke point
  • Serve stale-on-error for cacheable content, so users still get something useful while you recover

You do not need to memorize vendor terms. The idea is simple: the CDN prefers the healthiest upstream that can satisfy the request with the least damage.

Settings That Make Health-Based Routing Work

Most routing failures are not “CDN bugs.” They come from weak checks, bad thresholds, or slow propagation.

A practical checklist that keeps decisions sane:

  • Make health checks represent real readiness, not just “the process is running.”
  • Run checks from multiple regions so you can tell local path issues from global outages.
  • Separate degraded from dead so the CDN can shift traffic gradually instead of flipping everything at once.
  • Keep DNS TTL aligned with your tolerance for reroute delay, and rely on Anycast and request-layer logic to cover DNS caching.
  • Choose server health monitoring tools that alert quickly, avoid flapping, and integrate cleanly with your CDN controls.

Do this right and the CDN routes around trouble automatically. Do it wrong and I promise you the CDN will keep sending users to the place that is failing, because you never gave it a trustworthy signal to stop.

Outages Don’t Wait for Contracts to End
The Future of Delivery Is Multi-Edge
Switching CDNs Is Easy. Migrating Safely Isn’t.