Building an Enterprise Anycast CDN at the Network Edge
This series is a theory — my theory. It is not presented as a standard, a prescription, or a finished product, but as a deliberate exploration of an idea that emerges from operating large networks over time. Some parts are well‑understood practices; others are hypotheses tested through reasoning, experience, and constraint. Like any good theory, it is meant to be examined, challenged, adapted, and occasionally rejected. What follows is an attempt to think clearly and honestly about what might be possible, not to declare what must be done.

Section 5 — Advertising Service Truth (and Why Withdrawal Matters More Than Selection)
Up to this point, the architecture has focused on where traffic enters and how decisions are made after it arrives. What remains is the question of what information those decisions are based on.
In many systems, this is where complexity creeps in. Engineers attempt to build increasingly clever selection logic: choosing the "best" site, the "fastest" backend, or the "least loaded" cache. While these approaches can work in tightly controlled environments, they tend to fail poorly when distributed across many sites.
This design takes a different approach. Rather than trying to be clever about selection, it focuses on being precise about truth.
Services Advertise Themselves
In this model, services are responsible for advertising their own availability into the overlay routing domain.
If a service is healthy at a site, that site advertises a specific service identity — typically a /32 — into the overlay. If the service becomes unhealthy, the advertisement is withdrawn.
There is no central controller inferring state. There is no external system making guesses.
The closest component to the service declares the truth.
Why Withdrawal Is More Important Than Selection
It is tempting to think that the hard problem is choosing the best destination. In practice, the far more important problem is knowing when a destination should not be used at all.
Explicit withdrawal has several advantages:
- It is unambiguous
- It converges quickly
- It avoids partial or stale state
- It aligns naturally with routing behavior
When a service withdraws its advertisement, it is simply no longer a candidate. No special logic is required elsewhere.
By contrast, attempting to rank or score destinations requires shared assumptions, synchronized metrics, and careful tuning — all of which become brittle at scale.
Partial Failures Become First-Class
Because service reachability is signaled explicitly, partial failures are handled cleanly.
For example:
- A node may remain reachable via anycast
- Other services at that node may remain healthy
- Only one specific service is withdrawn
Traffic for that service will naturally flow elsewhere, without disturbing unrelated traffic.
This is difficult to achieve when correctness is inferred indirectly.
What the Overlay Sees
From the overlay's perspective, there is no concept of "load" or "preference." There are only routes that exist and routes that do not.
If multiple sites advertise the same service identity:
- All are considered valid
- The overlay may choose among them based on topology or cost
If no site advertises the service:
- The service is unavailable
- That fact is explicit and visible
The overlay does not guess.
Aligning Routing With Reality
Routing works best when it reflects reality rather than attempting to predict it.
By reducing service state to a simple binary signal — present or absent — the system avoids many subtle failure modes:
- Stale health information
- Split-brain decisions
- Oscillation based on marginal metrics
This simplicity is intentional.
In the next section, we will look at how private transport fits into this picture — and how it can be used to improve performance without becoming a dependency or a source of implicit trust.