Tech Notes

1. 27.1 - Anycast with BGP /32

Anycast publishes the same service IP from multiple locations. For a DNS resolver, CDN edge, or DDoS scrubber, the application sees one address while BGP policy decides which PoP receives each client's packets.

Failover is a routing event: a health agent withdraws the host route, upstream routers run normal best-path selection, and new packets move to another PoP after convergence.

Minimal C Demo - Anycast Route Selection

Anycast Routing Simulation — C Demo

stdin (optional)

2. 27.2 - LVS / IPVS Modes

LVS is a kernel load balancer. The Director owns the VIP and IPVS picks a real server. The key design choice is where the return packet goes and whether the real server can share L2, hold the VIP locally, or live behind a tunnel.

Mode	Header action	Return path	Main constraint
NAT	DNAT request, SNAT response.	Through Director.	Director handles both directions.
DR	Destination MAC rewrite; IP destination remains VIP.	Direct from RS to client.	Same L2 and ARP suppression on RS.
TUN	IP-in-IP encapsulation to RS.	Direct from RS to client.	RS must terminate tunnel and hold VIP on loopback.
FULLNAT	Source and destination NAT.	Often through LB tier.	Original client IP needs TOA or proxy metadata.

Minimal C Demo - LVS Header Transform

LVS Mode Comparison — C Demo

stdin (optional)

Minimal C Demo - LVS Scheduling

LVS Weighted Scheduling — C Demo

stdin (optional)

3. 27.3 and 27.4 - L4/L7 Load Balancers

Maglev and Katran solve high-scale L4 distribution with stable flow hashing. Maglev uses a large lookup table so every LB instance makes the same backend choice; Katran pushes the fast path into XDP and BPF maps.

Minimal C Demo - Maglev Slot Stability

Maglev Hash Table Redistribution — C Demo

stdin (optional)

HAProxy and Envoy operate higher in the stack. HAProxy is direct and operationally compact: frontends, backends, ACLs, health checks, stick tables, and a runtime socket. Envoy is a programmable proxy with xDS, filter chains, clusters, retries, outlier detection, circuit breakers, and rich telemetry.

Minimal C Demo - HAProxy Stick Table

HAProxy Stick Table Failover — C Demo

stdin (optional)

Minimal C Demo - Envoy Circuit Breaker

Envoy Circuit Breaker — C Demo

stdin (optional)

4. 27.5 - conntrackd

Active/backup firewalls and NAT load balancers need more than VIP movement. Existing TCP sessions also depend on Linux conntrack tuples, sequence state, and NAT mappings. conntrackd replicates that state before failure.

Minimal C Demo - Conntrack State Sync

conntrackd State Sync — C Demo

stdin (optional)

5. 27.6 - Pacemaker / Corosync

Corosync supplies cluster messaging, membership, and quorum. Pacemaker decides where resources should run, calls resource agents to start or monitor them, and requires fencing before a survivor takes over shared state.

STONITH is not ceremony. If a two-node cluster loses communication, both nodes can believe they are the survivor. Fencing proves the old owner is dead before promoting a database, VIP, or shared filesystem elsewhere.

Minimal C Demo - Split-Brain Prevention

Pacemaker STONITH Decision — C Demo

stdin (optional)

6. 27.7 - DNS Failover

DNS failover is useful for regional or provider-level steering, but it is not instant. The lower bound is health-check detection, authoritative record change, recursive resolver TTL expiry, and the application's next retry.

Technique	Best at	Weakness
DNS failover	Coarse regional steering and primary/secondary service records.	Bounded by caches and client retry behavior.
BGP anycast	Infrastructure-level nearest PoP selection and DDoS absorption.	Needs routing control and careful state handling.
L4 LB failover	Fast local service failover with health checks.	Stateful sessions need sync or reconnect logic.

Minimal C Demo - DNS Recovery Timer

DNS Failover Simulator — C Demo

stdin (optional)

7. Core Mechanism Walkthrough

Background: A public HTTPS service uses a BGP anycast VIP. Inside each PoP, two keepalived LVS Directors front a pool of real servers in DR mode. Stateful NAT is not on the hot path, but firewall state exists on the edge pair.

Plan: Fail small first. Remove bad real servers with local health checks, promote the standby Director with conntrack state if the active node dies, and withdraw the BGP /32 only when the whole PoP is unhealthy.

Failure	Detector	Recovery action	Blast radius
One real server fails	LVS or HAProxy health check.	Remove RS from scheduler.	Only flows to that RS reconnect or drain.
Active Director fails	VRRP or BFD via keepalived.	Backup owns VIP; conntrackd commits state if needed.	Local PoP only.
PoP service fails	External health checker.	Withdraw anycast /32 or change DNS answer.	Regional clients reroute.
Cluster split-brain risk	Corosync membership and quorum.	Fence old owner before promotion.	Protects shared state from dual ownership.

8. Source and Tooling Pointers

ipvsadm -Ln --stats shows VIPs, real servers, schedulers, and counters.
conntrack -L and conntrackd -s expose replicated connection state.
crm_mon -1 summarizes Pacemaker resource placement, quorum, and failed actions.
show ip bgp 203.0.113.53/32 or BMP telemetry proves anycast advertisement state.
echo "show table" through HAProxy's stats socket inspects stick tables and server health.

9. Interview Prep

Questions and concise answers

Why is LVS DR faster than NAT mode?	The Director handles only inbound traffic; real servers reply directly to clients.
Why does DR mode require ARP tuning?	Real servers hold the VIP on loopback but must not answer LAN ARP for it, or they bypass the Director.
How does Maglev reduce disruption?	Surviving backends keep most of their lookup-table slots; only slots owned by the removed backend move.
Why sync conntrack state?	Without replicated tuples and NAT state, standby takeover breaks existing stateful TCP flows.
Why is STONITH mandatory for serious Pacemaker designs?	It prevents two nodes from owning the same writable resource after a membership split.
Why is DNS failover slow?	Health checks, authoritative changes, resolver caches, client caches, and retry intervals all add delay.

27. Server / Service HA

1. 27.1 - Anycast with BGP /32

Minimal C Demo - Anycast Route Selection

2. 27.2 - LVS / IPVS Modes

Minimal C Demo - LVS Header Transform

Minimal C Demo - LVS Scheduling

3. 27.3 and 27.4 - L4/L7 Load Balancers

Minimal C Demo - Maglev Slot Stability

Minimal C Demo - HAProxy Stick Table

Minimal C Demo - Envoy Circuit Breaker

4. 27.5 - conntrackd

Minimal C Demo - Conntrack State Sync

5. 27.6 - Pacemaker / Corosync

Minimal C Demo - Split-Brain Prevention

6. 27.7 - DNS Failover

Minimal C Demo - DNS Recovery Timer

7. Core Mechanism Walkthrough

8. Source and Tooling Pointers

9. Interview Prep