Network Protocols

Deep Dive — Study Catalog

26 parts · 300+ sections · CCNP/CCIE routing & switching, TCP/IP & congestion control, QUIC/KCP, IPv6, multicast, NAT traversal, HA, RDMA/RoCE, AI training fabrics, SPDK, libev and the full select/poll/epoll path.

Reference Books

CodeBookFocus
TCPIP1TCP/IP Illustrated Vol.1 (2nd ed.) — Fall & StevensWire-level walk of every protocol
TCPIP2TCP/IP Illustrated Vol.2 — Wright & StevensBSD kernel implementation
TCPIP3TCP/IP Illustrated Vol.3 — StevensT/TCP, HTTP, NNTP — historical
UNPUNIX Network Programming Vol.1 (3rd ed.) — StevensSockets API bible — select/poll/epoll patterns
KUROSEComputer Networking: A Top-Down Approach — Kurose & RossUniversity-level CN reference
DOYLE1Routing TCP/IP Vol.1 (2nd ed.) — Jeff DoyleIGPs — RIP, EIGRP, OSPF, IS-IS
DOYLE2Routing TCP/IP Vol.2 — Jeff DoyleBGP, multicast, NAT, IPv6
HALABIInternet Routing Architectures (2nd ed.) — Sam HalabiBGP design — the BGP bible
MPLSMPLS-Enabled Applications — Minei & LucekMPLS, L3VPN, EVPN, SR
IPV6IPv6 Essentials (3rd ed.) — Silvia HagenPractical IPv6 reference
MCASTDeveloping IP Multicast Networks — Beau WilliamsonPIM-SM, MSDP, RP design
HPBNHigh Performance Browser Networking — Ilya GrigorikTLS, HTTP/2, HTTP/3, WebRTC
CCNPCCNP Enterprise ENCOR / ENARSI Official Cert Guides — Cisco PressCCNP exam-aligned, hands-on
CCIECCIE Enterprise Infrastructure / SP / DC v1.x Blueprint — CiscoLab-aligned curriculum
IBTAInfiniBand Architecture Spec Vol.1 & 2 — IBTARDMA verbs, transports, packet format
MELLARDMA Aware Networks Programming Manual — NVIDIA/MellanoxVerbs API and ConnectX behavior
NCCLNCCL Documentation & Source — NVIDIACollective algorithms, channels, transports
SPDKSPDK Programmer's Guide — Intel/Linux FoundationUser-space NVMe & NVMe-oF target
LIBEVlibev Documentation — Marc LehmannDefinitive source on libev internals
LIBEVTThe libevent Book — Nick MathewsonEvent-loop design patterns
TLPIThe Linux Programming Interface — Michael Kerriskselect/poll/epoll/io_uring chapters

Part ICisco CCNP / CCIE Foundations

Doyle Vol.1 · CCNP ENCOR · CCIE EI v1.1 blueprint

  • §1.1OSI vs TCP/IP Layered Model (encapsulation, MTU/MSS, fragmentation, ethertype)
  • §1.2Cisco IOS / IOS-XE / NX-OS / IOS-XR (CLI modes, AAA, SSH, NETCONF/YANG)
  • §1.3Certification Path (CCNA → CCNP Enterprise → CCIE EI; CCNP DC → CCIE DC; CCNP SP → CCIE SP)
  • §1.4Lab Tooling (GNS3, EVE-NG, Cisco CML/VIRL, Containerlab, Packet Tracer)
  • §1.5Network Automation (Ansible, NETCONF, RESTCONF, gNMI, Cisco DNA Center)
  • §1.6Reading the CCIE Lab Topology (control plane vs data plane, stateful vs stateless)

Part IILayer 2 Switching

CCNP ENCOR · Doyle Vol.1 Ch.4-7

  • §2.1Ethernet Frame & MAC Forwarding (CAM/TCAM, MAC aging, unicast flooding)
  • §2.2VLAN (access/trunk, native VLAN, 802.1Q tagging, voice VLAN, QinQ stacking)
  • §2.3VTP (server/client/transparent, pruning, VTPv3 password)
  • §2.4STP / RSTP / MSTP (root election, BPDU, port states, edge/PortFast, BPDU guard, root guard, loop guard)
  • §2.5EtherChannel / LACP (active/passive, PAgP auto/desirable, load-balance hash, min-links)
  • §2.6Switchport Security (port-security, DHCP snooping, DAI, IPSG, storm-control)
  • §2.7First-Hop Security (RA Guard, IPv6 Source Guard, BPDU Guard for IPv6)
  • §2.8TRILL / SPB (modern L2 multipath alternatives to STP)

Part IIIStack & De-stack Architectures

Cisco StackWise / VSS / vPC / MLAG · interview-critical

  • §3.1StackWise / StackWise-480 / StackWise-1T (Catalyst 3650/9300/9500 ring, master/standby/member)
  • §3.2StackWise Virtual / SVL (Catalyst 9500/9600 — two physical switches as one logical)
  • §3.3VSS (Virtual Switching System on Catalyst 6500/6800 — VSL link, dual-active detection, RPR/SSO)
  • §3.4vPC (Nexus 9K/7K Virtual Port Channel — peer-link, peer-keepalive, orphan ports, vPC roles)
  • §3.5MLAG / MC-LAG (Arista, Juniper MC-LAG, peer-gateway, ARP sync)
  • §3.6De-stack Architecture: Pure L3 ECMP Spine-Leaf (no MLAG, BGP-only, server multi-homing via routing)
  • §3.7Failure Domain Comparison (stack vs MLAG vs vPC vs L3-only)
  • §3.8ISSU / GIR (in-service software upgrade, Graceful Insertion & Removal)
  • §3.9Dual-Active / Split-Brain Detection (PAGP enhanced, fast-hello, BFD-based detection)
  • §3.10Migration Stories (StackWise → SVL → leaf-spine — operational lessons)

Part IVRouting Protocols — IGP

Doyle Vol.1 · CCIE Routing TCP/IP

  • §4.1Static & Floating Static Routes (administrative distance, recursive lookup, IP SLA tracking)
  • §4.2RIPv2 / RIPng (distance vector, split horizon, poison reverse, hold-down — legacy but useful baseline)
  • §4.3OSPFv2 (LSA Type 1-7, areas, NSSA / Totally NSSA, virtual link, DR/BDR election, SPF & iSPF)
  • §4.4OSPFv3 (per-link LSAs, IPv6 transport, address-family for v4)
  • §4.5OSPF Optimization (LSA throttle, SPF throttle, prefix suppression, fast-hello, BFD)
  • §4.6EIGRP (DUAL algorithm, feasible successor, FD/RD, named mode, classic vs wide metric)
  • §4.7IS-IS (Level-1/Level-2, NET addressing, CSNP/PSNP, wide metric, multi-topology — common in ISP/DC)
  • §4.8Route Redistribution (mutual redistribution, route-map, tag-based loop prevention)
  • §4.9Policy Routing (PBR, route-map on interface, BFD-tracked next-hop)
  • §4.10Convergence Tuning (BFD sub-second, fast-hello, LFA, RLFA, TI-LFA — all SR-aware)

Part VRouting Protocols — BGP

Halabi · RFC 4271/7606 · CCIE SP / DC

  • §5.1BGP Fundamentals (eBGP vs iBGP, full mesh, AS, TCP/179)
  • §5.2Path Attributes (AS_PATH, NEXT_HOP, LOCAL_PREF, MED, ORIGIN, COMMUNITIES, AGGREGATOR)
  • §5.3Best Path Selection (13-step algorithm — weight → local-pref → AS-path → origin → MED → eBGP/iBGP → IGP → router-id)
  • §5.4Route Reflector & Confederation (scaling iBGP)
  • §5.5BGP Communities (well-known, extended, large communities, community-based policy)
  • §5.6BGP Security (RPKI/ROA, BGPsec, max-prefix, TTL security, GTSM)
  • §5.7BGP Convergence (BGP-PIC, ADD-PATH, BFD, graceful restart)
  • §5.8Multipath BGP (eBGP multipath, iBGP multipath, AS-PATH multipath-relax)
  • §5.9BGP for Data Center (eBGP unnumbered, allowas-in, FRR, GoBGP, Bird)
  • §5.10BGP Monitoring (BMP RFC 7854, gNMI streaming telemetry)

Part VIMPLS, VPN & Segment Routing

MPLS-Enabled Applications (Minei) · CCIE SP

  • §6.1MPLS Forwarding (label stack, EXP/TC, S bit, TTL, PHP)
  • §6.2Label Distribution (LDP, RSVP-TE, BGP-LU, downstream unsolicited vs on-demand)
  • §6.3MPLS L3VPN / RFC 4364 (VRF, RD, RT, MP-BGP VPNv4/v6, PE-CE protocols)
  • §6.4MPLS L2VPN (VPWS pseudo-wire, VPLS, H-VPLS, control word)
  • §6.5EVPN (RFC 7432 — Type 1-5 routes, VXLAN/MPLS data plane, EVPN-VPWS)
  • §6.6Traffic Engineering (RSVP-TE, FRR/MPLS-FRR, auto-bandwidth)
  • §6.7Segment Routing — SR-MPLS (SID, prefix-SID, adj-SID, anycast SID, IGP shortest path)
  • §6.8SRv6 (locator + function + arg, micro-SID/uSID, end.x, end.dt4/dt6, SRv6 BE vs Policy)
  • §6.9TI-LFA (Topology-Independent LFA — sub-50ms protection over SR)
  • §6.10VXLAN (RFC 7348, head-end replication, flood-and-learn vs EVPN control plane)
  • §6.11Geneve / NVGRE / STT (alternative overlays — Geneve is the modern winner)

Part VIICampus Network Design

Cisco SAFE · CCNP ENCOR / ENARSI

  • §7.1Three-Tier Hierarchy (access / distribution / core)
  • §7.2Two-Tier Collapsed Core (medium campus, distribution = core)
  • §7.3SD-Access (LISP control plane, VXLAN data plane, ISE for policy, fabric edge/border/control)
  • §7.4802.1X / MAB / Profiling (Cisco ISE, dot1x event-driven, change-of-authorization)
  • §7.5Wireless Integration (CAPWAP local-mode vs FlexConnect, WLC, anchor controller, AP groups)
  • §7.6Wi-Fi 6 / 6E / 7 (OFDMA, BSS coloring, 6GHz channels, MLO multi-link operation)
  • §7.7PoE / PoE+ / UPoE / 802.3bt (15.4 W → 90 W, LLDP power negotiation)
  • §7.8Campus Multicast (PIM-SM, IGMP snooping, AutoRP / BSR)
  • §7.9Campus QoS (DSCP marking trust boundary, queueing on access/uplink)
  • §7.10Cisco DNA Center / Catalyst Center (assurance, automation, fabric provisioning)

Part VIIIData Center & Cloud Network

RFC 7938 · Clos · ACI / NSX-T / OVN · interview-critical

  • §8.1Spine-Leaf Clos Topology (k-ary fat-tree, ECMP, oversubscription ratio, rail design)
  • §8.2BGP-Only Underlay (RFC 7938, eBGP unnumbered, allowas-in, ECMP load-balance)
  • §8.3EVPN-VXLAN Overlay (Type 2 MAC/IP, Type 3 IMET, Type 5 IP prefix, anycast gateway)
  • §8.4Cisco ACI (APIC controller, EPG, contracts, bridge domain, VRF, policy model)
  • §8.5NSX-T (T0/T1 routers, segments, distributed firewall, GENEVE)
  • §8.6Open vSwitch / OVN (flow tables, OpenFlow, OVN northbound DB, logical routers/switches)
  • §8.7Container CNI (Calico BGP, Cilium eBPF, Flannel VXLAN, Antrea, Multus)
  • §8.8Hyperscale Fabrics (Facebook F16/Disaggregated Backbone, Google Jupiter, Azure SONiC)
  • §8.9AWS VPC Internals (mapping service, ENI, Hyperplane, GWLB, Transit Gateway)
  • §8.10Lossless DC Fabric for RoCE (PFC, ECN, DCQCN, headroom buffering, Spectrum-X)
  • §8.11DCI (Data Center Interconnect — OTV, EVPN-VXLAN over DWDM, VPLS legacy)

Part IXService Provider / ISP Network

Doyle Vol.2 · CCIE SP · MEF · ITU-T

  • §9.1ISP Topology (PE-P-PE backbone, ASBR, route reflector hierarchy, IGP design)
  • §9.2Internet Peering (transit, settlement-free peering, full table vs partial vs default-only)
  • §9.3IXP (Internet Exchange Points — DE-CIX, AMS-IX, route servers, BGP Looking Glass)
  • §9.4BGP/MPLS L3VPN at Carrier Scale (RD/RT design, inter-AS option A/B/C, CSC)
  • §9.56PE / 6VPE (carrying IPv6 over an MPLS IPv4 core)
  • §9.6Carrier Ethernet & MEF (E-Line, E-LAN, E-Tree, E-Access, OAM 802.1ag/Y.1731)
  • §9.7Optical Transport (SDH/SONET legacy, OTN ODU/OTU, DWDM, ROADM, coherent optics)
  • §9.85G Transport (xHaul: fronthaul/midhaul/backhaul, eCPRI, Slicing IETF / SR-aware)
  • §9.9SD-WAN (Cisco Viptela/Meraki, VMware Velocloud, Versa, Fortinet — overlay tunneling)
  • §9.10NFV (VNF chaining, NFV-MANO, OpenStack Tacker, ONAP)
  • §9.11Anycast Services (DNS root servers, CDN POPs, BGP-injected /32-/24)
  • §9.12DDoS Mitigation (BGP Flowspec, RTBH, scrubbing centers, BCP38)

Part XTCP State Machine & Connection Lifecycle

TCP/IP Illustrated Vol.1 Ch.11-13 · RFC 9293 · interview-critical

  • §10.1TCP Header Anatomy (seq, ack, flags S/A/F/R/P/U/E/C, window, urg-ptr, options TLV)
  • §10.2TCP State Machine — 11 States (CLOSED, LISTEN, SYN-SENT, SYN-RCVD, ESTABLISHED, FIN-WAIT-1/2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT)
  • §10.3Three-Way Handshake (SYN → SYN-ACK → ACK; ISN selection, RFC 6528 hash-based)
  • §10.4Four-Way Close (FIN, half-close, simultaneous close → CLOSING)
  • §10.5TIME_WAIT Deep Dive (2*MSL rationale: orphan dup segments + reliable last ACK; tw_reuse vs tw_recycle hazards)
  • §10.6SYN Flood & SYN Cookie (encode mss/wscale/sack into ISN via MD5; tradeoffs — no TS/SACK on cookie path)
  • §10.7SYN Queue vs Accept Queue (somaxconn, tcp_max_syn_backlog, listen() backlog meaning)
  • §10.8RST Handling & RST Attacks (blind reset, off-path attacker, RFC 5961 challenge ACK)
  • §10.9Half-Open / Half-Closed Connections (keepalive vs application heartbeat to detect)
  • §10.10Connection Table Sizing (ephemeral port range, conntrack table, source port reuse with SO_REUSEADDR)
  • §10.11Linux TCP Tracing (ss -tnipo, /proc/net/tcp, tcptrace, bpftrace tcp:* tracepoints)

Part XITCP Reliability & Flow Control

TCP/IP Illustrated Vol.1 Ch.14-19 · RFC 5681/6675/9293

  • §11.1Sliding Window — Sender (cwnd, snd.una, snd.nxt, snd.wnd; pipe = bytes in flight)
  • §11.2Sliding Window — Receiver (rcv.wnd advertise, zero-window probe, silly-window-syndrome avoidance)
  • §11.3Cumulative ACK vs SACK vs D-SACK (RFC 2018, RFC 2883 — duplicate detection)
  • §11.4RTT & RTO Estimation (Karn's algorithm, Jacobson SRTT/RTTVAR, RFC 6298)
  • §11.5Fast Retransmit & Fast Recovery (3 dup ACK trigger, NewReno partial ACK handling)
  • §11.6RACK Loss Detection (RFC 8985 — time-based, replaces dup-ACK threshold)
  • §11.7Nagle Algorithm vs TCP_NODELAY (small-packet coalescing trade-off, why interactive apps disable it)
  • §11.8Delayed ACK (200ms typical, ACK-every-other-segment; the Nagle ↔ delayed-ACK 200ms stall classic)
  • §11.9TCP_CORK / MSG_MORE (block partial sends until uncorked; sendfile + cork pattern for HTTP)
  • §11.10TCP Keepalive (TCP_KEEPIDLE / KEEPINTVL / KEEPCNT; default 7200s — almost useless, app-layer heartbeat preferred)
  • §11.11Path MTU Discovery (DF bit, ICMP need-frag, blackhole detection, PLPMTUD RFC 8899)
  • §11.12TCP Fast Open (TFO cookie, 0-RTT data on subsequent connects, middlebox interference)
  • §11.13Window Scaling, Timestamps, PAWS (RFC 7323 — large windows over high-BDP links)
  • §11.14Urgent Pointer & OOB Data (legacy, broken interop — never use it)
  • §11.15TCP Linger / SO_LINGER (close behavior, abortive vs graceful)

Part XIITCP Congestion Control

RFC 5681/8312/9438 · BBR papers · interview-critical

  • §12.1Framework (cwnd, ssthresh, slow start, congestion avoidance, AIMD, fast recovery)
  • §12.2Tahoe (slow start + cong-avoid, no fast recovery — historical)
  • §12.3Reno (fast retransmit + fast recovery on triple-dup-ACK)
  • §12.4NewReno (RFC 6582 — handles multiple losses in one window without SACK)
  • §12.5BIC (Binary Increase) — predecessor of CUBIC
  • §12.6CUBIC (RFC 8312 — cubic-function cwnd vs time-since-loss; Linux default since 2.6.19)
  • §12.7Westwood / Westwood+ (bandwidth estimate to set ssthresh after loss; for wireless lossy links)
  • §12.8Vegas (delay-based, RTT increase = congestion; mostly displaced by BBR)
  • §12.9Compound TCP (Microsoft — combines AIMD with delay component)
  • §12.10BBR v1 (max-bw × min-RTT, 4-state machine: Startup/Drain/ProbeBW/ProbeRTT, pacing)
  • §12.11BBR v2 / v3 (ECN integration, fairness with CUBIC, loss tolerance)
  • §12.12DCTCP (RFC 8257 — ECN-marked fraction → multiplicative decrease; data center workhorse)
  • §12.13PCC / Copa (utility-driven, machine-learned, no hand-tuned thresholds)
  • §12.14ECN (RFC 3168, ECT/CE codepoint, AccECN RFC 9341) & AQM (RED, CoDel, FQ-CoDel, PIE)
  • §12.15Bufferbloat & Pacing (TSO/GSO interaction with pacing, sch_fq pacing)
  • §12.16Pluggable CC in Linux (net.ipv4.tcp_congestion_control, /proc/sys/net/ipv4/tcp_available_congestion_control)
  • §12.17CC Selection Cheat Sheet (DC = DCTCP/BBRv2; long-haul = BBR; mobile = BBR/Westwood; small RTT LAN = CUBIC)

Part XIIIModern Transports — QUIC, KCP, SCTP, MPTCP

RFC 9000-9002 · KCP / kcptun · interview-critical

  • §13.1Why Move Off TCP — middlebox ossification, head-of-line blocking, slow handshake, no migration
  • §13.2QUIC History (Google QUIC 2013 → IETF QUIC RFC 9000 in 2021)
  • §13.3QUIC Wire Image (long header — Initial/Handshake/0-RTT/1-RTT; short header; CONNECTION_ID; varint encoding)
  • §13.4QUIC Streams (per-stream flow control, no HoL between streams, server/client + uni/bidi 4 stream types)
  • §13.5QUIC Crypto / TLS 1.3 Integration (CRYPTO frames, key updates, packet protection)
  • §13.6QUIC Handshake (1-RTT new conn, 0-RTT with cached server config — replay-safety constraints)
  • §13.7QUIC Connection Migration (stable Connection ID survives NAT rebinding / Wi-Fi → cellular)
  • §13.8QUIC Loss Recovery (RFC 9002 — packet number monotonic, ACK frame with ranges, time threshold + packet threshold)
  • §13.9QUIC Congestion Control (NewReno default, BBR/CUBIC pluggable, separate from TCP stack)
  • §13.10HTTP/3 (QUIC + QPACK header compression — replaces HPACK; H3 frame layer)
  • §13.11QUIC Implementations (Cloudflare quiche, Google quiche, Microsoft msquic, Linux kernel UDP GSO+GRO support)
  • §13.12KCP — ARQ over UDP (selective repeat, fast retransmit on N skip, FEC, kcptun + crypto wrapping)
  • §13.13KCP Tuning Knobs (nodelay, interval, resend, nc — interactive vs throughput presets)
  • §13.14SCTP (RFC 9260 — multi-streaming, multi-homing, message-oriented; used in telecom signaling)
  • §13.15MPTCP (RFC 8684 — multi-path TCP, subflows, scheduler, fallback to plain TCP, used by Apple Siri / Korea KT GiGA)

Part XIVUDP & UDP-Based Protocols

RFC 768 · TCP/IP Illustrated Vol.1 Ch.10

  • §14.1UDP Header (8 bytes — src/dst port, length, checksum); pseudo-header for checksum
  • §14.2UDP Socket API (sendto/recvfrom, connect() on UDP for filtering, getsockname for ephemeral port)
  • §14.3Batched I/O (recvmmsg / sendmmsg — single syscall for N packets)
  • §14.4UDP-Lite (RFC 3828 — partial checksum coverage for tolerant codecs)
  • §14.5UDP GSO / GRO (Linux 4.18+, segment offload for big UDP for QUIC throughput)
  • §14.6SO_REUSEPORT (kernel hashing across listening sockets — UDP load balancing without an LB)
  • §14.7UDP Fragmentation Risks (DF + ICMP needed; black-holed by middleboxes; QUIC restricts to PMTU)
  • §14.8RTP / RTCP (real-time media, sequence + timestamp, SR/RR reports, SSRC)
  • §14.9DTLS (TLS over UDP — used by WebRTC media, OpenVPN, CoAP)

Part XVIP Layer & ICMP

RFC 791/792/4443 · TCP/IP Illustrated Vol.1 Ch.5-8

  • §15.1IPv4 Header (version, IHL, TOS/DSCP/ECN, total len, ID/flags/frag-offset, TTL, proto, checksum, options)
  • §15.2IP Fragmentation (DF flag, MF flag, identification, reassembly buffer, fragment overlap attacks)
  • §15.3ICMP Types & Codes (Echo Request/Reply, Dest Unreachable codes, Time Exceeded, Redirect, Source Quench legacy)
  • §15.4ICMP Probe Tools (ping, traceroute UDP/ICMP/TCP variants, MTR, paris-traceroute)
  • §15.5ICMP Errors (orig packet header in payload, rate-limiting per RFC 4443)
  • §15.6PMTUD via ICMP (Need Frag with next-hop MTU) — and why it often breaks (firewalls drop ICMP)
  • §15.7PLPMTUD (RFC 8899 — search MTU at transport layer, no ICMP dependency, used by QUIC)
  • §15.8ICMP Attacks (smurf, ping of death, ICMP redirect injection, ICMP tunneling)
  • §15.9ICMPv6 (RFC 4443 — types reorganized, multicast for NDP/MLD, ICMPv6 essential not blockable)

Part XVIARP & Layer 2 Discovery

RFC 826/3927/5227 · TCP/IP Illustrated Vol.1 Ch.4

  • §16.1ARP Operation (request broadcast, unicast reply, ARP cache aging, GC threshold)
  • §16.2Gratuitous ARP — GARP (announce IP move; MAC change; used by VRRP/keepalived takeover; duplicate-IP detection RFC 5227)
  • §16.3Proxy ARP (router answers ARP for hosts in another subnet; use cases: PPP, mobile IP, transparent firewall)
  • §16.4ARP Spoofing & Defenses (DAI on switches, static ARP, arpwatch, arp-scan)
  • §16.5ARP Sponging (used by load balancers and live-migration to redirect traffic mid-flight)
  • §16.6RARP / InARP (legacy — Reverse ARP for diskless boot; Inverse ARP on Frame Relay)
  • §16.7Linux ARP Table Tuning (gc_thresh1/2/3, base_reachable_time, when ARP table overflows in HPC clusters)

Part XVIIIPv6 Deep Dive

RFC 8200/4861/4862/8106/8415 · IPv6 Essentials (Hagen)

  • §17.1IPv6 Header (40-byte fixed; flow label; no checksum; no header-level fragmentation)
  • §17.2Address Architecture (Global Unicast 2000::/3, Link-Local fe80::/10, ULA fc00::/7, Multicast ff00::/8, Anycast)
  • §17.3Address Types & Scopes (interface-local, link-local, site-local deprecated, global, multicast scopes)
  • §17.4EUI-64 Interface ID (MAC-derived, U/L bit flip; privacy concerns)
  • §17.5SLAAC — Stateless Address Autoconfig (RFC 4862 — RA + on-link prefix → host generates address; DAD)
  • §17.6Privacy Extensions (RFC 8981 — random IID rotation; RFC 7217 stable but opaque IID)
  • §17.7NDP — Neighbor Discovery (RFC 4861 — RS, RA, NS, NA, Redirect; replaces ARP + ICMP Redirect)
  • §17.8RA — Router Advertisement (M/O flags, prefix info option, MTU option, route info option)
  • §17.9DAD — Duplicate Address Detection (NS to solicited-node multicast before claiming)
  • §17.10Optimistic DAD (RFC 4429) — start using address before DAD completes
  • §17.11RDNSS / DNSSL (RFC 8106 — DNS resolver via RA, replaces stateless DHCPv6 for DNS)
  • §17.12DHCPv6 Stateful (IA_NA assignment, RFC 8415, replaces DHCPv4 for managed networks)
  • §17.13DHCPv6 Stateless (only options like NTP/SIP, address still SLAAC)
  • §17.14DHCPv6-PD — Prefix Delegation (IA_PD; how home routers get a /56 or /60 from ISP)
  • §17.15IPv6 Extension Headers (Hop-by-Hop, Routing, Fragment, Destination, AH/ESP — ordering rules)
  • §17.16IPv6 Transition Mechanisms (dual-stack, 6to4 deprecated, 6rd, Teredo legacy, NAT64+DNS64, MAP-T/MAP-E, 464XLAT)
  • §17.17IPv6 Multicast & MLD (MLDv1/v2, replaces IGMP, runs over ICMPv6)
  • §17.18IPv6 Security (RA Guard, DHCPv6 Guard, ND inspection, SAVI, why disabling IPv6 hurts more than helps)

Part XVIIIMulticast

RFC 3376/4604/7761/7450 · Doyle Vol.2

  • §18.1IPv4 Multicast Addressing (224.0.0.0/4, 224.0.0.x link-local, 232/8 SSM, 233/8 GLOP, 239/8 admin-scoped)
  • §18.2MAC Mapping for Multicast (01:00:5e:00:00:00 + lower 23 bits of group; 32→1 collision)
  • §18.3IGMP v1 / v2 / v3 (host membership; v3 adds source filter for SSM)
  • §18.4IGMP Snooping (L2 switch tracks IGMP joins; multicast not flooded; querier election)
  • §18.5PIM-DM — Dense Mode (flood-and-prune, state refresh, only for small dense groups)
  • §18.6PIM-SM — Sparse Mode (RP, shared tree (*,G), source tree (S,G), Register encapsulation, SPT switchover)
  • §18.7PIM-SSM — Source-Specific Multicast (no RP, IGMPv3/MLDv2 host signals (S,G) directly; for IPTV)
  • §18.8RP Discovery (static, AutoRP, BSR — Bootstrap Router; Anycast-RP via MSDP)
  • §18.9RPF — Reverse Path Forwarding Check (loop prevention, unicast routing table by default)
  • §18.10Source Tree vs Shared Tree (latency vs state trade-off; SPT-threshold knob)
  • §18.11MSDP — Multicast Source Discovery Protocol (inter-domain SA messages, Anycast-RP within a domain)
  • §18.12MLD v1/v2 (IPv6 host membership over ICMPv6; MLDv2 is SSM-capable)
  • §18.13Bidir-PIM (RFC 5015 — many-to-many, single shared tree, no source state)
  • §18.14BIER — Bit Indexed Explicit Replication (RFC 8279 — stateless multicast, ingress encodes bitstring)
  • §18.15Multicast in EVPN-VXLAN (Type 6/7/8 routes, head-end vs underlay multicast replication)

Part XIXNAT & NAT Traversal

RFC 4787/5128/5389/5766/8489 · WebRTC ICE

  • §19.1NAT Types (Full Cone, Restricted Cone, Port-Restricted Cone, Symmetric — STUN-defined behaviors)
  • §19.2NAPT / PAT — Port Address Translation (port overload, conntrack tuple, port allocation strategies)
  • §19.3NAT Conntrack (Linux nf_conntrack — tuple, expectations, helpers for FTP/SIP)
  • §19.4Hairpin / NAT Loopback (internal client → public IP → back to internal server)
  • §19.5Carrier-Grade NAT — CGN / NAT444 (port-block allocation, IPv4 exhaustion mitigation, logging volume)
  • §19.6NAT64 + DNS64 (RFC 6146/6147 — IPv6-only client to IPv4 server)
  • §19.7464XLAT (RFC 6877 — CLAT on phone + NAT64 in network; T-Mobile US)
  • §19.8NPTv6 (stateless 1:1 IPv6 prefix translation, RFC 6296)
  • §19.9STUN (RFC 8489 — discover external mapping, NAT type detection)
  • §19.10TURN (RFC 8656 — relay server when direct fails; allocation, permissions)
  • §19.11ICE (RFC 8445 — gather candidates → pair → connectivity checks → nominate; trickle ICE)
  • §19.12UDP Hole Punching (mutual STUN, simultaneous send; works for cone NATs, fails for symmetric)
  • §19.13TCP Hole Punching (TCP simultaneous open; SYN crossing; sequence & state machine challenges)
  • §19.14UPnP IGD / NAT-PMP / PCP (router-mediated mapping; PCP is the modern winner)
  • §19.15Tailscale / WireGuard / Nebula NAT Traversal (DERP relays, peer-to-peer establishment)
  • §19.16WebRTC End-to-End (signaling out-of-band, ICE for media, DTLS-SRTP for security)

Part XXDHCP

RFC 2131 · RFC 8415 (DHCPv6)

  • §20.1DHCPv4 DORA (Discover broadcast → Offer → Request → Ack; xid correlation; lease renewal T1/T2)
  • §20.2DHCP Options (1=mask, 3=router, 6=DNS, 12=hostname, 43=vendor, 51=lease, 55=PRL, 60/61=class/client-id)
  • §20.3DHCP Option 82 (relay agent info — circuit-ID, remote-ID; used by DHCP snooping & ISP CGNAT)
  • §20.4DHCP Option 121 / 249 (classless static routes; how to push routes via DHCP)
  • §20.5DHCP Relay (UDP broadcast on access VLAN → unicast to server; ip helper-address)
  • §20.6DHCP Snooping (security — only trusted ports may answer; binding table feeds DAI/IPSG)
  • §20.7DHCP Server Implementations (ISC dhcpd legacy, Kea modern, dnsmasq for SOHO, Windows DHCP)
  • §20.8DHCP HA / Failover (ISC failover protocol, Kea HA hooks, primary/secondary lease split)
  • §20.9DHCPv6 (UDP/546-547, multicast ff02::1:2; SOLICIT/ADVERTISE/REQUEST/REPLY; M/O flags interplay with SLAAC)
  • §20.10DHCPv6 Prefix Delegation (IA_PD — how home routers get /56 from ISP)
  • §20.11PXE Boot / iPXE (next-server option 66, boot file 67, UEFI HTTP boot)

Part XXIHigh Availability

RFC 5798 (VRRP) · RFC 5880 (BFD) · keepalived docs

  • §21.1HSRP (Cisco — virtual IP/MAC 0000.0c07.acXX, active/standby, group, priority, preempt)
  • §21.2VRRP (RFC 5798 — open standard, virtual MAC 0000.5e00.01XX, master/backup election)
  • §21.3GLBP (Cisco — load-balancing FHRP, AVG + AVF, weighted vs round-robin vs host-dependent)
  • §21.4keepalived (Linux VRRPv2/v3 daemon, healthcheck scripts, IPVS director integration)
  • §21.5BFD — Bidirectional Forwarding Detection (RFC 5880, sub-second protocol-agnostic detection; async/demand/echo modes)
  • §21.6BFD Multi-hop (RFC 5883 — for iBGP / RR / IPsec tunnels)
  • §21.7Anycast HA (BGP-injected /32 from healthy node; DNS root, public DNS 1.1.1.1 / 8.8.8.8)
  • §21.8MC-LAG / MLAG (multi-chassis link aggregation, dual-active forwarding, peer-link, peer-keepalive)
  • §21.9LVS / IPVS Modes (DR — direct routing same L2; NAT — return through LB; TUN — IP-in-IP)
  • §21.10L4 LB Architectures (Maglev consistent hashing, Katran XDP, GitHub GLB, Cloudflare Unimog)
  • §21.11L7 LB / Proxy (HAProxy, NGINX, Envoy — health checks, retries, circuit breaker, outlier detection)
  • §21.12Stateful Firewall / NAT HA (conntrackd, pacemaker, session sync)
  • §21.13DNS-Based Failover (low TTL, GeoDNS, weighted policy)
  • §21.14Cluster Resource Manager (Pacemaker + Corosync, Linux-HA, fencing/STONITH)

Part XXIIRDMA & RoCE

IBTA Vol.1/2 · RoCE v2 RFC · Mellanox / NVIDIA docs · interview-critical

  • §22.1Why RDMA — kernel bypass, zero-copy, CPU offload (motivation: 100/200/400/800 GbE saturating CPU memcpy)
  • §22.2RDMA Operations (SEND/RECV — two-sided; RDMA WRITE / RDMA READ — one-sided; ATOMIC fetch-add / compare-swap)
  • §22.3RDMA Verbs API — libibverbs (PD, MR, CQ, QP, WR, WC; ibv_post_send / ibv_post_recv)
  • §22.4Queue Pair States (RESET → INIT → RTR → RTS → SQD/SQE → ERR; transitions via ibv_modify_qp)
  • §22.5Memory Registration (MR, lkey/rkey, ODP — On-Demand Paging, FRWR — Fast Reg WR)
  • §22.6Transport Types (RC — Reliable Connection, UC, UD — Unreliable Datagram, XRC — eXtended Reliable Connection)
  • §22.7InfiniBand Fundamentals (HCA, subnet manager OpenSM, LID/GID, GUID, SL/VL — service level / virtual lanes)
  • §22.8RoCE v1 — RDMA over Ethernet (L2-only, ethertype 0x8915, no IP routing)
  • §22.9RoCE v2 — RDMA over UDP/IP (UDP/4791, routable, used in cloud DC fabrics; uses BTH header)
  • §22.10iWARP — RDMA over TCP (RFC 5040 — older, less performant, but tolerates lossy networks)
  • §22.11PFC — Priority Flow Control (802.1Qbb — per-priority pause, lossless class for RoCE)
  • §22.12ECN & DCQCN (Data Center QCN — RoCE congestion control, ConnectX hardware-offloaded reaction)
  • §22.13PFC Deadlock & Headroom Buffering (cyclic dependency on credits; DCBX exchange; CC vs PFC roles)
  • §22.14Lossless Ethernet Design (DCB stack — PFC + ETS + DCBX; mlnx_qos tooling)
  • §22.15Adaptive Routing & Flowlet (NVIDIA AR; per-packet vs per-flow vs flowlet-level spraying)
  • §22.16RDMA in Storage (NVMe-oF over RDMA, NFS over RDMA, SMB Direct, Ceph BlueStore msgr2 RDMA)
  • §22.17RDMA Diagnostics (perftest ib_send_bw / ib_write_bw, ibv_devinfo, ibstat, mlx5dump, NVIDIA NEO/UFM)
  • §22.18RDMA in K8s (SR-IOV, Multus, RDMA CNI, GPU-Operator, Network Operator)

Part XXIIIAI Training & Inference Networking

NCCL docs · NVIDIA Spectrum-X · Meta RoCE · OCP HPN — interview-critical, intentionally deep

  • §23.1Why AI Networking Is Different (synchronous bulk-synchronous traffic, all-to-all incast, microsecond tail-latency)
  • §23.2Collective Communication Primitives (Broadcast, Reduce, AllReduce, AllGather, ReduceScatter, AllToAll, Scatter, Gather, Barrier)
  • §23.3AllReduce Algorithms — Ring (2(N-1) bandwidth-optimal steps, used by NCCL default for large messages)
  • §23.4AllReduce Algorithms — Tree (latency-optimal, log N depth; NCCL Tree for small messages)
  • §23.5AllReduce Algorithms — Halving-Doubling, Recursive Doubling, Hierarchical (multi-node + intra-node split)
  • §23.6AllToAll Patterns (used in MoE expert routing, sequence parallelism — most network-stressful collective)
  • §23.7NCCL — NVIDIA Collective Communications Library (architecture: comm, channel, proxy thread, work queue)
  • §23.8NCCL Topology Detection (PCIe / NVLink / NVSwitch / CPU NUMA / NIC affinity → graph search → optimal channels)
  • §23.9NCCL Transports (Shared Memory intra-process, P2P over PCIe/NVLink, IB Verbs RDMA, Sockets fallback)
  • §23.10NCCL Tuning (NCCL_ALGO ring/tree, NCCL_PROTO LL/LL128/Simple, NCCL_NTHREADS, NCCL_BUFFSIZE, NCCL_IB_HCA)
  • §23.11NCCL Plugin API (Networking plugin, e.g. AWS OFI plugin for EFA, Microsoft MSCCL plugin for custom algos)
  • §23.12MSCCL / MSCCLPP (Microsoft programmable collectives — XML algo description, GPU-driven for inference)
  • §23.13RCCL (AMD ROCm fork of NCCL for MI300/MI250)
  • §23.14Gloo / MPI Alternatives (Gloo CPU, OpenMPI, MVAPICH2-GDR, UCX, OneCCL — when not NCCL)
  • §23.15NVLink / NVSwitch (intra-node fabric — 5th-gen NVLink 1.8 TB/s, NVL72 rack-scale, NVLink-C2C for Grace-Hopper)
  • §23.16GPUDirect RDMA (GPU memory ↔ NIC without host bounce — ConnectX + NVIDIA driver path)
  • §23.17GPUDirect Storage / GDS (GPU ↔ NVMe direct via cuFile + nvidia-fs)
  • §23.18Rail-Optimized Fat-Tree / Clos for AI (per-GPU rail to dedicated leaf, 8 rails per server, no rail crossing)
  • §23.19NVIDIA Spectrum-X (Spectrum-4 + BlueField-3 + adaptive routing + congestion control DDP)
  • §23.20Meta AI Backend Network (RoCE-based, 24K-GPU clusters, FBOSS, dual-rail design)
  • §23.21OCP Hyperscale Network for AI / SONiC AI Optimizations (open AI fabric reference)
  • §23.22Adaptive Routing in AI Fabrics (per-packet spraying with reordering tolerance via NIC, flowlet, IB AR)
  • §23.23ECN/PFC Tuning for AI (DCQCN target rate, headroom, watchdog timer; lossless gotchas)
  • §23.24Congestion Control for AI — HPCC, Swift, Annapurna, EQDS (next-gen receiver-driven CC)
  • §23.25Inference Networking — Disaggregated Prefill/Decode (PD disaggregation, KV-cache transfer over RDMA)
  • §23.26Tensor Parallelism / Pipeline Parallelism / Expert Parallelism Traffic Patterns (TP all-reduce per-layer, PP send/recv, EP all-to-all)
  • §23.27Communication Frameworks (PyTorch DDP / FSDP, Megatron-LM, DeepSpeed ZeRO — what each demands of the network)
  • §23.28AWS EFA / Google JCT / Azure SDN-AI (cloud AI fabric implementations)
  • §23.29Backend AI vs Frontend AI (training cluster vs inference serving — different latency/throughput profiles)
  • §23.30Worked Example — Tracing One AllReduce (8-GPU node, 2-node, 16 GPUs total: ring chunks, schedule, RDMA WR queue)

Part XXIVSPDK — Storage Performance Dev Kit

SPDK docs · DPDK shared model · Intel/NVMe

  • §24.1SPDK Motivation (kernel I/O stack overhead, polling > interrupts at >1M IOPS, kernel-bypass storage parallel to DPDK)
  • §24.2SPDK Architecture (event framework, reactor per lcore, message-passing thread model, no shared mutable state)
  • §24.3User-Space NVMe Driver (PCI BAR mmap, SQ/CQ doorbells, MSI-X interrupts via VFIO eventfd, polling preferred)
  • §24.4SPDK BDEV Layer (block device abstraction, drivers for NVMe / AIO / virtio-blk / Ceph RBD / iSCSI / NVMe-oF initiator)
  • §24.5NVMe-oF Target — Transports (TCP RFC 8009, RDMA, FC; subsystems, namespaces, controllers)
  • §24.6BlobStore & BlobFS (lightweight storage abstraction; not a POSIX FS — used by Rocksdb backend)
  • §24.7vhost-user-blk / vhost-user-scsi (zero-copy VM I/O — virtqueue shared mem with QEMU)
  • §24.8SPDK + DPDK Shared Model (memory allocator rte_malloc, mempool, ring; runs as DPDK secondary or unified)
  • §24.9SPDK NVMe-oF Performance (1M+ IOPS per CPU core, sub-10µs latency over RDMA)
  • §24.10Real-World Deployments (Alibaba PolarStore, Ceph BlueStore + SPDK, AWS Nitro storage, Azure Premium SSD v2)
  • §24.11SPDK vs io_uring (when each wins — full bypass vs in-kernel batched async)

Part XXVI/O Multiplexing — select / poll / epoll

TLPI Ch.63 · Linux source fs/select.c, fs/eventpoll.c · interview-critical

  • §25.1Five I/O Models (blocking, non-blocking, I/O multiplexing, signal-driven, async I/O — Stevens UNP Ch.6)
  • §25.2select(2) — fd_set bitmap, FD_SETSIZE=1024, copy in/out every call, O(n) full scan; sys_select() in kernel
  • §25.3select Internals (fs/select.c — do_select() loop: poll_wait()/sets bits, wait via __pollwait, restartable timeout)
  • §25.4select Limitations (1024 fd cap, expensive setup, no edge-trigger, returns count not list)
  • §25.5poll(2) — pollfd array, no FD_SETSIZE limit, still O(n) scan, still copy-in/out every call
  • §25.6poll Internals (do_poll() builds wait queues per fd, walks list)
  • §25.7epoll Architecture — Big Picture (interest set persistent in kernel; ready list maintained on event; O(1) wait)
  • §25.8epoll Kernel Data Structures (struct eventpoll: rbr RB-tree of registered fds, rdllist ready list, ovflist; struct epitem)
  • §25.9epoll_create / epoll_create1 (creates anon inode; returns fd; CLOEXEC flag)
  • §25.10epoll_ctl ADD / MOD / DEL (insert/update/remove epitem; hooks into target fd's wait queue via ep_ptable_queue_proc)
  • §25.11epoll_wait — How Events Reach Ready List (target fd's poll callback ep_poll_callback fires, splices epitem onto rdllist, wakes waiters)
  • §25.12Level-Triggered (LT) — Default Semantics (re-reports as long as condition holds; safer; equivalent to poll)
  • §25.13Edge-Triggered (EPOLLET) — One Notify Per Transition (must drain till EAGAIN; non-blocking fd required)
  • §25.14EPOLLONESHOT (one-shot, must rearm via EPOLL_CTL_MOD; clean handoff between threads)
  • §25.15EPOLLEXCLUSIVE (RFC 4.5 — wake only one waiter; mitigates accept thundering herd in multi-process listeners)
  • §25.16epoll Drain Rule (with ET: read until EAGAIN; with LT: optional but faster with batching)
  • §25.17Common Pitfalls (close() of fd auto-removes from epoll only if last ref; dup'd fds trap; TOCTOU on unregister)
  • §25.18epoll vs kqueue vs IOCP (BSD/macOS unified event filters; Windows completion-based vs readiness-based)
  • §25.19epoll vs io_uring (readiness vs true async; io_uring SQ/CQ shared rings; multishot + zero-copy)
  • §25.20Reactor Pattern (epoll_wait → dispatch → handler — one thread per loop)
  • §25.21Proactor Pattern (true async completion — io_uring, IOCP)
  • §25.22Worked Example — Echo Server Progression (select → poll → epoll-LT → epoll-ET; benchmarks; pitfalls at each step)
  • §25.23Worked Example — High-Concurrency HTTP Server with epoll-ET + accept4 + SO_REUSEPORT

Part XXVIEvent Loop Libraries — libev / libevent / libuv

libev manual · libevent book · libuv design · interview-critical

  • §26.1Why Wrap epoll/kqueue/IOCP — portability, watcher abstraction, timer wheel, signal safety
  • §26.2Library Landscape (libev — minimalist by Marc Lehmann; libevent — older, more features; libuv — Node.js, cross-platform incl. Windows)
  • §26.3libev Architecture — Loops & Watchers (one struct ev_loop, many ev_*_watcher embedded into user struct)
  • §26.4libev Watcher Types (ev_io fd readiness, ev_timer relative, ev_periodic absolute/repeating, ev_signal, ev_child SIGCHLD, ev_stat inotify, ev_idle, ev_prepare/check loop hooks, ev_async cross-thread wakeup, ev_embed nested loop, ev_fork)
  • §26.5libev Backend Selection (auto: epoll on Linux, kqueue on BSD/Mac, port on Solaris, poll/select fallback; EVBACKEND_* flags)
  • §26.6libev Core Loop (ev_run / ev_loop) — Phases (1.before-fork → 2.queue pending → 3.invoke check → 4.fdupdate → 5.timer → 6.io wait → 7.invoke pending → repeat)
  • §26.7libev Timer Implementation (4-heap min-heap; O(log n) insert/extract; ev_now caching to avoid repeated clock_gettime)
  • §26.8libev fd-to-Watchers Map (ANFD array indexed by fd; multiple watchers per fd via linked list; reify on next loop iteration)
  • §26.9libev Priority Queue (priority -2..+2; pending events queued by priority; invoke_pending walks from highest)
  • §26.10libev Signal Handling — Safe Async (signalfd if available; pipe-based wakeup fallback; ev_signal watcher coalesces deliveries)
  • §26.11libev Fork Handling (ev_loop_fork: re-arm epoll fd in child, re-register signals; child should not call old loop)
  • §26.12libev Threading Model (one loop per thread; ev_async only safe cross-thread API; ev_loop is NOT thread-safe)
  • §26.13libev Embed Watcher (run a child loop inside parent — used to mix backends, e.g. select inside epoll loop)
  • §26.14libev vs libevent (libev = simpler, faster, less feature creep; libevent = HTTP/RPC helpers, evbuffer, evdns, deprecated event_base API)
  • §26.15libuv Internals — Cross-Platform (epoll/kqueue/IOCP/event ports; thread pool for FS + DNS; req-based async file I/O)
  • §26.16Worked Example — libev Echo Server (ev_io accept watcher → spawn ev_io read watcher per conn; ev_timer idle reaper)
  • §26.17Worked Example — Tearing Down libev for an Interviewer (loop-by-loop walkthrough; how each watcher type maps to a kernel mechanism)
  • §26.18Common Pitfalls (forgetting ev_io_stop on close; ET vs LT mismatch — libev assumes LT; not draining means infinite wakeups)
  • §26.19Choosing Library (libev for embedded / minimalist; libevent for HTTP; libuv for cross-platform Node-like; raw epoll if you want zero abstraction)

Appendix — Common Protocols & Well-Known Ports

ProtocolTransport / PortNotes
DNSUDP/53, TCP/53, DoT TCP/853, DoH TCP/443UDP for queries, TCP for >512B / zone xfer
DHCP / BOOTPUDP/67 server, UDP/68 clientBroadcast at L2 then unicast
DHCPv6UDP/547 server, UDP/546 clientff02::1:2 link-scoped multicast
HTTP / HTTPSTCP/80, TCP/443; HTTP/3 UDP/443QUIC over UDP for HTTP/3
SSHTCP/22Default for sshd, scp, sftp, ssh tunnels
BGPTCP/179MD5 / TCP-AO authentication
OSPFIP proto 89224.0.0.5 (all SPF), 224.0.0.6 (DR)
EIGRPIP proto 88224.0.0.10
IS-ISL2 directly (no IP)AllL1ISs / AllL2ISs MAC
VRRPIP proto 112224.0.0.18
PIMIP proto 103224.0.0.13
IGMPIP proto 2v2 gen-query 224.0.0.1
LDPTCP/646, UDP/646 helloTargeted hellos for remote LDP
GREIP proto 47Generic encap; classic tunnel
IPsec ESPIP proto 50AH IP proto 51, IKE UDP/500, NAT-T UDP/4500
VXLANUDP/4789 (RFC 7348)Linux historically used 8472 (pre-IANA)
GeneveUDP/6081Variable-length TLV options
WireGuardUDP (configurable, 51820 default)Single UDP port
NVMe-oF / TCPTCP/4420RFC 8009; or RDMA on 4791
RoCE v2UDP/4791BTH header inside UDP
RDMA CMTCP/18 (well-known) — actually port 18 unused; RDMA CM uses random portsVerbs allocates QP numbers
NTPUDP/123Stratum hierarchy, leap seconds
SyslogUDP/514, TCP/6514 TLSRFC 5424 structured data
SNMPUDP/161 query, UDP/162 trapv3 has authPriv security
NetFlow / IPFIXUDP/2055 / 4739Templated flow records
BFDUDP/3784 single-hop, UDP/4784 multi-hop, UDP/3785 echoSub-second liveness

Appendix — TCP State Machine Quick Reference

StateSideTriggered byNext on normal path
CLOSEDbothInitial / after teardownLISTEN (server) / SYN-SENT (client)
LISTENserverlisten()SYN-RCVD on incoming SYN
SYN-SENTclientconnect() sends SYNESTABLISHED on SYN-ACK
SYN-RCVDserverGot SYN, sent SYN-ACKESTABLISHED on ACK
ESTABLISHEDbothHandshake completeFIN-WAIT-1 (active close) / CLOSE-WAIT (passive close)
FIN-WAIT-1active closerclose() sends FINFIN-WAIT-2 (ACK only) / CLOSING (FIN crosses) / TIME-WAIT (FIN+ACK)
FIN-WAIT-2active closerPeer ACKed our FINTIME-WAIT on peer's FIN
CLOSE-WAITpassive closerGot peer's FINLAST-ACK after own close()
LAST-ACKpassive closerSent FIN after peer'sCLOSED on peer's ACK
CLOSINGboth (rare)Simultaneous close — FIN crossingTIME-WAIT on ACK
TIME-WAITactive closerFinal ACK sentCLOSED after 2*MSL

Appendix — Congestion Control Algorithms Cheat Sheet

AlgoSignalcwnd BehaviorBest For
TahoeLoss (3 dup ACK or RTO)cwnd = 1, slow start to ssthresh = cwnd/2Historical baseline
RenoLoss (3 dup ACK)cwnd = ssthresh = cwnd/2 + fast recoveryLow-loss small-RTT links
NewRenoLoss + partial ACKStay in fast recovery for multiple lossesPre-SACK era; still default fallback
CUBICLoss (cubic function of t since loss)Cubic concave then convex around W_maxLong-haul high-BDP TCP — Linux default
BBR v1Bandwidth × min-RTT (model-based)Pace at estimated BtlBw × min-RTT, no slow-start collapseLong-haul, lossy, video/CDN
BBR v2BBR signal + ECN + lossAdds ECN response and CUBIC-fairnessDC + WAN mixed traffic
VegasRTT increase (delay-based)Reduce on RTT growth, no loss neededLow-loss links; loses to Reno in mixed
WestwoodLoss + bandwidth estimatessthresh = bw * min-RTT after lossWireless / lossy links
DCTCPECN-CE fractionα-weighted multiplicative decrease per roundData center fabrics with ECN-marking switches
CTCPLoss + delayAIMD + delay-based component (Microsoft)Windows long-haul
HTCPLoss, time-since-lossAggressive cwnd growth on long no-loss periodsVery high-BDP scientific links

Appendix — I/O Multiplexing API Comparison

APIOSStyleComplexityLimitNotes
selectPOSIX everywhereReadinessO(n) scan, O(n) copyFD_SETSIZE = 1024Bitmap in/out, oldest, broken at scale
pollPOSIX everywhereReadinessO(n) scan, O(n) copyRLIMIT_NOFILEBetter than select; still O(n)
epollLinux 2.6+ReadinessO(1) wait, O(log n) ctlRLIMIT_NOFILEET / LT, EPOLLEXCLUSIVE, persistent kernel state
kqueueBSD / macOSReadiness + filtersO(1) waitkern.maxfilesperprocFilters on fs, signals, timers, processes
IOCPWindowsCompletionO(1)True async — kernel completes I/O, posts to queue
io_uringLinux 5.1+Completion (true async)O(1) batchedRLIMIT_NOFILESQ/CQ shared rings, SQPOLL, multishot, registered FDs/buffers
AIO (libaio)LinuxCompletionO(1) batchedOnly O_DIRECT; effectively replaced by io_uring
POSIX AIOPOSIXCompletionUser-thread emulationSlow — glibc emulates with threads

Appendix — High Availability Mechanisms

MechanismLayerFailover TimeCommon Use
HSRPL3 first-hop (Cisco)~3-10s default, sub-second tunedDefault-gateway redundancy on access network
VRRPL3 first-hop (RFC)~3s default, sub-second tunedOpen-standard FHRP, used by keepalived
GLBPL3 first-hop + LB (Cisco)Like HSRPActive-active gateway load balancing
BFDL3-agnostic liveness<50ms typicalSpeed up OSPF/BGP/static convergence
MC-LAG / vPCL2Sub-secondServer multi-homing without STP blocking
StackWise / VSSChassisISSU sub-second; RPR/SSO sub-secondTwo physical → one logical control plane
Anycast (BGP)L3 routedBGP convergence (1-30s)DNS, CDN, public services
LFA / TI-LFAIGP<50msIGP-driven sub-50ms protection
MPLS FRRMPLS<50msRSVP-TE backup tunnels
Pacemaker / CorosyncServiceSecondsResource manager + STONITH
Keepalived + IPVSL4 LBSub-secondLinux virtual server with VRRP failover

Appendix — AI Collective Operations Cheat Sheet

OpWhat it doesBandwidth Cost (N ranks, M bytes)When used
Broadcast1 → all (root sends to everyone)M (per link in tree)Initial weight distribution
Reduceall → 1 (sum/max/min at root)MAggregating loss / metrics to rank 0
AllReduceall → all of reduced value2M(N-1)/N (ring; bandwidth-optimal)DDP / FSDP gradient sync — most common
AllGatherconcat tensors from all → allM(N-1)/NFSDP unshard, sequence parallelism gather
ReduceScatterelementwise reduce + scatter slicesM(N-1)/NFSDP gradient pre-shard; half of AllReduce
AllToAllrank i sends slice j to rank jM(N-1)/N — but every pairMoE expert dispatch, sequence parallelism
Scatter1 → all (each gets a slice)M(N-1)/NInitial data partitioning
Gatherall → 1 (concatenate slices)M(N-1)/NCollect outputs to rank 0
Barriersynchronize without datalog N roundsPhase boundaries

Appendix — RDMA Verb Operations

OperationSidedReceiver CPU?Notes
SEND / RECVTwo-sidedYes (must post RECV)Like sockets — needs matching RECV WR posted
RDMA WRITEOne-sidedNoInitiator writes into peer's pre-registered MR using rkey
RDMA WRITE with IMMOne-sided + signalYes (consumes RECV)Write + 4-byte immediate value triggers receive completion
RDMA READOne-sidedNoInitiator reads from peer's MR; lower throughput than WRITE
ATOMIC FETCH_ADDOne-sided RMWNo8-byte atomic, consistent across HCA & host CPU only on certain hw
ATOMIC CMP_SWPOne-sided RMWNoCompare-and-swap on remote 8 bytes
SEND with INVALIDATETwo-sidedYesInvalidates a receiver-side rkey atomically with delivery

Appendix — libev Watcher Types Quick Reference

WatcherTriggered byMapped to
ev_iofd readable / writableepoll_ctl ADD on backend
ev_timerRelative timeout (after X seconds, optional repeat)Min-heap; loop computes nearest deadline for epoll_wait timeout
ev_periodicAbsolute time / cron-like reschedule callbackMin-heap with reschedule cb
ev_signalPOSIX signal receivedsignalfd or pipe + sigaction handler
ev_childSIGCHLD for a specific PIDInternal signal watcher + waitpid
ev_statFile stat changes (path-based)inotify if available, else periodic stat()
ev_idleNo other events pendingRun after all ready events processed
ev_prepareBefore each loop iteration's pollHook used by glue layers (Perl, etc.)
ev_checkAfter each poll, before invokeHook for glue layers
ev_asyncev_async_send() from another threadeventfd / pipe wakeup — ONLY safe cross-thread API
ev_embedInner ev_loop made pollable as one fdRun a kqueue inside an epoll loop, etc.
ev_forkAfter fork() in childCleanup on fork
ev_cleanupLoop destroyedFinal teardown hook

Appendix — Cisco Certification Path Quick Reference

TrackCCNACCNP (core + concentration)CCIE (lab)
Enterprise200-301 CCNAENCOR 350-401 + ENARSI / ENSLD / ENWLSI / ENWLSD / SPCOR / etc.CCIE Enterprise Infrastructure (Lab v1.x)
Data Center200-301 CCNADCCOR 350-601 + DCID / DCACI / DCACIA / DCAUICCIE Data Center
Service Provider200-301 CCNASPCOR 350-501 + SPRI / SPVI / SPCNI / SPAUICCIE Service Provider
Security200-301 CCNASCOR 350-701 + SISE / SNCF / SVPN / SWSA / SAUTOCCIE Security
Collaboration200-301 CCNACLCOR 350-801 + CLICA / CLACCM / CLCEI / CLAUTOCCIE Collaboration
DevNetDEVASCDEVCOR 350-901 + concentrationCCDE / DevNet Expert