Tech Notes

1. § 3.1 — StackWise Ring

StackWise turns several Catalyst switches into one logical switch through a proprietary ring backplane. The active master owns configuration, protocols, and management; the standby is ready to take over, while member switches contribute ports and forwarding ASICs.

Role	Selection	Operational Meaning
Active master	Highest priority, then tie-breakers such as MAC/platform state	Owns CLI, STP, routing protocols, and stack configuration
Standby	Next best candidate	Keeps enough state to take over during master failure
Member	Remaining switches	Provides local ports and hardware forwarding under master control

2. § 3.2 — StackWise Virtual

StackWise Virtual keeps the single-switch management model but stretches it across two chassis with an SVL bundle. A separate dual-active detection path matters because the worst failure is not a dead chassis; it is two live chassis advertising the same identity.

stackwise-virtual domain binds the pair into one logical system.
SVL carries control synchronization and traffic that must cross between chassis.
DAD should use an independent path so SVL failure can be distinguished from peer death.

3. § 3.3 — VSS

VSS on Catalyst 6500/6800 merges two chassis into one switch through the Virtual Switch Link. The active supervisor handles protocols; the standby is synchronized with SSO/NSF, while both chassis forward data and support multichassis EtherChannel downstream.

Dual-active detection can use PAgP-enhanced fast hello, BFD, or a dedicated fast-hello link. Once dual-active is confirmed, the losing chassis goes into recovery and shuts down data ports instead of letting duplicate gateway MAC/IP state corrupt the network.

4. § 3.4 — vPC

vPC lets two Nexus switches present one LACP system to a downstream server or access switch without merging both control planes. The peer-link carries VLAN traffic plus MAC/ARP synchronization; the keepalive is only a separate heartbeat used to avoid split-brain decisions.

The important interview case is peer-link failure. If keepalive remains up, the primary keeps forwarding while the secondary suspends vPC member ports and secondary orphan ports, preventing both peers from forwarding as independent owners of the same LAG.

Minimal C Demo — vPC Failure Walk-through

vPC Failure Walk-through — C Demo

stdin (optional)

5. § 3.5 — MLAG and MC-LAG

MLAG is the same operational idea as vPC in vendor-specific form: two switches synchronize enough state to appear as one LAG endpoint. Peer-gateway avoids needless peer-link hairpinning by letting each switch answer for the shared gateway MAC locally.

Arista MLAG uses a peer-link and a local VLAN interface for peer adjacency.
Juniper MC-LAG uses ICCP over TCP for inter-chassis coordination.
ARP/MAC sync is what keeps failover from forcing every host to rediscover the gateway.

6. § 3.6 — De-stack into Pure L3 ECMP

The de-stack trend removes the peer-link as a shared fate point. Servers or ToR leaves use routed links and BGP ECMP, often with host routing through Bird or GoBGP, so every uplink is active and a single leaf failure stays local to directly attached hosts.

Minimal C Demo — ECMP Path Selection

ECMP Path Selection — C Demo

stdin (optional)

7. § 3.7 — Failure Domain Comparison

Architecture	Failure Domain	Complexity	Bandwidth Efficiency	Convergence
StackWise	Ring/control failure affects the stack as one logical switch	Medium	Good through stack backplane	Fast with standby state
VSS / SVL	VSL/SVL failure can become dual-active	High	Good with local switching	Fast with SSO and DAD
vPC / MLAG	Peer-link failure suspends secondary-side risk surfaces	Medium	Good; both access links active	Fast with keepalive/BFD
Pure L3 ECMP	One leaf failure affects only attached hosts or routes	Low at L2, higher host/routing requirement	Excellent; all routed paths active	Fast with BGP and BFD

8. § 3.8 — ISSU and GIR

ISSU depends on redundant control planes and synchronized forwarding state: upgrade the standby side, switch over with SSO, then upgrade the old active side. GIR is the broader maintenance pattern: drain the node by gracefully withdrawing routes before touching it.

9. § 3.9 — Dual-Active Detection

Split-brain means the inter-switch control link failed while both halves are still alive. Detection methods such as PAgP enhanced fast hello, BFD, or dedicated fast hello prove the peer still exists, then recovery isolates one side by suspending non-management ports.

Minimal C Demo — Dual-Active Recovery Choice

Dual-Active Recovery Choice — C Demo

stdin (optional)

10. § 3.10 — Migration Stories

A clean migration keeps old and new designs side by side long enough to move one failure domain at a time. StackWise to SVL mainly changes cabling and chassis identity; SVL/vPC to L3 leaf-spine changes the operating model by replacing L2 adjacency with routed reachability.

11. Interview Prep

What is the difference between vPC peer-link and keepalive? The peer-link carries VLAN traffic and state sync; keepalive is a separate heartbeat for peer-liveness decisions.
What happens when a vPC peer-link fails but keepalive stays up? Primary keeps forwarding; secondary suspends vPC member ports and secondary orphan ports.
Why is dual-active dangerous? Both halves advertise the same logical switch identity, creating duplicate MAC/IP ownership and blackholes.
Why move to pure L3 ECMP? It removes STP/MLAG split-brain domains and uses all links, at the cost of host or application L3 awareness.
What does ISSU require? SSO/NSF-capable redundancy, compatible images, synchronized forwarding state, and strict pre/post checks.

§ 3 StackWise, VSS, vPC, MLAG, and L3 ECMP

1. § 3.1 — StackWise Ring

2. § 3.2 — StackWise Virtual

3. § 3.3 — VSS

4. § 3.4 — vPC

Minimal C Demo — vPC Failure Walk-through

5. § 3.5 — MLAG and MC-LAG

6. § 3.6 — De-stack into Pure L3 ECMP

Minimal C Demo — ECMP Path Selection

7. § 3.7 — Failure Domain Comparison

8. § 3.8 — ISSU and GIR

9. § 3.9 — Dual-Active Detection

Minimal C Demo — Dual-Active Recovery Choice

10. § 3.10 — Migration Stories

11. Interview Prep