Part XIV — Debugging

§14.1–14.6 Debugging & Profiling Tools

A field guide for moving from packet loss, slow syscalls, hot CPU, kernel crashes, tracepoints, and memory leaks to a concrete root cause.

1. Overview

Kernel debugging is a layer-selection problem: first decide whether the symptom lives in packets, syscalls, CPU execution, kernel control flow, or memory ownership, then choose the narrowest tool that can observe that layer without changing the system too much.

2. §14.1 — Network Debugging

Packet capture sits beside the normal receive path: AF_PACKET receives a copy of matching packets while the driver still hands the original skb to the protocol stack.

ToolPurposeCommand shape
tcpdump / WiresharkCapture and decode packetstcpdump -i eth0 -nn host 10.0.0.1 and port 80
ssInspect sockets and TCP statess -tulpn; ss -s
iperf3Measure TCP or UDP throughputiperf3 -c 10.0.0.2 -P 8; iperf3 -u -b 1G
ping / traceroute / mtrFind latency, loss, and path changesmtr -ezbw 8.8.8.8
ethtool -SRead driver and per-queue countersethtool -S eth0 | grep -i drop
ip -s linkRead interface RX/TX drops and errorsip -s link show dev eth0

Filters and counters

  • host 10.0.0.1 and port 80 keeps only HTTP traffic touching one host.
  • tcp[tcpflags] & tcp-syn != 0 catches SYNs, useful for connection storms or SYN cookie checks.
  • ss -tulpn maps listening sockets to processes; ss -s shows TCP state totals.
  • ethtool -S eth0 | grep -i drop distinguishes queue-level driver drops from protocol drops.

3. §14.2 — System Call Tracing

strace uses ptrace to stop a task at each syscall entry and exit, which gives precise arguments and return values at the cost of many context switches.

What to run

  • strace -e trace=network -T -p <pid> prints socket syscalls and per-call duration, for example connect(... ) = -1 EINPROGRESS <0.000042>.
  • ltrace -p <pid> traces shared library calls, which is useful when libc, OpenSSL, or allocator behavior is the suspected layer.
  • perf trace -p <pid> uses kernel tracing infrastructure and is usually lower overhead than ptrace-based strace.

4. §14.3 — Performance Profiling

perf turns hardware events into sampled instruction pointers: a PMU counter overflows, an NMI records the current stack, and later symbolization aggregates hot paths.

Flame graphs compress sampled stacks so width means total CPU time and vertical depth means call depth; the widest leaf frames are the first optimization candidates.

ToolAnswersCommand
perf statWhole-program countersperf stat -e cycles,instructions,cache-misses ./app
perf record/reportSample CPU stacksperf record -g -p <pid>; perf report --stdio
perf topLive hottest symbolsperf top -g
perf schedScheduler latency and switchesperf sched record ./app; perf sched latency
perf lockLock contentionperf lock record ./app; perf lock report
perf c2cFalse sharing and cache line bouncingperf c2c record ./app; perf c2c report

IPC is instructions / cycles. Low IPC usually means the CPU is waiting on memory, branch recovery, locks, or kernel scheduling instead of retiring useful instructions.

5. §14.4 — Kernel Debugging

KGDB gives source-level debugging of a target kernel from a separate host GDB session; it is invasive, but it is the direct tool when control flow must stop exactly at a breakpoint.

An OOPS starts at a CPU exception, then the kernel records registers, fault context, stack frames, and loaded modules before killing the current task or panicking.

Crash artifacts

  • OOPS anatomy: faulting PC or RIP, link register on ARM, stack pointer, fault address, register dump, call trace, taint flags, and module list.
  • crash vmlinux vmcore opens a dump; bt, ps, vm, and log are the first commands.
  • addr2line -e vmlinux <address> maps a symbolized address back to a source file and line when debug info is available.
  • KASAN reports include access type, byte count, bad address, allocation/free stack, and shadow bytes that explain the poisoned region.

6. §14.5 — Kernel Tracing

ftrace uses compiler-inserted function entry hooks and per-CPU ring buffers, then exposes the stream through tracefs for live reads or offline trace-cmd analysis.

eBPF makes tracing programmable: bytecode is compiled, verified for safety, optionally JIT compiled, and attached to hooks such as kprobes, tracepoints, tc, or XDP.

A kprobe instruments an arbitrary kernel instruction by patching a trap or optimized jump, running a handler, then resuming the original instruction stream.

Tracing commands

  • bpftrace -e 'kprobe:tcp_retransmit_skb { @[comm] = count(); }' counts TCP retransmit calls by process name.
  • biolatency from bcc records block I/O latency as a histogram.
  • echo function > current_tracer; echo tcp_v4_connect > set_ftrace_filter; cat trace is the minimal ftrace workflow.

7. §14.6 — Memory Debugging

/proc/meminfo separates free memory from reclaimable cache; MemAvailable is the practical pressure signal because it estimates what can be allocated without swapping.

Valgrind Memcheck maintains shadow state for application memory, so every load and store can be checked for addressability and initialization.

  • valgrind --leak-check=full --track-origins=yes ./app finds userspace leaks and uninitialized reads.
  • cat /proc/slabinfo shows kernel object cache growth; suspicious caches can explain unreclaimable memory.
  • echo scan > /sys/kernel/debug/kmemleak; cat /sys/kernel/debug/kmemleak asks kmemleak to report unreachable kernel allocations.
  • vmstat 1 connects memory pressure to swap, reclaim, I/O wait, and runnable queue pressure.

8. Minimal C Demo

This tiny program models what perf script | stackcollapse-perf.pl prepares for a flame graph: identical stack prefixes accumulate, and the widest path is the best starting point.

Perf Samples to Flame-Graph Intuition — C Demo
stdin (optional)

9. Kernel Source Pointers

AreaFiles and functions
perf eventskernel/events/core.c: perf_event_open, perf_sample_event_took
ptrace syscall stopskernel/ptrace.c, arch/*/kernel/ptrace.c
ftracekernel/trace/ftrace.c, kernel/trace/trace.c
kprobeskernel/kprobes.c, arch/*/kernel/kprobes.c
BPF verifierkernel/bpf/verifier.c, kernel/bpf/syscall.c
OOPS and panickernel/panic.c, arch/*/mm/fault.c, arch/*/kernel/traps.c
KASANmm/kasan/report.c, mm/kasan/shadow.c
kmemleakmm/kmemleak.c

10. Interview Prep

QuestionConcise answer
What does perf stat measure?Hardware and software counters such as cycles, instructions, cache misses, context switches, and faults. IPC is instructions divided by cycles.
How do you build a CPU flame graph?Record stacks with perf record -g, convert perf script output with stackcollapse-perf.pl, then render with flamegraph.pl.
How do you detect false sharing?Use perf c2c to find cache lines bouncing between cores, then inspect the structs and align or shard the hot fields.
What does KASAN catch?Kernel out-of-bounds, use-after-free, invalid frees, and related memory bugs by poisoning shadow memory.
How do you analyze an OOPS?Identify the faulting address and PC, symbolize the call trace, check taint and modules, then map the top real frame back to source.
How does a kprobe work?The kernel patches a target instruction with a trap or optimized jump, runs the probe handler with register context, then resumes execution.
What does the BPF verifier prove?It checks bounded execution, safe memory access, helper permissions, pointer types, and map access rules before attach.
How do you diagnose a kernel leak?Watch slab growth, enable kmemleak, trigger a scan, then inspect unreachable allocation stacks.