§14.1–14.6 Debugging & Profiling Tools
A field guide for moving from packet loss, slow syscalls, hot CPU, kernel crashes, tracepoints, and memory leaks to a concrete root cause.
1. Overview
Kernel debugging is a layer-selection problem: first decide whether the symptom lives in packets, syscalls, CPU execution, kernel control flow, or memory ownership, then choose the narrowest tool that can observe that layer without changing the system too much.
2. §14.1 — Network Debugging
Packet capture sits beside the normal receive path: AF_PACKET receives a copy of matching packets while the driver still hands the original skb to the protocol stack.
| Tool | Purpose | Command shape |
|---|---|---|
tcpdump / Wireshark | Capture and decode packets | tcpdump -i eth0 -nn host 10.0.0.1 and port 80 |
ss | Inspect sockets and TCP state | ss -tulpn; ss -s |
iperf3 | Measure TCP or UDP throughput | iperf3 -c 10.0.0.2 -P 8; iperf3 -u -b 1G |
ping / traceroute / mtr | Find latency, loss, and path changes | mtr -ezbw 8.8.8.8 |
ethtool -S | Read driver and per-queue counters | ethtool -S eth0 | grep -i drop |
ip -s link | Read interface RX/TX drops and errors | ip -s link show dev eth0 |
Filters and counters
host 10.0.0.1 and port 80keeps only HTTP traffic touching one host.tcp[tcpflags] & tcp-syn != 0catches SYNs, useful for connection storms or SYN cookie checks.ss -tulpnmaps listening sockets to processes;ss -sshows TCP state totals.ethtool -S eth0 | grep -i dropdistinguishes queue-level driver drops from protocol drops.
3. §14.2 — System Call Tracing
strace uses ptrace to stop a task at each syscall entry and exit, which gives precise arguments and return values at the cost of many context switches.
What to run
strace -e trace=network -T -p <pid>prints socket syscalls and per-call duration, for exampleconnect(... ) = -1 EINPROGRESS <0.000042>.ltrace -p <pid>traces shared library calls, which is useful when libc, OpenSSL, or allocator behavior is the suspected layer.perf trace -p <pid>uses kernel tracing infrastructure and is usually lower overhead than ptrace-based strace.
4. §14.3 — Performance Profiling
perf turns hardware events into sampled instruction pointers: a PMU counter overflows, an NMI records the current stack, and later symbolization aggregates hot paths.
Flame graphs compress sampled stacks so width means total CPU time and vertical depth means call depth; the widest leaf frames are the first optimization candidates.
| Tool | Answers | Command |
|---|---|---|
perf stat | Whole-program counters | perf stat -e cycles,instructions,cache-misses ./app |
perf record/report | Sample CPU stacks | perf record -g -p <pid>; perf report --stdio |
perf top | Live hottest symbols | perf top -g |
perf sched | Scheduler latency and switches | perf sched record ./app; perf sched latency |
perf lock | Lock contention | perf lock record ./app; perf lock report |
perf c2c | False sharing and cache line bouncing | perf c2c record ./app; perf c2c report |
IPC is instructions / cycles. Low IPC usually means the CPU is waiting on memory, branch recovery, locks, or kernel scheduling instead of retiring useful instructions.
5. §14.4 — Kernel Debugging
KGDB gives source-level debugging of a target kernel from a separate host GDB session; it is invasive, but it is the direct tool when control flow must stop exactly at a breakpoint.
An OOPS starts at a CPU exception, then the kernel records registers, fault context, stack frames, and loaded modules before killing the current task or panicking.
Crash artifacts
- OOPS anatomy: faulting PC or RIP, link register on ARM, stack pointer, fault address, register dump, call trace, taint flags, and module list.
crash vmlinux vmcoreopens a dump;bt,ps,vm, andlogare the first commands.addr2line -e vmlinux <address>maps a symbolized address back to a source file and line when debug info is available.- KASAN reports include access type, byte count, bad address, allocation/free stack, and shadow bytes that explain the poisoned region.
6. §14.5 — Kernel Tracing
ftrace uses compiler-inserted function entry hooks and per-CPU ring buffers, then exposes the stream through tracefs for live reads or offline trace-cmd analysis.
eBPF makes tracing programmable: bytecode is compiled, verified for safety, optionally JIT compiled, and attached to hooks such as kprobes, tracepoints, tc, or XDP.
A kprobe instruments an arbitrary kernel instruction by patching a trap or optimized jump, running a handler, then resuming the original instruction stream.
Tracing commands
bpftrace -e 'kprobe:tcp_retransmit_skb { @[comm] = count(); }'counts TCP retransmit calls by process name.biolatencyfrom bcc records block I/O latency as a histogram.echo function > current_tracer; echo tcp_v4_connect > set_ftrace_filter; cat traceis the minimal ftrace workflow.
7. §14.6 — Memory Debugging
/proc/meminfo separates free memory from reclaimable cache; MemAvailable is the practical pressure signal because it estimates what can be allocated without swapping.
Valgrind Memcheck maintains shadow state for application memory, so every load and store can be checked for addressability and initialization.
valgrind --leak-check=full --track-origins=yes ./appfinds userspace leaks and uninitialized reads.cat /proc/slabinfoshows kernel object cache growth; suspicious caches can explain unreclaimable memory.echo scan > /sys/kernel/debug/kmemleak; cat /sys/kernel/debug/kmemleakasks kmemleak to report unreachable kernel allocations.vmstat 1connects memory pressure to swap, reclaim, I/O wait, and runnable queue pressure.
8. Minimal C Demo
This tiny program models what perf script | stackcollapse-perf.pl prepares for a flame graph: identical stack prefixes accumulate, and the widest path is the best starting point.
9. Kernel Source Pointers
| Area | Files and functions |
|---|---|
| perf events | kernel/events/core.c: perf_event_open, perf_sample_event_took |
| ptrace syscall stops | kernel/ptrace.c, arch/*/kernel/ptrace.c |
| ftrace | kernel/trace/ftrace.c, kernel/trace/trace.c |
| kprobes | kernel/kprobes.c, arch/*/kernel/kprobes.c |
| BPF verifier | kernel/bpf/verifier.c, kernel/bpf/syscall.c |
| OOPS and panic | kernel/panic.c, arch/*/mm/fault.c, arch/*/kernel/traps.c |
| KASAN | mm/kasan/report.c, mm/kasan/shadow.c |
| kmemleak | mm/kmemleak.c |
10. Interview Prep
| Question | Concise answer |
|---|---|
| What does perf stat measure? | Hardware and software counters such as cycles, instructions, cache misses, context switches, and faults. IPC is instructions divided by cycles. |
| How do you build a CPU flame graph? | Record stacks with perf record -g, convert perf script output with stackcollapse-perf.pl, then render with flamegraph.pl. |
| How do you detect false sharing? | Use perf c2c to find cache lines bouncing between cores, then inspect the structs and align or shard the hot fields. |
| What does KASAN catch? | Kernel out-of-bounds, use-after-free, invalid frees, and related memory bugs by poisoning shadow memory. |
| How do you analyze an OOPS? | Identify the faulting address and PC, symbolize the call trace, check taint and modules, then map the top real frame back to source. |
| How does a kprobe work? | The kernel patches a target instruction with a trap or optimized jump, runs the probe handler with register context, then resumes execution. |
| What does the BPF verifier prove? | It checks bounded execution, safe memory access, helper permissions, pointer types, and map access rules before attach. |
| How do you diagnose a kernel leak? | Watch slab growth, enable kmemleak, trigger a scan, then inspect unreachable allocation stacks. |