Debugging Tips

Purpose

The purpose of this page is to provide you some aid and information on debugging in 3210.

Debugging is one of the most challenging aspects of 3210. Students who use better debugging techniques generally finish labs much faster!

General Debugging Tips

The first thing you should do when your code doesn’t work is gather more information.

I strongly discourage spending hours (or even a half hour) looking over your code trying to find the bug statically. Run your code. Add Debugging Statements. Get more information.

Static debugging (debugging without runtime information) is hard! Computers are bad at it, and so are humans! If you want to become a great debugger (and save yourself tons of time), then learn how to ask the right questions, and add debugging information to your code to help you answer them!

Some things to ask yourself (or your code):

  • What was the last place I knew my code was working?
    • Can I add prints around that point?
    • Can I think of other interesting points to add prints that might tell me more (e.g. verify something happened, or that it didn’t!)
  • Where could it have gone from there?
    • Can I add extra prints to my code to help determine where the program went before it crashed?
  • Is the code stuck in an infinite loop?
    • If so, GDB can help – use ctrl+c to stop your program, then the ‘bt’ or ‘backtrace’ command to see where you are!
  • Is the kernel panicing?
    • A kernel panic will print the faulting address. You can use the build/kernel/kernel.asm file to search for the line of code associated with the faulting address!
  • Is the xv6 machine constantly rebooting?
    • QEMU’s -d option may help! (note, our launch script xv6-qemu doesn’t support -d, so you may have to modify it).
    • Otherwise, try figuring out the last thing your code does before rebooting.
    • Finally, a constant reboot is typically caused by a double-fault. The most common culprit for this is a corrupted page table!

Debugging Tools

Here we outline some of the course’s debugging tools, and some of the ways you can use them to help you when debugging your xv6 projects.

Kernel

GDB is your friend. Use xv6-qemu’s -g flag to start xv6 with gdb support (it will wait for you to attach a gdb instance). See the GDB reference below for some commands that are useful when debugging kernels.

If you’re getting unexpected interrupts, exceptions, or triple faults, you can ask QEMU to generate a detailed log of interrupts using the -d argument.

To debug virtual memory issues, try the QEMU monitor commands info mem (for a high-level overview) or info pg (for lots of detail). Note that these commands only display the current page table.

To debug multiple CPUs, use GDB’s thread-related commands like thread and info threads.

User environments

GDB also lets you debug user processes, but there are a few things you need to watch out for since GDB doesn’t know that there’s a distinction between multiple user environments, or between user and kernel.

You can symbolically debug user code, just like you can kernel code, but you have to tell GDB which symbol table to use with the symbol-file command since it can only use one symbol table at a time. The provided .gdbinit loads the kernel symbol table, kernel. The symbol table for a user environment is in its ELF binary, so you can load it using symbol-file <binary>. Don’t load symbols from any .o files, as those haven’t been relocated by the linker (libraries are statically linked into xv6 user binaries, so those symbols are already included in each user binary). Make sure you get the right user binary; library functions will be linked at different EIPs in different binaries and GDB won’t know any better!

Since GDB is attached to the virtual machine as a whole, it sees clock interrupts as just another control transfer. This makes it basically impossible to step through user code because a clock interrupt is virtually guaranteed the moment you let the VM run again. The stepi command works because it suppresses interrupts, but it only steps one assembly instruction. Breakpoints generally work, but watch out because you can hit the same EIP in a different environment (indeed, a different binary altogether!).

GDB

See the GDB manual for a full guide to GDB commands. Here are some particularly useful commands for cs3210, some of which don’t typically come up outside of OS development.

  • Ctrl-c

    • Halt the machine and break into GDB at the current instruction. If QEMU has multiple virtual CPUs, this halts all of them.
  • c (or continue)
    • Continue execution until the next breakpoint or Ctrl-c.
  • si (or stepi) –
    • Execute one machine instruction.
  • b <function> or b <file:line> (or breakpoint)
    • Set a breakpoint at the given function or line.
  • b *addr (or breakpoint)
    • Set a breakpoint at the EIP addr.
  • set print pretty
    • Enable pretty-printing of arrays and structs.
  • info registers
    • Print the general-purpose registers, eip, eflags, and the segment selectors. For a much more thorough dump of the machine register state, see QEMU’s own info registers command.
  • x/Nx addr
    • Display a hex dump of N words starting at virtual address addr. If N is omitted, it defaults to 1. addr can be any expression.
  • x/Ni addr
    • Display the N assembly instructions starting at addr. Using $eip as addr will display the instructions at the current instruction pointer.
  • symbol-file <file>
    • Switch to symbol file file. When GDB attaches to QEMU, it has no notion of the process boundaries within the virtual machine, so we have to tell it which symbols to use.

QEMU represents each virtual CPU as a thread in GDB, so you can use all of GDB’s thread-related commands to view or manipulate QEMU’s virtual CPUs.

  • thread n
    • GDB focuses on one thread (i.e., CPU) at a time. This command switches that focus to thread n, numbered from zero.
  • info threads
    • List all threads (i.e., CPUs), including their state (active or halted) and what function they’re in.

QEMU

QEMU includes a built-in monitor that can inspect and modify the machine state in useful ways. To enter the monitor, press Ctrl-a c in the terminal running QEMU. Press Ctrl-a c again to switch back to the serial console.

For a complete reference to the monitor commands, see the QEMU manual. Here are some particularly useful commands:

  • xp/Nx paddr
    • Display a hex dump of N words starting at physical address paddr. If N is omitted, it defaults to 1. This is the physical memory analog of GDB’s x command.
  • info registers
    • Display a full dump of the machine’s internal register state. In particular, this includes the machine’s hidden segment state for the segment selectors and the local, global, and interrupt descriptor tables, plus the task register. This hidden state is the information the virtual CPU read from the GDT/LDT when the segment selector was loaded. Here’s the CS when running a sample kernel and the meaning of each field:

      CS =0008 10000000 ffffffff 10cf9a00 DPL=0 CS32 [-R-]
      CS =0008
      
      • The visible part of the code selector. We’re using segment 0x8. This also tells us we’re referring to the global descriptor table (0x8&4=0), and our CPL (current privilege level) is 0x8&3=0.

      • 10000000 The base of this segment. Linear address = logical address + 0x10000000.

      • ffffffff The limit of this segment. Linear addresses above 0xffffffff will result in segment violation exceptions.

      • 10cf9a00 The raw flags of this segment, which QEMU helpfully decodes for us in the next few fields.

      • DPL=0 The privilege level of this segment. Only code running with privilege level 0 can load this segment.

      • CS32 This is a 32-bit code segment. Other values include DS for data segments (not to be confused with the DS register), and LDT for local descriptor tables.

      • [-R-] This segment is read-only.

  • info mem
    • (Lab 2+) Display mapped virtual memory and permissions. For example,
      ef7c0000-ef800000 00040000 urw
      efbf8000-efc00000 00008000 -rw
      

      Tells us that the 0x00040000 bytes of memory from 0xef7c0000 to 0xef800000 are mapped read/write and user-accessible, while the memory from 0xefbf8000 to 0xefc00000 is mapped read/write, but only kernel-accessible.