rCore Notes: How a Debugger Restores the Call Stack

Introduction

Currently, I am studying Operating Systems by reading rCore Tutorial Book v3, a teaching operating system developed by the OS teaching group of Tsinghua University.

In the exercise of Chapter 1, a question is raised: how does a debugger restore the call stack when it is paused at a breakpoint?

The original question is as follows, originally written in Chinese, and I will translate it into English:

现代的很多编译器生成的代码,默认情况下不再严格保存/恢复栈帧指针。在这个情况下,我们只要编译器提供足够的信息,也可以完成对调用栈的恢复。 我们可以手动阅读汇编代码和栈上的数据,体验一下这个过程。例如,对如下两个互相递归调用的函数:

Many modern compilers no longer strictly save/restore the frame pointer by default. In this situation, as long as the compiler provides sufficient information, it is still possible to restore the call stack.

We can manually read the assembly code and the data on the stack to experience this process. For example, consider the following two functions that recursively call each other:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
void flip(unsigned n) {
    if ((n & 1) == 0) {
        flip(n >> 1);
    } else if ((n & 1) == 1) {
        flap(n >> 1);
    }
}

void flap(unsigned n) {
    if ((n & 1) == 0) {
        flip(n >> 1);
    } else if ((n & 1) == 1) {
        flap(n >> 1);
    }
}

在某种编译环境下,编译器产生的代码不包括保存和恢复栈帧指针 fp 的代码。以下是 GDB 输出的本次运行的时候,这两个函数所在的地址和对应地址指令的反汇编,为了方便阅读节选了重要的控制流和栈操作(省略部分不含栈操作):

In a certain compilation environment, the compiler-generated code does not include the instructions to save and restore the frame pointer (FP). Below is the GDB output showing the addresses of these two functions during this run, along with the disassembled instructions corresponding to these addresses. For readability, I’ve excerpted the important control flow and stack operations (omitting parts without stack operations):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
(gdb) disassemble flap
Dump of assembler code for function flap:
   0x0000000000010730 <+0>:     addi    sp,sp,-16    // 唯一入口 (the only entry)
   0x0000000000010732 <+2>:     sd      ra,8(sp)
   ...
   0x0000000000010742 <+18>:    ld      ra,8(sp)
   0x0000000000010744 <+20>:    addi    sp,sp,16
   0x0000000000010746 <+22>:    ret                  // 唯一出口 (the only exit)
   ...
   0x0000000000010750 <+32>:    j       0x10742 <flap+18>

(gdb) disassemble flip
Dump of assembler code for function flip:
   0x0000000000010752 <+0>:     addi    sp,sp,-16    // 唯一入口 (the only entry)
   0x0000000000010754 <+2>:     sd      ra,8(sp)
   ...
   0x0000000000010764 <+18>:    ld      ra,8(sp)
   0x0000000000010766 <+20>:    addi    sp,sp,16
   0x0000000000010768 <+22>:    ret                  // 唯一出口 (the only exit)
   ...
   0x0000000000010772 <+32>:    j       0x10764 <flip+18>
End of assembler dump.

启动这个程序,在运行的时候的某个状态将其打断。此时的 pc, sp, ra 寄存器的值如下所示。此外,下面还给出了栈顶的部分内容。(为阅读方便,栈上的一些未初始化的垃圾数据用 ??? 代替。)

Start the program and interrupt it at a certain state during execution. At this point, the values of the pc, sp, and ra registers are as follows. Additionally, a portion of the contents of the top of the stack is provided below. (For readability, some uninitialized garbage data on the stack is replaced with ???.)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
(gdb) p $pc
$1 = (void (*)()) 0x10752 <flip>

(gdb) p $sp
$2 = (void *) 0x40007f1310

(gdb) p $ra
$3 = (void (*)()) 0x10742 <flap+18>

(gdb) x/6a $sp
0x40007f1310:   ???     0x10750 <flap+32>
0x40007f1320:   ???     0x10772 <flip+32>
0x40007f1330:   ???     0x10764 <flip+18>

根据给出这些信息,调试器可以如何复原出最顶层的几个调用栈信息?假设调试器可以理解编译器生成的汇编代码。

Given this information, a debugger can restore the top few frames of the call stack by following these steps, assuming the debugger understands the assembly code generated by the compile.

Answer

At first, review the effect of the register points which are used in this question:

  • pc: Program Counter, the address of the next instruction to be executed.
  • sp: Stack Pointer, the address of the top of the stack.
  • ra: Return Address, the address to return to after the current function finishes.

Then, let’s analyze the current state of the program, pc register points is 0x10752, whose assembly code is addi sp,sp,-16. This means that the current function is flip, and the stack pointer is decreased by 16 bytes. However, don’t forget that pc points to the next instruction to be executed, so the assembly code addi sp,sp,-16 is not executed yet. So sp is pointing to caller’s stack frame.

The ra register points to 0x10742, which is in the middle of the flap function. This means that the flip function is called by the flap function. So the last call stack is flap -> flip.

In the above, sp points to the caller flap’s stack frame, and we can see that the return address of the flap function is stored at sp + 8 by the instruction sd ra,8(sp). So the return address of the flap function is 0x40007f1310 + 0x8 = 0x40007f1318, which is 0x10750. It’s also an address of flap. We can see that the call stack is flap -> flap -> flip.

After that we can continue to analyze the call stack, in assembly code, we can see that flap function’s stack size is 16 bytes, and the return address is stored at sp + 8. So the return address of the flap function is 0x40007f1320 + 0x8 = 0x40007f1328, which is 0x10772. It’s an instruction of flip. So the call stack is flip -> flap -> flap -> flip.

Finally, by using the same method, we can see the call stack is flip -> flip -> flap -> flap -> flip.

Summary

In this problem, we analyze the call stack by using the information of the pc, sp, and ra registers. By understanding the assembly code generated by the compiler, we can restore the call stack of the program. This is a basic skill for debugging and reverse engineering.

The points that are easy to overlook is that the pc register points to the next instruction to be executed, some people may forget this and think sp is pointing to the current function’s stack frame mistakenly. So we should be careful when analyzing the call stack.

At last, I think rCore Tutorial Book is a good book for learning operating systems, and I will continue to read it and write notes about it. If you are new to operating systems, I recommend you to read it.

Licensed under CC BY-NC-SA 4.0
Built with Hugo
Theme Stack designed by Jimmy