Comments on PLT and relocation

Some comments on PLT and relocation

Different compilers generate different executable code. The hooking example shown in the lab relies on the fact that the value of a C function pointer for a standard library function (here, puts) is the address of the stub in the PLT, i.e. puts@plt as shown in the debugger. This is true for gcc-5 and clang (including 3.9), but different for gcc-6 (e.g., Debian 6.2.0-10).

Consider the following code, with a function that is just called (puts), another for which just the address is used (getchar), and another with both uses (printf).

#include <stdio.h>

int main(void)
{
  puts("Just call");
  printf("Just use: %p\n", getchar);
  printf("Call and use: %p\n", printf);
}

In an executable compiled either gcc-5 or clang-3.9, all three functions symbol are in the .rela.plt section (displayed with readelf -r):

Relocation section '.rela.plt' at offset 0x3f8 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
000000601020  000400000007 R_X86_64_JUMP_SLO 0000000000400480 printf@GLIBC_2.2.5 + 0
000000601028  000500000007 R_X86_64_JUMP_SLO 0000000000400490 getchar@GLIBC_2.2.5 + 0

However, when compiled with gcc-6, only puts is in section .rela.plt, whereas the two functions for which the pointers are used have entries in the .rela.dyn section, which is the dynamic relocation section for data (variables).

Relocation section '.rela.dyn' at offset 0x488 contains 11 entries:
...
000000200fc8  000300000006 R_X86_64_GLOB_DAT 0000000000000000 printf@GLIBC_2.2.5 + 0
000000200fd8  000500000006 R_X86_64_GLOB_DAT 0000000000000000 getchar@GLIBC_2.2.5 + 0
...

The values in the .rela.dyn section will be filled in at loading time. Use objdump -d to disassemble the executable (or disas main in gdb right after loading, before running). You can see that the function addresses getchar and printf passed to printf are loaded from offsets relative to %rip:

mov    0x2008a1(%rip),%rax        # 200fd8 <getchar@GLIBC_2.2.5>
...    
    
mov    0x200876(%rip),%rax        # 200fc8 <printf@GLIBC_2.2.5>

and you can check with objdump -s that these addresses belong to the global offset table (GOT), where of course they are still filled with zeroes. You can see the same in the debugger immediately after loading and before starting the program:

$ gdb a.out 
(gdb) disas main
...
0x0000000000000730 <+16>:    mov    0x2008a1(%rip),%rax        # 0x200fd8
(gdb) x/a 0x200fd8
0x200fd8:       0x0

After setting a breakpoint and starting the program, you will see the actual addresses at which the program is loaded, and that the GOT entry for getchar (and printf) has been already filled in.

(gdb) b main
(gdb) r
(gdb) disas main
...
0x0000555555554730 <+16>:    mov    0x2008a1(%rip),%rax        # 0x555555754fd8
(gdb) x/a 0x555555754fd8
0x555555754fd8: 0x7ffff7aab550 <getchar>

We see the difference in the calls to puts and printf:

   0x000055555555472b <+11>:    callq  0x5555555545d0 <puts@plt>
(gdb) disas 0x5555555545d0
   0x00005555555545d0 <+0>:     jmpq   *0x200a42(%rip)        # 0x555555755018
(gdb) x/a 0x555555755018
   0x555555755018: 0x5555555545d6 <puts@plt+6>
...
   0x0000555555554746 <+38>:    callq  0x5555555545e0         # this is printf
(gdb) disas 0x5555555545e0,0x5555555545f0
   0x00005555555545e0:  jmpq   *0x2009e2(%rip)        # 0x555555754fc8
(gdb) x/a 0x555555754fc8
   0x555555754fc8: 0x7ffff7a8b160 <__printf>

For printf, the indirect jump already points to the actual library __printf code, whereas for puts, it points back to the PLT and the code that will resolve the symbol on first call.

Moreover, the C pointers for getchar and printf already point direclty to the library code, and not the PLT, so we can't use the same technique as in the lab function hooking program. However, we can write a modified program that searches through its code to find calls to puts and thus where the indirect jump happens (it may be instructive to try to write it yourself). Since the function address is already resolved, the memory is already set to read-only, so we have to use mprotect to modify it.

Marius Minea

Last modified: Sun Oct 30 12:15:00 EET 2016