ARTICLE

Understanding ELF using readelf and objdump
Contributed by Mulyadi Santosa in Misc on 2006-06-16 00:00:00
Page 3 of 3 /p>

D. How a function is referenced?

If a program calls a function that resides within its own executable, all it has to do is simple: just call the procedure. But what happens if it calls something like printf() that is defined inside glibc shared library?

Here, I won't discuss deeply about how the dynamic linker really works, but I focus on how the calling mechanism is implemented inside the executable itself. With this assumption in mind, let's continue.

When a program wants to call a function, it actually does following flow:

  1. It made a jump to relevant entry in PLT (Procedure Linkage Table).

  2. In PLT, there is another jump to an address mentioned in related entry in GOT (Global Offset Table).

  3. If this is the first the function is called, follow step #4. If this isn't, follow step #5.

  4. The related GOT entry contains an address that points back to next instruction in PLT. Program will jump to this address and then calls the dynamic linker to resolve the function's address. If the function is found, its address is put in related GOT entry and then the function itself is executed.

    So, another time the function is called, GOT already holds its address and PLT can jump directly to the address. This procedure is called lazy binding; all external symbols are not resolved until the time it is really needed (in this case, when a function is called). Jump to step #6.

  5. Jump to the address mentioned in GOT. It is the address of the function thus PLT is no longer used.

  6. Execution of the function is finished. Jump back to the next instruction in the main program.

As always, looking inside the executable is the best way to explain it. If you do:

$ objdump -d -j .text test

You will see the following code fragment:

.....
08048370 
: ..... 804838f: e8 1c ff ff ff call 80482b0

What we have on 0x80482b0 is:

 080482b0 :
 80482b0:       ff 25 ec 95 04 08       jmp    *0x80495ec
 80482b6:       68 08 00 00 00          push   $0x8
 80482bb:       e9 d0 ff ff ff          jmp    8048290 <_init+0x18>	

As you see, the jump on 0x80482b0 is indirect jump ('*' in front of the address). So, to see where it will jump, we must peek into 0x80482b0. The guesses are, either this address is in .got section or in .got.plt. Looking back in SHT, it is clear that we must check .got.plt. I use readelf to do hexadecimal dump because it does number reordering for us:

$ readelf -x 21 test
Hex dump of section '.got.plt':
  0x080495dc 080482a6 00000000 00000000 08049510 ................
  0x080495ec                            080482b6 ....
(Note: first column is virtual address. The data in this address is described at the 5th column, not the second one! So, from right to left, the address is in ascending order.)

Bingo! We have "080482b6" here. In other word, we go back to PLT and there we eventually jump another address. This is where the work of the dynamic linker is started, so we will skip it. Assuming the dynamic linker has finished its magic work, the related GOT entry now holds the address of printf().

E. Alternative tool to inspect ELF structure.

Besides counting on readelf and objdump, there is another tool called Biew. This is actually a file viewer but it is capable to parse the ELF structure.. You can grab the source from http://biew.sourceforge.net/ and compile it by yourself. Usually biew is included in hacking oriented Linux Live CD such as Phlak. Refer to the website and the packaged documents on how to compile and install Biew.

I personally like Biew because it offers curses based GUI display. Navigation between sections, checking ELF header, listing symbols and other tasks are now just a matter of pressing certain keyboard shortcut and you're done.

For example, you can list the symbols and directly jump to the symbol's address. Here, we try to jump to main(). First execute biew:

$ biew test

Press Ctrl+A followed by F7 to view symbol table. To avoid wasting time traversing the table, press again F7 to open "Find string" menu. Type "main" and press Enter. Once the highlighted entry is what you're looking for, simply press Enter and Biew will jump to the address of main(). Don't forget to switch to the disassembler mode (press F2 to select it) so you can see the high level interpretation of the opcodes.

Figure 2. Biew lists all symbols

Since we usually refer to virtual address, not file offset, it is better to switch to virtual address view. Press Ctrl+C followed by F6 and select "Local". Now, what you see in the leftmost column is the virtual address.

Conclusion

This article is just an overview on how to study ELF structure. Using readelf and objdump, you are ready to take your first journey. If needed, tool like Biew can help you to explore the binary internal faster. Use any arsenal you have, be creative and practice it regularly, then soon you can master the technique. Happy exploring.

Further reading


Article Index
Understanding ELF using readelf and objdump
Examining Section Header Table(SHT)
How a function is referenced?
 
Discussion(s)
adfaf
Written by adfa on 2007-05-09 11:14:29
afdaf
Discuss! Reply!

sw
Written by hi on 2007-06-27 10:08:36
sws
Discuss! Reply!

nice
Written by trakos on 2007-12-28 11:31:24
Nice work!
Discuss! Reply!

Khali The Great
Written by Khali The Great on 2008-05-09 06:02:24
Real Nice
Discuss! Reply!

Thanks!
Written by iron on 2008-06-12 20:39:23
This was a very informative article.
Discuss! Reply!

WOW!
Written by someone on 2008-06-18 11:37:24
Wow!
Discuss! Reply!