| |
11-13-2008
|
#1 (permalink)
| | Linux User
Join Date: Dec 2007 Location: Canada, Prince Edward Island
Posts: 464
| [SOLVED] Assembler and Dynamic libraries Hi,
I'm trying to figure out how dynamic libraries work in Linux(or at least the concept) and I'm having a difficult time of it so I created
this oversimplified piece of assembler code in hopes that it represents at least in principle the calling of a dynamic linked function.
Note: CPU - AMD athlon 64 bit
Now my question is, is this even close to how dynamic libraries work(the linux loader is simulated
by the statement movq $printit, %rax like I said oversimplified)?
I know this is a gross oversimplification but is it correct in spirit....Please see code and thanks for your input...Gerard4143
Note: there is no errors in the code...it will compile and run correctly. Code: .section .data
mydata: .ascii "hello world!\n"
.equ mylen, . - mydata
myvar: .quad 0
.section .bss
.section .text
.global _start
_start:
movq $printit, myvar /*simulated linux loader loading*/
/*function address into fix location*/
call tohere /*push instruction pointer*/
tohere: /* onto the stack*/
jmp tryit /*so we can return*/
movq $60, %rax /*exit routine*/
movq $0, %rdi
syscall
tryit:
call *myvar /*call the fix address*/
popq %rax /*pop the instruction pointer - return address*/
addq $2, %rax /*add two to it*/
jmp *%rax /*and jump*/
printit: /*a function with an arbitrary address*/
/*it isn't here but it could have been*/
pushq %rbp
movq %rsp, %rbp
movq $1, %rax
movq $1, %rdi
movq $mydata, %rsi
movq $mylen, %rdx
syscall
movq %rbp, %rsp
popq %rbp
ret
|
| Looking for Linux Hosting? Click Here
|
11-13-2008
|
#2 (permalink)
| | Linux Engineer
Join Date: Sep 2007 Location: Mariposa
Posts: 1,192
| I've done absolutely no assember language coding since before Linux came along, but you can get a good start at understanding how dynamic libraries work by linking dynaimcally a C program which calls a function in a dynamic library specifically written by you for this test. You can observe the assembler language two ways: - by using the -S option while compiling; and
- by running the dynamically linked program under strace.
I've looked briefly at dynamically linked UNIX code a few years back, and Linux code since then (too lazy to try it again tonight), and the generated code seems to have done an mmap() with the file that contains the library. If several programs do that with the same library, they'll share the same library code image, just as they'd share the same data if they weren't doing linking, but had some other purpose for mmap()ping a file without using PROT_WRITE.
That's just my foggy memory speaking, though. Try it! :)
__________________ --
Bill
Old age and treachery will overcome youth and skill. |
| |
11-13-2008
|
#3 (permalink)
| | Linux User
Join Date: Dec 2007 Location: Canada, Prince Edward Island
Posts: 464
| Thanks for the reply Bill, your answers are a little beyond what I was looking for right now, maybe if I stated it another way...
Does the Linux loader assign dynamic addresses to known address spaces in an object file and are these known addresses accessed with jump tables within the object file?
Again thanks for your reply...I'll be investigating the information you provided as soon as I get this straightened out...Gerard4143 |
| |
11-13-2008
|
#4 (permalink)
| | Linux Engineer
Join Date: Sep 2007 Location: Mariposa
Posts: 1,192
| I don't really know. But will answer these questions for you.
__________________ --
Bill
Old age and treachery will overcome youth and skill. |
| |
11-13-2008
|
#5 (permalink)
| | Linux User
Join Date: Dec 2007 Location: Canada, Prince Edward Island
Posts: 464
| Actually you get a better listing if you use:
objdump -D filename
My understanding of this process right now is:
1. the linux loader assigns dynamic addresses into the GOT section so that the user process
can access these values in fixed address locations
2. the user process uses the PLT section as a jump table into the GOT addresses
Now I know this is a simplified version of the process but are my assumptions right? I can't pursue this
topic without a firm understanding of the basics so if anybody has any input...it will be appreciated...
thanks again for your time and effort Gerard4143 |
| |
11-13-2008
|
#6 (permalink)
| | Linux Newbie
Join Date: Jan 2008 Location: UK
Posts: 211
| |
| |
11-13-2008
|
#7 (permalink)
| | Linux User
Join Date: Dec 2007 Location: Canada, Prince Edward Island
Posts: 464
| Wowbag1 I've seen this web page before and it states Anatomy of Linux dynamic libraries
"Relocation is handled through an indirection mechanism called the Global Offset Table (GOT) and the Procedure Linkage Table (PLT).
These tables provide the addresses of external functions and data, which ld-linux.so loads during the relocation process. This means that
the code that requires the indirection (that is, uses the tables) needs no changes: only the tables require adjustment. Relocation can occur
immediately upon load or whenever a given function is needed. (See more on this difference later in Dynamic loading with Linux.)"
A great blurb but it really doesn't say anything specific...It doesn't answer my questions that I posted above.
That is the problem with this topic, its either to general or to technical to get any instruction from...not to worry, I'm stubborn I'll figure this out and post it...
I'm going to assume that my understanding of this process is correct and start exploring the mmap() function that bill eluded to...
Thanks again for all the help if I find a definite answer I'll post it...Gerard4143 |
| |
11-14-2008
|
#8 (permalink)
| | Linux User
Join Date: Dec 2007 Location: Canada, Prince Edward Island
Posts: 464
| Found the answer to the first question I think Makefile with the -fPIC switch for position independent code,
this will engage the GOT Code: test: test.o
gcc test.o -o test
test.o: test.c
gcc -fPIC -c test.c
clean:
rm -f test.o
the - objdump -D test - only the important parts...the GOT address for x
you can see it when you scroll to the right addr = 0x600fd8 Code: 00000000004005dc <getx>:
4005dc: 55 push %rbp
4005dd: 48 89 e5 mov %rsp,%rbp
4005e0: 48 8b 05 f1 09 20 00 mov 0x2009f1(%rip),%rax # 600fd8 <_DYNAMIC+0x1b0>
4005e7: c7 00 58 00 00 00 movl $0x58,(%rax)
4005ed: c9 leaveq
4005ee: c3 retq
now the C code:
Note I found the value of myvoid = 0x600fd8 from the above objdump Code: #include <stdio.h>
#include <stdlib.h>
int x = 5;
int getx()
{
x = 88;
}
void **myvoid = NULL;
int main(int argc, char**argv)
{
myvoid = (void**)0x600fd8; /*pointer to GOT which contains
a pointer to the global varable x*/
fprintf(stdout, "ptr->%p\n", *myvoid);
fprintf(stdout, "x->%p\n", &x);
getx();
return x;
}
and the result of running the code:
the global variable x has an address 0x601020
and we have a pointer in the GOT equal to 0x601020 Code: ptr->0x601020
x->0x601020
So the GOT, I hope, is a place in memory to store pointers to global variables and functions
so that they can reside anywhere |
| |
11-14-2008
|
#9 (permalink)
| | Linux User
Join Date: Dec 2007 Location: Canada, Prince Edward Island
Posts: 464
| the second question answered the makefile with -fPIC to engage position independent code Code: test: test.o
gcc test.o -o test
test.o: test.c
gcc -fPIC -c test.c
clean:
rm -f test.o
the objdump -D test
with the plt section and got for the function fprintf
the dynamic function that I'll be investigating Code: Disassembly of section .plt:
00000000004004f0 <fprintf@plt>:
4004f0: ff 25 12 0b 20 00 jmpq *0x200b12(%rip) # 601008 <_GLOBAL_OFFSET_TABLE_+0x20>
4004f6: 68 01 00 00 00 pushq $0x1
4004fb: e9 d0 ff ff ff jmpq 4004d0 <_init+0x18>
00000000004005ff <main>:
4005ff: 55 push %rbp
400600: 48 89 e5 mov %rsp,%rbp
400603: 48 83 ec 10 sub $0x10,%rsp
400607: 89 7d fc mov %edi,-0x4(%rbp)
40060a: 48 89 75 f0 mov %rsi,-0x10(%rbp)
40060e: 48 8b 05 ab 09 20 00 mov 0x2009ab(%rip),%rax # 600fc0 <_DYNAMIC+0x1a0>
400615: 48 c7 00 d8 0f 60 00 movq $0x600fd8,(%rax)
40061c: 48 8b 05 9d 09 20 00 mov 0x20099d(%rip),%rax # 600fc0 <_DYNAMIC+0x1a0>
400623: 48 8b 00 mov (%rax),%rax
400626: 48 8b 10 mov (%rax),%rdx
400629: 48 8b 05 b0 09 20 00 mov 0x2009b0(%rip),%rax # 600fe0 <_DYNAMIC+0x1c0>
400630: 48 8b 38 mov (%rax),%rdi
400633: 48 8d 35 62 01 00 00 lea 0x162(%rip),%rsi # 40079c <_IO_stdin_used+0x4>
40063a: b8 00 00 00 00 mov $0x0,%eax
40063f: e8 ac fe ff ff callq 4004f0 <fprintf@plt>/*********** fprintf *************/
400644: 48 8b 05 95 09 20 00 mov 0x200995(%rip),%rax # 600fe0 <_DYNAMIC+0x1c0>
40064b: 48 8b 38 mov (%rax),%rdi
40064e: 48 8b 15 7b 09 20 00 mov 0x20097b(%rip),%rdx # 600fd0 <_DYNAMIC+0x1b0>
400655: 48 8d 35 49 01 00 00 lea 0x149(%rip),%rsi # 4007a5 <_IO_stdin_used+0xd>
40065c: b8 00 00 00 00 mov $0x0,%eax
400661: e8 8a fe ff ff callq 4004f0 <fprintf@plt>/*********** fprintf *************/
400666: 48 8b 05 53 09 20 00 mov 0x200953(%rip),%rax # 600fc0 <_DYNAMIC+0x1a0>
40066d: 48 c7 00 08 10 60 00 movq $0x601008,(%rax)
400674: 48 8b 05 65 09 20 00 mov 0x200965(%rip),%rax # 600fe0 <_DYNAMIC+0x1c0>
40067b: 48 8b 38 mov (%rax),%rdi
40067e: 48 8b 15 53 09 20 00 mov 0x200953(%rip),%rdx # 600fd8 <_DYNAMIC+0x1b8>
400685: 48 8d 35 20 01 00 00 lea 0x120(%rip),%rsi # 4007ac <_IO_stdin_used+0x14>
40068c: b8 00 00 00 00 mov $0x0,%eax
400691: e8 5a fe ff ff callq 4004f0 <fprintf@plt>
400696: b8 00 00 00 00 mov $0x0,%eax
40069b: e8 4c ff ff ff callq 4005ec <getx>
4006a0: 48 8b 05 29 09 20 00 mov 0x200929(%rip),%rax # 600fd0 <_DYNAMIC+0x1b0>
4006a7: 8b 00 mov (%rax),%eax
4006a9: c9 leaveq
4006aa: c3 retq
4006ab: 90 nop
4006ac: 90 nop
4006ad: 90 nop
4006ae: 90 nop
4006af: 90 nop
Disassembly of section .got:
0000000000600fc0 <.got>:
...
Disassembly of section .got.plt:
0000000000600fe8 <_GLOBAL_OFFSET_TABLE_>:
600fe8: 20 0e and %cl,(%rsi)
600fea: 60 (bad)
...
600fff: 00 e6 add %ah,%dh
601001: 04 40 add $0x40,%al
601003: 00 00 add %al,(%rax)
601005: 00 00 add %al,(%rax)
601007: 00 f6 add %dh,%dh
601009: 04 40 add $0x40,%al
60100b: 00 00 add %al,(%rax)
60100d: 00 00 add %al,(%rax)
...
Disassembly of section .data:
0000000000601010 <__data_start>:
...
and the C code with the address 0x 601008 extracted from
the objdump above Code: #include <stdio.h>
#include <stdlib.h>
int x = 5;
int getx()
{
x = 88;
}
void **myvoid = NULL;
int main(int argc, char**argv)
{
//the test for the global variable x
myvoid = (void**)0x600fd0; /*pointer to GOT which contains
a pointer to the global varable x*/
fprintf(stdout, "got ptr to x->%p\n", *myvoid);
fprintf(stdout, "x------------>%p\n", &x);
//the test for the fprintf
myvoid = (void**)0x601008; /*pointer to GOT which contains
a pointer to the global varable fprintf*/
fprintf(stdout, "fprintf------------>%p\n", fprintf);
fprintf(stdout, "got ptr to fprintf->%p\n", *myvoid);
getx();
return x;
}
and the answer Code: got ptr to x->0x601020
x------------>0x601020
fprintf------------>0x7f8bc3cc9ec0
got ptr to fprintf->0x7f8bc3cc9ec0
What does this all mean to me...well the call to the function fprintf is accomplished by the call to the <fprintf@plt>
which is a lauching point or jump to the got table where the address of fprintf resides...
Please, if this is incorrect let me know...Gerard4143 |
| |
11-14-2008
|
#10 (permalink)
| | Linux User
Join Date: Dec 2007 Location: Canada, Prince Edward Island
Posts: 464
| And the assembler code - well a simple version anyways CPU - AMD athlon 64 bit Code: hello: hello.o
ld hello.o -o hello
hello.o: hello.s
as -gstabs hello.s -o hello.o
clean:
rm -f hello.o
and the code Code: .section .data
mydata: .ascii "hello world!\n"
.equ mylen, . - mydata
.section .got
myvar_got: .quad 0
.section .plt
printit_plt:
jmp *myvar_got(%rip)
.section .text
.global _start
_start:
nop
movq $printit, myvar_got /*simulate loading the global offset table*/
call printit_plt
movq $60, %rax
movq $0, %rdi
syscall
printit:
pushq %rbp
movq %rsp, %rbp
movq $1, %rax
movq $1, %rdi
movq $mydata, %rsi
movq $mylen, %rdx
syscall
movq %rbp, %rsp
popq %rbp
ret
now this code does work, if anyone sees any glaring errors please let me
know...if not I'll close this posting in 24 - 48 hours...Gerard4143 |
| | |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | | | | Thread Tools | | | | Display Modes | Linear Mode |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | |