Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Sony Reader > Sony Reader Dev Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 04-20-2010, 11:41 AM   #1
Xaphiosis
Connoisseur
Xaphiosis doesn't litterXaphiosis doesn't litterXaphiosis doesn't litter
 
Posts: 52
Karma: 216
Join Date: Apr 2010
Device: PRS-T1
Lightbulb Tool for locating references to constant strings in ARM ELF executable/.so files

I really should get a blog or something so I can put these things up in more detail. As I don't, I'll just post here.

Short version:

This tool (requires ARM toolchain, get it from embedded Debian) goes through an ELF executable's .text section and via simplistic symbolic execution detects when a register obtains a value that is a reference to a string in the constant data section (.rodata)

Run python arm-disasm.py the_executable
It'll print some tracing data, and among it where you see STRING, you get the location of the instruction that put a string's address into a register, as well as the string.

Caveats: assumes the executable is compiled by something gcc-like, only deals with simple cases and absolutely no obfuscation. Hence might miss strings.

Now the long version:

Basically, the problem is this. Say I write hello world in C:
Code:
#include <stdio.h>
int main(int argc, char* argv[]) {
    printf("Hello from C!\n");
    return 0;
}
Now let's compile it normally:
Code:
arm-linux-gnueabi-gcc -Wall -o hello hello.c
If we disassemble its .text
Code:
arm-linux-gnueabi-objdump -EL -j .text -d hello
we can clearly see it's invocation of printf:
Code:
0000845c <main>:
    ... (preamble)
    8474:	e59f0014 	ldr	r0, [pc, #20]	; 8490 <main+0x34>
    8478:	ebffffaf 	bl	833c <_init+0x5c>
    ... (more unimportant stuff)
    8490:	0000852c 	.word	0x0000852c
We see that objdump figures out that r0 gets a value from address 0x8490, which we can see is 0x852c. If we look that up in the .rodata segment
Code:
arm-linux-gnueabi-objdump -EL -j .rodata -s hello

hello:     file format elf32-littlearm

Contents of section .rodata:
 8528 01000200 48656c6c 6f206672 6f6d2043  ....Hello from C
 8538 21000000                             !...
Well, we can clearly see which string it's referencing. 0x852c is clearly "Hello from C!"

Ok, so what's the problem then? Well, if we're dealing with a .so file (shared library), it can be placed anywhere in memory, so the code must be entirely independent of its position in memory. To simulate this, we need to recompile hello.c with position-independent-code:
Code:
arm-linux-gnueabi-gcc -fPIC -Wall -o hello hello.c
First, let's see where the string is in the executable:
Code:
arm-linux-gnueabi-objdump -EL -j .rodata -s hello

hello:     file format elf32-littlearm

Contents of section .rodata:
 853c 01000200 48656c6c 6f206672 6f6d2043  ....Hello from C
 854c 21000000                             !...
It moved to 0x8540. Nothing too exciting.

Let's take a look at what happened to the main function:
Code:
0000845c <main>:
    ... (preamble)
    846c:	e59f202c 	ldr	r2, [pc, #44]	; 84a0 <main+0x44>
    8470:	e08f2002 	add	r2, pc, r2
    8474:	e50b0010 	str	r0, [fp, #-16]
    8478:	e50b1014 	str	r1, [fp, #-20]
    847c:	e59f3020 	ldr	r3, [pc, #32]	; 84a4 <main+0x48>
    8480:	e0823003 	add	r3, r2, r3
    8484:	e1a00003 	mov	r0, r3
    8488:	ebffffab 	bl	833c <_init+0x5c>
    ... (not relevant)
    849c:	e12fff1e 	bx	lr
    84a0:	000081d8 	.word	0x000081d8
    84a4:	ffff7ef0 	.word	0xffff7ef0
You can still see the invocation of printf, but let's take a look at what string address it's putting in r0 to invoke as the first argument of printf. Yeah, something strange is going on. Do you see 0x8540 anywhere? Do 0x000081d8 or 0xffff7ef0 mean anything directly? Nope.

So how does it get the address of the string? Well, let's look at what exactly contributes to generating the r0 value:
Code:
    846c:	e59f202c 	ldr	r2, [pc, #44]	; 84a0 <main+0x44>
    8470:	e08f2002 	add	r2, pc, r2
    847c:	e59f3020 	ldr	r3, [pc, #32]	; 84a4 <main+0x48>
    8480:	e0823003 	add	r3, r2, r3
    8484:	e1a00003 	mov	r0, r3
    8488:	ebffffab 	bl	833c <_init+0x5c>
    ...
    84a0:	000081d8 	.word	0x000081d8
    84a4:	ffff7ef0 	.word	0xffff7ef0
Here we go:
  • "ldr r2, blah" means load a word from blah into r2, objdump tells us that word is at 0x84a0. That word is 0x81d8, so that's the value r2 acquires. Important observation: this is relative to the program counter (pc + 44). If we take away 44 from 0x84a0, we get 0x8474, but the instruction we're looking at is at 0x846c! Basically, on ARM, the program counter is ahead of the current instruction. Usually by 8, but there are special cases when it isn't.
  • So anyway, r2 = 0x81d8
  • Next: "add r2, pc, r2", which is r2 = pc + r2. What did we learn about pc? It's about 8 ahead of the current instruction address, so r2 += 0x8470 + 8. As a result, r2 = 0x10650
  • Next we have another pc-relative load, into r3. Objdump tells us it's from 0x84a4, so r3 = 0xffff7ef0.
  • Now we add r2 and r3, store the result in r3. Since r3 is a very large value in 32-bit, the addition overflows, so really, it's subtraction. 0x10650 + 0xffff7ef0 = 0x100008540, the last 32-bits of which are 0x8540.
  • So r3 becomes 0x8540, which gets moved to r0 and printf is invoked. We are already familiar with 0x8540... that's the offset at which "Hello from C!" lives! Wow, we did it!

Now, this is already ridiculous for hello.c, and I was trying to look through ebookConfig.so searching for references to /dev/fb0. I'd have to be stupid to try do it by hand. So let's try running my tool on hello:
Code:
python arm-disasm.py hello
Result (omitting the irrelevant parts):
Code:
846c: ldr r2, [pc, #44]
	r2 <- [84a0] (000081d8)
8470: add r2, pc, r2
	r2 <- pc (8478) + r2 (81d8) = 10650
8474: str r0, [fp, #-16]
8478: str r1, [fp, #-20]
847c: ldr r3, [pc, #32]
	r3 <- [84a4] (ffff7ef0)
8480: add r3, r2, r3
	r3 <- r2 (10650) + r3 (ffff7ef0) = 8540
8480:	STRING "Hello from C!"
8484: mov r0, r3
	r0 <- r3 (8540)
8484:	STRING "Hello from C!"
8488: bl 833c <_init+0x5c>
The lines with STRING are printed in bright green too, so you don't miss them.
Awesome, in two seconds we see that first r3 gets the address of our string, then it gets moved to r0. The tool may not be great, but it beats the manual process!

Finally, let's try it on what I originally wanted to do:
Code:
python arm-disasm.py ebookConfig.so|grep STRING
Result:
Code:
fc4:	STRING "@"
11c8:	STRING "/usr/local/sony/bin/nblconfig -bootdone"
12b0:	STRING "/sbin/shutdown -h now"
149c:	STRING "/"
15b0:	STRING "/Data/screenLock"
15bc:	STRING "[PreferencesDir]"
15f4:	STRING "/Data/tmp/preload.lst"
15f8:	STRING "/Data/tmp/preload.lst"
1614:	STRING "/Data/tmp/preload.lst"
1650:	STRING "/Data/tmp/preload.lst"
1748:	STRING "[mediaPath]"
177c:	STRING "[keyPath]"
178c:	STRING "/Data/tmp/checked"
1798:	STRING "[adeptKeyPath]"
17a8:	STRING "cp -Rp /Data/tmp/database/cache /Data/database/"
17b4:	STRING "/dev/fb0"
17c0:	STRING "/dev/fb0"
17f4:	STRING "/dev/fb0"
1804:	STRING "/dev/fb0"
1818:	STRING "/dev/fb0"
183c:	STRING "/dev/fb0"
184c:	STRING "/dev/fb0"
185c:	STRING "/dev/fb0"
186c:	STRING "/dev/fb0"
And that's all for tonight.
Attached Files
File Type: gz arm-disasm.py.gz (2.3 KB, 369 views)
Xaphiosis is offline   Reply With Quote
Old 04-22-2010, 05:37 AM   #2
kartu
PRS+ author
kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.
 
Posts: 1,637
Karma: 2446233
Join Date: Dec 2007
Device: Sony PRS-300, 505, 600, 650, 950
Xaphiosis
I can give you access to PRS+ project's wiki (and there is wiki.mobileread.com)

By the way, check the attached file out (slightly modified fsk decompiler by igorsk). Some so files and all xsb files contain compiled javascript (ecmascript). Igorsk has managed to write a decompiler for it.
Attached Files
File Type: zip DeFsk.zip (164.6 KB, 194 views)
kartu is offline   Reply With Quote
 
Advertisement
Old 04-22-2010, 06:39 AM   #3
Xaphiosis
Connoisseur
Xaphiosis doesn't litterXaphiosis doesn't litterXaphiosis doesn't litter
 
Posts: 52
Karma: 216
Join Date: Apr 2010
Device: PRS-T1
kartu: thanks again, porkupan told me about DeFsk but I didn't manage to find it, and I bothered him with so many questions already. Glad to see igorsk is also a python hacker.

As for the wiki, thank you for the offer, but I don't think I'm doing anything PRS+ would find useful. If you do need something compiled at the C level but don't want to mess around with linux and cross-compilation, drop me a line.

Speaking of PRS+, I looked at the sources. I have to say that the parts that are in the repository are truly much nicer than what I've seen from sony. Nice work!
Xaphiosis is offline   Reply With Quote
Old 04-22-2010, 06:45 AM   #4
kartu
PRS+ author
kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.kartu ought to be getting tired of karma fortunes by now.
 
Posts: 1,637
Karma: 2446233
Join Date: Dec 2007
Device: Sony PRS-300, 505, 600, 650, 950
Xaphiosis
Quote:
As for the wiki, thank you for the offer, but I don't think I'm doing anything PRS+ would find useful.
I also think that mobile read's wiki would be much more appropriate place. Regarding stuff usefull to PRS+, fixing this:
http://www.mobileread.com/forums/sho...29&postcount=9

would be a breakthrough.

Quote:
Speaking of PRS+, I looked at the sources. I have to say that the parts that are in the repository are truly much nicer than what I've seen from sony. Nice work!
Thanks.
kartu is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Firmware-update switch tool (batch files) T_Frain_K Bookeen 3 01-19-2010 07:21 AM
6.17 installer not executable?? petercreasey Calibre 3 10-13-2009 06:28 PM
Is there a dictionary tool for pdf files? booklover iRex 2 07-12-2008 01:15 AM
Is there a tool to combine LRF files? Lime2K Sony Reader 1 03-03-2008 03:09 AM
New PDF to LRF Tool (for DJVU and CBZ files too) RWood Sony Reader 0 08-29-2007 03:13 PM


All times are GMT -4. The time now is 10:40 AM.


MobileRead.com is a privately owned, operated and funded community.