Patching iphone-gcc binaries to armv7s
Theos Issue #53 is, as described:
and typing “$THEOS/bin/nic.pl”
my MobileTerminal executes:
Illegal Instruction: 4
So, this means that perl got us an Illegal Instruction error. Oh no!
The solution is put below, which is:
sed -i'' 's/\x00\x30\x93\xe4/\x00\x30\x93\xe5/g;s/\x00\x30\xd3\xe4/\x00\x30\xd3\xe5/g;' /usr/bin/perl ldid -s /usr/bin/perl
There’s the solution! But once it’s there, why this blog post?
Well, this issue re-surfaced on #theos earlier today, so I figured I’d go around and find out why this is so. I ended up learning a bit more about how ARM Assembly and etc. work, so well, why not put it in a blog post? This tumblr sucks anyway.
So, the second line of the previous script just re-signs the binary, so we can ignore it. The first line applies a binary patch. The question is, what does it patch?
Since I don’t in fact have an iPhone 5, let’s assume he put the Illegal Instruction-crashing program in gdb and discovered the bad instruction by seeing where pc is. As for us, we’ll have to do this by looking at the patch itself.
We shall take the ntpdate binary, which was cited as an example of something the patch made work in the blog post it was initially introduced.
otool -tV ./ntpdate > ./ntpdate_disassembly # apply binary patch otool -tV ./ntpdate > ./new_ntpdate_disassembly diff -u ./ntpdate_disassembly ./new_ntpdate_disassembly
With this diff, we get this change repeatedly:
-000026a0 e4933000 ldr r3, [r3] +000026a0 e5933000 ldr r3, [r3] -000027e0 e4d33000 ldrb r3, [r3] +000027e0 e5d33000 ldrb r3, [r3]
So! The issue was at the generated opcode for the ldr and ldrb instructions, and when changing e4 to e5 it magically appears to work! OK!
There are many questions surrounding this still, and after a talk with DHowett about the issue, he pointed out to me that this was in fact a known issue with Telesphoreo’s csu’s crt1.o’s instructions.
Indeed! In the binary, ldr r3, [r3] instructions with the e4 opcode prefix only happened in functions exported by crt1.o (_crt_basename, __start). The same instruction when in other places in the binary has the other (correct) prefix!
So as it turns out, the issue isn’t even an obscure problem with iphone-gcc generating bad instructions that worked on armv6 and armv7 undocumented and not on armv7s, or an obscure instruction which worked on armv6-7 and not on 7s. Rather, it’s an issue with the crt the toolchain used, which for some reason contains in its machine code a non-standard opcode which worked on armv6 and armv7 but not on armv7s.
The task is now to replace it with the proper armv6 opcode.
From that, the approach to find the correct opcode out is, at its simplest:
cat >test.s <<EOF
start:
ldr r3, [r3]
ldrb r3, [r3]
EOF
as -arch armv6 test.s -o test.o
otool -tV test.o
And there we have it:
test.o: (__TEXT,__text) section start: 00000000 e5933000 ldr r3, [r3] 00000004 e5d33000 ldrb r3, [r3]
So finally, analyzing the patch closely:
s/\x00\x30\x93\xe4/\x00\x30\x93\xe5/g;s/\x00\x30\xd3\xe4/\x00\x30\xd3\xe5/g;
This, in fact, can be reduced to two substitutions:
00 30 93 e4->00 30 93 e500 30 d3 e4->00 30 d3 e5
Knowing that ARM instructions are formatted to binary back-to-forth (that is, if an opcode is e4 93 30 00, it’ll be written to the binary as 00 30 93 e4)1, understanding why the patch works is easy.
The big question remains, and can’t be answered: How and why did crt1.o get built with that faulty opcode? We are left with:
[01:10:56] <@DHowett> anyway: [they] realized that [they] never knew where exactly that file came from
-
As requested, more detail: This is because we’re on little endian systems, which means over 1-byte (8-bit) data (like instruction words, which are 32-bit) is stored in memory and files starting by the little end; the least significant, smallest byte (that is, 64353 = 0xFB61 = 1111101101100001, which when split in bytes are respectively
FB 61and11111011 01100001. We start by the little end of the number bytes, therefore we get01100001 11111011, or61 FB; the same applies to instruction opcodes). ↩