Understanding JIT spray

(written by Lawrence Krubner, however indented passages are often quotes)

A very clever hack:

As mentioned earlier, the JIT has to mark its own assembly buffers as executable. An attacker may look at using that fact to generate executable stage 0 shellcode, in order to bypass some of the pain inflicted by DEP. But how could you possibly use JIT compilation process to make shellcode?

JIT spraying is the process of coercing the JIT engine to write many executable pages with embedded shellcode.

—Blazakis, 2010

Dion Blazakis wrote the seminal paper on JIT spray, in which he presented a jaw-dropping example. Blazakis noticed that the following ActionScript [¶] code:

var y = (
0x3c54d0d9 ^
0x3c909058 ^
0x3c59f46a ^
0x3c90c801 ^
0x3c9030d9 ^
0x3c53535b ^

)
Was JIT-compiled into the following instruction sequence:

addr op imm assembly
0 B8 D9D0543C MOV EAX,3C54D0D9
5 35 5890903C XOR EAX,3C909058
10 35 6AF4593C XOR EAX,3C59F46A
15 35 01C8903C XOR EAX,3C90C801
20 35 D930903C XOR EAX,3C9030D9
25 35 5B53533C XOR EAX,3C53535B
Check out the first line — it’s showing that the first instruction is a MOV that places the 32-bit immediate payload into the EAX register. The 32-bit immediate payload from that instruction (3C54D0D9) is exactly the immediate that was used as the left-hand-side to the long XOR sequence in the original ActionScript code.

Now, if we look at the subsequent lines, we see that the addr column, which is showing the address of instructions relative to the start of the sequence, goes up by 5 every time. That’s because each instruction after the initial MOV is performing an XOR against the original value in the accumulator register, EAX, exactly as the ActionScript program described.

Each of these instructions is exactly five bytes long — each instruction has a one-byte opcode prefix, given under the op column, followed by a 32 bit immediate constant: the opcode for MOV EAX,[imm32] is 0xB8, and the opcode sequence for XOR EAX,[imm32] is 0×35.

The immediate column may look confusing at a glance, but it’s actually just the little-endian equivalent of the 32-bit immediate given in the assembly: the “little end” (least significant byte) goes “in” (at the lowest memory address), which is why the byte order looks flipped around from the one given in the assembly (and in the original ActionScript program).

It may not look so sinister, but the above table is actually deceiving you! In the table, all of the instructions are the same number of bytes (5) in length. On x86 CPUs, however, instructions are actually a variable number of bytes in length: instructions can be as small as a single byte, but can get quite long: the nop instruction is just a 0×90 opcode byte with no operands, whereas the movl $0xdeadbeef, 0×12345678(%ebx,%edx,1) instruction is significantly larger. [#]

As a result, when we look at this instruction sequence “crooked” — with a one-byte skew to the address — we decode a totally different sequence of instructions. I’ll show you what I mean.

Our instructions in memory look like the following buffer:

static const char buf[] = {
0xB8, 0xD9, 0xD0, 0×54, 0x3C,
0×35, 0×58, 0×90, 0×90, 0x3C,
0×35, 0x6A, 0xF4, 0×59, 0x3C,
0×35, 0×01, 0xC8, 0×90, 0x3C,
0×35, 0xD9, 0×30, 0×90, 0x3C,
0×35, 0x5B, 0×53, 0×53, 0x3C
};
When we load this up in GDB, and run the disassemble command, we confirm the instructions present in the above table::

(gdb) disassemble/r buf
Dump of assembler code for function buf:
0×08048460 <+0>: b8 d9 d0 54 3c mov eax,0x3c54d0d9
0×08048465 <+5>: 35 58 90 90 3c xor eax,0x3c909058
0x0804846a <+10>: 35 6a f4 59 3c xor eax,0x3c59f46a
0x0804846f <+15>: 35 01 c8 90 3c xor eax,0x3c90c801
0×08048474 <+20>: 35 d9 30 90 3c xor eax,0x3c9030d9
0×08048479 <+25>: 35 5b 53 53 3c xor eax,0x3c53535b
But then, if we look at the buffer with a one-byte offset, we see a totally different set of instructions! Note the use of buf+1 as the disassembly target.:

(gdb) disassemble/r (buf+1), (buf+sizeof(buf))
Dump of assembler code from 0×8048461 to 0x804847e:
0×08048461 : d9 d0 fnop
0×08048463 : 54 push esp
0×08048464 : 3c 35 cmp al,0×35
0×08048466 : 58 pop eax
0×08048467 : 90 nop
0×08048468 : 90 nop
0×08048469 : 3c 35 cmp al,0×35
0x0804846b : 6a f4 push 0xfffffff4
0x0804846d : 59 pop ecx
0x0804846e : 3c 35 cmp al,0×35
0×08048470 : 01 c8 add eax,ecx
0×08048472 : 90 nop
0×08048473 : 3c 35 cmp al,0×35
0×08048475 : d9 30 fnstenv [eax]
0×08048477 : 90 nop
0×08048478 : 3c 35 cmp al,0×35
0x0804847a : 5b pop ebx
0x0804847b : 53 push ebx
If you look down the middle part of the two disassemblies, before the assembly mnemonics, you can read that the bytes are the same from left to right: the first line of the first disassemblies goes b8 d9 d0 54 3c, and the second disassembly starts on the second byte of that same sequence with d9 d0 54 3c, straddling multiple instructions. This is the magic of variable length instruction encoding: when you look at an instruction stream a little bit sideways, things can change very drastically.

Source