At M86 Security Labs, we research various attacks on a daily basis. Some of these attacks originate from malicious PDF files.
One distinctive characteristic of malicious PDF files is a chunk of javascript code performing a heap-spray on the client browser, filling it with NOP (No OPeration) instructions (also acting as a valid heap memory address) – followed by the attacker’s shellcode; then triggering a bug in the PDF reader, which directs the flow of execution to a random memory location on the sprayed heap, executing the NOP sled followed by the shellcode.
While investigating the latest PDF 0day exploit [CVE-2010-4091, Extraexploit, VUPEN, Original Full-Disclosure post] that was published to the Full-Disclosure mailing list, we noticed something interesting – the shellcode part of the malicious javascript code was very tiny:

Here’s a disassembly view of the shellcode:
What we’re seeing is a known shellcode technique called Egghunting, where the shellcode itself is very small (usually free of Null bytes) and it’s sole purpose is to search the memory space of the process for the real shellcode, and on some more advanced versions for one or more parts of the reall shellcode – collect all the pieces together, then execute the found shellcode.
It’s used mainly in types of attack that prevent the attacker from placing a large amount of shellcode at the point where he is able to gain control of code execution, while being able to control data in the memory space of the process yet lacking the exact memory address location of the controllable data.
Notice how the egghunter shellcode uses int 0x2e to call the nt!NtDisplayString kernel function, passing it a pointer to the address to check on the stack (the edx register points to the user-land stack while eax is the System Service Code – an index to the nt!KiServiceTable pointer array, pointing to the nt!NtDisplayString function). You can read more about “How do windows NT system calls really work?” in this great article.
If the memory address is un-mapped in the address space of the process, an access violation will occur and the return value in the eax register will be 0xc0000005 (STATUS_ACCESS_VIOLATION).
The egghunter shellcode compares the low byte of eax to 5, indicating un-mapped memory and increments the address to check on each loop iteration.
Each mapped memory region is searched for the pattern \x90\x50\x90\x58 which translates to:
90 – NOP
50 – PUSH EAX
90 – NOP
58 – POP EAX
- A ‘non-intrusive’ marker (‘NOP’) indicating the beginning of the real shellcode.
Once found – the egghunter jumps to the address and continues execution from there.
As a side-note, this particular sample is not functional – i.e. when the vulnerability is triggered by executing the this.printSept() javascript code, the egghunter shellcode is never executed – crashing the browser.
It would nevertheless be interesting to see were egg-hunting exploits do decide to place the real shellcode in future PDF attacks.


[...] This post was mentioned on Twitter by Jovi Umawing, Timeless Prototype and Chae Jong Bin, xanda. xanda said: Who's looking for eggs in your PDF? http://bit.ly/bAWZ5p [...]