Mass mailing or targeted campaigns that use common files to host or exploit code have been and are a very popular vector of attack. In other words, a malicious PDF or MS Office document received via e-mail or opened trough a browser plug-in. In regards to malicious PDF files the security industry saw a significant increase of vulnerabilities after the second half of 2008 which might be related to Adobe Systems release of the specifications, format structure and functionality of PDF files.
From a security incident response perspective the knowledge about how to do a detailed analysis of such malicious files can be quite useful. When analyzing this kind of files an incident handler can determine the worst it can do, its capabilities and key characteristics. Furthermore, it can help to be better prepared and identify future security incidents and how to contain, eradicate and recover from those threats.
So, which steps could an incident handler or malware analyst perform to analyze such files?
- Extract the shellcode
- Create a shellcode executable
- Analyze shellcode and determine what is does.
A summary of tools and techniques using REMnux to analyze malicious documents are described in the cheat sheet compiled by Lenny, Didier and others. In order to practice these skills and to illustrate an introduction to the tools and techniques, below is the analysis of a malicious PDF using these steps.
Then looking deeper we can use pdf-parser.py to display the contents of the 6 objects. The output was reduced for the sake of brevity but in this case the Object 2 is the /XFA element that is referencing to Object 1 which contains a stream compressed and rather suspicious.
3rd Step – Extract the shellcode
These Unicode encoded strings need to be converted into binary. To perform this isolate the Unicode encoded strings into a separated file and convert it the Unicode (\u) to hex (\x) notation. To do this you need using a series of Perl regular expressions using a Remnux script called unicode2hex-escaped. The resulting file will contain the shellcode in a hex format (“\xeb\x06\x00\x00..”) that will be used in the next step to convert it into a binary
4th Step – Create a shellcode executable
Next with the shellcode encoded in hexadecimal format we can produce a Windows binary that runs the shellcode. This is achieved using a script called shellcode2exe.py written by Mario Vilas and later tweaked by Anand Sastry. As Lenny states ” The shellcode2exe.py script accepts shellcode encoded as a string or as raw binary data, and produces an executable that can run that shellcode. You load the resulting executable file into a debugger to examine its. This approach is useful for analyzing shellcode that’s difficult to understand without stepping through it with a debugger.”
5th Step – Analyze shellcode and determine what is does.
Final step is to determine what the shellcode does. To analyze the shellcode you could use a dissasembler or a debugger. In this case the a static analysis of the shellcode using the strings command shows several API calls used by the shellcode. Further also shows a URL pointing to an executable that will be downloaded if this shellcode gets executed
We now have a strong IOC that can be used to take additional steps in order to hunt for evil and defend the networks. This URL can be used as evidence and to identify if machines have been compromised and attempted to download the malicious executable. At the time of this analysis the file was no longer there but its known to be a variant of the Game Over Zeus malware.
The steps followed are manual but with practice they are repeatable. They just represent a short introduction to the multifaceted world of analyzing malicious documents. Many other techniques and tools exist and much deeper analysis can be done. The focus was to demonstrate the 5 Steps that can be used as a framework to discover indicators of compromise that will reveal machines that have been compromised by the same bad guys. However using these 5 steps many other questions could be answered. Using the mentioned and other tools and techniques within the 5 steps we can have a better practical understanding on how malicious documents work and which methods are used by Evil. Two great resource for this type of analysis is the Malware Analyst’s Cookbook : Tools and Techniques for Fighting Malicious Code book from Michael Ligh and the SANS FOR610: Reverse-Engineering Malware: Malware Analysis Tools and Technique authored by Lenny Zeltser.
Download link for the malicious PDF file: https://0x0.st/sZyY.zip . MD5: 4f275c936b0772c969b2daf4688b7fc9