In the first post of this series we uncovered YARA and demonstrated couple of use case that that can be used to justify the integration of this tool throughout the enterprise Incident Response life-cycle. In this post we will step through the requirements for the development of YARA rules specially crafted to match patterns in Windows portable executable “PE” files. Additionally, we will learn how to take advantage of Yara modules in order to create simple but effective rules. Everything will be wrapped-up in a use case where an incident responder, that will be you, will create YARA rules based on the static analysis of a PE file.
Specifically, the use case scenario will be split into two posts. In part 2 we will start with an incident report that will introduce a simple rule development challenge, solely based on static analysis. In the part 3, will cover rule creation, performance tuning and troubleshooting.
Before we begin you will need a Linux distribution with the following tools:
- YARA 3.4.0 (get it here)
- pescanner.py (get it here)
If you are in a hurry I advise you to pick REMnux, Lenny Zeltser’s popular Linux distro for malware analysis, which include a generous amount of tools and frameworks used in the dark art of malware analysis and reverse engineering. REMnux is available for download here.
Additionally you will need a piece of malware to analyse, you can get your own copy of the sample from Malwr.com:
Malwr.com report link here Sample MD5: f38b0f94694ae861175436fcb3981061
WARNING: this is real malware, ensure you will do your analysis in a controlled, isolated and safe environment, like a temporary virtual machine.
Its Wednesday 4:00PM when a incident report notification email drops on your mailbox. It seems that a Network IPS signature was triggered by a suspicious HTTP file download (f38b0f94694ae861175436fcb3981061) hash of a file. You check the details of the IPS alert to see if it stored the sample in a temporary repository for in-depth analysis. You find that the file was successfully stored and its of type PE (executable file), definitely deserves to be look at. After downloading the file you do the usual initial static analysis: Google for the MD5, lookup the hash in Virustotal, analyse the PE header of the file looking for malicious intent. Right of the bat the sample provides a handful of indicators that will help you to understanding how the file will behave during execution. Just what you needed to start developing your own YARA rules.
Create a YARA rule that matches the following conditions:
- a suspicious string that seems to be related with debug information
- the MD5 hash of the .text section
- the .rsrc section with high entropy
- the symbol GetTickCount import
- the rich signature XOR key
- must be a Windows executable file
Before we continue let me write that the details concerning the structure of the PE file are omitted for the sake of brevity. Please see here and here for more information on PE header structure. Onward!
The first challenge is to find a string related with debug information left by the linker , specifically we will be looking for a program database file path (i.e. PDB). Lets run the strings command to output the ASCII strings:
Amid the vast output the dddd.pdb string stands out. This is probably what we are looking for. Note that is important to output the file offset in decimal with -t d suffix so that you can pinpoint the string location within the file structure. If the string is indeed related to debug information it should be part of the RSDS header. Let’s dump a few bytes of the sample using the 99136 offset as a pivot:
The presence of RSDS string gives us the confidence to select the string dddd.pdb as the string related to the debug information.
Next we need to compute the hash of the .text section, that typically contains the executable code , for this task we will use hiddenillusion’s version of pescanner.py  using the sample name as argument:
pescanner.py outputs an extensive report about the PE header structure, on which it includes the list of sections along with the hash. Take note of the .text section MD5 hash (2a7865468f9de73a531f0ce00750ed17) as we will need to use it later when creating the YARA rule.
Also in the pescanner.py report we are informed that the .rsrc section as high entropy. This is a suspicious indicator for the presence of heavily obfuscated code. Please keep this in mind when creating the rule, as this info will help us answering the third item in the challenge. Lastly the report also features the list of imported symbols, in which we can see the presence of GetTickCount, a well known anti-debugging timing function . This will be required to answer the fourth entry of the challenge. By the way, the report also mentions the file type, indicating we are in the presence of a PE32 file, which matches the sixth item of the challenge.
Lastly we need to get our hands on the XOR key used to encode the Rich signature, read more about the Rich signature here. You can check existence of this key in two ways: traditionally you would dump the first bytes of the sample, enough to cover all the DOS Header in the PE file, the Rich signature starts at file offset 0x80, and the XOR key will be located in the dword that follows the Rich ASCII string:
Bear in mind that the x86 byte-order is little-endian , therefore you need to byte-swap the dword value, so the XOR key value is 0x887f83a7 or 2290058151 in decimal.
Now for the easy way. Remember when I have mentioned in the first post of this series that the YARA scan engine is modular and feature rich? This is because you can use YARA pretty much like pescanner.py, in order to obtain valuable information on the PE header structure. Let’s start by creating the YARA rule file named rule.yar with the following content:
Next execute YARA as follows:
By using the –print-module-data argument YARA will output the report of the PE module, on which will include the rich_signature section along with the XOR key decimal value.
Ok, we now have gathered all the info required to start creating the YARA rule and finish the challenge. In the part 3 of this series, we will cover the YARA rule creation process, featuring the information gathered from static analysis. Stay tuned!
- Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software, (page 22)
I think you meant to say GetTickCount. Otherwise, nice post.
Updated. Thanks for pointing out!