Tag Archives: REMnux

Unleashing YARA – Part 3

yara-logoIn the second post of this series we introduced an incident response challenge based on the static analysis of a suspicious executable file. The challenge featured 6 indicators that needed to be extracted from the analysis in order to create a YARA rule to match the suspicious file. In part 3 we will step through YARA’s PE, Hash and Math modules functions and how they can help you to meet the challenge objectives. Lets recap the challenge objectives and map it with the indicators we extracted from static analysis:

  1. a suspicious string that seems to be related with debug information
    • dddd.pdb
  2. the MD5 hash of the .text section
    • 2a7865468f9de73a531f0ce00750ed17
  3. the .rsrc section with high entropy
    • .rsrc entropy is 7.98
  4. the symbol GetTickCount import
    • Kernel32.dll GetTickCount is present in the IAT
  5. the rich signature XOR key
    • 2290058151
  6. must be a Windows executable file
    • 0x4D5A (MZ) found at file offset zero

In part 2 we created a YARA rule file named rule.yar, with the following content:

import "pe"

If you remember the exercise, we needed the PE module in order to parse the sample and extract the Rich signature XOR key. We will use this rule file to develop the remaining code.

The debug information string

In part 1 I have introduced YARA along with the rule format, featuring the strings and condition sections. When you add the dddd.pdb string condition the rule code should be something like:

yara_3_1

The code above depicts a simple rule object made of a single string variable named $str01 with the value set to the debug string we found.

The section hash condition

Next item to be added to the condition is the .text section hash, using both PE and HASH modules. To do so we will iterate over the PE file sections using two PE module functions: the number_of_sections and sections. The former will be used to iterate over the PE sections, the latter will allow us to fetch section raw_data_offset, or file offset, and raw_data_size, that will be passed as arguments to md5 hash function, in order to compute the md5 hash of the section data:

yara_3_2

The condition expression now features the for operator comprising two conditions: the section md5 hash and the section name. In essence, YARA will loop through every PE section until it finds a match on the section hash and name.

The resource entropy value

Its now time to add the resource entropy condition. To do so, we will rely on the math module, which will allow us to calculate the entropy of a given size of bytes. Again we will need to iterate over the PE sections using two conditions: the section entropy and the section name (.rsrc):

yara_3_3

Again we will loop until we find a match, that is a section named .rsrc with entropy above or equal to 7.0. Remember that entropy minimum value is 0.0 and maximum is 8.0, therefore 7.0 is considered high entropy and is frequently associated with packing [1]. Bear in mind that compressed data like images and other types of media can display high entropy, which might result in some false positives [2].

The GetTickCount import

Lets continue improving our YARA rule by adding the GetTickCount import to the condition. For this purpose lets use the PE module imports function that will take two arguments: the library and the DLL name. The GetTickCount function is exported by Kernel32.DLL, so when we passe these arguments to the pe.imports function the rule condition becomes:

yara_3_4

Please note that the DLL name is case insensitive [3].

The XOR key

Our YARA rule is almost complete, we now need to add the rich signature key to the condition. In this particular case the PE module provides the rich_signature function which allow us to match various attributes of the rich signature, in this case the key. The key will be de decimal value of dword used to encode the contents with XOR:

yara_3_5

Remember that the XOR key can be obtained either by inspecting the file with a hexdump of the PE header or using YARA PE module parsing capabilities, detailed in part 2 of this series.

The PE file type

Ok, we are almost done. The last condition will ensure that the file is a portable executable file. In part two of this series we did a quick hex dump of the samples header, which revealed the MZ (ASCII) at file offset zero, a common file signature for PE files. We will use the YARA int## functions to access data at a given position. The int## functions read 8, 16 and 32 bits signed integers, whereas the uint## reads unsigned integers. Both 16 and 32 bits are considered to be little-endian, for big-endian use int##be or uint##be.

Since checking only the first two bytes of the file can lead to false positives we can use a little trick to ensure the file is a PE, by looking for particular PE header values. Specifically we will check for the IMAGE_NT_HEADER Signature member, a dword with value “PE\0\0”. Since the signature file offset is variable we will need to rely on the IMAGE_DOS_HEADER e_lfanew field. e_lfanew value is the 4 byte physical offset of the PE Signature and its located at physical offset 0x3C [4].

With the conditions “MZ” and “PE\0\0” and respective offsets we will use uint16 and uint32 respectively:

yara_3_6

Note how we use the e_lfanew value to pivot the PE Signature, the first uint32 function output, the 0x3C offset, is used as argument in the second uint32 function, which must match the expected value “PE\0\0”.

Conclusion

Ok! We are done, last step is to test the rule against the file using the YARA tool and our brand new rule file rule.yar:

yara_3_7

YARA scans the file and, as expected, outputs the rule matched rule ID, in our case malware001.

A final word on YARA performance

While YARA performance might be of little importance if you are scanning a dozen of files, poorly written rules can impact significantly when scanning thousands or millions of files. As a rule of thumb you are advised to avoid using regex statements. Additionally you should ensure that false conditions appear first in the rules condition, this feature is named short-circuit evaluation and it was introduced in YARA 3.4.0 [5]. So how can we improve the rule we just created, in order to leverage YARA performance? In this case we can move the last condition, the PE file check signature, to the top of the statement, by doing so we will avoid checking for the PE header conditions if the file is an executable (i.e. PDF, DOC, etc). Lets see how the new rule looks like:

yara_3_8

If you like to learn more about YARA performance, check the Yara performance guidelines by Florian Roth, as it features lots of tips to keep your YARA rules resource friendly.

References

  1. Structural Entropy Analysis for Automated Malware Classification
  2. Practical Malware Analysis, The Hands-On Guide to Dissecting Malicious Software, Page 283.
  3. YARA Documentation v.3.4.0, PE Module
  4. The Portable Executable File Format
  5. YARA 3.4.0 Release notes
Tagged , ,

Unleashing YARA – Part 2

yarapart2In the first post of this series we uncovered YARA and demonstrated couple of use case that that can be used to justify the integration of this tool throughout the enterprise Incident Response life-cycle. In this post  we will step through the requirements for the development of YARA rules specially crafted to match patterns in Windows portable executable “PE” files. Additionally, we will learn how to take advantage of Yara modules in order to create simple but effective rules. Everything will be wrapped-up in a use case where an incident responder, that will be you, will create YARA rules based on the static analysis of a PE file.

Specifically, the use case scenario will be split into two posts. In part 2 we will start with an incident report that will introduce a simple rule development challenge, solely based on static analysis. In the part 3,  will cover rule creation, performance tuning and troubleshooting.

Prerequisites

Before we begin you will need a Linux distribution with the following tools:

If you are in a hurry I advise you to pick REMnux, Lenny Zeltser’s popular Linux distro for malware analysis, which include a generous amount of tools and frameworks used in the dark art of malware analysis and reverse engineering. REMnux is available for download here.

Additionally you will need a piece of malware to analyse, you can get your own copy of the sample from Malwr.com:

Malwr.com report link here

Sample MD5: f38b0f94694ae861175436fcb3981061

WARNING: this is real malware, ensure you will do your analysis in a controlled, isolated and safe environment, like a temporary virtual machine.

Incident Report

Its Wednesday 4:00PM when a incident report notification email drops on your mailbox. It seems that a Network IPS signature was triggered by a suspicious HTTP file download (f38b0f94694ae861175436fcb3981061) hash of a file. You check the details of the IPS alert to see if it stored the sample in a temporary repository for in-depth analysis. You find that the file was successfully stored and its of type PE (executable file), definitely deserves to be look at. After downloading the file you do the usual initial static analysis: Google for the MD5, lookup the hash in Virustotal, analyse the PE header of the file looking for malicious intent. Right of the bat the sample provides a handful of indicators that will help you to understanding how the file will behave during execution. Just what you needed to start developing your own YARA rules.

The challenge

Create a YARA rule that matches the following conditions:

  1. a suspicious string that seems to be related with debug information
  2. the MD5 hash of the .text section
  3. the .rsrc section with high entropy
  4. the symbol GetTickCount import
  5. the rich signature XOR key
  6. must be a Windows executable file

Static Analysis

Before we continue let me write that the details concerning the structure of the PE file are omitted for the sake of brevity. Please see here and here for more information on PE header structure. Onward!

The first challenge is to find a string related with debug information left by the linker [1], specifically we will be looking for a program database file path (i.e. PDB). Lets  run the strings command to output the ASCII strings:

yara_2.1_1

strings output

Amid the vast output the dddd.pdb string stands out. This is probably what we are looking for. Note that is important to output the file offset in decimal with -t d suffix so that you can pinpoint the string location within the file structure. If the string is indeed related to debug information it should be part of the RSDS header. Let’s dump a few bytes of the sample using the 99136 offset as a pivot:

yara_2.1_2

xxd output

The presence of RSDS string gives us the confidence to select the string dddd.pdb as the string related to the debug information.

Next we need to compute the hash of the .text section, that typically contains the executable code [2], for this task we will use hiddenillusion’s version of pescanner.py [3] using the sample name as argument:

yara_2.1_3

pescanner.py initial report

yara_2.1_4

pescanner.py report on sections, resources and imports on the PE file

pescanner.py outputs an extensive report about the PE header structure, on which it includes the list of sections along with the hash. Take note of the .text section MD5 hash (2a7865468f9de73a531f0ce00750ed17) as we will need to use it later when creating the YARA rule.

Also in the pescanner.py report we are informed that the .rsrc section as high entropy. This is a suspicious indicator for the presence of heavily obfuscated code. Please keep this in mind when creating the rule, as this info will help us answering the third item in the challenge. Lastly the report also features the list of imported symbols, in which we can see the presence of GetTickCount, a well known anti-debugging timing function [4]. This will be required to answer the fourth entry of the challenge. By the way, the report also mentions the file type, indicating we are in the presence of a PE32 file, which matches the sixth item of the challenge.

Lastly we need to get our hands on the XOR key used to encode the Rich signature, read more about the Rich signature here. You can check existence of this key in two ways: traditionally you would dump the first bytes of the sample, enough to cover all the DOS Header in the PE file, the Rich signature starts at file offset 0x80, and the XOR key will be located in the dword that follows the Rich ASCII string:

yara_2.1_5

Bear in mind that the x86 byte-order is little-endian [5], therefore you need to byte-swap the dword value, so the XOR key value is 0x887f83a7 or 2290058151 in decimal.

Now for the easy way. Remember when I have mentioned in the first post of this series that the YARA scan engine is modular and feature rich? This is because you can use YARA pretty much like pescanner.py, in order to obtain valuable information on the PE header structure. Let’s start by creating the YARA rule file named rule.yar with the following content:

 import “pe”

Next execute YARA as follows:

yara_2.1_6

strings command output

By using the –print-module-data argument YARA will output the report of the PE module, on which will include the rich_signature section along with the XOR key decimal value.

Ok, we now have gathered all the info required  to start creating the YARA rule and finish the challenge. In the part 3 of this series, we will cover the YARA rule creation process, featuring the information gathered from static analysis. Stay tuned!

References

  1. http://www.godevtool.com/Other/pdb.htm
  2. Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software, (page 22)
  3. https://github.com/hiddenillusion/AnalyzePE/blob/master/pescanner.py
  4. http://antukh.com/blog/2015/01/19/malware-techniques-cheat-sheet
  5. http://teaching.idallen.com/cst8281/10w/notes/110_byte_order_endian.html
Tagged , , ,

Malicious Office Document delivering Dyre

Phishing campaigns that distribute commodity malware are common and ongoing problem for end users and corporations. In this article would like give you an illustration about one email that was part of a phishing campaign that distributes a very successful banking trojan malware known as Dyre or Dyreza. This trojan is quite sophisticated. Among other things is capable of stealing all kinds of credentials from the victim’s computer. It can also redirect victim’s traffic to sites controlled by the threat actors using man-in-the-browser functionality. This allows interception and manipulation of traffic that is supposed to be delivered to legitimate sites. Dyre has remote access functionality that allows the threat actor to connect to websites trough the victim computer.

The distribution methods observed is the past are mainly done using phishing emails that include a malicious Microsoft Office document attached. It normally includes a Microsoft Word or Excel document with macros. The image below illustrates one of these samples where the sender falsely mimics a legitimate business transaction. This well-crafted email attempts to lure the recipient into opening the malicious attachment.

email-dyre

Noteworthy is that the threat actors behind the malware distribution operations go to great lengths ensuring the files are not detected as malicious  by the different security mechanism such as Anti-Spam, Anti-Virus and Sandboxes.  Is quite normal when  submitting this malicious Office documents to Virus Total to have less than 10% detection rate and being rated benign by Sandboxes.In this example as seen in the picture below, the user when opening the email attachment is presented with the information that the document is encrypted with RSA algorithm and the user needs to enable macros in order to see its content.

email-dyre2

The document is of course not encrypted and this is a social engineering technique to trick the users to run the malicious macro. Allowing macros to run causes the malware to be extracted and executed in the victim system.

So what makes this document different than the others recently seen? The difference is in the weaponization mechanism that was used. Traditionally, these malicious Office documents have encoded and obfuscated macros. These macros when executed will connect to a site and download the malicious executable. However, in this case the document contains an obfuscated macro but the executable is embedded within the document.  This reduces the steps needed for infection and might increase the infection rate.

What tools and techniques one could use to unravel what is inside the document?

For malicious document analysis, REMnux and the cheat sheet created by Lenny Zeltser are a fantastic companion. On REMnux v6 one of the new tools pre-installed is the python-oletools toolkit. This toolkit was created by Philippe Lagadec based on the work created by John William Davison on officeparser. The tool among other things can parse and extract VBA Macros from Microsoft Office documents. It supports a variety of Microsoft Office documents and it can be very handy for this type of malicious documents. We start with the usage of olemeta.py to view the document metadata. In this case we could see codepage 1252 which might indicate where the document was fabricated. Then we run oleid.py which gives us a very good overview about the capabilities inside the document. In this case we could observe that the document contain VBA macros and OLE embeded objects. These indicators definitely need a deeper look.

email-oletools

We start by looking at the VBA macro. The following picture illustrates the output given by olevba.py after analyzing the malicious Office document. It shows the actual macro and one could see that is obfuscated. In addition the tool gives its interpretation of the different functions used in the macro. Many of the functions used are considered suspicious.

 

email-olevba1

Based on this information we can see that the document contains a obfuscated VBA macro that presumably creates RTF files (300.rtf and 301.rtf) and extracts an executable called n1.exe.

To further dig into these artifacts we will use another great tool. Oledump created by Didier Stevens. This tool allows you to analyze OLE files. As stated in the manual page: “Many file formats are in fact OLE files, like Microsoft Office files, MSI files. Even the new Microsoft Office Open XML format uses OLE files for VBA macros”. Another powerful tool created by Didier that even supports decoders and plugins such as Yara rules. One of the Yara rules is based on the work made on OfficeMalscanner  by Frank Boldewin that can find shelcode, PE-files and other embedded streams inside Office documents.

To verify our suspicious we use oledump.py with a yara rule that detects the presence of PE files inside documents. The output confirms that an executable was found inside the document.

email-oledump1

We can now extract the executable. The stream that triggered the Yara rule is the stream 11. Once again using oledump.py we can extract the binary and redirected to a file for further analysis.

email-oledump2

With the executable extracted we can now start  using static and dynamic analysis techniques to determine the malware capabilities and extract IOCs that can be used to across our logs and network to find infected systems. This will be left for another blog post.

E-mail continues to be the weapon of choice for mass delivering malware. The tools and techniques used by attackers  continue to evolve and bypass all the security controls in place. From a defense perspective, the US-CERT put together excellent tips for detecting and preventing this type of malware and to avoid scams and phishing attempts applicable to home users and corporations.

 

MD5 of the files used in this exercise:
Evil.doc  : dd3cd493aa68f55d1df442873ad2b2e8
Evil.bin : 27079661fb498dcf18194f45a4171492

Tagged , , , ,

Memory Forensics with Volatility on REMnux v5 – Part 2

Before we continue our hands-on memory forensics exercise, will be good to refresh some concepts on how memory management works on Windows 32 bits operating systems running on x86 architecture. Windows operating systems have a complex and sophisticated memory management system. In particular the way the mapping works between physical and virtual addresses is a great piece of engineering.  On 32 bits x86 machines, virtual addresses are 32 bits long which means each process has 4GB of virtual addressable space. In a simplified manner the 4GB of addressable space is divided into two 2GB partitions. One for user mode processes – Low 2GB (0x00000000 through 0x7FFFFFFF) -and another for the Kernel mode – High 2GB (0x80000000 through 0xFFFFFFFF). This default configuration could be changed in the Windows boot configuration data store (BCD) to accommodate 3GB for user mode.  When a process is created and a region of virtual address space is allocated – trough VirtualAlloc API – the memory manager creates a VAD (Virtual Address Descriptor) for it. The VAD is essentially a data structure that contains the various elements of the process virtual addresses organized in a balanced tree. Elements such as the base address, access protection, the size of the region, name of mapped files and others are defined on each VAD entry.  The Microsoft Windows Internals book goes over in great detail about it.

Fortunately, Brendan Dolan-Gavitt, back in 2007 created a series of plugins for The Volatility Framework to query, walk and perform forensic examinations of the VAD tree. His original paper is available here.

That being said the next step in our quest to find the malicious code. We will use these great plugins to look at the memory ranges that are accessible by the suspicious process virtual address space. This will allows us to look for hidden and injected code. Worth to mention that a great book with excellent resources and exercises about this matter is the Malware Analyst Cookbook from Michael Ligh et. al. Another great resource that also goes into great detailed and was published last year is the Art of Memory Forensics also from Michael Ligh and the Volatility team.  Both books mention that among other things the VAD  might be useful to find malicious code hiding in another process. Essentially, by looking at the memory ranges that have an access protection marked as executable specially with the  PAGE_EXECUTE_READWRITE protection. This suggest that a particular memory range might be concealing executable code. This has been demonstrated by Michael Ligh and was translated into a Volatility plugin named Malfind. This plugin, among other things, detects code that might have been injected into a process address space by looking at its access protections.  For this particular case Malfind does not produce results for the explorer.exe (PID 1872). Might be due to the fact that permissions can be applied at a further granularity level than the VAD node level.

Our next hope is to start looking at the different VAD attributes and mapped files of the VAD tree for this suspicious process using the Vadinfo plugin. The Vadinfo plugin goes into great detail displaying info about the process memory and the output might be vast. For the sake of brevity only the first entry is displayed in the following picture.

memory-analysis-8

In this case the output of the Vadinfo plugin displays almost two hundred VAD nodes and its details. Because we are looking for specific IOCs that we found we want to further look into these memory regions. This is where the Vaddump plugin comes to great help. Vaddump is able to carve our each memory range specified on the VAD node and dump the data into a file.

memory-analysis-9

Then we can do some command line kung-fu to look into those files. We can determine what kind of files we have in the folder and also look for the indicators we found previously when performing dynamic  analysis.  In this case we could observe that the string “topic.php” appears in five different occasions using the default strings command. In addition it appears once when running strings with a different encoding criteria – and this exactly matches the found IOC.

memory-analysis-10

On a side note is interesting to see based on the file names the memory ranges that span the virtual address space from 0x00000000 to 0x7FFFFFFF. And you also could see that there is some groups of addresses more contiguous than others. This related on how the process memory layout looks like. The following image illustrates this layout on a x86 32bit Windows machine. This can be further visualized using VMMap tool from Mark Russinovich and available on the Sysinternals Suite.

memory-analysis-13

Back to our analysis, one first step is to look across dumped data from the Vaddump plugin see if there is any interesting executables files. We observed that the previous search found several Windows PE32 files. Running again our loop to look for Windows PE32 files we see that it contains 81 executables. However, we also know the majority of them are DLL’s. Lets scan trough the files to determine which ones are not DLL’s. This gives good results because only two files matched our search.

memory-analysis-14

Based on the two previous results we run again the Vadinfo plugin and looked deeper into those specific memory regions. The first memory region which starts at 0x01000000 contains normal attributes and a mapped file that matches the process name. The second memory regions which starts at 0x00090000 is  more interesting. Specially the node has the tag VadS. According to Michael Ligh and the Volatility Team this type of memory regions are strong candidates to have injected code.

memory-analysis-15

Next step? run strings into this memory region to see what comes up and don’t forget to run strings with a different encoding criteria.

As you could see below, on the left side, we have a small set of the strings found on the memory region. This information gives great insight into the malware capabilities, gives further hints that can be used to identify the malware on other systems and can further help our analysis. On the right side you have a subset of the strings found using a different encoding criteria. This goes together with previous strings and  gives you even further insight into what the malware does, its capabilities and the worst it can do.

memory-analysis-16

Security incident handlers might be satisfied with dynamic analysis performed on the malware because they can extract IOCs that might be used to identify the malware across the  defense systems and aid the  incident response actions. However, trough this exercise we gained intelligence about what the malware does by performing memory forensics. In addition, as malware evolves and avoids footprints in the hard drive the memory analysis will be needed to further understand the malware capabilities. This allows us to augment prevention controls and improve detection and response actions. Nonetheless, one might want to even go further  deep and not only determine what the malware does but also how.  That being said the next logical step would be to get the unpacked malware from memory and perform static analysis on the sample. Once again, Volatility Framework offers the capability of extracting PE files from memory. Then they can be loaded into a dissasembler and/or a debugger. To extract these files from memory there are a variety of plugins. We will use the dlldump plugin because it allows you to extract PE files from hidden or injected memory regions.

There are some caveats with the PE extraction plugins and there is a new sport around malware writers on how to manipulate PE files in memory or scramble the different sections so the malware analysts have a tougher time getting them. The Art of Memory Forensics and the The Rootkit Arsenal book goes into great detail into those aspects.

memory-analysis-17

With the malware sample unpacked and extracted from memory we can now go in and perform static analysis … this is left for another post.

That’s it – During the first part we defined the  steps needed to acquire the RAM data for a VMware virtual machine before and after infection.  Then, we started our quest of malware hunting in memory. We did this by benchmarking both memory captures against each other and by applying the intelligence gained during the dynamic analysis. On the second part we went deeper into the fascinating world of memory forensics. We manage to find the injected malware in memory that was hidden in a VAD entry of the explorer.exe process address space. We got great intel about the malware capabilities by looking at strings on memory regions and finally we extracted the hidden PE file. All this was performed  using the incredible Volatility Framework that does a superb job removing all the complex memory details from the investigation.

 

Malware sample used in this exercise:

remnux@remnux:~/Desktop$ pehash torrentlocker.exe
file:             torrentlocker.exe
md5:              4cbeb9cd18604fd16d1fa4c4350d8ab0
sha1:             eaa4b62ef92e9413477728325be004a5a43d4c83
ssdeep:           6144:NZ3RRJKEj8Emqh9ijN9MMd30zW73KQgpc837NBH:LhRzwEm7jEjz+/g2o7fH

 

References:

Ligh, M. H., Case, A., Levy, J., & Walters, A. (2014). Art of memory forensics: Detecting malware and threats in Windows, Linux, and Mac memory
Ligh, M. H. (2011). Malware analyst’s cookbook and DVD: Tools and techniques for fighting malicious code.
Russinovich, M. E., Solomon, D. A., & Ionescu, A. (2012). Windows internals: Part 1
Russinovich, M. E., Solomon, D. A., & Ionescu, A. (2012). Windows internals: Part 2
Blunden, B. (2009). The rootkit arsenal: Escape and evasion in the dark corners of the system.
Zeltser, Lenny (2013) SANS FOR610: Reverse-Engineering Malware: Malware Analysis Tools and Techniques

Tagged , , , , , , ,

Static Malware Analysis – Find Malicious Intent

Any tool that you execute that analyzes a binary without running, it performs some kind of static analysis. Basic static analysis consist on producing information about a particular binary without running the code. On Win32 PE files there are a few steps that one can perform in order to find information about the malicious intent of a file in a very short amount of time.

First step, profile the file. Essentially this can be done by calculating the cryptographic hash of the file. This will compute a fingerprint that uniquely identifies the file. This unique digest can be used easily across different online tools like Virus Total, Malwr or Metascan-online to confirm malicious intent. Also, a file similarity index should be calculated using a process called fuzzy hashing or Context Triggered Piecewise Hashing (CTPH) – the original paper about this technique was written by Jesse Kornblum here -.   This fuzzy hashing technique is helpful to identify malicious files from the same family.

Second step is to run the strings command against the malicious binary.  The strings command will display printable ASCII and UNICODE strings that are embedded within the file. This can disclose information about the binary functionality.

Third step is to use some tool to analyze the Win32 PE headers. Extracting information from the PE headers can reveal information about API calls that are imported and exported by the program. It can also disclose date and time of the compilation and other embedded data of interest. Common Windows PE files extensions are .exe, .dll, .sys, .drv,.cpl, .ocx. When you compile a binary you can perform static linking or dynamic linking. Static linking means all the helper functions needed to execute the binary are inside the binary. On the other hand, dynamic linking is when the executable runs and resolve pointers to the helper functions at run time. In other words the binary contain library dependencies and these dependencies can be looked at in order to infer functionality trough static analysis. These dependencies are included in the Import Address Table (IAT) section of the PE structure so the Windows loader (ntdll.dll) can know which dll’s and functions are needed for the binary to properly run.

Let’s see how we can perform these steps in REMnux v5 using the PE file analysis toolkit.  The toolbox was created by the Brazilian security researcher Fernando Mercês and others. Is a very feature rich and usefully set of tools to analysis windows binaries. The toolkit runs on Windows, Linux and Mac. Some of the tools include:

pev70

To perform the first step, start by hashing the malicious binary in order to uniquely identify it. Pehash is a small tool that will produce the hashing of the file using different hashing algorithms including CTPH using ssdeep.

remnux@remnux:~/Desktop$ pehash torrentlocker.exe
file:             torrentlocker.exe
md5:              4cbeb9cd18604fd16d1fa4c4350d8ab0
sha1:             eaa4b62ef92e9413477728325be004a5a43d4c83
ssdeep:           6144:NZ3RRJKEj8Emqh9ijN9MMd30zW73KQgpc837NBH:LhRzwEm7jEjz+/g2o7fH

Then we can run Pestr to search for strings in the entire file.

remnux@remnux:~/Desktop$ pestr torrentlocker.exe

Another example would be to run the command “strings -a torrentlocker.exe”. The -a suffix will force the strings commands to scan the entire file and not only the data section.

static-analysis-strings

From the output we can see the string MSVCRT.dll. This is meaningful because its the name of a Windows library i.e., the Microsoft Visual C Run-Time Library. We can conclude the program has been written with Visual C++. Another interesting string that corroborates this conclusion is the program database (PDB) file which holds debugging and project state information for Visual Studio projects. This could be used to attribution to some extent. Other information that we can see is the _controlfp or _oneexit strings which are names of Windows functions exported from the MSVCRT.dll. In this case the strings output doesn’t produce any more meaningful results because the malware is obfuscated or encrypted or packed. This means we might want to run the strings command after the malware has been unpacked – this will produce much more interesting results such as name of functions that interact with network, registry, I/O, etc.

The final step is to use Readpe tool to identify the dll and functions the binary needs to perform its function. The functions used can be shown in the form of both import by ordinal and import by name as you can see in the below picture. Normally the binaries contain a large number of imported functions. If the number is low we can conclude the binary is somehow packed, obfuscated or encrypted.

static-analysis-imports

In this case the malicious file does not show much dependencies but, for example, it invokes GetProcAddress and LoadLibrary windows functions from the Kernel32.dll library. These two functions are very popular among packed malware because they allow the binary to load and get access to other Windows functions. In addition it invokes Windows functions from the MFC42.dll using its ordinal number. This can slow down the analyst but you can use tools like Dependency Walker to find out what the ordinal number corresponds to.

For Microsoft Windows enthusiasts, the same steps can be performed in order to achieve the same results. The first step on Windows can be performed using the Microsoft File Checksum Integrity Verifier. The tool can be downloaded here and below a sample usage

static-analysis-fciv

The second step can be performed on a Windows machine using the strings.exe utility that comes with SysInternals Suite.

static-analysis-sysinternals

And finally, the third step. One of the tools to perform this step is the Microsoft DUMPBIN utility that comes with the 32-bit version of Microsoft Visual C++. It is a command line tool to dump the different properties of a Win32 PE file for early evaluation of the file intentions. Below shows the DUMPBIN output of running the /dependents followed by the /Imports section.

static-analysis-dumpbin

The result is the same as the Readpe utility. The previous output show all dependencies from the malicious file and its Import Address Table.

Another alternative is to run the powerful and feature rich PEStudio tool created by Marc Ochsenmeie. The tool is free for non-commercial use and can perform static analysis on 32-bit and 64 bit binaries. The tool is actively maintained and contains great functionality that is worth to look. For example it can lookup Virustotal and query the different Anti-Virus engines using the file MD5. Other great feature is the Indicators. The Indicators show anomalies of the application being analysed e.g., the file not being digitally signed,

static-analysis-pestudio

We have showed the three basic steps that are used to perform an initial assessment in order to find the program malicious intentions without running its code. In this case the malware is packed and obfuscated so we will need to dig deeper but I will leave that for another blog post.

Other tools and techniques exist and a great summary of the available tools for Windows is written by Lenny Zeltser here. Besides this, for the brave who want to go deeper, one of the best resources available to learn about Windows PE internals is the course “The Life of Binaries” created by Xeno Kovah and available on opensecuritytraining.info. Other material of interest is the Microsoft Windows PE and COFF specification – available here . The PE Reference diagram created by Ero Carrera Ventura and the Black Hat 09 presentation “Undocumented PECOFF” from Mario Vuksan and Tomislav Pericin are also great material. Microsoft also maintains the Windows API reference site that can be helpful when looking at the Import Address Table.

 

Tagged , , , , , , ,

Dynamic Malware Analysis with REMnux v5 – Part 2

procdotDuring part one we created the environment to perform dynamic malware analysis with REMnux toolkit. Then the malware was executed and  all the interactions with the network were observed and captured. Essentially,  the malware was executed in a disposable virtual machine and all the traffic – including SSL – was intercepted.

Now, to further acquire information that would help us answering the questions made during part 1, it would be relevant to observe what happens at the system level.

Below recipe involves running Process Monitor from Sysinternals. Process Monitor is an advanced monitoring tool for Windows that shows real-time file system, Registry and process/thread activity, In parallel we run a packet capture with Wireshark on the disposable machine or we run tcpdump on REMnux. These tools will be started before malware executed and stopped moments after the infection. Then we will import this data – pcap and procmon – into ProcDOT. ProcDOT is a free tool created by Christian Wojne from CERT.at. It runs on Windows or Linux and is installed on REMNux This tool has the capability to parse the Procmon data. Then correlates it with the pcap  and transforms the data into a picture. This enables malware visual analysis. This method can be of great help for a effective and efficient dynamic malware analysis. The steps using part 1 model are:

 

  • Restore the VMware snapshot.
  • Start Process Monitor (need to be installed beforehand) and configure Columns fields to be compatible with ProcDOT.
  • Start Wireshark or tcpdump on REMnux and start capturing traffic.
  • Copy the malware to the virtual machine and run the malware.
  • Stop the wireshark capture and export the traffic into PCAP.
  • Stop the Process Monitor capture and export the results into CSV.
  • Move the PCAP and CSV file to REMnux.
  • Power off the infected machine.
  • Execute ProcDOT and import the procmon and pcap data.
  • Select the process and render the analysis.
  • Run trough the animation to understand the timing aspects.

The picture below demonstrate the result (for the sake of size elements such of registry key and paths have been removed):

procdot-torrentlocker

 

This visualization technique is very helpful in order to determine in a very quick way what the malware does. The graph produced by ProcDOT is interactive and you can play an animation of it in order to determine in which sequence the events occurred. In this case several relevant events happened. You can easily observe that:

  • The executable torrentlocker.exe starts execution and invokes a new thread id 1924
  • The thread id 1924 creates a new suspicious file named 01000000
  • Thread id 1924 creates a new process named “explorer.exe” with pid 1928
  • This new process creates several actions including
  • A suspicious executable file is created under c:\windows named izigajev.exe
  • A new AutoStart registry key is created invoking the just created executable file
  • The file Administrator.wab is read and data written to it.
  • A new process name vssadmin.exe is invoked

By looking at this sequence of events we can conclude that the malware performs process injection by injecting its malicious code into a benign process. Creates a copy of itself and drops it into Windows folder. Maintains persistence by adding the dropped executable into the auto start registry key and creates several suspicious files.

A very simple, fast and effective method that can speed up the malware analysis. Using this visual analytic technique and the one described on part one you can gather indicators that could now be used to identify the malware in motion (network) or at rest (file system/registry) across your network. Further they could be used to augment existing security controls and find additional infected systems.

Tagged , , , ,

Dynamic Malware Analysis with REMnux v5 – Part 1

REMnux-logo1 [Part 1 illustrates a series of very useful tools and techniques used for dynamic analysis. Security incident handlers and malware analysts can apply this knowledge to analyze a malware sample in a quick fashion using the multi-purpose REMnux v5. This way you can extract IOCs that might be used to identify the malware across your defense systems and aid your incident response actions. ~Luis]

Malware analysis is a interesting topic nowadays. It requires a fairly broad of knowledge and practical experience across different subjects. My background is in systems and infrastructure which means I am more confident with the dynamic analysis methodology than the static analysis one. Some of the readers have similar background. However, if you are willing to roll your sleeves and spend time in order to learn and be proficient with the different tools and techniques static analysis can done – hopefully will write about basic static analysis in a near future. Additionally is intellectually challenging.

One of the goals of performing malware analysis is to determine the malware actions and get insight into its behavior and inner workings by analyzing its code. By doing this we can find answers to pertinent questions such as:

  • What are the malware capabilities?
  • What is the worst it can do?
  • Which indicators of compromise (IOC) could be used identify this malware in motion (network), at rest (file system) or in use (memory)?  – These IOCs can then be used across our defense systems and in our incident response actions.

The process consists of executing the malware specimen in a safe, secure, isolated and controlled environment.The dynamic analysis methodology allows you to determine the malware behavior and how it interacts with the network, file system, registry and others. In this post I go trough a technique to determine its behaviour at the network level. In this way we can start answering the previous questions.

How?

A simple and effective manner to execute malware analysis in an safe, isolated and controlled fashion would be to use a second hand laptop with enough RAM and fast I/O like a SSD drive. Then on top of it a virtualization software. My personal preferences goes VMware Workstation due to the wide range of operating systems supported, and affordable price. Essentially two virtual machines. One machine running the resourceful and multi-purpose REMnux v5.

For those who don’t know, REMnux is a fantastic toolkit based on Ubuntu created by Lenny Zeltser that provides an enormous amount of tools preinstalled to perform static and dynamic malware analysis. The tools installed have the ability to analyze Windows and Linux malware variants. You can download it from either as a Live CD or a preconfigured virtual appliance for Vmware or VirtualBox from here.

The second machine will be running Windows XP or 7 32 Bits. That will get you started. Then configure the environment and install the required tools on the disposable – relying heavily on VMware snapshots – Windows machine.

In the first technique, I want REMnux to act as gateway, dns server and proxy – including SSL – . This will allow us to intercept all network communications originating from the infected machine. The following picture illustrates the methodology for dynamic analysis.

malware-analysis-framework

The illustration should be self-explanatory. In this manner, any DNS request made by the infected machine will be redirected to the REMnux. If the malware is not using DNS but using hardcoded IP addresses, the requests will go through the default gateway which is pointing to the REMnux. The REMnux by its turn will have iptables configured to redirect all received traffic either on port TCP 80 or 443 to TCP port 8080. On this port – TCP 8080 – Burp Suite is listening as a transparent Proxy. In this way you will have visibility and control into all network communications initiated by the infected machine.

On REMnux the steps to perform this configuration are:

  1. Define the Network adapter settings on VMware Workstation to be in a custom virtual networkg., VMnet5.
  2. Define a static IP
  3. Start FakeDNS to answer any DNS requests.
  4. Start HTTP daemon to answer HTTP requests.
  5. Redirect HTTP and HTTPS traffic to port TCP 8080 by configuring redirect rules via iptables.
  6. Intercept HTTP requests using BURP Suite in Invisible mode on port 8080
  7. Optionally you run tcpdump to capture all the networking traffic (allows you to create IDS signatures).

Te necessary commands to perform steps 3 to 6 are:

remnux@remnux:~$ sudo fakedns 192.168.1.23
dom.query. 60 IN A 192.168.1.23

Open another shell:

remnux@remnux:~$ httpd start
Starting web server: thttpd.
remnux@remnux:~$ sudo sysctl -w net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
remnux@remnux:~$ sudo iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 8080
remnux@remnux:~$ sudo iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 443 -j REDIRECT --to-port 8080
remnux@remnux:~$ sudo iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination        
REDIRECT   tcp  --  anywhere             anywhere            tcp dpt:https redir ports 
REDIRECT   tcp  --  anywhere             anywhere            tcp dpt:www redir ports 
REDIRECT   tcp  --  anywhere             anywhere            tcp dpt:www redir ports 
REDIRECT   tcp  --  anywhere             anywhere            tcp dpt:https redir ports 

remnux@remnux:~$ burpsuite
[1] 8912

malware-analysis-framework-burp

Then, on Windows the initial steps are:

  1. Define the Network adapter settings in the VMware to be in the same custom virtual network as the REMnux.
  2. Configure IP address in the same range as the REMnux
  3. Configure the DNS server to point to the REMnux
  4. Define the default GW as being the REMnux
  5. Test the network settings
  6. Create a VMware snapshot
  7. Move the malware sample to the machine
  8. Start necessary tools (if needed)
  9. Execute the malware sample

After having the machines ready you can move your malware sample to the disposable Windows machine and execute it. In this case I executed a malware variant of Torrentlocker. The result is shown in the following picture:

malware-analysis-framework-result1

  1. There is a query from the Windows machine to the DNS server asking the A record of the address allwayshappy.ru
  2. FakeDNS answers back with the IP of the REMnux
  3. Windows machines establishes a SSL connection to the IP REMnux on port 443 which is redirected trough iptables to port 8080
  4. The traffic is Intercept by Burp Suite and can be seen and manipulated in clear.
  5. The request can be forwarded to localhost on port 80 to fake an answer.

Following the first request, this malware performs a second request, potentially sending some more data. Unfortunately the request is encrypted – that would be a good challenge for static analysis!

malware-analysis-framework-burp2

As you could see in a quick manner you could determine that the malware tries to reach out to a C&C. This type of knowledge can then be used to find other compromised systems and start your incident response actions.

You might see this as a time-intensive process that does not scale – think a company that needs to analyse hundreds of samples per month, week or per day – solution is automation. Several automated malware analysis system have appeared over the last years such as CWSandbox, Norman Sandbox, Anubis, Cuckoo and others. Essentially these systems load the malicious binary into a virtual machine and execute it. During execution all the interactions with I/O, memory, registry and network are tracked and then a report is produced. This greatly reduces the costs of malware analysis. However, is good to understand how to do manual analysis because many times the malware samples only trigger on specific conditions or bypasses the sandboxes. In addition you start to be proficient on different tools and techniques!

 

References:
SANS FOR610: Reverse-Engineering Malware: Malware Analysis Tools and Techniques

Tagged , , , , , , , ,