Any tool that you execute that analyzes a binary without running, it performs some kind of static analysis. Basic static analysis consist on producing information about a particular binary without running the code. On Win32 PE files there are a few steps that one can perform in order to find information about the malicious intent of a file in a very short amount of time.
First step, profile the file. Essentially this can be done by calculating the cryptographic hash of the file. This will compute a fingerprint that uniquely identifies the file. This unique digest can be used easily across different online tools like Virus Total, Malwr or Metascan-online to confirm malicious intent. Also, a file similarity index should be calculated using a process called fuzzy hashing or Context Triggered Piecewise Hashing (CTPH) – the original paper about this technique was written by Jesse Kornblum here -. This fuzzy hashing technique is helpful to identify malicious files from the same family.
Second step is to run the strings command against the malicious binary. The strings command will display printable ASCII and UNICODE strings that are embedded within the file. This can disclose information about the binary functionality.
Third step is to use some tool to analyze the Win32 PE headers. Extracting information from the PE headers can reveal information about API calls that are imported and exported by the program. It can also disclose date and time of the compilation and other embedded data of interest. Common Windows PE files extensions are .exe, .dll, .sys, .drv,.cpl, .ocx. When you compile a binary you can perform static linking or dynamic linking. Static linking means all the helper functions needed to execute the binary are inside the binary. On the other hand, dynamic linking is when the executable runs and resolve pointers to the helper functions at run time. In other words the binary contain library dependencies and these dependencies can be looked at in order to infer functionality trough static analysis. These dependencies are included in the Import Address Table (IAT) section of the PE structure so the Windows loader (ntdll.dll) can know which dll’s and functions are needed for the binary to properly run.
Let’s see how we can perform these steps in REMnux v5 using the PE file analysis toolkit. The toolbox was created by the Brazilian security researcher Fernando Mercês and others. Is a very feature rich and usefully set of tools to analysis windows binaries. The toolkit runs on Windows, Linux and Mac. Some of the tools include:
To perform the first step, start by hashing the malicious binary in order to uniquely identify it. Pehash is a small tool that will produce the hashing of the file using different hashing algorithms including CTPH using ssdeep.
remnux@remnux:~/Desktop$ pehash torrentlocker.exe file: torrentlocker.exe md5: 4cbeb9cd18604fd16d1fa4c4350d8ab0 sha1: eaa4b62ef92e9413477728325be004a5a43d4c83 ssdeep: 6144:NZ3RRJKEj8Emqh9ijN9MMd30zW73KQgpc837NBH:LhRzwEm7jEjz+/g2o7fH
Then we can run Pestr to search for strings in the entire file.
remnux@remnux:~/Desktop$ pestr torrentlocker.exe
Another example would be to run the command “strings -a torrentlocker.exe”. The -a suffix will force the strings commands to scan the entire file and not only the data section.
From the output we can see the string MSVCRT.dll. This is meaningful because its the name of a Windows library i.e., the Microsoft Visual C Run-Time Library. We can conclude the program has been written with Visual C++. Another interesting string that corroborates this conclusion is the program database (PDB) file which holds debugging and project state information for Visual Studio projects. This could be used to attribution to some extent. Other information that we can see is the _controlfp or _oneexit strings which are names of Windows functions exported from the MSVCRT.dll. In this case the strings output doesn’t produce any more meaningful results because the malware is obfuscated or encrypted or packed. This means we might want to run the strings command after the malware has been unpacked – this will produce much more interesting results such as name of functions that interact with network, registry, I/O, etc.
The final step is to use Readpe tool to identify the dll and functions the binary needs to perform its function. The functions used can be shown in the form of both import by ordinal and import by name as you can see in the below picture. Normally the binaries contain a large number of imported functions. If the number is low we can conclude the binary is somehow packed, obfuscated or encrypted.
In this case the malicious file does not show much dependencies but, for example, it invokes GetProcAddress and LoadLibrary windows functions from the Kernel32.dll library. These two functions are very popular among packed malware because they allow the binary to load and get access to other Windows functions. In addition it invokes Windows functions from the MFC42.dll using its ordinal number. This can slow down the analyst but you can use tools like Dependency Walker to find out what the ordinal number corresponds to.
For Microsoft Windows enthusiasts, the same steps can be performed in order to achieve the same results. The first step on Windows can be performed using the Microsoft File Checksum Integrity Verifier. The tool can be downloaded here and below a sample usage
The second step can be performed on a Windows machine using the strings.exe utility that comes with SysInternals Suite.
And finally, the third step. One of the tools to perform this step is the Microsoft DUMPBIN utility that comes with the 32-bit version of Microsoft Visual C++. It is a command line tool to dump the different properties of a Win32 PE file for early evaluation of the file intentions. Below shows the DUMPBIN output of running the /dependents followed by the /Imports section.
The result is the same as the Readpe utility. The previous output show all dependencies from the malicious file and its Import Address Table.
Another alternative is to run the powerful and feature rich PEStudio tool created by Marc Ochsenmeie. The tool is free for non-commercial use and can perform static analysis on 32-bit and 64 bit binaries. The tool is actively maintained and contains great functionality that is worth to look. For example it can lookup Virustotal and query the different Anti-Virus engines using the file MD5. Other great feature is the Indicators. The Indicators show anomalies of the application being analysed e.g., the file not being digitally signed,
We have showed the three basic steps that are used to perform an initial assessment in order to find the program malicious intentions without running its code. In this case the malware is packed and obfuscated so we will need to dig deeper but I will leave that for another blog post.
Other tools and techniques exist and a great summary of the available tools for Windows is written by Lenny Zeltser here. Besides this, for the brave who want to go deeper, one of the best resources available to learn about Windows PE internals is the course “The Life of Binaries” created by Xeno Kovah and available on opensecuritytraining.info. Other material of interest is the Microsoft Windows PE and COFF specification – available here . The PE Reference diagram created by Ero Carrera Ventura and the Black Hat 09 presentation “Undocumented PECOFF” from Mario Vuksan and Tomislav Pericin are also great material. Microsoft also maintains the Windows API reference site that can be helpful when looking at the Import Address Table.