JPEXS | Count Upon Security

Over the course of the last two articles (part 1 & part 2), I analyzed a recent drive-by-download campaign that was delivering the RIG Exploit Kit . In this article, I will complete the analysis by looking at the shellcode that is executed when the exploit code is successful. As mentioned in the previous articles, each one of the two exploits has shellcode that is used to run malicious code in the victims system. The shellcode objective is the same across the exploits: Download, decrypt and execute the malware.

In part 1, when analyzing the JavaScript that was extracted from the RIG landing page we see that at the end there was a function that contained a hex string of 2505 bytes. This is the shellcode for the exploit CVE-2013-2551. In a similar way but to exploit CVE-2015-5122 on the second stage Flash file, inside the DefineBinaryData tag 3, one of the decrypted strings contained a hex string of 2828 bytes.

So, how can we analyze this shellcode and determine what it does?

You can copy the shellcode and create a skeletal executable that can then be analyzed using a debugger or a dissassembler. First, the shellcode needs to be converted into hex notation (\x). This can be done by coping the shellcode string into a file and then running the following Perl one liner “$cat shellcode | perl -pe ‘s/(..)/\\x$1/g’ >shellcode.hex”. Then generate the skeletal shellcode executable with shellcode2exe.py script written by Mario Villa and later tweaked by Anand Sastry. The command is “$shellcode2exe.py –s shellcode shellcode.exe”. The result is a windows executable for the x86 platform that can be loaded into a debugger. Another way to convert shellcode is to use the converter tool from http://www.kahusecurity.com

Next step is to load the generated executable into OllyDbg. Stepping through the code one can see that the shellcode contains a deobfuscation routine. In this case, the shellcode author is using a XOR operation with key 0x84. After looping through the routine, the decoded shellcode shows a one liner command line.

After completing the XOR de-obfuscation routine the shellcode has to be able to dynamically resolve the Windows API’s in order to make the necessary system calls on the environment where is being executed. To make system calls the shellcode needs to know the memory address of the DLL that exports the required function. Popular API calls among shellcode writers are LoadLibraryandGetProcAddress. These are common functions that are used frequently because they are available in the Kernel32.dll which is almost certainly loaded into every Windows operating system. The author can then get the address of any user mode API call made.

Therefore, the first step of the shellcode is to locate the base address of the memory image of Kernel32.dll. It then needs to scan its export table to locate the address of the functions needed.

How does the shellcode locate the Kernel32.dll? On 32-bit systems, the malware authors use a well-known technique that takes advantage of a structure that resides in memory and is available for all processes. The Process Environment Block (PEB). This structure among other things contains linked lists with information about the DLLs that have been loaded into memory. How do we access this structure? A pointer exists to the PEB that resides insider another structure known as the Threat Information Block (TIB) which is always located at the FS segment register and can be identified as FS:[0x30](Zeltser, L.). Given the memory address of the PEB the shellcode author can then browse through the different PEB linked lists such as the InLoadOrderModuleList which contains the list of DLL’s that have been loaded by the process in load order. The third element of this list corresponds to the Kernel32.dll. The code can then retrieve the base address of the DLL. This technique was pioneered by one of the members of the well-known and prominent virus and worm coder group 29A and written in volume 6 of their e-zine in 2002. The figure below shows a snippet of the shellcode that contains the different sequence of assembly instructions in order for the code to find the Kernel32.dll.

The next step is to retrieve the address of the required function. This can be obtained by navigating through the Export Directory Table of the DLL. In order to find the right API there is a comparison made by the shellcode against a string. When it matches, it fetches its location and proceeds. This technique was pioneered and is well described in the paper “Win32 Assembly Components” written in 2002 by The Last Stage of Delirium Research Group (LSD). Finally, the code invokes the desired API. In this case, the shellcode uses the CreateProcessA API where it will spawn a new process that will carry out the command line specified in the command line string.

This command will launch a new instance of the Windows command interpreter, navigate to the users %tmp% folder and then redirect a set of JavaScript commands to a file. Finally it will invoke Windows Script Host and launch this JavaScript file with two parameters. One is the decryption key and the other is the URL from where to fetch the malicious payload. Essentially this shellcode is a downloader. The full command is shown in the figure below.

If the exploits are sucessfull and the shellcode is executed then the payload is downloaded. In this case the payload is variant of Locy ransomware and a user in Windows 7 box would see the User Account Control dialog box popping up asking to run it.

That’s it! We now have a better understanding how this variant of the RIG Exploit Kit works and what it does and how. All stages of the RIG Exploit Kit enforce different protection mechanisms that slow down analysis, prevent code reuse and evade detection. It begins with multiple layers of obfuscated JavaScript using junk code and string encoding that hides the code logic and launches a browser exploit. Then it goes further by having multiple layers of encrypted Flash files with obfuscated ActionScript. The ActionScript is then responsible to invoke an exploit with encoded shellcode that downloads encrypted payload. In addition, the modular backend framework allows the threat actors to use different distribution mechanisms to reach victims globally. Based on this modular backend different filtering rules are enforced and different payloads can be delivered based on the victim Geolocation, browser and operating system. This complexity makes these threats a very interesting case study and difficult to defend against. Against these capable and dynamic threats, no single solution is enough. The best strategy for defending against this type of attacks is to understand them and to use a defense in depth strategy – multiple security controls at different layers.

References:

SANS FOR610: Reverse-Engineering Malware: Malware Analysis Tools and Techniques
Neutrino Exploit Kit Analysis and Threat Indicators

Continuing with the analysis of the RIG exploit kit, let’s start where we left off and understand the part that contains the malicious Adobe Flash file. We saw, in the last post, that the RIG exploit kit landing page contains heavily obfuscated and encoded JavaScript. One of the things the JavaScript code does is verifying if the browser is vulnerable to CVE-2013-2551. If it is, it will launch an exploit followed by shellcode, as we saw in the last post. If the browser is not vulnerable, it continues and the browser is instructed to download a malicious Flash file. The HTTP request made to fetch the Flash file is made to the domain add.alislameyah.org. As you could see in the figure below, the HTTP answer is of content type x-shockwave-flash and the data downloaded starts with CWS (characters ‘C’,’W’,’S’ or bytes 0x43, 0x57, 0x53). This is the signature for a compressed Flash file.

Next step of our analysis? Analyze this Flash file. Before we start let’s go over some overview about Flash. There are two main reasons why Flash is an attractive target for malware authors. One is because of its presence in every modern endpoint and available across different browsers and content displayers. The other is due to its features and capabilities.

Let’s go over some of the features. Adobe Flash supports the scripting language known as ActionScript. The ActionScript is interpreted by the Adobe ActionScript Virtual Machine (AVM). Current Flash versions support two different versions of the ActionScript scripting language. The Action Script (AS2) and the ActionScript 3 (AS3) that are interpreted by different AVM’s. The AS3 appeared in 2006 with Adobe Flash player 9 and uses AVM2. The creation of a Flash file consists in compiling ActionScript code into byte code and then packaging that byte code into a SWF container. The combination of the complex SWF file format and the powerful AS3 makes Adobe Flash an attractive attack surface. For example, SWF files contain containers called tag’s that could be used to store ActionScript code or data. This is an ideal place for exploit writers and malware authors to conceal their intentions and to use it as vehicle for launching attacks against client side vulnerabilities. Furthermore, both AS2 and AS3 have the capability to load SWF embedded files at runtime that are stored inside tags using the loadMovie and Loader class respectively. AS3 even goes further by allowing referencing objects from one SWF to another SWF. As stated by Wressnegger et al., in the paper “Analyzing and Detecting Flash-based Malware using Lightweight Multi-Path Exploration ” this allows sophisticated capabilities that can leverage encrypted payloads, polymorphism and runtime packers .

Now, let’s go over the analysis and dissection of the Flash file. This is achieved using a combination of dynamic and static analysis. First, we look at the file capabilities and functionality by looking at its metadata. The command line tool Exiftool created by Phill Harvey can display the metadata included in the analyzed file. In this case, it shows that it takes advantage of the Action Script 3.0 functionality. Information that is more comprehensive is available with the usage of the swfdump.exe tool that is part of the Adobe Flex SDK, which displays the different components of the Flash file. The output of swfdump displays that the SWF file contains the DoABC and DefineBinaryData tags. This suggests the usage of ActionScript 3.0 and binary data containing other elements that might contain malicious code executed at runtime.

Second, we will go deeper and dissect the SWF file. Open source tools to dissect SWF files exist such as Flare and Flasm written by Igor Kogan. Regrettably, they do not support ActionScript 3. Another option is the Adobe SWF Investigator. This tool was created by Peleus Uhley and released as open source by Adobe Labs. The tool can analyze and disassemble ActionScript 2 (AS2), ActionScript 3 (AS3) SWFs and include many other features. Unfortunately, sometimes the tool is unable to parse the SWF file in case has been packed using commercial tools like secureSWF and DoSWF.

One good alternative is to use JPEXS Flash File Decompiler (FFDec). FFDec is a powerful, feature rich and open source flash decompiler built in Java and originally written by Jindra Petřík. One key feature of FFDec is that it includes an Action Script debugger that can be used to add breakpoints to allow you to step into or over the code. Another feature is that it shows the decompiled ActionScript and its respective p-code.

One popular tool among Flash malware writers is doSWF. doSWF is a commercial product used to protect the intellectual property of different businesses that use Adobe Flash technology and want to prevent others copying it. Malware authors take advantage of this and use it for their own purposes. This tool can enforce different protections to the code level in order to defeat the decompiler. In addition to the different protections done to the code logic, doSWF can perform literal strings encryption using RC4 or AES. Also, it can be used to wrap an encrypted SWF inside another SWF file using the encrypted loader function. The decryption occurs at runtime and the decrypted file is loaded into memory.

Opening the SWF file using FFDec and observing its structure using Action Script you can see the different strings referencing doSWF which is an indication that the file has been obfuscated using doSWF. FFDec has a P-Code deobfuscation feature that can restore the control flow, remove traps and remove dead code. In addition, there is a plugin that can help rename invalid identifiers. The figure below shows a snippet of the ActionScript code after it has been deobfuscated by FFDec.

As seen in other Exploit Kits such as Angler and Neutrino, the first flash file is only used as carrier and malicious code such as exploit code or more Flash files are encrypted and obfuscated inside this first stage flash file.. The goal here is to perform static analysis of the Action Script code and determine what is happening behind the scenes. Normally, the DefineBinaryData contains further Flash files or exploit code but you need to understand the code and get the encryption keys in order to extract the data. Because my strengths are not in programming I tried to overcome this step before rolling up my sleeves and spending hours trying to understand the Action Script code like I did for Neutrino with the help of some friends. One way to carve the data is to use the Action Script debugger available in FFDec. Essentially, setting a breakpoint in the LoadBytes() method. Then running the Flash file and then when the breakpoint is triggered, use the FFDec Search SWF in memory plugin in order to find SWF files inside the FFDec process memory address space. But for this sample I used SULO.

During Black Hat USA 2014, Timo Hirvonen presented a novel tool to perform dynamic analysis of malicious Flash files. He released an open source tool named SULO. This tool uses the Intel Pin framework to perform binary instrumentation in order to analyze Flash files dynamically. This method enables automated unpacking of embedded Flash files that are either obfuscated or encrypted using commercial tools like secureSWF and DoSWF. The code is available for download on F-Secure GitHub repository (https://github.com/F-Secure/Sulo) and it should be compiled with Visual Studio 2010. The compilation process creates a .DLL file that can be used in conjunction with Intel Pin Kit for Visual Studio 2010. There are however limitations in the versions of Adobe Flash Player supported by SULO. At the time of writing only Flash versions 10.3.181.23 and 11.1.102.62 are supported. Nonetheless, one can use SULO with the aim to extract the packed Flash file in a simple and automated manner. In this case, I used the stand alone Flash player flashplayer11_1r102_62_win_sa_32bit.exe.

When using SULO to analyze the Flash file, the second stage Flash file is extracted automatically. The command shown in figure below will run and extract the packed SWF file.

In this particular case, SULO manages to extract 2 SWF files. So, the next step is to analyze these second stage SWF files. Once again, using FFDec and observing its structure and Action Script code. The second stage Flash files are interesting because one contains the code and the other makes extensive use of DefineBinaryData tag’s to store encrypted data. As a starting point, the analysis steps here are the same. Invoke the P-Code deobfuscation feature in order to restore the control flow, remove traps and remove dead code. In addition, the plugin to rename invalid identifiers was executed. After performing these two steps, the Action Script code is more readable.

In this post I won’t bother you with the details about the Action Script code. Nonetheless, one thing to mention about the code is that if you follow the site malware.dontneedcoffee.com and the amazing work done by Kaffeine on hunting down, analyzing and documenting Exploit Kits you might have noticed that he calls this version of RIG “RIG-v Neutrino-ish“. The reason might be due to the usage of the RC4 key to decrypt the payload and also the similarities in the way the Flash files are encoded and obfuscated.

Anyhow, understanding the code is important but is also important to understand what the Flash file is hiding from us. In a nutshell, one of the second stage Flash files contain several defineBinaryTags which contain encrypted strings that are used throughout the code in the other second stage Flash file. The data can be obtained by decrypting the data using RC4 keys that are also inside the defineBinaryTags. The figure below ilustrates this.

In summary DefineBinaryData tag 2 contains an array of one 16-byte RC4 key. DefineBinaryData tag 1 contains 19 (0x13) RC4-encrypted strings. The first dword contains the total number of strings. Then each string starts with a dword that contains the size of the string, followed by the RC4-encrypted data.

The DefineBinaryData tag 3 contains an array of three 16-byte RC4 keys. DefineBinaryData tag 4 contains 6 (0x06) RC4-encrypted strings. The first dword contains the total number of strings. Then each string starts with a dword that contains the size of the string, followed by the RC4-encrypted data. The RC4 decryption routine uses the 3 RC4 keys iteratively across the 19 strings.

The decrypted strings are used on different parts of the code. One of the strings is relevant because it contains shellcode that is nearly identical to the one seen in the JavaScript exploit inside the landing page.

Based on the decrypted strings it seems the Flash contains code to exploit CVE-2015-5122. This exploit has a CVSS score of 10 and is known as Adobe Flash ActionScript 3 opaque Background Use-After-Free Vulnerability. This exploit was found as a result of the public disclosure of the Hacking Team leak. In a matter of hours, the exploit was incorporated in the Angler Exploit Kit.

… and with a so lengthy post, that’s it for today. In the following post I will cover how to analyze the Shellcode to understand what is done behind the scenes when one of the exploits is successfully triggered.

First stage Flash file MD5: d11c936fecc72e44416dde49a570beb5
Second stage Flash file MD5: 574353ed63276009bc5d456da82ba7c1
Second stage Flash file MD5: e638fa878b6ea20fa8253d79b989fd7e
References:

Neutrino Exploit Kit Analysis and Threat Indicators

Count Upon Security

Increase security awareness. Promote, reinforce and learn security skills.

Tag Archives: JPEXS

RIG Exploit Kit Analysis – Part 3

RIG Exploit Kit Analysis – Part 2