Extract and use Indicators of Compromise from Security Reports

Every now and then there is a new security report released and with it comes a a wealth of information about different threat actors offering insight about the attacker’s tactics, techniques and procedures. For example, in this article I wrote back in 2014, you have a short summary about some of the reports that were released publicly throughout the year. These reports allow the security companies to advertise their capabilities but on the other hand they are a great resource for network defenders. Some reports are more structured than others and they might contain different technical data. Nonetheless, almost all of them have IOC’s (Indicators of compromise).

For those who never heard about indicators of compromise let me give a brief summary. IOC’s are pieces of information that can be used to search and identify compromised systems. These pieces of information have been around since ages but the security industry is now using them in a more structural and consistent fashion. All types of enterprises are moving from the traditional way of handling security incidents – Wait for an alert to come in and then respond to it – to a more proactive approach which consists in taking the necessary steps to hunt for evil in order to defend their networks. In this new strategy, the IOCs are a key technical component. When someone compromises a system, they leave evidence behind. That evidence, artifact or remnant piece of information left by an attacker can be used to identify the intrusion, threat group or the malicious actor. Examples of IOCs are IP addresses, domain names, URLs, email addresses, file hashes, HTTP user agents, registry keys, a service configuration change, a file is deleted, etc. With this information one could sweep the network/endpoints and look for indicators that the system might have been compromised. For more background about it you can read Lenny Zeltzer summary. Will Gragido from RSA explained it well in is 3 parts blog here, here and here. Mandiant also has this and this great articles about it.

So, if almost all the reports published contain IOC’s, is there a central place that contains all reports that were released by the different security organizations? Kiran Bandla created a repository on GitHub named APTnotes where he does exactly that. He maintains a list of all security reports released by the different vendors – “APTnotes is a repository of publicly-available papers and blogs (sorted by year) related to malicious campaigns/activity/software that have been associated with vendor-defined APT (Advanced Persistent Threat) groups and/or tool-sets.” Kiran, among other methods relies on the community who share with him when a report X or Y was released and he adds it to the APTnotes repository. The list of reports is available in CSV and JSON format.

So, what can you do with all this reports? How can we as network defenders use this information?

Three example use cases. The first use case is more likely and common, where the other two might be more common in larger organizations with higher budget and bigger security appetite.

An organization that has a Security Operations Center in place could use the IOC’s to augment their existing monitoring and detection capabilities.
An organization that has in-house CSIRT capabilities could leverage the IOC’s in a proactive manner in order to have a higher probability of discovering something bad and, as such, reduce the business impact that a security incident might have in the organization.
An organization that has a Cyber Threat Intelligence capability in-house, could collect, process, exploit and analyze these reports. Then disseminate actionable information to the threat intelligence consumers throughout the organization.

In a simple manner, the process for the first scenario would look something like the following diagram:

How would this work in practical terms? Normally, you could split the IOC’s in host based or network based. For example, a DNS name or IP addresses will be more effective to search across your network infrastructure. However, a Registry Key or a MD5 will be more likely to be searched across the endpoint.

For this article, I will focus on MD5’s. Some reports offer file hashes using SHA-1 or SHA-256 but not many organizations have the capability to search for this. MD5 is more common. Noteworthy that the value of MD5 hashes about a malicious file might be considered low. A great article about the value of IOC’s and TTP’s was written back in 2014 from David Bianco title The Pyramid of Pain. Following that there is an article from Harlan Carvey with additional thoughts about it. Another point to take into consideration about the MD5’s from the reports is that some might be from legitimate files due to the usage of DLL hijacking or they are windows built-in commands used by threat actors. In addition is likely that malware used by the different threat actors is only used once and you might not see a MD5 hash a second time. Nonetheless, the MD5’s is a starting point.

So, how can we collect the reports, extract the IOC’s and convert them and use them?

First you can use a python script to download all the reports in a central place separated per year. Then you can use the tool IOC parser written by Armin Buescher. This tool will be able to parse PDF reports and extract IOC’s into CSV format. From here you can extract the relevant IOC’s. For this example, I want to extract the MD5’s and then use IOC writer to create IOC’s in OpenIOC 1.0/1.1 format which could be used with a tool such as Redline. IOC writer is a python library written by William Gibb that allows you to manipulate IOC’s in OpenIOC 1.1 and 1.0 format and was released on BlackHat 2013.

The steps necessary to perform this are illustrated below – for sure there are other better ways to perform this but this was a quick way to do the job -.

With this we created a series of files with .ioc extension that can be further edited with the ioc_writer Python library. However, not many tools support OpenIOC, so you might just use the MD5’s and feed that into whatever tool or format you are using. You can download below the excel with all the MD5’s separated per year. If you use them, please bare in mind you might have false positives and you will need to go back to the csv files to understand from which report did the MD5 came from. For example I already removed half dozen of MD5’s that are legitimate files but seen in the reports, and also the MD5 for an empty file “d41d8cd98f00b204e9800998ecf8427e”

That’t it, with this you can create a custom IOC set that contain MD5’s of different tools, malware families and files that was compiled by extracting the MD5’s from the public reports about targeted attacks. From here you can start building more complex IOC’s with different artifacts based on a specific report or threat actor. Maybe you get lucky and you find evil on your network!

The year 2010 contains 81 unique MD5’s.
The year 2011 contains 96 unique MD5’s.
The year 2012 contains 718 unique MD5’s.
The year 2013 contains 2149 unique MD5’s
The year 2014 contains 1306 unique MD5’s
The year 2015 contains 1553 unique MD5’s
The year 2016 contains 1173 unique MD5’s

Excel : md5-aptnotes-2010-2016

2 thoughts on “Extract and use Indicators of Compromise from Security Reports”

Week 2 – 2017 – This Week In 4n6 says:

January 15, 2017 at 1:25 am

[…] Luis Rocha at Count Upon Security shares his thoughts on how defenders can use information shared in security reports about the various APT’s. This also includes practical steps for extracting and utilising the reported IOCs. Extract And Use Indicators Of Compromise From Security Reports […]

LikeLike

filippo mottini says:

January 16, 2017 at 2:03 pm

Great article!!!!! By Filippo

LikeLike

Count Upon Security

Increase security awareness. Promote, reinforce and learn security skills.