Extract and use Indicators of Compromise from Security Reports

apt-reports-1Every now and then there is a new security report released and with it comes a a wealth  of information about different threat actors offering insight about the attacker’s tactics, techniques and procedures. For example, in this article I wrote back in 2014, you have a short summary about some of the reports that were released publicly throughout the year. These reports allow the security companies to advertise their capabilities but on the other hand they are a great resource for network defenders. Some reports are more structured than others and they might contain different technical data. Nonetheless, almost all of them have IOC’s (Indicators of compromise).

For those who never heard about indicators of compromise let me give a brief summary. IOC’s are pieces of information that can be used to search and identify compromised systems. These pieces of information have been around since ages but the security industry is now using them in a more structural and consistent fashion. All types of enterprises are moving from the traditional way of handling security incidents – Wait for an alert to come in and then respond to it – to a more proactive approach which consists in taking the necessary steps to hunt for evil in order to defend their networks. In this new strategy, the IOCs are a key technical component. When someone compromises a system, they leave evidence behind.  That evidence, artifact or remnant piece of information left by an attacker can be used to identify the intrusion, threat group or the malicious actor. Examples of IOCs are IP addresses, domain names, URLs, email addresses, file hashes, HTTP user agents, registry keys, a service configuration change, a file is deleted, etc. With this information one could sweep the network/endpoints and look for indicators that the system might have been compromised. For more background about it you can read Lenny Zeltzer summary. Will Gragido from RSA explained it well in is 3 parts blog herehere and here. Mandiant also has this and this great articles about it.

So, if almost all the reports published contain IOC’s, is there a central place that contains all reports that were released by the different security organizations? Kiran Bandla created a repository on GitHub named APTnotes where he does exactly that. He maintains a list of all security reports released by the different vendors – “APTnotes is a repository of publicly-available papers and blogs (sorted by year) related to malicious campaigns/activity/software that have been associated with vendor-defined APT (Advanced Persistent Threat) groups and/or tool-sets.” Kiran, among other methods relies on the community who share with him when a report X or Y was released and he adds it to the APTnotes repository. The list of reports is available in CSV and JSON format.

So, what can you do with all this reports? How can we as network defenders use this information?

Three example use cases.  The first use case is more likely and common, where the other two might be more common in larger organizations with higher budget and bigger security appetite.

  • An organization that has a Security Operations Center in place could use the IOC’s to augment their existing monitoring and detection capabilities.
  • An organization that has in-house CSIRT capabilities could leverage the IOC’s in a proactive manner in order to have a higher probability of discovering something bad and, as such, reduce the business impact that a security incident might have in the organization.
  • An organization that has a Cyber Threat Intelligence capability in-house, could collect, process, exploit and analyze these reports. Then disseminate actionable information to the threat intelligence consumers throughout the organization.

In a simple manner, the process for the first scenario would look something like the following diagram:


How would this work in practical terms? Normally, you could split the IOC’s in host based or network based. For example, a DNS name or IP addresses will be more effective to search across your network infrastructure. However, a Registry Key or a MD5 will be more likely to be searched across the endpoint.

For this article, I will focus on MD5’s. Some reports offer file hashes using SHA-1 or SHA-256 but not many organizations have the capability to search for this. MD5 is more common. Noteworthy that the value of MD5 hashes about a malicious file might be considered low.  A great article about the value of IOC’s and TTP’s was written back in 2014 from David Bianco title The Pyramid of Pain. Following that there is an article from Harlan Carvey with additional thoughts about it. Another point to take into consideration about the MD5’s from the reports is that some might be from legitimate files due to the usage of DLL hijacking or they are windows built-in commands used by threat actors. In addition is likely that malware used by the different threat actors is only used once and you might not see a MD5 hash a second time. Nonetheless, the MD5’s is a starting point.

So, how can we collect the reports, extract the IOC’s and convert them and use them?

First you can use a python script to download all the reports in a central place separated per year. Then you can use the tool IOC parser written by Armin Buescher. This tool will be able to parse PDF reports and extract IOC’s into CSV format. From here you can extract the relevant IOC’s. For this example, I want to extract the MD5’s and then use IOC writer to create IOC’s in OpenIOC 1.0/1.1  format which could be used with a tool such as Redline.  IOC writer is a python library written by William Gibb that allows you to manipulate IOC’s in OpenIOC 1.1 and 1.0 format and was released on BlackHat 2013.

The steps necessary to perform this are illustrated below – for sure there are  other better ways to perform this but this was a quick way to do the job -.


With this we created a series of files with  .ioc extension that can be further edited with the ioc_writer Python library. However, not many tools support OpenIOC, so you might just use the MD5’s and feed that into whatever tool or format you are using. You can download below the excel with all the MD5’s separated per year.  If you use them, please bare in mind you might have false positives and you will need to go back to the csv files to understand from which report did the MD5 came from. For example I  already removed half dozen of MD5’s that are legitimate files but seen in the reports, and also the MD5 for an empty file “d41d8cd98f00b204e9800998ecf8427e”

That’t it, with this you can create a custom IOC set that contain MD5’s of different tools, malware families and files that was compiled by extracting the MD5’s from the public reports about targeted attacks. From here you can start building more complex IOC’s with different artifacts based on a specific report or threat actor. Maybe you get lucky and you find evil on your network!

The year 2010 contains 81 unique MD5’s.
The year 2011 contains 96 unique MD5’s.
The year 2012 contains 718 unique MD5’s.
The year 2013 contains 2149 unique MD5’s
The year 2014 contains 1306 unique MD5’s
The year 2015 contains 1553 unique MD5’s
The year 2016 contains 1173 unique MD5’s

Excel : md5-aptnotes-2010-2016


2 thoughts on “Extract and use Indicators of Compromise from Security Reports

  1. […] Luis Rocha at Count Upon Security shares his thoughts on how defenders can use information shared in security reports about the various APT’s. This also includes practical steps for extracting and utilising the reported IOCs. Extract And Use Indicators Of Compromise From Security Reports […]


  2. filippo mottini says:

    Great article!!!!! By Filippo


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

gb_master's /dev/null

... and I said, "Hello, Satan. I believe it's time to go."

Source Code Auditing, Reversing, Web Security

Finding Hidden codes in the software

BruteForce Lab's Blog

security, programming, devops, visualization, the cloud

Count Upon Security

Increase security awareness. Promote, reinforce and learn security skills.

Naked Security

Computer Security News, Advice and Research

Didier Stevens

(blog \'DidierStevens)


Adventures in double-clicking malware / by Anuj Soni

Rational Survivability

Hoff's Ramblings about Information Survivability, Information Centricity, Risk Management and Disruptive Innovation.

SANS Internet Storm Center, InfoCON: green

Increase security awareness. Promote, reinforce and learn security skills.


Increase security awareness. Promote, reinforce and learn security skills.

Schneier on Security

Increase security awareness. Promote, reinforce and learn security skills.

Technicalinfo.net Blog

Increase security awareness. Promote, reinforce and learn security skills.

Lenny Zeltser

Increase security awareness. Promote, reinforce and learn security skills.

Krebs on Security

In-depth security news and investigation

%d bloggers like this: