Unleashing YARA – Part 1

[Editor’s Note: In the article below, Ricardo Dias who is a SANS GCFA gold certified and a seasoned security professional demonstrates the usefulness of Yara – the Swiss Army knife for Incident Responders. This way you can get familiar with this versatile tool and develop more proactive and mature response practices against threats. ~Luis]

Intro

yara_logoI remember back in 2011 when I’ve first used YARA. I was working as a security analyst on an incident response (IR) team, doing a lot of intrusion detection, forensics and malware analysis. YARA joined the tool set of the team with the purpose to enhance preliminary malware static analysis of portable executable (PE) files. Details from the PE header, imports and strings derived from the analysis resulted in YARA rules and shared within the team. It was considerably faster to check new malware samples against the rule repository when compared to lookup analysis reports. Back then concepts like the kill chain, indicator of compromise (IOC) and threat intelligence where still at its dawn.

In short YARA is an open-source tool capable of searching for strings inside files (1). The tool features a small but powerful command line scanning engine, written in pure C, optimized for speed. The engine is multi-platform, running on Windows, Linux and MacOS X. The tool also features a Python extension providing access to the engine via python scripts. Last but not least the engine is also capable of scanning running processes. YARA rules resemble C code, generally composed of two sections: the strings definition and a, mandatory, boolean expression (condition). Rules can be expressed as shown:

rule evil_executable
{
    strings:
        $ascii_01 = "mozart.pdb"
        $byte_01  = { 44 65 6d 6f 63 72 61 63 79 }
    condition:
        uint16(0) == 0x5A4D and
        1 of ( $ascii_01, $byte_01 )
}

The lexical simplicity of a rule and its boolean logic makes it a perfect IOC. In fact ever since 2011 the number of security vendors supporting YARA rules is increasing, meaning that the tool is no longer limited to the analyst laptop. It is now featured in malware sandboxes, honey-clients, forensic tools and network security appliances (2). Moreover, with the growing security community adopting YARA format to share IOCs, one can easily foresee a wider adoption of the format in the cyber defence arena.

In the meantime YARA became a feature rich scanner, particularly with the integration of modules. In essence modules enable very fine grained scanning while maintaining the rule readability. For example the PE module, specially crafted for handling Windows executable files, one can create a rule that will match a given PE section name. Similarly, the Hash module allows the creation on hashes (i.e. MD5) based on portions of a file, say for example a section of a PE file.

YARA in the incident response team

So how does exactly a tool like YARA integrate in the incident response team? Perhaps the most obvious answer is to develop and use YARA rules when performing malware static analysis, after all this is when the binary file is dissected, disassembled and understood. This gives you the chance to cross-reference the sample with previous analysis, thus saving time in case of a positive match, and creating new rules with the details extracted from the analysis. While there is nothing wrong with this approach, it is still focused on a very specific stage of the incident response. Moreover, if you don’t perform malware analysis you might end up opting to rule out YARA from your tool set.

Lets look at the SPAM analysis use case. If your team analyses suspicious email messages as part of their IR process, there is great chance for you to stumble across documents featuring malicious macros or websites redirecting to exploit kits. A popular tool to analyse suspicious Microsoft Office documents Tools is olevba.py, part of the oletools package (3), it features YARA when parsing OLE embedded objects in order to identify malware campaigns (read more about it here). When dealing with exploit kits, thug (4), a popular low-interaction honey-client that emulates a web browser, also features YARA for exploit kit family identification. In both cases YARA rule interchanging between the IR teams greatly enhances both triage and analysis of SPAM.

Another use case worth mentioning is forensics. Volatility, a popular memory forensics tool, supports YARA scanning (5) in order to pinpoint suspicious artefacts like processes, files, registry keys or mutexes. Traditionally YARA rules created to parse memory file objects benefit from a wider range of observables when compared to a static file rules, which need to deal with packers and cryptors. On the network forensics counterpart, yaraPcap (6), uses YARA for scan network captures (PCAP) files. Like in the SPAM analysis use case, forensic analysts will be in advantage when using YARA rules to leverage the analysis.

Finally, another noteworthy use case is endpoint scanning. That’s right, YARA scanning at the client computer. Since YARA scanning engine is multi-platform, it poses no problems to use Linux developed signatures on a Windows operating system. The only problem one needs to tackle is on how to distribute the scan engine, pull the rules and push the positive matches to a central location. Hipara, a host intrusion prevention system developed in C, is able to perform YARA file based scans and report results back to a central server (7). Another solution would be to develop an executable python script featuring the YARA module along with REST libraries for pull/push operations. The process have been documented, including conceptual code,  in the SANS paper “Intelligence-Driven Incident Response with YARA” (read it here). This use case stands as the closing of the circle in IOC development, since it enters the realm of live IR, delivering and important advantage in the identification of advanced threats.

Conclusion

The key point lies in the ability for the IR teams to introduce the procedures for YARA rule creation and use. Tier 1 analysts should be instructed on how to use YARA to enhance incident triage, provide rule feedback, concerning false positives, and fine tuning to Tier 2 analyst. Additionally a repository should be created in order to centralize the rules and ensure the use of up-to-date rules. Last but not least teams should also agree on the rule naming scheme, preferably reflecting the taxonomy used for IR. These are some of the key steps for integrating YARA in the IR process, and to prepare teams for the IOC sharing process.

References:

  1. https://github.com/plusvic/yara
  2. https://plusvic.github.io/yara
  3. https://blog.didierstevens.com/2014/12/17/introducing-oledump-py
  4. https://github.com/buffer/thug
  5. https://github.com/volatilityfoundation/volatility
  6. https://github.com/kevthehermit/YaraPcap
  7. https://github.com/jbc22/hipara
Tagged , , ,

Leave a comment