Exploitation | Count Upon Security

Intro to American Fuzzy Lop – Fuzzing with ASAN and beyond

In the previous article I showed how easily you could run American Fuzy Lop (AFL) to fuzz an open source target. I used a 64-bit system and used AFL to fuzz the 64-bits version of tcpdump. Today, I’m going to fuzz the 32-bit version of tcpdump. Why the 32-bit version? Because I would like to combine AFL with address sanitizer (ASAN). As written by Michal Zalewski on the “notes_for_asan.txt” document (AFL documentation), ASAN requires a lot of virtual memory. In certain circumstances, due to some input, it could cause a lot of memory consumption and cause system instability. Therefore, Michal suggests to use ASAN to fuzz a 32-bit target and tell AFL to enforce a memory limit. Other options are possible to combine AFL with ASAN on a 64-bit target such as using the option “ASAN_OPTIONS=hard_rss_limit_mb=N”, but I will keep it simple and try that. And why use ASAN? Because ASAN, is a fast memory error detector, created by Google (Konstantin Serebryany, Derek Bruening, Alexander Potapenko and Dmitry Vyukov) and released in 2011 [2][7], that is part of the LLVM compiler infrastructure. It performs compile-time instrumentation and uses a runtime library that replaces memory management functions such as malloc and free with custom code. With these properties, while fuzzing with AFL, you increase the likelihood of finding memory corruption errors such as heap buffer overflow, memory leaks, user -after-free and others [1]. The ASAN custom code that replaces the memory management functions use a shadow memory technique that keeps track of all the malloc-ed and free-ed regions and marks the areas of memory around it as red zones, i.e. it poisons the memory regions surrounding the malloc-ed and free-ed regions [4]. Then while fuzzing, if the input causes some sort out-of-bounds read or write, ASAN would catch that.

So, to use AFL with ASAN, instead of using afl-gcc compiler wrapper, like I did in previous article, I need to use the afl-clang-fast compiler which is a wrapper for Clang compiler. Noteworthy, if you are not leveraging ASAN, you should use the AFL Clang compiler because it is much faster than the AFL GCC compiler.

So, let’s start. First, I start by installing all dependencies for the 32-bit architecture. This step works well on a freshly installed 64-bit system, on an existing system you might need to resolve some dependencies between the different architectures.

Then, like in the previous article, I’m going to download AFL and compile it, but this time I also need to compile the LLVM mode support and give it the correct flags and version of Clang.

Next, to compile the 32-bit version of tcpdump, I also need to compile the 32-bit version of libpcap. At the time of this writing the last version is 1.8.1. When compiling libpcap I use the “-m32” flag so it produces 32-bits code. When compiling tcpdump, I add the ASAN flags “-fsanitize=address” and “-fno-omit-frame-pointer” to add the AddressSanitizer run-time library and get better stack traces reports, respectively [3].

After that, I’m going to use the same corpus as the one outlined in the previous article and will put the corpus into a ramdisk. This is not a mandatory step but rather a good practice due to the fact that fuzzing is I/O intensive and could, eventually, reduce the lifetime of your hard drive specially with SSD’s.

Then, start fuzzing! When you start AFL for the first time, you might want to start it without screen, to make sure it executes properly because AFL might printout instructions about tuning that needs to be performed on the operating system such as checking core_pattern and the CPU scaling governor. After having the master node running, I start the slave nodes. In this case I’m using a system with 32 cores, so I should be able to run 32 instances of AFL.

AFL will start reading the different test cases from the input directory, and fuzz them using the different deterministic and non-deterministic stages, find new test cases and queue them for future stage rounds. To check the status of the fuzzing session across the different, I could use afl-whatsup. After a while, you should, hopefully, start to see crashes on some of the fuzzer instances. Then you might want to start looking at the crashes across the different fuzzing sessions. Picture below illustrates a quick way to run the crash reports and grep for the ERROR line.

To get more information about were the crash happened and the stack trace, you could tell ASAN to output debug symbols information by using the options “ASAN_OPTIONS=symbolize=1” and “ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-3.5”. In the picture below I run the crash files with ASAN options so I could see symbols.

Before I continue, it is worth to mention that apart of running AFL with ASAN you could instead instrument the binary with MemorySanitizer (MSan). MSan is another address sanitizer that helps detect uninitialized memory reads in C/C++ programs which occur when a program tries to read allocated memory before it is written. For sake of brevity I’m only showing how to instrument tcpdump with MSAN and run a crash file report with symbolized stack trace for debugging purposes. MSAN needs a newer version of LLVM compiler infrastructure.

So, after you find a crash condition, what are the next steps? You report the bug to the vendor, maintainer or entity and provide them all the details. Then, they might work with you to find the root cause and finally they fix the code for the greater good of overall security. However, another path, instead of just reporting the fault, is finding the exploitability of a crash condition. This is another important aspect of the vulnerability analysis process, and nowadays the economical and professional incentives to perform this work are significant. However, in my opinion, this path is much harder. With the race between bugs being found and security measures being adopted by the different operating systems, the depth of knowledge that one needs in order to be able to triage the crash, likely produced by a fuzzer, debug the program, perform code analysis, understand its internals, assess its state when it crashes and determine if is there is any chance of exploitability under the operating system environment where crash occurred, is a remarkable and unique skill to have.

However, even if you are not the one to have those kind of skills, like myself, you could at least start to understand if by controlling the input allows you to have different outcomes. For example, are you able to, somehow, control the value on the memory address that causes a segmentation fault? AFL can help determine this to some degree. With AFL you can use one crash file and use the “crash exploration” mode which is also known as the “peruvian were-rabbit” mode. Essentially you feed AFL with the test case that causes a crash condition and AFL will quickly determine the different code paths that can be reached while maintain the crash state. With this AFL will generate a number of crash files that could help you understand to which degree the input has control over the fault address.

In this case, for illustration purposes, I executed AFL in crash exploration mode using as input the file “stack-buffer-overflow-printf_common-hncp-cve2017-13044.pcap”. This is a packet capture that triggers CVE-2017-13044 : HNCP parser in tcpdump before 4.9.2 has a buffer over-read in print-hncp.c:dhcpv4_print(). After a few minutes, AFL was able to determine 42 unique crashes. This means 42 unique code paths that lead to the same crash state.

Quickly looking at the ASAN reports we can obtain two crashes that overrun the buffer boundary and allow us to READ different sizes of adjacent memory.

Picture below shows the ASAN stack trace report with Symbols when running the packet capture that triggers the CVE-2017-13044 on tcpdump version 4.9.0.

So, after that, in an attempt to understand more, you could identify the function and look at the tcpdump source code and also start to debug it with a debugger. Other than that I wanted to understand if by changing a particular byte in the input file I could control something. Therefore I mapped the packet capture to the protocol in question (HNCP) which is described in RFC7788. After some trial and error, it seems there are a 2-bytes in the “Prefix Length” field that cause the buffer over read condition. The picture below shows the hex listing and mapping for the two crash conditions related to CVE-2017-13044. On the packet from the left, setting the HNCP “Prefix Length” on any value between 0x1f and 0x98 will make the ndo_printf() print stack contents. Other values don’t produce the leak. On the packet from the right, the range is from 0x25 to 0x98. So, I could see that I could leak at least 2 memory addresses from the stack but I couldn’t not find a way to control what is printed. Nonetheless, perhaps this packet capture could be merged with another one that has another bug that allows you to control, somehow, the saved return pointer and use the leak to bypass ASLR.

The printscreen below shows the output of GDB (with PEDA plugin) when setting a breakpoint on the function responsible to print the fault address. The debugging session was done with tcpdump compiled without ASAN. Noteworthy if the program is compiled with ASAN and you want to load it into GDB and view the program state before the ASAN crash report you could set a break point on “__asan_report_error”.

That’s it for today. this was the second episode about American Fuzzy Lop. I presented a small step-by-step guide on how you could start using AFL with the LLVM compiler infrastructure. Plus, among other things, leverage the address sanitizers such as ASAN or MSAN to fuzz an open source target and catch a bigger variety of bugs related to memory corruption. In addition, we started to look at the crashes and with small steps understand if it was exploitable.

Using the various techniques described in this article, during the last weeks, I fuzzed couple of tools (tcpdump, affutils, libevt and sleuthkit) and found some faults that I reported to the respective maintainers. I will update the below list as soon as all the faults are disclosed and acknowledged.

CVE-2018-8050 : The af_get_page() function in lib/afflib_pages.cpp in AFFLIB (aka AFFLIBv3) through 3.7.16 allows remote attackers to cause a denial of service (segmentation fault) via a corrupt AFF image that triggers an unexpected pagesize value. I reported this bug to Phillip Hellewell who promptly analyzed the crash condition and quickly released a fix for it.
CVE-2018-8754 : The libevt_record_values_read_event() function in libevt_record_values.c in libevt before 2018-03-17 does not properly check for out-of-bounds values of user SID data size, strings size, or data size. I reported this bug to Joachim Metz who promptly analyzed the crash condition and quickly released a fix for it.

Credits: Thanks to Aleksey Cherepanov for his enthusiasm, valued support and willingness to help me with my never ending questions and Rui Reis for his ideas and support to make this article better.

References:

[1] https://github.com/google/sanitizers/wiki/AddressSanitizer
[2] https://blog.chromium.org/2011/06/testing-chromium-addresssanitizer-fast.html
[3] https://clang.llvm.org/docs/AddressSanitizer.html
[4] http://devstreaming.apple.com/videos/wwdc/2015/413eflf3lrh1tyo/413/413_advanced_debugging_and_the_address_sanitizer.pdf (page 29)
[5] https://www.blackhat.com/docs/us-16/materials/us-16-Branco-DPTrace-Dual-Purpose-Trace-For-Exploitability-Analysis-Of-Program-Crashes-wp.pdf
[6] https://github.com/google/sanitizers/wiki/MemorySanitizer
[7] https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf
Vulnerability Discovery and Triage Automation training material from Richard Johnson

Evolution of Stack Based Buffer Overflows

On the 2^ndNovember, 1988 the Morris Worm was the first blended threat affecting multiple systems on the Internet. One of the things the worm did was to exploit a buffer overflow against the fingerd daemon due to the usage of gets() library function. In this particular case the fingerd program had a 512-byte buffer for gets(). However, this function would not verify if the input received was bigger than the allocated buffer i.e., would not perform boundary checking. Due to this, Morris was able to craft an exploit of 536-bytes which will fill the gets() buffer and overwrite parts of the stack. More precisely it overwrote the memory address of the return stack frame with a new address. This new address would point into the stack where the crafted input has been stored. The shellcode consisted on a series of opcodes that would perform the execve(“/bin/sh”,0,0) system call. This would give a shell prompt to the attacker. A detailed analysis about it was written by the Eugene Spafford, an American professor of computer science at Purdue University. This was a big event and made buffer overflows gain notoriety.

Time has passed and the security community had to wait for information about the closely guarded technique to be publicly available. One of the first articles on how to exploit buffer overflows was written in the fall of 1995 by Peiter Zatko a.k.a Mudge – at the time Mudge was one of the members of the prominent hacker group L0pht. One year later, in the summer of 1996, the 49^th issue of the Phrack e-zine was published. With it, came the notorious step-by-step article “Smashing the Stack for Fun and Profit” written by Elias Levy a.k.a. Aleph1. This article is still today a reference for the academia and for the industry in order to understand buffer overflows. In addition to these two articles another one was written in 1997 by Nathan Smith named ” Stack Smashing vulnerabilities in the UNIX Operating System.” These 3 articles, especially the article from Aleph1 allowed the security community to learn and understand the techniques needed to perform such attacks.

Meanwhile, in April 1997 Alexander Peslyak a.k.a. Solar Designer posted on Bugtraq mailling list a Linux patch in order to defeat this kind of attacks. His work consisted in changing the memory permissions of the stack to read and write instead of read, write and execute. This would defeat buffer overflows where the malicious code would reside in the stack and would need to be executed from there.

Nonetheless, Alexander went further and in August 1997 he was the first to demonstrate how to get around a non-executable stack using a technique known as return-to-libc. Essentially, when executing a buffer overflow the limits of the original buffer will be exceeded by the malicious input and the adjacent memory will be overwritten, especially the return stack frame address. The return stack frame address is overwritten with a new address. This new address, instead of pointing to an address on the stack it will point to a memory address occupied by the libc library e.g, system(). Libc is the C library that contains all the system functions on Linux such as printf(), system() and exit(). This is an ingenious technique which bypasses non-executable stack and doesn’t need shellcode. This technique can be achieved in three steps. As Linus Torvalds wrote in 1998 you do something like this:

Overflow the buffer on the stack, so that the return value is overwritten by a pointer to the “system()” library function.
The next four bytes are crap (a “return pointer” for the system call, which you don’t care about)
The next four bytes are a pointer to some random place in the shared library again that contains the string “/bin/sh” (and yes, just do a strings on the thing and you’ll find it).

Apart of pioneering the demonstration of this technique, Alexander also improved his previous non-executable stack patch with a technique called ASCII Armoring. ASCII Armoring would make buffer overflows more difficult to happen because it will map the shared libraries on memory address that contain a zero byte such as 0xb7e39d00. This was another clever defense because one of the causes of buffer overflows is the way the C language handles string routines like strcp(), gets() and many others. These routines are created to handle strings that terminate with a null byte i.e, a NULL character. So, you as an attacker when you are crafting your malicious payload you could provide malicious input that does not contain NULL character. This will be processed by the string handling routine with catastrophic consequences because it does not know where to stop. By introducing this null byte into memory addresses the payload of buffer overflows that are processed by the string handling routines will break.

Based on the work from Alexander Peslyak, Rafal Wojtczuk a.k.a. Nergal, wrote in January 1998 to the Bugtraq mailing list another way to perform return-to-libc attacks in order to defeat the non-executable stack. This new technique presented a method that was not confined to return to system() libc and could use other functions such as strcpy() and chain them together.

Meanwhile, In October 1999, Taeh Oh wrote “Advanced Buffer Overflow Exploits” describing novel techniques to create shellcode that could be used to exploit buffer overflow attack.

Following all this activity, Crispin Cowan presented on the 7^th USENIX Security Symposium on January 1998 a technology known as StackGuard. StackGuard was a compiler extension that introduced the concept of “canaries”. In order to prevent buffer overflows, binaries compiled with this technology will have a special value that is created during the function epilogue and pushed into the stack next to the address of the return stack frame. This special value is referred as the canary. When preforming the prologue of a function call, StackGuard will check if the address of the return stack frame has been preserved. In case the address has been altered the execution of the program will be terminated.

As always in the never ending cat and mice game of the security industry, after this new security technique was introduced, others have had to innovate and take it to the next level in order to circumvent the implemented measures. The first information about bypassing the StackGuard was discovered in November 1999 by the Polish hacker Mariusz Wołoszyn and posted on the BugTraq mailing list. Following that In January 2000, Mariuz a.k.a. Kil3r and Bulba, published on Phrack 56 the article “Bypassing StackGuard and StackShield”. Following that a step forward was made in 2002 by Gerardo Richarte from CORE security who wrote the paper “Four different tricks to bypass StackShield and StackGuard protection”.

The non-executable stack patch developed by Alexander was not adopted by all Linux distributions and the industry had to until the year 2000 for something to be adopted more widely. In August 2000, the PaX team (now part of GR-security) released a protection mechanism known as Page-eXec (PaX) that would make some areas of the process address space not executable i.e., the stack and the heap by changing the way memory paging is done. This mitigation technique is nowadays standard in the GNU Compiler Collection (GCC) and can be turned off with the flag “-z execstack”.

Then in 2001, the PaX team implemented and released another mechanism known as Address Space Layout Randomization (ASLR). This method defeats the predictability of addresses in virtual memory. ASLR randomly arranges the virtual memory layout for a process. With this the addresses of shared libraries and the location of the stack and heap are randomized. This will make return-to-libc attacks more difficult because the address of the C libraries such as system() cannot be determined in advance.

By 2001, the Linux Kernel had two measures to protect against unwarranted code execution. The non-executable stack and ASLR. Nonetheless, Mariusz Wołoszyn wrote a breakthrough paper in issue 58 of Phrack on December 2001. The article was called “The Advanced return-into-lib(c) exploits” and basically introduced a new techniques known as return-to-plt. This technique was able to defeat the first ASLR implementation. Then the PaX team strengthen the ASLR implementation and introduced a new feature to defend against return-to-plt. As expected this technique didn’t last long without a comprehensive study on how to bypass it. It was August 2002 and Tyler Durden published an article on Phrack issue 59 titled “Bypassing PaX ASLR protection”.

Today, ASLR is adopted by many Linux distributions. Nowadays is built into the Linux Kernel and on Debian and Ubuntu based systems is controlled by the parameter /proc/sys/kernel/randomize_va_space. This mitigation can be changed with the command “echo <value > /proc/sys/kernel/randomize_va_space ” where value can be:

0 – Disable ASLR. This setting is applied if the kernel is booted with the norandmaps boot parameter.
1 – Randomize the positions of the stack, virtual dynamic shared object (VDSO) page, and shared memory regions. The base address of the data segment is located immediately after the end of the executable code segment.
2 – Randomize the positions of the stack, VDSO page, shared memory regions, and the data segment. This is the default setting.

Interesting is the fact that on 32-bit Linux machines an attacker with local access could disable ASLR just by running the command “ulimit -c”. A patch has just been released to fix this weakness.

Following the work of StackGuard, the IBM researcher Hiroaki Etoh developed ProPolice in 2000. ProPolice is known today as Stack Smashing Protection (SSP) and was created based on the StackGuard foundations. However, it brought new techniques like protecting not only the return stack frame address as StackGuard did but also protecting the frame pointer and a new way to generate the canary values. Nowadays this feature is standard in the GNU Compiler Collection (GCC) and can be turned on with the flag “-fstack-protector”. Ben Hawkes in 2006 presented at Ruxcoon a technique to bypass the ProPolice/SSP stack canaries using brute force methods to find the canary value.

Time passed and in 2004, Jakub Jelinek from RedHat introduced a new technique known as RELRO. This mitigation technique was implemented in order to harden data sections of ELF binaries. ELF internal data sections will be reordered. In case of a buffer overflow in the .data or .bss section the attacker will not be able to use the GOT-overwrite attack because the entire Global Offset Table is (re)mapped as read only which will avoid format strings and 4-byte write attacks. Today this feature is standard in GCC and comes in two flavours. Partial RELRO (-z relro) and Full RELRO (-z relro -z now). More recently, Chris Rohlf wrote an article about it here and Tobias Klein wrote about it on a blog post.

Also in 2004 a new mitigation technique was introduced by RedHat engineers. The technique is known as Position Independent Executable (PIE). PIE is ASLR but for ELF binaries. ASLR works at the Kernel level and makes sure shared libraries and memory segments are arranged in randomized addresses. However, binaries don’t have this property. This means the addresses of the compiled binary when loaded into memory are not randomized and become a weak spot for protection against buffer overflows. To mitigate this weakness, RedHat introduced the PIE flag in GCC (-pie). Binaries that have been compiled with this flag will be loaded at random addresses.

The combination of RELRO, ASLR, PIE and Non-executable stack raised significantly the bar in protecting against buffer overflows using return-to-libc technique and its variants. However, this didn’t last long. First Sebastian Krahmer from SUSE developed a new variant of return-to-libc attack for x64 systems. Sebastian wrote a paper called “x86-64 buffer overflows exploits and the borrowed code chunks exploitation technique”.

Then with an innovative paper published on ACM in 2007, Hovav Shacham wrote “The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86)”. Hovav introduced the concept of using return oriented programming and what he called gadgets to extend the return-to-libc technique and bypass different mitigation’s enforced by the Linux operating system. This technique was based on the work from Solar and Nergal and does not need to inject code and takes advantage of existing instructions from the binary itself. Reuse existing instructions and chain them together using the RET instruction to achieve the end goal of manipulating the program control flow execute code of attackers choice. This is a difficult technique to perform but is powerful and is known as ROP. A summary was presented by Hovav on Black Hat 2008.

Also, in 2008, Tilo Müller wrote “ASLR Smack & Laugh Reference” explaining the different attacks against ASLR in a comprehensive study that outlines the various techniques. In 2009 the paper “Surgically returning to randomized lib(c)” from Giampaolo Fresi Roglia also explains how to bypass non-executable stack and ASLR.

In 2010, Black Hat had 3 talks about Return-Oriented exploitation. More recently and to facilitate ROP exploitation, the French security researcher Jonathan Salwan wrote a tool written in Python called ROPgadget. This tool supports many CPU architectures and allows the attacker to find the different gadgets needed to build its ROP chain. Jonathan is also gives lectures and makes his material accessible. Here is the 2014 course lecture on Return Oriented Programming and ROP chain generation. ROP is the current attack method of choice for exploitation and research is ongoing on mitigation and further evolution.

Hopefully, this is gives you good reference material and a good overview about the evolution of the different attacks and mechanisms against Stack based buffer overflows. There are other type of buffer overflows like format strings, integer overflows and heap based but those are more complex. Buffer Overflows is a good starting point before understanding those. Apart of all the material linked in this article, good resources for learning about this topic are the books Hacking: The Art of Exploitation by Jon Erickson, The Shellcoder’s Handbook: Discovering and Exploiting Security Holes by Chris Anley et.al., and A Bug Hunter’s Diary: A Guided Tour Through the Wilds of Software Security by Tobias Klein.

Count Upon Security

Increase security awareness. Promote, reinforce and learn security skills.

Category Archives: Exploitation

Intro to American Fuzzy Lop – Fuzzing with ASAN and beyond

Evolution of Stack Based Buffer Overflows