Buffer Overflows

Goals

Understanding how stack and heap memory is used

Understanding the concepts of buffer overflow

Exploiting a stack buffer overflow vulnerability

Understanding code reuse attacks (advanced buffer overflow attacks)

With the knowledge about a buffer overflow, launch an attack that exploits a stack buffer overflow vulnerability in a provided sort.c program.

Read up on and write about code reuse attacks.

Explain what a buffer overflow is and why are they dangerous?

Explain how an actual buffer overflow works on both the stack and the heap.

Analysis

Stack Buffer Overflow

Describe the stack in the address space of the VM (in generalities).

In general, a stack is an area where we allocate space for dynamically created variables, which is created when a new process is created. Then, the variables are discarded when the process finishes.

Addresses where in memory the stack would be located (specifically). Which direction, relative to overall memory, does a stack consume memory when it grows?

The stack grows from bottom to top relative to the overall memory. As more processes are called, the stack consumes more memory pushing local variables to lower parts of the memory.

Figure 1. Stack within Address Space

Explain how program control flow is implemented using the stack.

When a program is called, the very first thing that the program needs is to save the return address so it can return to control when it finishes. In addition, it needs to save the parameters while passing it to the function that is called. While the in the function, the function might have some local variables, which also needs to be saved. All of these data are saved in the stack.

How does the stack structure get affected when a buffer of size ‘non-binary’ is allocated by a function (i.e. – buffer size which causes misalignment within the stack)? [Also known as ‘non-binary’]

Given a non-binary size allocated by a function, the stack will create unused padding between the variables to make sure it is stored in the right location. Although it creates unused memory, it is more efficient.

Create a diagram that includes the following

What does the stack structure look like when data is pushed onto the stack and popped off the stack?

Figure 2. Example Stack Frame with Functions P and Q (Stack layout)

From the figure above, when a program is compiled, return address, parameters and local variables of the program are pushed to the stack in the reverse order of declaration. As the program exits, local variables, parameters and return address are pop out of the stack frame, while the stack pointer is moved accordingly.

Show what register values are placed onto and used with the stack.

Within a stack frame, there are two register values that are used with the stack. These are the frame pointer and the stack pointer, the frame pointer chains frames together so when a function is exiting, it can restore the stack frame for the calling function before giving control to the return address. On the other hand, the stack pointer, points to the last position of the stack, which it moves when variables are added or removed.

Where would arguments be placed on the stack?

Arguments are pushed into the stack in reverse order of declaration

Where are local user variables placed on the stack?

Local variables are pushed into the stack in reverse order of declaration

Testing Program – Stack Buffer Overflow

Write a testing program (in C) that contains a stack buffer overflow vulnerability. (You cannot use sort.c from task 2). You are not required to exploit it.


#include <stdio.h>

int main(int argc, char *argv[])
{
	char buf[8];
	memcpy(buf, argv[1], strlen(argv[1]));
	printf(buf);
	printf("\n");
}

Code Snippet 1. Stack Buffer Overflow Exploit

Show what the stack layout looks like and explain how to exploit it. (Include a diagram)

Given the piece of code above, the length of argv is not checked whenever is passed to the memcpy, which clearly shows a buffer overflow vulnerability. In order to exploit the program, first, it is needed to get the number of bytes required to overwrite the return address. To get that number of bytes, an input is given with a certain length to try and overwrite the return address. In this case, a segmentation fault occurs whenever the length reaches 20 bytes. By this point, we can use the GBD program to get the addresses of system, _exit and /bin/sh, in order to be able to append it to our input and make it so it can overwrite the return address with the address of system, transfer the control to system, add the address of _exit, which is the return address of the system call, and be able to run /bin/sh by placing it as a parameter to the system. Finally, the following input is ran to make the exploit happen: ./stack_overflow `perl -e 'print pack("H*", "58585858585858585858585858585858585858589071e5b7c4cbecb7247af7b7")'`

Figure 3. Buffer Overflow Exploit example

Heap Buffer Overflow

Where is the Heap located in a machine’s memory map?

The heap is located above the program code and global data, on the lower part of the memory and grows towards high addresses

Contrast this to Stack memory allocation (in general terms).

The heap in comparison with the stack, the heap grows from low addresses to high addresses. Also, the memory of the heap is dynamically allocated at runtime. On the other hand, the memory of the stack is allocated at compile time. Additionally, the heap memory has to be manually allocated compared to the stack, which is allocated automatically by the compiler. Finally, the access time of the heap takes more than the stack.

Describe the data structure implemented in a heap memory.

In terms of data structure, the heap is more like a linked list of records, compared to a stack which is like an array.

How are allocated and unallocated chunks structured? (Show a diagram)

This is allocated during runtime, and it is allocated manually by the programmer using malloc() or free() to deallocate such memory. Since heap memory is not contiguous, the structure of a chunk is different of an allocated chunk in comparison to an unallocated chunk.

Figure 4. Allocated Heap Chunk

Figure 5. Unallocated Heap Chunk

As seen in the figures 4 and 5 above, whenever memory from the heap is freed, the unallocated chunk is structured differently, creating two new pointers (Forward pointer and Back pointer), in order to keep track of the chunks within the heap.

Is heap memory contiguous within the memory architecture? (Yes or No and why?)

No, since the memory at the heap is allocated in any random order. which is reason of its main issue, fragmentation.

Testing Program – Heap Buffer Overflow

Write a testing program (in C) that contains a heap buffer overflow vulnerability. (Provide an example in the project. Copy/paste is fine. No Screenshot). Again, you do not have to exploit it.

Show what the heap layout looks like and explain how to exploit it. (Include a diagram)


#include <stdio.h>

struct Chunk {
    char buf[16];
    void (* message)(char *);
};

int main(int argc, char *argv[])
{
    var1 = malloc(sizeof(struct Chunk));
    var2 = malloc(sizeof(struct Chunk));

    var1->message = &print_var1;
    var2->message = &print_var2;

    printf("Input var1: ");
    fgets(var1->buf, 20, stdin);
    var1->buf[strcspn(var1->buf, "\n")] = 0;

    printf("Input var2: ");
    fgets(var2->buf, 10, stdin);
    var2->buf[strcspn(var2->buf, "\n")] = 0;

    var1->message(var1->buf);
    var2->message(var2->buf);
}

Code Snippet 2. Heap Buffer Overflow Exploit

Given the piece of code above, the length of buf is not checked whenever is passed to the fgets, which clearly shows a buffer overflow vulnerability. In order to exploit the program, first, we can see that after 16 bytes for var1, it overwrites the heap metadata of other chunks, making var2 unusable. It can be noted that the heap does have a return address like the stack.

Figure 6. Heap Overflow Exploit example

Exploiting Buffer Overflow

Laptop, OSX 10.14.3, 64-bit

Give an explanation about how you figured out the exploit.

By applying the same methodology in task 1; we can see from the code that the array is not checked for the length so, in order to exploit the sort program, since we are dealing with an array. First, it is needed to get the number of items required to overwrite the return address. To get that number of items, on the data.txt, I started adding items in ascending order to avoid the if statement of the program, which sorts the addresses. In this case, a segmentation fault occurred whenever there was 17 items, and the return address was overwritten on the 18th item. By this point, we can use the GBD program to get the addresses of system, _exit and /bin/sh, in order to be able to append it to our data.txt right after the 16th item and make it so it can transfer the control to system, be able to run /bin/sh as a parameter and exit cleanly. In this case, the addresses are: 0xb7e57190, 0xb7eccbc4 and 0xb7f77a24 for the system(), _exit() and /bin/sh respectively. It can be noted that in order for this to work, the output of the sorted list has to include the three addresses at the end in their respective order. So for the values that were chosen, all of the values are lower than the address 0xb7e57190 for it to make sure the three addresses needed to exploit the buffer overflow works.

Open Question

Explain the similarities and differences between Jump-Oriented Programing and Return- Oriented Programing.

Both Return-Oriented Programming (ROP) and Jump-Oriented Programming (JOP) are a class of code-reuse attack, which exploit the buffer overflow to load a set of addresses into memory to execute and take over the control flow. However, Return-Oriented Programming focuses on short code sequences ending in a ret instruction, which hijacks the control flow and perform malicious operations by taking control of the stack. These are found within the existing binaries. On the other hand, Jump-Oriented Programming is a new class of code reuse attack, which does not rely on the stack and ret instructions compared to the Return-Oriented Programming. This is done by a dispatcher gadget, which dispatch and executes the functional snippets of code. In addition, instead of ending in a ret instruction, JOP uses the jmp instruction to redirect the flow to the dispatcher gadget, to be able to execute the attack.

What protection measures do they overcome or are vulnerable to?

Return-Oriented Programming suffers from some vulnerabilities, which because it depends on the stack and ret instructions, a number of defenses has been implemented to detect or prevent ROP-based attacks. These are DynIMA, which detect the consecutive execution ending with ret instructions, and DROP, which observes executions that pops up return addresses that always point to the same specific memory space. On the other hand, Jump-Oriented Programming since it does not rely on the stack or the ret instructions for control flow, all known techniques to defend against ROP-based attacks are overcame. However, given the fact that JOP uses jmp instructions to base its attack, it performs an uni-directional control-flow to transfer to its target, which makes it difficult to get control back for the next jump-oriented gadget.

Why might you use one over the other?

While JOP seems to overcome all the defenses against ROP, JOP requires a dispatcher gadget, which is fundamental to keep track of all other functional gadgets. In addition, JOP requires gadgets that end in indirect branches, which constructing the attack code manually is very complex compared to ROP. However, while deciding which one might be used over the other, it rather depends on the behavior of allowing entry to any address in an executable program or library.