Flash CTF – Steg64

Overview

This challenge abuses the fact that Base64 does not always utilize every bit that could be encoded, defaulting to just zeroing the extra bits, but almost no libraries actually verify the bits are zeroed. With that discrepancy, we can hide secret messages within unused base64 bits.

Initial Look

Reviewing the challenge file, we’re given a list of 52 strings, each of which appears to be a Base64 encoded string.

VEh=
QT==
Tl==
S5==
IF==
Wd==
Tx==
VY==
IF==
SA==
Qd==
Q1==
Sx==
RR==
Up==
Ie==
Cs==
Ct==
Qh==
VX==
VN==
IJ==
T4==
Vc==
Ut==
IN==
Rt==
TH==
Qc==
R8==
IN==
Se==
Ux==
IN==
SR==
Ts==
II==
Qd==
Th==
T3==
VN==
SI==
RY==
Us==
IF==
Q9==
QQ==
U9==
VF==
TP==
RU==
IQ==

Base64 decoding all of the lines and combining them gives us a message, but the message doesn’t have a flag!

Your flag is in another castle!

There’s no extra whitespace or any other bytes in the file, it’s a simple text file so there’s no metadata, where could the flag be?

Understanding Base64

Fundamentally, Base64 simply maps one of 64 characters onto each 6 bit chunk of the input message. This allows any binary data to be presented with only 64 (65 with padding) characters (most commonly the bytes A-Za-z0-9+/). This means that for every three bytes of input (or 24 bits), four characters are used to encode those three input bytes. For example, the input bytes “ABC” are encoded into the Base64 output “QUJD”. This works perfectly when there’s a multiple of 6 bits being encoded, such as when 24 bits (three bytes) are being encoded, but what happens in cases where a non-multiple of 6 bits is being encoded? In this case, the remainder of any 6 bit segment that is already being encoded will be zeroed out, and any remaining 6 bit segments that are required to get to a multiple of 8 bits are simply replaced with the padding character ‘=’, this is where the signature = or == at the end of base64 messages comes from. Continuing with the previous example, the input bytes “AB” are turned into the Base64 output bytes “QUI=”. Going through that example more carefully in binary will show a very interesting property however. Take the “AB” example, this is the binary data 01000001 01000010, using the six bit mapping, we see that it becomes 010000 010100 001000, plus an additional padding byte. Notice the extra two 0 bits added to the end of the third output character, these bytes are not used by any Base64 decoding algorithms, but are almost always set to zero. What if we wanted to encode extra information into the Base64 message by setting those bits to 11 instead of 00? Instead of QUI=, we get QUL=, but interestingly, we see that it still decodes into AB, despite the additional two bits of secret information! 

QUL= still decodes into AB

This means that for messages that end with two out of three bytes being used in the final Base64 chunk, we can hide two extra bits of information. The exact same principle lies true with single byte messages, except that we can hide up to four additional bits of information.

Solution Script

# Base64 character set
BASE64_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"

def base64_to_bits(b64_string):
    # Convert each Base64 character to its 6-bit binary representation
    bits = ''.join(format(BASE64_CHARS.index(char), '06b') for char in b64_string if char in BASE64_CHARS)
    return bits

with open('steg64.txt', 'r') as f:
    base64_strings = f.readlines()

total_unhidden = ''
total_hidden = ''

for s in base64_strings:
    bits =  base64_to_bits(s)
    unhidden = bits[:(len(bits)//8) * 8]
    hidden = bits[len(unhidden):]
    total_unhidden += unhidden
    total_hidden += hidden

decoded_bytes = bytes([int(total_unhidden[i:i+8], 2) for i in range(0, len(total_unhidden), 8)])
print("Cover Message:", decoded_bytes)
decoded_bytes = bytes([int(total_hidden[i:i+8], 2) for i in range(0, len(total_hidden), 8)])
print("Hidden Message:", decoded_bytes)
python3 ./solve.py
/solve.py
Cover Message: b'THANK YOU HACKER!\n\nBUT OUR FLAG IS IN ANOTHER CASTLE!'
Hidden Message: b'MetaCTF{4_f3w_3xtr4_b1t5}\x00'

Further Reading

This challenge was inspired heavily by this blog post written by Gynvael Coldwind, check that out if you found this challenge fun, they have some more interesting things you can do with Base64 that this challenge doesn’t cover!