Flash CTF – Slithering Bytes

Overview

Python typically compiles .py source into bytecode before running it. Bytecode is a sequence of instructions (load a value, call a function, jump, compare, etc.) similar to assembly that the Python virtual machine executes.

This design gives Python several advantages. Bytecode is mostly platform-independent. It also enables dynamic features such as runtime type changes, reflection, dynamic imports, and on-the-fly code execution, all of which are difficult to support in compiled languages like C.

You can use the dis module in Python to generate the bytecode of any function and see it for yourself:

import dis

def some_function():
    ...

dis.dis(some_function)

In this challenge, you were presented with dis.dis() output of a flag checker function.

Reverse engineering the bytecode

The function validates an input string called flag. It rejects (returns False) on any mismatch and only succeeds (returns True) if every check matches.

Python bytecode operates on a stack. Instructions push values onto the stack, pop them off for operations, and compare or store results.

You can see that the code is broken into sections separated by a newline and a number in the first column. This is a debugging artifact that tells you which line number in the source code that bytecode corresponds to. This makes it easier for us to review.

The first number in each line for each instruction is the byte offset in the bytecode. It tells you the position (in bytes) where that instruction starts in the compiled bytecode sequence. We can ignore it for this.

Let’s walk through the first few instructions:

 2 LOAD_GLOBAL              1 (NULL + len)
12 LOAD_FAST                0 (flag)
14 CALL                     1
22 LOAD_CONST               1 (30)
24 COMPARE_OP              55 (!=)
28 POP_JUMP_IF_FALSE        1 (to 32)
30 RETURN_CONST             2 (False)

LOAD_GLOBAL pushes the len function onto the stack. LOAD_FAST pushes the flag parameter. CALL 1 pops both, calls len(flag), and pushes the result. LOAD_CONST pushes 30COMPARE_OP 55 pops both values and compares them with !=. If true, the jump is skipped and the function returns False. Otherwise execution continues.

In other words, this section corresponds to if len(flag) != 30: return False

Later in the bytecode:

 98 LOAD_GLOBAL              5 (NULL + ord)
108 LOAD_FAST                0 (flag)
110 LOAD_FAST                3 (i)
112 BINARY_SUBSCR
116 CALL                     1
124 LOAD_CONST               7 (181)
126 BINARY_OP               12 (^)

This loads ord, then loads flag[i] using BINARY_SUBSCR (subscript operator), calls ord(flag[i]), loads 181, and performs XOR (^). This section corresponds to ord(flag[i]) ^ 181

You can manually trace through the entire bytecode this way or use a tool to help.

In reconstructed Python, the function looks like this:

def check(flag):
    if len(flag) != 30:
        return False

    p2 = [208, 212, 225, 206, 219, 222, 234, 219, 193, 208, 215, 193, 214, 209, 200]
    p1 = [110, 87, 96, 101, 80, 66, 70, 74, 124, 75, 124, 90, 70, 76, 70]

    for i in range(1, len(flag), 2):
        if (ord(flag[i]) ^ 181) != p2[i // 2]:
            return False

    for i in range(0, len(flag), 2):
        if (ord(flag[i]) ^ 35) != p1[i // 2]:
            return False

    return True

Finding the flag

There are two loops. The first loop checks indices 1, 3, 5, and so on. The other loop checks indices 0, 2, 4, and so on.

Each loop uses i // 2 to map the string index into a 15-element table. That matches the fact that a 30-character string has 15 even indices and 15 odd indices.

Each character is converted to an integer with ord(), XORed with a constant, and compared to a stored value. XOR is reversible. If a ^ k = b, then a = b ^ k. That lets us recover the original characters from the tables:

even_chars = [chr(x ^ 35) for x in p1]
odd_chars  = [chr(x ^ 181) for x in p2]

Then interleave them to rebuild the full string:

out = []
for j in range(15):
    out.append(even_chars[j])
    out.append(odd_chars[j])

flag = "".join(out)
print(flag)

Running this produces the single 30-character string accepted by the bytecode, which is the flag.