Overview
Python typically compiles .py source into bytecode before running it. Bytecode is a sequence of instructions (load a value, call a function, jump, compare, etc.) similar to assembly that the Python virtual machine executes.
This design gives Python several advantages. Bytecode is mostly platform-independent. It also enables dynamic features such as runtime type changes, reflection, dynamic imports, and on-the-fly code execution, all of which are difficult to support in compiled languages like C.
You can use the dis module in Python to generate the bytecode of any function and see it for yourself:
import dis
def some_function():
...
dis.dis(some_function)
In this challenge, you were presented with dis.dis() output of a flag checker function.
Reverse engineering the bytecode
The function validates an input string called flag. It rejects (returns False) on any mismatch and only succeeds (returns True) if every check matches.
Python bytecode operates on a stack. Instructions push values onto the stack, pop them off for operations, and compare or store results.
You can see that the code is broken into sections separated by a newline and a number in the first column. This is a debugging artifact that tells you which line number in the source code that bytecode corresponds to. This makes it easier for us to review.
The first number in each line for each instruction is the byte offset in the bytecode. It tells you the position (in bytes) where that instruction starts in the compiled bytecode sequence. We can ignore it for this.
Let’s walk through the first few instructions:
2 LOAD_GLOBAL 1 (NULL + len)
12 LOAD_FAST 0 (flag)
14 CALL 1
22 LOAD_CONST 1 (30)
24 COMPARE_OP 55 (!=)
28 POP_JUMP_IF_FALSE 1 (to 32)
30 RETURN_CONST 2 (False)
LOAD_GLOBAL pushes the len function onto the stack. LOAD_FAST pushes the flag parameter. CALL 1 pops both, calls len(flag), and pushes the result. LOAD_CONST pushes 30. COMPARE_OP 55 pops both values and compares them with !=. If true, the jump is skipped and the function returns False. Otherwise execution continues.
In other words, this section corresponds to if len(flag) != 30: return False
Later in the bytecode:
98 LOAD_GLOBAL 5 (NULL + ord)
108 LOAD_FAST 0 (flag)
110 LOAD_FAST 3 (i)
112 BINARY_SUBSCR
116 CALL 1
124 LOAD_CONST 7 (181)
126 BINARY_OP 12 (^)
This loads ord, then loads flag[i] using BINARY_SUBSCR (subscript operator), calls ord(flag[i]), loads 181, and performs XOR (^). This section corresponds to ord(flag[i]) ^ 181
You can manually trace through the entire bytecode this way or use a tool to help.
In reconstructed Python, the function looks like this:
def check(flag):
if len(flag) != 30:
return False
p2 = [208, 212, 225, 206, 219, 222, 234, 219, 193, 208, 215, 193, 214, 209, 200]
p1 = [110, 87, 96, 101, 80, 66, 70, 74, 124, 75, 124, 90, 70, 76, 70]
for i in range(1, len(flag), 2):
if (ord(flag[i]) ^ 181) != p2[i // 2]:
return False
for i in range(0, len(flag), 2):
if (ord(flag[i]) ^ 35) != p1[i // 2]:
return False
return True
Finding the flag
There are two loops. The first loop checks indices 1, 3, 5, and so on. The other loop checks indices 0, 2, 4, and so on.
Each loop uses i // 2 to map the string index into a 15-element table. That matches the fact that a 30-character string has 15 even indices and 15 odd indices.
Each character is converted to an integer with ord(), XORed with a constant, and compared to a stored value. XOR is reversible. If a ^ k = b, then a = b ^ k. That lets us recover the original characters from the tables:
even_chars = [chr(x ^ 35) for x in p1]
odd_chars = [chr(x ^ 181) for x in p2]
Then interleave them to rebuild the full string:
out = []
for j in range(15):
out.append(even_chars[j])
out.append(odd_chars[j])
flag = "".join(out)
print(flag)
Running this produces the single 30-character string accepted by the bytecode, which is the flag.