Challenge overview
The binary counting reads 32 bytes from standard input and prints either:
Correct!if every per-byte check succeedsWrong!otherwise
Internally it maintains a global 64-bit state in rbx. For each input byte i (0 through 31) it runs a deterministic transformation that mutates rbx. Then it compares the resulting rbx against a built-in table of 32 expected values stored in .data. If the comparison matches for all indices, it prints Correct!.
Quick triage
From the command line:
file counting
It is an x86-64 ELF, statically linked.
To view the important code and the expected table, we can use objdump:
objdump -d -Mintel ./counting
objdump -s -j .data ./counting
What the binary does (start to finish)
1) Program start and input read
The entry point _start seeds the state and reads 32 bytes from stdin into a buffer in .bss:
0000000000401000 <_start>:
401000: xor r12,r12
401003: inc r12
401006: mov ebx,0x12345678
40100b: inc r12
40100e: xor r15,r15
401011: inc r12
401014: lea rsi,[rip+0x10f5] # 402110 <__bss_start>
40101b: inc r12
40101e: lea r14,[rip+0xfeb] # 402010 <expected>
401025: inc r12
401028: mov eax,0x0
40102d: inc r12
401030: mov edi,0x0
401035: inc r12
401038: mov edx,0x20
40103d: inc r12
401040: syscall
Important registers here:
rsipoints to the input buffer (at virtual address402110)r14points to the expected table (at virtual address402010)ebxis seeded with0x12345678r15is the byte index and starts at 0
The syscall at 401040 is read(0, rsi, 0x20):
eax = 0selects syscall number 0 on Linux (read)edi = 0selects file descriptor 0 (stdin)rdx = 0x20is the byte count (32)
After the read, the program clears the “how many indices matched” counter:
401045: xor r11,r11
2) Per-byte outer loop
The main loop runs i = r15 from 0 to 31:
000000000040104b <main_loop>:
40104b: cmp r15,0x20
40104f: jge 40113b <done>
Each iteration reads one input byte:
401058: movzx eax,BYTE PTR [rsi+r15*1]
Then it computes a loop counter r9 from the current global state rbx and the input byte:
401060: mov r8,rax
401066: xor r8,rbx
401072: and r8,0xff
40107c: add r8,0x5
401083: mov r9,r8
So r9 = ((rbx XOR byte) AND 0xff) + 5.
Also note this instruction:
401060: mov r13,r12
r13 snapshots r12 at the start of processing each byte.
The abnormal inc r12 counter
One striking property of this binary is that it executes inc r12 extremely frequently. In the disassembly above you can already see it right after the function prologue and repeated after most operations.
This is not dead code. The value in r12 is used as part of the state mixing:
- In
path_athe code doesxor rbx, r12 - In
path_bthe code doesadd rbx, r12and then multiplies byrbx - In the check it uses the current
r12value (mov rdx, r12) and subtracts the snapshot (sub rdx, r13)
So the effective transformation for byte index i depends on a rapidly changing counter that advances on almost every executed instruction. That is why the program feels “more procedural” than a normal hand-written checksum: r12 is acting like a time-like instruction counter that gets folded into the arithmetic.
3) Inner loop: mutating rbx based on the guessed byte
The inner loop runs while r9 != 0:
0000000000401089 <byte_loop>:
401089: test r9,r9
40108c: je 4010d7 <byte_done>
At each inner-loop iteration:
- It computes a temporary value
r10:
401091: mov r10,r9
401097: xor r10,rax
40109d: and r10,0x1
Here rax still holds the input byte.
- It branches on
r10:
4010a4: cmp r10,0x0
4010a8: jne 4010bf <path_b>
So:
- if
((r9 XOR byte) AND 1) == 0it takespath_a - otherwise it takes
path_b
path_a:
00000000004010ad <path_a>:
4010ad: xor rbx,r12
4010b3: rol rbx,0x3
path_b:
00000000004010bf <path_b>:
4010bf: add rbx,r12
4010c5: imul rbx,rbx,0x7
After either path, it decrements r9 and repeats:
00000000004010cc <path_end>:
4010cc: dec r9
4010d2: jmp 401089 <byte_loop>
4) “Finish mixing” and the actual per-byte check
Once the inner loop ends, it runs the check code at byte_done:
00000000004010d7 <byte_done>:
4010d7: mov rdx,r12
4010dd: sub rdx,r13
4010e3: mov rcx,r15
4010e9: and rcx,0x7
4010f0: rol rdx,cl
4010f6: xor rbx,rdx
4010fc: rol rbx,0x5
401103: mov rax,rbx
401109: mov r8,QWORD PTR [r14+r15*8]
401110: xor rax,r8
401116: cmp rax,0x0
40111a: sete al
401120: movzx r10,al
401127: add r11,r10
What this means:
rdx = r12 - r13- it rotates
rdxby(r15 & 7) - it applies
rbx = ROL( rbX XOR rdx, 5 ) - then it compares the resulting
rbxtoexpected[r15]
The comparison is implemented as:
rax = rbxr8 = expected[r15]rax ^= r8sete alsetsal = 1ifrax == 0, elseal = 0r10 = zero_extend(al)givesr10as 0 or 1- finally
r11 += r10
So r10 is the per-byte success bit for the current index, and r11 counts how many byte indices matched so far.
After that, it increments the outer-loop index:
40112d: inc r15
401133: jmp 40104b <main_loop>
5) Final decision: all 32 bytes must match
When the outer loop ends:
000000000040113b <done>:
40113b: cmp r11,0x20
40113f: jne 401170 <fail>
So it prints success only if r11 == 32.
On success, it prints Correct!:
0000000000401144 <success>:
401144: mov eax,0x1
40114c: mov edi,0x1
401154: lea rsi,[rip+0xea5] # 402000 <success_msg>
40115e: mov edx,0x9
401166: syscall
On failure, it prints Wrong!:
0000000000401170 <fail>:
401180: lea rsi,[rip+0xe82] # 402009 <fail_msg>
40118a: mov edx,0x7
401192: syscall
The expected table
The expected values are stored in .data starting at 0x402010 and consist of 32 little-endian 64-bit values (qwords). Extracting the raw table from the binary yields:
expected[0] = 0xe1a3610df5d93da8
expected[1] = 0x3e232e23b5dc0997
expected[2] = 0x81513ba5e927a45
expected[3] = 0xa1936838fa7261de
expected[4] = 0x5033cd38fb530b5
expected[5] = 0xe70ebb4a0fcea677
expected[6] = 0x6b187da7cdcc6df2
expected[7] = 0x2e797688f4a8a44d
expected[8] = 0x1d24cde94345ff4c
expected[9] = 0xa102f94e8cd3cd4a
expected[10] = 0x6c60ea6e41852aff
expected[11] = 0x3e818bf8f946bb78
expected[12] = 0xd6835df6aa72dc0b
expected[13] = 0xd89e795c51f8cbef
expected[14] = 0x50851bf3e0861d60
expected[15] = 0x4e4360593b6224b7
expected[16] = 0x764fba21d26da7e2
expected[17] = 0x4799e79c99c41ac7
expected[18] = 0x81d0af7c6b2613b4
expected[19] = 0x71274df9ca213b40
expected[20] = 0x6ba349848895752b
expected[21] = 0xacce40e065cadc7a
expected[22] = 0x68df30b65b850658
expected[23] = 0xf08dcaf63a96c712
expected[24] = 0x3015bcbbd307a0aa
expected[25] = 0xb0d564fa30e3b4d5
expected[26] = 0xeb53d0b48a2ac18
expected[27] = 0x290b07cb7da9d12
expected[28] = 0xc41c86b53ce5886c
expected[29] = 0xda740a01a04f1782
expected[30] = 0xaf7b5574fed2d5a0
expected[31] = 0x6e1f58ac94050c54
You do not need to reproduce these values to solve the challenge, but it helps confirm what is being compared.
How to solve it
Key observation
The binary never prints which byte was wrong. But in the check code it computes a 0/1 value and immediately adds it into r11:
40111a: sete al
401120: movzx r10,al
401127: add r11,r10
At the moment execution reaches address 0x401127 (the add r11,r10 instruction), register r10 is already set to:
1ifrbx == expected[r15]0otherwise
Also, at that same moment, $r15 is the current outer-loop index i.
So if we run the program under gdb and break exactly at 0x401127 with a condition if $r15 == i, then reading r10 tells us whether our current guess byte at index i is correct.
Why we can brute-force byte-by-byte
Even though rbx depends on all previous bytes, the program is deterministic. That means:
- if you already fixed bytes
0..i-1correctly - then for byte
iyou only need to search the remaining byte value so that the check at indexipasses
Because the check at each index is immediately verifiable with the breakpoint trick above, you can build the flag sequentially from left to right.
Brute-force strategy
- Fix bytes
0..i-1to the values we already found. - For each candidate character
c:- construct an input buffer of length 32 where byte
i = cand all other bytes are arbitrary filler - run the binary under gdb (the breakpoint triggers and the gdb script immediately exits)
- break at
0x401127only when$r15 == i - read
r10
- construct an input buffer of length 32 where byte
- The correct byte is the one where
r10 == 1. - Repeat for
i = 0..31.
The provided automated solver
The repository includes challenges/reven_WhosCounting/solve.py.
It implements exactly the strategy above, with two practical improvements:
- It auto-locates the breakpoint address.
- In AT&T syntax, the check instruction is printed as
add %r10,%r11. - The script uses
objdump -dand searches for that instruction to get the address.
- In AT&T syntax, the check instruction is printed as
- It uses gdb in batch mode for speed and determinism.
For each candidate byte it writes input.txt and runs gdb with a script like:
set pagination off
set confirm off
file ./counting
break *0x401127 if $r15 == 8
run < input.txt
printf "R10=%lx\n", $r10
quit
It then parses the printed R10=... line and keeps the byte when it equals 1.
By default, the script brute-forces all 32 bytes starting from index 0.
If you already know a prefix of the flag (for example, many CTFs start with MetaCTF{), you can still use that knowledge to speed things up. The brute-force method works left-to-right because once bytes 0..i-1 are correct, byte i becomes independent to determine via the r10 success bit at 0x401127. So you can either:
- start solving from byte index
kinstead of0(leave bytes0..k-1fixed to your known prefix), or - modify the script to pre-seed
resultwith the known prefix bytes and then continue brute-forcing fromk.
solve.py
#!/usr/bin/env python3
import argparse
import os
import re
import string
import subprocess
import sys
from typing import List, Optional
RE_BREAK = re.compile(r"^\s*([0-9a-fA-F]+):.*add\s+%r10,%r11\b")
def find_add_one_break_addr(binary_path: str) -> int:
"""
Locate the instruction that does `add %r10, %r11` in the disassembly.
That is the spot where `r10` is 0/1 depending on whether the current byte check passes.
"""
out = subprocess.check_output(["objdump", "-d", binary_path], text=True, stderr=subprocess.DEVNULL)
for line in out.splitlines():
m = RE_BREAK.match(line)
if not m:
continue
return int(m.group(1), 16)
raise RuntimeError("Could not locate `add %r10,%r11` in objdump output")
def run_gdb_check(
binary_path: str,
break_addr: int,
byte_index: int,
candidate_byte: int,
prefix_bytes: List[int],
total_len: int,
filler_byte: int,
) -> Optional[int]:
"""
Returns 0/1 value from r10 when the conditional breakpoint triggers for $r15 == byte_index.
If the breakpoint didn't trigger / parsing fails, returns None.
"""
buf = bytearray([filler_byte] * total_len)
buf[:byte_index] = bytes(prefix_bytes)
buf[byte_index] = candidate_byte
# Binary reads exactly total_len bytes from stdin, so appending a newline is harmless.
input_payload = bytes(buf) + b"\n"
with open("input.txt", "wb") as f:
f.write(input_payload)
gdb_script = f"""set pagination off
set confirm off
file {binary_path}
break *0x{break_addr:x} if $r15 == {byte_index}
run < input.txt
printf "R10=%lx\\n", $r10
quit
"""
with open("gdb.txt", "w", encoding="utf-8") as f:
f.write(gdb_script)
res = subprocess.run(
["gdb", "-q", "--batch", "-x", "gdb.txt"],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
)
for line in res.stdout.splitlines():
if "R10=" in line:
# Example: `R10=0` or `R10=1`
try:
return int(line.split("R10=", 1)[1].strip(), 16)
except ValueError:
return None
return None
def brute_force_flag(
binary_path: str,
total_len: int,
charset: bytes,
filler_byte: int,
) -> bytes:
break_addr = find_add_one_break_addr(binary_path)
print(f"[+] Using breakpoint address: 0x{break_addr:x}")
result: List[int] = []
for i in range(total_len):
print(f"[+] Solving byte {i}...")
prefix = result[:] # bytes for indices [0, i)
for c in charset:
v = run_gdb_check(
binary_path=binary_path,
break_addr=break_addr,
byte_index=i,
candidate_byte=c,
prefix_bytes=prefix,
total_len=total_len,
filler_byte=filler_byte,
)
if v == 1:
result.append(c)
print(f" -> Byte {i}: {chr(c)!r}")
break
else:
raise RuntimeError(f"Failed to find a matching byte at index {i}")
return bytes(result)
def main() -> None:
ap = argparse.ArgumentParser(description="Solve `reven_WhosCounting` using gdb breakpoint brute-force.")
ap.add_argument("--len", dest="total_len", type=int, default=32, help="Number of bytes to read (default: 32)")
ap.add_argument(
"--charset",
default=None,
help="Charset to brute force (default: printable ASCII 0x20..0x7e).",
)
ap.add_argument("--filler", default="A", help="Fill for unknown bytes beyond the current index (default: 'A').")
args = ap.parse_args()
# Run from the challenge directory so that gdb can read input.txt/gdb.txt.
here = os.path.dirname(os.path.abspath(__file__))
os.chdir(here)
binary_path = os.path.join(here, "counting")
if not os.path.exists(binary_path):
raise FileNotFoundError(f"Binary not found: {binary_path}")
if args.charset is None:
charset = bytes(range(0x20, 0x7F)) # printable ASCII
else:
charset = args.charset.encode("latin-1", errors="strict")
if len(args.filler) != 1:
raise ValueError("--filler must be exactly one character")
filler_byte = ord(args.filler)
flag = brute_force_flag(
binary_path=binary_path,
total_len=args.total_len,
charset=charset,
filler_byte=filler_byte,
)
print("[+] Flag:", flag.decode("ascii"))
if __name__ == "__main__":
main()