Flash CTF – Who’s Counting

Challenge overview

The binary counting reads 32 bytes from standard input and prints either:

  • Correct! if every per-byte check succeeds
  • Wrong! otherwise

Internally it maintains a global 64-bit state in rbx. For each input byte i (0 through 31) it runs a deterministic transformation that mutates rbx. Then it compares the resulting rbx against a built-in table of 32 expected values stored in .data. If the comparison matches for all indices, it prints Correct!.

Quick triage

From the command line:

file counting

It is an x86-64 ELF, statically linked.

To view the important code and the expected table, we can use objdump:

objdump -d -Mintel ./counting
objdump -s -j .data ./counting

What the binary does (start to finish)

1) Program start and input read

The entry point _start seeds the state and reads 32 bytes from stdin into a buffer in .bss:

0000000000401000 <_start>:
  401000:	xor    r12,r12
  401003:	inc    r12
  401006:	mov    ebx,0x12345678
  40100b:	inc    r12
  40100e:	xor    r15,r15
  401011:	inc    r12
  401014:	lea    rsi,[rip+0x10f5]        # 402110 <__bss_start>
  40101b:	inc    r12
  40101e:	lea    r14,[rip+0xfeb]        # 402010 <expected>
  401025:	inc    r12
  401028:	mov    eax,0x0
  40102d:	inc    r12
  401030:	mov    edi,0x0
  401035:	inc    r12
  401038:	mov    edx,0x20
  40103d:	inc    r12
  401040:	syscall

Important registers here:

  • rsi points to the input buffer (at virtual address 402110)
  • r14 points to the expected table (at virtual address 402010)
  • ebx is seeded with 0x12345678
  • r15 is the byte index and starts at 0

The syscall at 401040 is read(0, rsi, 0x20):

  • eax = 0 selects syscall number 0 on Linux (read)
  • edi = 0 selects file descriptor 0 (stdin)
  • rdx = 0x20 is the byte count (32)

After the read, the program clears the “how many indices matched” counter:

  401045:	xor    r11,r11

2) Per-byte outer loop

The main loop runs i = r15 from 0 to 31:

000000000040104b <main_loop>:
  40104b:	cmp    r15,0x20
  40104f:	jge    40113b <done>

Each iteration reads one input byte:

  401058:	movzx  eax,BYTE PTR [rsi+r15*1]

Then it computes a loop counter r9 from the current global state rbx and the input byte:

  401060:	mov    r8,rax
  401066:	xor    r8,rbx
  401072:	and    r8,0xff
  40107c:	add    r8,0x5
  401083:	mov    r9,r8

So r9 = ((rbx XOR byte) AND 0xff) + 5.

Also note this instruction:

  401060:	mov    r13,r12

r13 snapshots r12 at the start of processing each byte.

The abnormal inc r12 counter

One striking property of this binary is that it executes inc r12 extremely frequently. In the disassembly above you can already see it right after the function prologue and repeated after most operations.

This is not dead code. The value in r12 is used as part of the state mixing:

  • In path_a the code does xor rbx, r12
  • In path_b the code does add rbx, r12 and then multiplies by rbx
  • In the check it uses the current r12 value (mov rdx, r12) and subtracts the snapshot (sub rdx, r13)

So the effective transformation for byte index i depends on a rapidly changing counter that advances on almost every executed instruction. That is why the program feels “more procedural” than a normal hand-written checksum: r12 is acting like a time-like instruction counter that gets folded into the arithmetic.

3) Inner loop: mutating rbx based on the guessed byte

The inner loop runs while r9 != 0:

0000000000401089 <byte_loop>:
  401089:	test   r9,r9
  40108c:	je     4010d7 <byte_done>

At each inner-loop iteration:

  1. It computes a temporary value r10:
  401091:	mov    r10,r9
  401097:	xor    r10,rax
  40109d:	and    r10,0x1

Here rax still holds the input byte.

  1. It branches on r10:
  4010a4:	cmp    r10,0x0
  4010a8:	jne    4010bf <path_b>

So:

  • if ((r9 XOR byte) AND 1) == 0 it takes path_a
  • otherwise it takes path_b

path_a:

00000000004010ad <path_a>:
  4010ad:	xor    rbx,r12
  4010b3:	rol    rbx,0x3

path_b:

00000000004010bf <path_b>:
  4010bf:	add    rbx,r12
  4010c5:	imul   rbx,rbx,0x7

After either path, it decrements r9 and repeats:

00000000004010cc <path_end>:
  4010cc:	dec    r9
  4010d2:	jmp    401089 <byte_loop>

4) “Finish mixing” and the actual per-byte check

Once the inner loop ends, it runs the check code at byte_done:

00000000004010d7 <byte_done>:
  4010d7:	mov    rdx,r12
  4010dd:	sub    rdx,r13
  4010e3:	mov    rcx,r15
  4010e9:	and    rcx,0x7
  4010f0:	rol    rdx,cl

  4010f6:	xor    rbx,rdx
  4010fc:	rol    rbx,0x5

  401103:	mov    rax,rbx
  401109:	mov    r8,QWORD PTR [r14+r15*8]
  401110:	xor    rax,r8
  401116:	cmp    rax,0x0
  40111a:	sete   al
  401120:	movzx  r10,al
  401127:	add    r11,r10

What this means:

  • rdx = r12 - r13
  • it rotates rdx by (r15 & 7)
  • it applies rbx = ROL( rbX XOR rdx, 5 )
  • then it compares the resulting rbx to expected[r15]

The comparison is implemented as:

  • rax = rbx
  • r8 = expected[r15]
  • rax ^= r8
  • sete al sets al = 1 if rax == 0, else al = 0
  • r10 = zero_extend(al) gives r10 as 0 or 1
  • finally r11 += r10

So r10 is the per-byte success bit for the current index, and r11 counts how many byte indices matched so far.

After that, it increments the outer-loop index:

  40112d:	inc    r15
  401133:	jmp    40104b <main_loop>

5) Final decision: all 32 bytes must match

When the outer loop ends:

000000000040113b <done>:
  40113b:	cmp    r11,0x20
  40113f:	jne    401170 <fail>

So it prints success only if r11 == 32.

On success, it prints Correct!:

0000000000401144 <success>:
  401144:	mov    eax,0x1
  40114c:	mov    edi,0x1
  401154:	lea    rsi,[rip+0xea5]        # 402000 <success_msg>
  40115e:	mov    edx,0x9
  401166:	syscall

On failure, it prints Wrong!:

0000000000401170 <fail>:
  401180:	lea    rsi,[rip+0xe82]        # 402009 <fail_msg>
  40118a:	mov    edx,0x7
  401192:	syscall

The expected table

The expected values are stored in .data starting at 0x402010 and consist of 32 little-endian 64-bit values (qwords). Extracting the raw table from the binary yields:

expected[0]  = 0xe1a3610df5d93da8
expected[1]  = 0x3e232e23b5dc0997
expected[2]  = 0x81513ba5e927a45
expected[3]  = 0xa1936838fa7261de
expected[4]  = 0x5033cd38fb530b5
expected[5]  = 0xe70ebb4a0fcea677
expected[6]  = 0x6b187da7cdcc6df2
expected[7]  = 0x2e797688f4a8a44d
expected[8]  = 0x1d24cde94345ff4c
expected[9]  = 0xa102f94e8cd3cd4a
expected[10] = 0x6c60ea6e41852aff
expected[11] = 0x3e818bf8f946bb78
expected[12] = 0xd6835df6aa72dc0b
expected[13] = 0xd89e795c51f8cbef
expected[14] = 0x50851bf3e0861d60
expected[15] = 0x4e4360593b6224b7
expected[16] = 0x764fba21d26da7e2
expected[17] = 0x4799e79c99c41ac7
expected[18] = 0x81d0af7c6b2613b4
expected[19] = 0x71274df9ca213b40
expected[20] = 0x6ba349848895752b
expected[21] = 0xacce40e065cadc7a
expected[22] = 0x68df30b65b850658
expected[23] = 0xf08dcaf63a96c712
expected[24] = 0x3015bcbbd307a0aa
expected[25] = 0xb0d564fa30e3b4d5
expected[26] = 0xeb53d0b48a2ac18
expected[27] = 0x290b07cb7da9d12
expected[28] = 0xc41c86b53ce5886c
expected[29] = 0xda740a01a04f1782
expected[30] = 0xaf7b5574fed2d5a0
expected[31] = 0x6e1f58ac94050c54

You do not need to reproduce these values to solve the challenge, but it helps confirm what is being compared.

How to solve it

Key observation

The binary never prints which byte was wrong. But in the check code it computes a 0/1 value and immediately adds it into r11:

  40111a:	sete   al
  401120:	movzx  r10,al
  401127:	add    r11,r10

At the moment execution reaches address 0x401127 (the add r11,r10 instruction), register r10 is already set to:

  • 1 if rbx == expected[r15]
  • 0 otherwise

Also, at that same moment, $r15 is the current outer-loop index i.

So if we run the program under gdb and break exactly at 0x401127 with a condition if $r15 == i, then reading r10 tells us whether our current guess byte at index i is correct.

Why we can brute-force byte-by-byte

Even though rbx depends on all previous bytes, the program is deterministic. That means:

  • if you already fixed bytes 0..i-1 correctly
  • then for byte i you only need to search the remaining byte value so that the check at index i passes

Because the check at each index is immediately verifiable with the breakpoint trick above, you can build the flag sequentially from left to right.

Brute-force strategy

  1. Fix bytes 0..i-1 to the values we already found.
  2. For each candidate character c:
    • construct an input buffer of length 32 where byte i = c and all other bytes are arbitrary filler
    • run the binary under gdb (the breakpoint triggers and the gdb script immediately exits)
    • break at 0x401127 only when $r15 == i
    • read r10
  3. The correct byte is the one where r10 == 1.
  4. Repeat for i = 0..31.

The provided automated solver

The repository includes challenges/reven_WhosCounting/solve.py.

It implements exactly the strategy above, with two practical improvements:

  1. It auto-locates the breakpoint address.
    • In AT&T syntax, the check instruction is printed as add %r10,%r11.
    • The script uses objdump -d and searches for that instruction to get the address.
  2. It uses gdb in batch mode for speed and determinism.

For each candidate byte it writes input.txt and runs gdb with a script like:

set pagination off
set confirm off
file ./counting
break *0x401127 if $r15 == 8
run < input.txt
printf "R10=%lx\n", $r10
quit

It then parses the printed R10=... line and keeps the byte when it equals 1.

By default, the script brute-forces all 32 bytes starting from index 0.

If you already know a prefix of the flag (for example, many CTFs start with MetaCTF{), you can still use that knowledge to speed things up. The brute-force method works left-to-right because once bytes 0..i-1 are correct, byte i becomes independent to determine via the r10 success bit at 0x401127. So you can either:

  • start solving from byte index k instead of 0 (leave bytes 0..k-1 fixed to your known prefix), or
  • modify the script to pre-seed result with the known prefix bytes and then continue brute-forcing from k.

solve.py

#!/usr/bin/env python3
import argparse
import os
import re
import string
import subprocess
import sys
from typing import List, Optional


RE_BREAK = re.compile(r"^\s*([0-9a-fA-F]+):.*add\s+%r10,%r11\b")


def find_add_one_break_addr(binary_path: str) -> int:
    """
    Locate the instruction that does `add %r10, %r11` in the disassembly.
    That is the spot where `r10` is 0/1 depending on whether the current byte check passes.
    """
    out = subprocess.check_output(["objdump", "-d", binary_path], text=True, stderr=subprocess.DEVNULL)
    for line in out.splitlines():
        m = RE_BREAK.match(line)
        if not m:
            continue
        return int(m.group(1), 16)
    raise RuntimeError("Could not locate `add %r10,%r11` in objdump output")


def run_gdb_check(
    binary_path: str,
    break_addr: int,
    byte_index: int,
    candidate_byte: int,
    prefix_bytes: List[int],
    total_len: int,
    filler_byte: int,
) -> Optional[int]:
    """
    Returns 0/1 value from r10 when the conditional breakpoint triggers for $r15 == byte_index.
    If the breakpoint didn't trigger / parsing fails, returns None.
    """
    buf = bytearray([filler_byte] * total_len)
    buf[:byte_index] = bytes(prefix_bytes)
    buf[byte_index] = candidate_byte

    # Binary reads exactly total_len bytes from stdin, so appending a newline is harmless.
    input_payload = bytes(buf) + b"\n"

    with open("input.txt", "wb") as f:
        f.write(input_payload)

    gdb_script = f"""set pagination off
set confirm off
file {binary_path}
break *0x{break_addr:x} if $r15 == {byte_index}
run < input.txt
printf "R10=%lx\\n", $r10
quit
"""
    with open("gdb.txt", "w", encoding="utf-8") as f:
        f.write(gdb_script)

    res = subprocess.run(
        ["gdb", "-q", "--batch", "-x", "gdb.txt"],
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        text=True,
    )

    for line in res.stdout.splitlines():
        if "R10=" in line:
            # Example: `R10=0` or `R10=1`
            try:
                return int(line.split("R10=", 1)[1].strip(), 16)
            except ValueError:
                return None
    return None


def brute_force_flag(
    binary_path: str,
    total_len: int,
    charset: bytes,
    filler_byte: int,
) -> bytes:
    break_addr = find_add_one_break_addr(binary_path)
    print(f"[+] Using breakpoint address: 0x{break_addr:x}")

    result: List[int] = []
    for i in range(total_len):
        print(f"[+] Solving byte {i}...")
        prefix = result[:]  # bytes for indices [0, i)
        for c in charset:
            v = run_gdb_check(
                binary_path=binary_path,
                break_addr=break_addr,
                byte_index=i,
                candidate_byte=c,
                prefix_bytes=prefix,
                total_len=total_len,
                filler_byte=filler_byte,
            )
            if v == 1:
                result.append(c)
                print(f"    -> Byte {i}: {chr(c)!r}")
                break
        else:
            raise RuntimeError(f"Failed to find a matching byte at index {i}")

    return bytes(result)


def main() -> None:
    ap = argparse.ArgumentParser(description="Solve `reven_WhosCounting` using gdb breakpoint brute-force.")
    ap.add_argument("--len", dest="total_len", type=int, default=32, help="Number of bytes to read (default: 32)")
    ap.add_argument(
        "--charset",
        default=None,
        help="Charset to brute force (default: printable ASCII 0x20..0x7e).",
    )
    ap.add_argument("--filler", default="A", help="Fill for unknown bytes beyond the current index (default: 'A').")
    args = ap.parse_args()

    # Run from the challenge directory so that gdb can read input.txt/gdb.txt.
    here = os.path.dirname(os.path.abspath(__file__))
    os.chdir(here)

    binary_path = os.path.join(here, "counting")
    if not os.path.exists(binary_path):
        raise FileNotFoundError(f"Binary not found: {binary_path}")

    if args.charset is None:
        charset = bytes(range(0x20, 0x7F))  # printable ASCII
    else:
        charset = args.charset.encode("latin-1", errors="strict")

    if len(args.filler) != 1:
        raise ValueError("--filler must be exactly one character")
    filler_byte = ord(args.filler)

    flag = brute_force_flag(
        binary_path=binary_path,
        total_len=args.total_len,
        charset=charset,
        filler_byte=filler_byte,
    )

    print("[+] Flag:", flag.decode("ascii"))


if __name__ == "__main__":
    main()