Flash CTF – Carrot

Overview

In this reverse engineering challenge, we are given a malware sample to analyze.

Solution

We are given the file carrot. Running the file command, we see the following:

┌──(kali㉿kali)-[~/Desktop/carrot]
└─$ file carrot 
carrot: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e284377572eb536d9796e663b6216f8e03fa1bcf, for GNU/Linux 4.4.0, stripped

The important things to note here are:

  • The binary is a 64-bit ELF executable
  • PIE is enabled, which means the memory addresses of functions and data will be relative
  • The binary is stripped, making it more difficult to reverse engineer by removing function and debugging information

Running strings gives us some interesting results, including references to curl, RSA and AES functions, and various strings (URLs, a web header, process names, references to hypervisor software). From what we can see, we can guess that the binary is doing something with web requests and encryption. Perhaps it’s sending out encrypted traffic, which would make sense for a piece of malware. Let’s continue our analysis.

Disassembly and decompilation

For this writeup, we will be using Ghidra to disassemble and decompile carrot, since it’s one of the most common reverse engineering tools and somewhat beginner-friendly.

After importing the binary and opening it up in Ghidra’s CodeBrowser tool, we can analyze the binary using the default settings. After analysis has been completed, we can look at the symbol tree to gain a better understanding of what functions are present in this binary. Unfortunately, this is where stripping the binary makes things a little difficult, as there appears to be no main function present.

To find main(), we can go to the entry function and look at the decompiled code, which should look something like this:

void processEntry entry(undefined8 param_1,undefined8 param_2)  {   
    undefined auStack_8 [8];

    __libc_start_main(FUN_00102450,param_2,&stack0x00000008,0,0,param_1,auStack_8);

    do {                     
        /* WARNING: Do nothing block with infinite loop */   
    } while( true ); 
} 

__libc_start_main() is used to initialize the runtime environment before calling the main function. The function to be treated as main is usually the first argument in a call to __libc_start_main(). In our case, FUN_00102450 is the first argument, so we can double-click on it to view that function’s code. Furthermore, by pressing L we can rename the function to “main”, which will assist in understanding how the binary works as we continue to analyze it.

Let’s dig deeper into main(). We see what appears to be a giant if-else statement that checks the return values for various functions:

iVar1 = FUN_00103531();
if (((((((iVar1 == 0) && (iVar1 = FUN_0010371a(), iVar1 == 0)) &&
         (iVar1 = FUN_00103773(), iVar1 == 0)) &&
        ((iVar1 = FUN_0010356f(), iVar1 == 0 && (iVar1 = FUN_00103622(), iVar1 == 0)))) &&
       ((iVar1 = FUN_001037da(), iVar1 != 0 &&
        ((iVar1 = FUN_001032b6(), iVar1 != 0 && (iVar1 = FUN_00103378(), iVar1 == 0)))))) &&
      (iVar1 = FUN_00103503(), iVar1 != 0)) &&
     ((iVar1 = FUN_00102c87(), iVar1 == 0 && (iVar1 = FUN_00102c5a(), 1 < iVar1)))) {
    __s1 = (char *)FUN_00102793();
    if ((__s1 == (char *)0x0) || (iVar1 = strcmp(__s1,"www"), iVar1 != 0)) {
      FUN_001038fc();
    }
    ...

We also see references to RSA encryption in the block of code that runs if all checks pass, and after clicking through the other functions in that block to skim their code, we see FUN_00103029(), which contains function calls to curl. It seems like our initial guess of encrypting data and sending it over the web might hold some weight, so let’s run with it. We’ll call this function something like send_data and move on for now.

If we’re gonna get the flag, you may have to defeat my 11 evil functionality checks

Moving back to the giant if else statement we saw earlier, we probably want the function we just saw to run. Perhaps the flag isn’t stored in the malware itself, but rather on the endpoint that the malware is making web requests to. We’ll try to use Ghidra’s patching feature to modify individual instructions, such as changing the condition iVar1 == 0 to iVar != 0, to trick the binary into passing all checks.

ptrace anti-debugging

The first function called in the block of code attributed to the if else is FUN_00103531(). Taking a look at the function code, we see the following lines:

  lVar1 = ptrace(PTRACE_TRACEME,0,0,0);
  if (lVar1 != -1) {
    ptrace(PTRACE_DETACH,0,0,0);
  }
  else {
    perror("ptrace");
  }

This technique involves the process calling ptrace() on itself, and if the function returns a value of -1, that means that ptrace() is already in use, likely by a debugger, which results in the program terminating execution. To get around this, we could patch the binary to not compare against a value of -1 (see the next section for details), or we could just not use a debugger and hope for the best.

Sleep patching

The next function, FUN_0010371a(), contains the following snippet of code:

  time_t __time0;
  time_t __time1;
  double dVar1;
  
  __time0 = time((time_t *)0x0);
  if (__time0 != -1) {
    sleep(0x96);
    __time1 = time((time_t *)0x0);
    if (__time1 != -1) {
      dVar1 = difftime(__time1,__time0);
      if (dVar1 < 149.95) {
        return true;
      }
      return 150.25 < dVar1;
    }
  }
  return false;

This function is taking the system time using time(), sleeping for 150 seconds (0x96 in decimal form), taking the system time after the sleep is complete, and then calculating the amount of time that has passed between the stored timestamps. If the time elapsed is close enough to 150 seconds, then execution will continue, otherwise, the function returns false, which causes the giant check in main() to fail.

This technique aims to prevent the use of a debugger because the system time will still tick while the program may be paused, meaning that the elapsed time could be smaller or larger than the intended value of 150 seconds.

We’re impatient, so we’ll patch the binary to sleep for one second instead. To do this, we’ll click on the sleep(0x96); line, which will highlight the corresponding line of assembly code. From there, we see the following line of assembly that moves 0x96 into a register:

                             LAB_0010372c                                    XREF[1]:     00103726(j)  
        0010372c bf 96 00        MOV        EDI,0x96
                 00 00

To patch this, all we have to do is press Ctrl + Shift + G while our cursor is on the line, type a new value of 0x1 in place of the original number, then press Enter to save our change.

This change will cause the program to sleep for only 1 second rather than the original 150 seconds, but the check will still fail. To deal with that, we can go back to main(), locate the expression that evaluates the return value of the function [(iVar1 = FUN_0010371a(), iVar1 == 0)], and look at the assembly:

        00102485 e8 90 12        CALL       FUN_0010371a                                     undefined FUN_0010371a()
                 00 00
        0010248a 85 c0           TEST       EAX,EAX
        0010248c 75 e9           JNZ        LAB_00102477

We can see that after calling the check function, the program executes a JNZ instruction, which will redirect execution to LAB_00102477 if EAX isn’t zero. We don’t want this as this will disrupt the checks, so instead we’re going to patch the binary to flip the instruction.

We can follow the same instructions from earlier when we changed the sleep amount, but instead of changing the data, we can click on the field on the left and modify the instruction from JNZ to JZ and then save. This will cause the program to continue executing if EAX isn’t zero. We can also see the de-compiler change the expression to (iVar1 = FUN_0010371a(), iVar1 != 0), which confirms our change.

The next function that gets executed [FUN_00103773()] is similar to the one we just changed, but it uses a do while loop to simulate sleep(). Nevertheless, we can do the same trick as with the first function by changing the stored value and patching the corresponding conditional expression in main() to be a JZ instruction.

Here, we’ll look at the while part of the do while loop, which runs the loop while the time difference is less than 150.0 seconds. Viewing the relevant assembly, we see this:

        0010379d f2 0f 10        MOVSD      XMM2,qword ptr [DAT_00104470]                    = 4062C00000000000h
                 15 cb 0c 
                 00 00
        001037a5 66 0f 2f d0     COMISD     XMM2,XMM0
        001037a9 77 dd           JA         LAB_00103788

The program moves a pointer to some data (DAT_00104470) to the register XMM2, then does a comparision between XMM2 and XMM0, which will jump back to the beginning of the loop (LAB_00103788) if the condition (the time difference being less than 150.0 seconds) is met. To change the value of the data, we can double-click on DAT_00104470 to take us to the data stored in the program:

                             DAT_00104470                                    XREF[1]:     FUN_00103773:0010379d(R)  
        00104470 00 00 00        undefined8 4062C00000000000h
                 00 00 c0 
                 62 40

The “undefined8” string means that Ghidra doesn’t know what type the stored data is. We can fix this by right-clicking on the string, heading to “Data”, and choosing “double”, because the COMISD instruction is used for comparing the value of doubles. Once we do that, we can then press Ctrl + Shift + H which will let us change the value of the data. We’ll change it to 1.0, modify the conditional expression in main() like we did with the previous expression, and move on to the next function.

Anti-VM checks

FUN_0010356f() executes fopen("/proc/cpuinfo","r"), which opens the /proc/cpuinfo file for reading, reads the contents of the file into a stream, and checks the stream for the presence of the string “hypervisor”. This is essentially checking to see if the program is running in a VM, which will cause the check to fail. Similarly, FUN_00103622() opens various system files and checks for the strings “VMWare”, “VirtualBox”, and “QEMU”.

To bypass these checks, we can use the same patching strategy as before and change the JNZ instructions to JZ ones.

Internet connectivity + kill switch check

This function [FUN_001037da()] causes the binary to visit the URLs https://metactf.com and https://e205e724dda896b5a70bb03b7aed1dba.metactf.com using curl, and checks the HTTP response codes for each visit using the conditional statement local_30 != 200, where local_30 is the response code returned from requesting the endpoint. For the first URL, the check passes if it’s accessible (response code of 200), while the opposite is true for the second URL. If we want the malware to continue running while connected to the Internet, we have to trick it into thinking the second URL isn’t accessible.

We could set something up with DNS so that the URL can’t be resolved or patch the binary to change the comparison to return true if the URL is accessible, but if we try to curl the website ourselves, we’ll see that it can’t be resolved, meaning that we actually don’t need to do anything.

IP address check

FUN_001032b6() executes the following snippet:

  iVar1 = getifaddrs(&local_430);
  puVar2 = local_430;
  if (iVar1 == -1) {
    puVar2 = (undefined8 *)0x0;
  }
  else {
    for (; puVar2 != (undefined8 *)0x0; puVar2 = (undefined8 *)*puVar2) {
      __sa = (sockaddr *)puVar2[3];
      if ((__sa != (sockaddr *)0x0) && (__sa->sa_family == 2)) {
        iVar1 = getnameinfo(__sa,0x10,local_421,0x401,(char *)0x0,0,1);
        if ((iVar1 == 0) && (iVar1 = strncmp(local_421,"10.13.37.",9), iVar1 == 0)) {
            ...

The function getifaddrs() gets a linked list containing information about all the network interfaces on the system. Assuming that function executes successfully, the program will then loop through each entry in the linked list and use getnameinfo() to retrieve the IP address from each socket. The function then runs strncmp to check if the IP address contains “10.13.37”, meaning that our machine would ideally have to have an address using those first three octets.

Of course, we know a way around this. We can just patch the comparison instruction, this time changing JZ to JNZ. Surely we won’t have to deal with this later…right?

Process check

FUN_00103378() starts off by opening the /proc directory. It then iterates through each process, reading the contents of the cmdline. This, combined with the strings “gdb”, “wireshark”, “strace”, and other debugging tools, leads us to infer that the program is checking for whether certain programs are running, and will likely terminate execution if those programs are running. The odd one out is “apache2”, which if found will set the return value to zero before closing the loop, thereby passing the check. Therefore, it is reasonable to assume that we don’t want any debugger programs running but have apache2 running.

We could start up apache2, but given how we’ve treated the past checks, we’ll just patch it out instead.

Username check

This check [FUN_00103503()] uses getuid() and getpwuid() to get the username of the user that the process is running under. If the username isn’t meatctf, the check fails.

We know the drill by now, so let’s move on.

LD_PRELOAD check

This function checks whether the LD_PRELOAD environment variable is set, which indicates whether custom shared objects are loaded. One strategy that could be used to tackle this challenge thus far is writing a custom library that overrides the functions that gather data for the comparisons by returning the data the program needs to execute successfully, especially for the ptrace() check. This check would prove to be a major roadblock in that endeavor.

Instead, we won’t do anything to modify the behavior of this check, considering we’re not using a shared library or a debugger.

Fan check???

Upon looking at the code, we’re greeted with the following:

  int iVar1;
  int iVar2;
  int iVar3;
  
  iVar1 = FUN_00102b47("/sys/class/hwmon");
  iVar2 = FUN_00102b47("/sys/devices");
  iVar3 = FUN_00102b47("/proc/acpi/fan");
  return iVar3 + iVar1 + iVar2;

Well this is interesting. After doing some research, we can conclude that the strings being referenced here are actually directories that Linux uses to store sensor and device information. Looking into FUN_00102b47(), we can see the string “fan” amongst all the code, leading us to believe that this function is looking for fans. Based on this and the fact that the function returns an integer, we can conclude that the function is checking for the number of fans present on the machine, which in a VM would be zero. The comparison expression is 1 < iVar1, which actually checks whether the number of fans is greater than or equal to 2.

We just have to patch this check like the others, taking care to flip the JLE instruction to JGE instead.

Hostname check

FUN_00102793() grabs the hostname of the machine, with the actual check happening in main() via (iVar1 = strcmp(__s1,"www"), iVar1 != 0). The program is essentially checking whether the hostname of the machine is “www”, in which case the check will pass. We’ll patch it and move on.

Sending data

Finally, we’ve reached the end of the checks. Now we can take a look at the function that runs after we pass everything, the first function being FUN_00102c9f():


char * FUN_00102c9f(void)

{
  uint uVar1;
  uint uVar2;
  char *__ptr;
  char *__ptr_00;
  char *__ptr_01;
  char *__ptr_02;
  char *__s;
  
  __ptr = (char *)FUN_00102826();
  __ptr_00 = (char *)FUN_00102793();
  __ptr_01 = (char *)FUN_001027fb();
  __ptr_02 = (char *)FUN_0010298e();
  uVar1 = fan_check();
  uVar2 = ld_preload_check();
  __s = (char *)malloc(0x2000);
  if (__s == (char *)0x0) {
    perror("malloc");
    free(__ptr);
    free(__ptr_00);
    free(__ptr_01);
    free(__ptr_02);
  }
  else {
    if (__ptr_02 == (char *)0x0) {
      __ptr_02 = "unknown";
    }
    if (__ptr_01 == (char *)0x0) {
      __ptr_01 = "unknown";
    }
    if (__ptr_00 == (char *)0x0) {
      __ptr_00 = "unknown";
    }
    if (__ptr == (char *)0x0) {
      __ptr = "unknown";
    }
    snprintf(__s,0x2000,
             "{\"ip_addresses\": \"%s\", \"hostname\": \"%s\", \"username\": \"%s\", \"processes\": \"%s\", \"fan_count\": %d, \"ld_preload_set\": %d}"
             ,__ptr,__ptr_00,__ptr_01,__ptr_02,(ulong)uVar1,(ulong)uVar2,__s);
  }
  return __s;
}

Looks like there are four unknown functions being called, the fan count check, and the LD_PRELOAD check, which then have their return values stored in pointers. which are then written into the JSON string using snprintf().

Going back to the four unknown functions, inspecting the four unknown functions reveals that they have similar patterns to functions we previously identified:

  • FUN_00102826() calls getifaddrs() and getnameinfo()
  • FUN_00102793() calls gethostname()
  • FUN_001027fb() calls getuid() and getpwuid()
  • FUN_0010298e() opens /proc and iterates through each process, reading the /cmdline file.

With that information and the fields in the JSON string, it’s clear that the IP addresses, hostname, username of the user the process is running under, the fan count, and the status of LD_PRELOAD is being gathered by the program. But what is it doing with this data? Looking at main() yields some answers. The JSON string is passed into FUN_00102dd6(), which calls some encryption functions, and then the result of that function is passed into the function we named send_data().

Just one more patch bro

If data about the system is being sent out, our patches in the main function of the program aren’t going to be enough, as there is likely some server-side validation happening.

In order to defeat this, we have a number of options. We could try and override the functions, we could try and patch them, or…we could take a closer look at the four if-statements. These statements check the return values of the functions, or more specifically, the variables that hold them. If a variable contains nothing, then the program enters the string “unknown” instead. What if we abused this to store our own strings?

If we modify the check logic and change the address of the string that’s loaded into a variable to one that the program already has using the techniques described previously, we can alter the contents of the variables! We already know what strings we need considering they’ve been used in previous checks.

This is what the source code block should look like after our patches:

    if (__ptr_02 != (char *)0x0) {
      __ptr_02 = "apache2";
    }
    if (__ptr_01 != (char *)0x0) {
      __ptr_01 = "meatctf";
    }
    if (__ptr_00 != (char *)0x0) {
      __ptr_00 = "www";
    }
    if (__ptr != (char *)0x0) {
      __ptr = "10.13.37";
    }

Furthermore, we can edit the JSON string and change the "fan_count" field from %d to a number, say 4. This works because our edit there won’t exceed the memory already allocated to the string, as opposed to trying to change the "username" field from %s to "meatctf".

After making all our changes, we can press O, which brings up an export menu. Choosing Original File will save the modified program in its original format, which we can then run (make sure to run chmod +x first!).

After all our hard work, we can run the program and are greeted with the flag:

MetaCTF{y0u_g0t_7h3_meats}