OLE - Dirty Laundry [Forensics]

Introduction


Challenge Description:
We managed to retrieve a sample of the spyware and suspicious mail that seems to be produced by the spyware. Can you analyze the provided files 'mail.txt' and 'invisible_shields.docm', and find out what happened?

This is from mail.txt:

From: Austin <taustin@whschool.com> 
To: dph@whschool.com 
Subject: Outlook Exfiltration Data from User: taustin


*twGsy*#p7XY8CT4N3RpGq5xDzL7EMHW|MZgInjVQiig/Ce4mInU3xVamChLH3kT4ME1JJ9YEHJuCFLa1Zfg+I5d2h5j1QkGwNj237XLiaBtzkualk2WiJg==

And indeed we have a file called invisible_shields.docm:

Now a tip from me, an amateur maldoc creator and maldoc analyst is that whenever you have a file with the .docm extension, the "m" in that extension means a macro-enabled document. Macros contain VBA scripts, which is short for Visual Basic for Applications, a programming language that allows users to automate tasks and create personalised solutions within Microsoft Excel or Word. However, modern document viewers will have macros disabled by default.

Stage 1 Analysis


Now when you have a macro-enabled file, it's best to use olevba to extract and analyse the potentially malicious file. You can download it through here or if you are using git, you can use the following command:

Also, when using olevba, it's best to add the --deobf and --decode switches to deobfuscate and decode if there are any obfuscated or encoded strings within the VBA. As you can see here, the extracted VBA is shown:

As shown in the image above, the obfuscation that is mainly used here is the XOR function which is Xor in VBA syntax. So, what I did is to replace every Xor statement with ^ to make it simpler for me to deobfuscate with Python. You can consider this as Stage 1 for deobfuscating the VBA script, this makes it easier to deobfuscate the script in Stage 2. Yes, I did it manually, for every. single. instance.

Stage 2 Analysis


When I was done deobfuscating the XORed strings, the next step or Stage 2 in deobfuscating the VBA script was to format it so that it is readable to us. The methodology is to tabulate and separate functions, so that we can analyse what each function does in Stage 3.

Stage 3 Analysis


Analysing the code, the function nkalPYSrDkoirG was used again and again with an array and an integer as its arguments. Looking at the function shows us that it is a simple XOR encryption. The first argument is the encrypted string and the second argument is the XOR key. However, the catch here is that whatever the integer provided in the second argument is used as an offset to determine where to start in the byte array PjJHmvDBocr. This byte array is derived from the function ovLKcDvvuvaxVc, which operates on a variable from the active document called gtrxGyKtbDzUEDng, likely representing the encryption key. Analysing the ovLKcDvvuvaxVc function with ChatGPT, it is determined that the function is a function to decode Base64 strings.

The function iterates through the encrypted array, performing an XOR operation between each byte of the encrypted data and a corresponding byte from PjJHmvDBocr, starting at the specified offset. This XOR operation decrypts each character, which is then concatenated into the result string fvPLOtDYqRXxu. Essentially, this means that the decryption key is applied with an adjustable starting point, which can vary the decrypted output depending on the offset provided. The function ultimately returns this decrypted string.

However, there is no active document variable defined anywhere and I didn't want to run this on my host seeing it as a malicious document. With that said, a .docm file is also a zip file, so unzipping the document gives us all the properties of the file.

The grep command allows us to find strings within files easily. The variable gtrxGyKtbDzUEDng is what we're trying to look for so by using the following command, it was able to find a Base64 string!

This is the complete Base64 string:

With this information, a simple Python script can be built to decrypt all the strings which used that function. So, again, I did this manually, going through each instance of nkalPYSrDkoirG. This will eventually allow us to read the script easily in Stage 4 of the analysis.

Stage 4 Analysis


Stage 4 of the analysis involves renaming the functions and variable names. For this, ChatGPT was used. However, only some of the function names and variable names were changed as at some point, I knew what the script did.

So, the attack flow is as follows:

  1. It starts from the ProcessOutlookEmails() function where the script will go through Outlook emails from the last 400 days and look for sensitive words and sensitive files from the emails.

  2. If it manages to find any sensitive files, it will then forward that email to the attacker with the email "dph@whschool.com". It will then delete that email from the user's records.

  3. If it manages to find any sensitive words within the email, it will create a new email and call the EncryptData() function with the email's subject and the email's body which is joined together as one string as the argument and whatever the EncryptData() function returns will be sent to the attacker with the email "dph@whschool.com". It will then delete that email from the user's records.

  4. The EncryptData() function will call FVaFfsygaGuUBB() to generate a random 32-byte IV. It will then create an Object which will use the Rijndael encryption method. It will then use a hard-coded key to encrypt whatever the function wants to encrypt.

  5. However, the EncryptData() function does not encrypt the input string that is provided to it. Instead, it returns the IV and the supposedly encrypted flag which is separated by "|" .

Decrypting the Flag


Looking at mail.txt shows what we have previously analysed.

First of all, AES is a United States federal standard, FIPS 197, which is a subset of Rijndael:

AES has a fixed block size of 128 bits and a key size of 128, 192, or 256 bits, whereas Rijndael can be specified with block and key sizes in any multiple of 32 bits, with a minimum of 128 bits and a maximum of 256 bits.

The VBA script that we have just analysed seems to be at first glance, an AES encryption when I first analysed it. However, as stated before, it uses the Rijndael encryption method which is similar to AES but not. Having that said, CyberChef does not have a recipe for Rijndael encryption so a third party is used.

The hard-coded key mentioned is 8xppg2oX68Bo6koL7hwSeC8bCEWvk540. By separating the IV from the encrypted data from the email with "|" and putting it into their respective fields on the online tool gives us the flag!

Last updated