MalwareTech's VM1 Reversing Challenge
Get the challenge from here
vm1.exe implements a simple 8-bit virtual machine (VM) to try and stop reverse engineers from retrieving the flag. The VM’s RAM contains the encrypted flag and some bytecode to decrypt it. Can you figure out how the VM works and write your own to decrypt the flag? A copy of the VM’s RAM has been provided in ram.bin (this data is identical to the ram content of the malware’s VM before execution and contains both the custom assembly code and encrypted flag).
Rules & Information You are not require to run vm1.exe, this challenge is static analysis only. Do not use a debugger or dumper to retrieve the decrypted flag from memory, this is cheating. Analysis can be done using the free version of IDA Pro (you don’t need the debugger).
We are given two files - vm1.exe and ram.bin, and according to the problem statement, ram.bin contains the bytecode for the VM and the flag encrypted in it somehow.
I used IDA Pro to analyse the binary. I started off with the start
function.
First, there are a few calls to some MD5 related functions, those are to display the MD5 hash of the flag when we run the program.
Then there is a call to GetProcessHeap
and HeapAlloc
which basically allocates a memory of size 0x1FB. After that we have a call to memcpy
that copies data from unk_404040
to the newly allocated memory (renamed to bytecode). On taking a look at the bytes at that location, they are exactly the same as ram.bin so this is the memory location that is mentioned in the problem statement.
We move straight to the read_bytecode_from_memory
function (sub_4022E0
before renaming).
Here first eax is set to 1 and then there is a loop that runs until eax does not become 0. The body of the loop basically read 3 bytes of the bytecode sequentially, stores it and passes it to the function evaluate
(sub_402270
before renaming).
This is the function where the VM bytecode is interpreted. The function has 3 arguments, which are basically 3 bytes of the bytecode passed from the read_bytecode_from_memory
function. For evaluation, the first parameter is checked first -
- If it is 1, then the memory location at offset param2 is assigned param3 and eax is set to 1.
- If it is 2, then a variable, byte_404240 is set to the value at memory location at offset param2 and eax is set to 1.
- If it is 3, then the value at offset param2 is XORed with the value of byte_404240 and stored back at the offset of param2.
- Otherwise, if it is not 3 then al is set to 0, i.e. eax is now zero and the loop in
read_bytecode_from_memory
should now stop.
After the loop in read_bytecode_from_memory
ends we know that now, the flag is in the memory, precisely at the location unk_404040
.
So, we can basically emulate the whole functionality with a python script and then get the flag from the converted data.
I wrote the following script.
data = [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0DE, 0x7E, 0x7D, 0x55, 0x1E, 0x5, 0x0E6, 0x9F, 0x0E4, 0x0A6, 0x47, 0x50, 0x2, 0x1, 0x0C7, 0x0FC, 0x0CB, 0x60, 0x9, 0x0C6, 0x0E, 0x2E, 0x41, 0x65, 0x0A4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x1D, 0x0BD, 0x1, 0x5, 0x53, 0x1, 0x12, 0x48, 0x1, 0x10, 0x0E6, 0x1, 0x13, 0x8A, 0x1, 0x0D, 0x47, 0x1, 0x16, 0x13, 0x1, 0x0A, 0x15, 0x1, 0x0, 0x98, 0x1, 0x2, 0x3C, 0x1, 0x18, 0x0D9, 0x1, 0x1A, 0x57, 0x1, 0x6, 0x0AB, 0x1, 0x1B, 0x0C6, 0x1, 0x1, 0x32, 0x1, 0x17, 0x20, 0x1, 0x15, 0x6F, 0x1, 0x11, 0x2D, 0x1, 0x8, 0x0C9, 0x1, 0x9, 0x0E7, 0x1, 0x3, 0x12, 0x1, 0x0C, 0x2F, 0x1, 0x0E, 0x88, 0x1, 0x19, 0x6C, 0x1, 0x4, 0x65, 0x1, 0x1E, 0x0AE, 0x1, 0x14, 0x59, 0x1, 0x1F, 0x91, 0x1, 0x1C, 0x5D, 0x1, 0x0F, 0x0AE, 0x1, 0x0B, 0x15, 0x1, 0x7, 0x0CC, 0x2, 0x20, 0x0, 0x3, 0x0, 0x0, 0x2, 0x21, 0x0, 0x3, 0x1, 0x0, 0x2, 0x22, 0x0, 0x3, 0x2, 0x0, 0x2, 0x23, 0x0, 0x3, 0x3, 0x0, 0x2, 0x24, 0x0, 0x3, 0x4, 0x0, 0x2, 0x25, 0x0, 0x3, 0x5, 0x0, 0x2, 0x26, 0x0, 0x3, 0x6, 0x0, 0x2, 0x27, 0x0, 0x3, 0x7, 0x0, 0x2, 0x28, 0x0, 0x3, 0x8, 0x0, 0x2, 0x29, 0x0, 0x3, 0x9, 0x0, 0x2, 0x2A, 0x0, 0x3, 0x0A, 0x0, 0x2, 0x2B, 0x0, 0x3, 0x0B, 0x0, 0x2, 0x2C, 0x0, 0x3, 0x0C, 0x0, 0x2, 0x2D, 0x0, 0x3, 0x0D, 0x0, 0x2, 0x2E, 0x0, 0x3, 0x0E, 0x0, 0x2, 0x2F, 0x0, 0x3, 0x0F, 0x0, 0x2, 0x30, 0x0, 0x3, 0x10, 0x0, 0x2, 0x31, 0x0, 0x3, 0x11, 0x0, 0x2, 0x32, 0x0, 0x3, 0x12, 0x0, 0x2, 0x33, 0x0, 0x3, 0x13, 0x0, 0x2, 0x34, 0x0, 0x3, 0x14, 0x0, 0x2, 0x35, 0x0, 0x3, 0x15, 0x0, 0x2, 0x36, 0x0, 0x3, 0x16, 0x0, 0x2, 0x37, 0x0, 0x3, 0x17, 0x0, 0x2, 0x38, 0x0, 0x3, 0x18, 0x0, 0x1, 0x19, 0x0, 0x4, 0x0, 0x0, 0x0]
i = 0
bval = 0
ret = 1
while ret:
opcode = data[i+0xFF]
op1 = data[i+1+0xFF]
op2 = data[i+2+0xFF]
# print("{}, {}, {}".format(opcode, op1, op2))
if opcode == 1:
data[op1] = op2
elif opcode == 2:
bval = data[op1]
elif opcode == 3:
data[op1] = data[op1] ^ bval
elif opcode != 3:
ret = 0
i += 3
print(data)
The first few numbers look like ASCII, converting them,
data = [70, 76, 65, 71, 123, 86, 77, 83, 45, 65, 82, 69, 45, 70, 79, 82, 45, 77, 65, 76, 87, 65, 82, 69, 125]
data = [chr(x) for x in data]
print(''.join(data))
We get the flag - FLAG{VMS-ARE-FOR-MALWARE}