Chapter Eight and Nine focused on dynamic analysis of programs. Once the basics were out of the way in Chapter eight, we shifted focus to using OllyDbg to fulfil our dynamic analysis objectives. Let’s get to solving problems from this chapter!
Exercise 1
Hash | Name |
---|---|
b94af4a4d4af6eac81fc135abda1c40c | Lab09-01.exe |
d6356b2c6f8d29f8626062b5aefb13b7fc744d54 | Lab09-01.exe |
6ac06dfa543dca43327d55a61d0aaed25f3c90cce791e0555e3e306d47107859 | Lab09-01.exe |
Preface: Analyze the malware found in the file Lab09-01.exe using OllyDbg and IDA Pro to answer the following questions. This malware was initially analyzed in the Chapter 3 labs using basic static and dynamic analysis techniques.
Analysis: Let’s take this particular sample through our standard malware analysis process. I’m going to statically analyze the binary and see what information can be gathered without interacting with it. Opening up the binary in PE Studio, we can find:
- It’s a 32-bit console application
- It was compiled on Oct 18, 2011 (as per the file-header)
Basic string analysis shows us the following strings:
- GET/DOWNLOAD/UPLOAD: Functions might indicate the functionality embedded within the program (possibly a backdoor?)
- http://www.practicalmalwareanalysis.com: URL might be used to communicate with the TA’s C2 server
- cmd.exe: Launch command prompt on the compromised endpoint
- SOFTWARE\Microsoft \XPS: Registry key might be used by the malware to persist (we’ll explore this relation later)
- k:%s h:%s p:%s per:%s: A format string (needs more context)
Let’s shift our attention to advanced static analysis and fire this binary up in IDA. Main starts at 00402AF0
and this is where we’ll begin our hunt.
It kicks off with a simple argument count check; comparing argc
to 1. If it does not have an argument, it queries the registry key: SOFTWARE\MICROSOFT\XPS\Configuration
. If the value is 0, the handle is closed (actually, the handle is close regardless of the value of the key). After this check, a call to function at 00402B13
is made in which an interesting operation takes place. For this, it’d be better if we switch our focus to dynamic analysis via OllyDbg.
Firing up the binary in OllyDbg, let’s quickly get to the function call by stepping into the main function, followed by a few steps in the function. Once you’re at the function call, step into it. First of all, it makes a call to GetModuleFileNameA
to which if no module name is provided, it returns the full path of the currently executing process. We can also see a few offsets i.e., /c del
and >> NUL
. Using OllyDbg, step into this function and take special notice of the registers. See how the string slowly formulates into the final parameter to be executed by the ShellExecute
call referencing cmd.exe
and the string offsets which move on to deleting the executing binary. Though the program won’t be able to delete the binary as it is open in IDA and OllyDbg at my end.
Okay, what happens if the argument count is 1? Let’s explore that route now. While opening the binary in OllyDbg, there’s an option to pass an argument to it. Let’s pass a random argument to at least meet our argument count conditional.
There’s one other way for you to pass arguments to the process i.e. by heading into the Debug
option in the menu from the navbar and selecting Arguments
. Simply add in your arguments and reset the flow by pressing CTRL+F2.
Now, let’s quickly shift to the main
function at 0x403945
. Our argument check is now fulfilled. Let’s review the branch at 402B1D
. It takes in the values of argc
and argv
and stores it in EAX and ECX respectively. Next, we see a slightly cryptic operation where the address [ECX+EAX*4-4]
is moved into EDX. Since EAX points to the number of arguments i.e. 2 in our case and ECX is array of arguments passed to the program, by this pointer arithmetic, we reference the last element of the argv
array.
This last element is then fed to a function at 0x00402510
which… is performing some operation on this argument. Alright, so let’s re-execute the binary in our debugger with a sample argument, aaaa
. Going into the function’s disassembly, I’ve renamed the argument to lastArgumentOfArgv
for it to make more sense to me. Onwards, we see ECX is OR’ed with a large number effectively making it a counter. Then, the following code segment is executed:
xor eax, eax repne scasb not ecx add ecx, 0FFFFFFFFh cmp ecx,4
SCASB stands for ‘SCan A String (Byte)’ which scans the source string in ES:DI for a match in EAX. When paired with REPNE, the operation is repeated until the zero flag is set or ECX is equal to 0. Since ECX here is a large number, it’s likely never going to be zero. But the Zero flag is likely going to be set to 1 before and the condition for REPNE will fail eventually leading to the ADD
and CMP
instructions where ECX is compared to 4 to see if the length of the parameter is 4
. Let’s take a look at it in the debugger:
Since our test argument was four characters long, we’ll make the jump to loc_40252D
. Since the last element is essentially a character array, the reference to the element’s characters can be incremented by 1 to move on to the next character. Here, the first character is matched to be a
. Since our argument did stand true, our execution continues.
Here, the array is moved back to EAX, the pointer is incremented to point towards the second element, and the value is moved to VAR_4. The same array is moved back to EDX i.e. pointer to the first element and the two values (first and second character) are subtracted and later compared with 1. Their difference can only be 1 if the next character is b
. Alright this doesn’t hold true for us but let’s take a look at our static disassembly.
You will see that the next matches are for the characters ‘C’ and the next ‘D’ after which the comparisons end (ultimately, the password here should ‘abcd’). If these comparisons are true, EAX is set to 1 and the value is returned. That’s it! It’s more like a password provided to the script as an argument. But… we can completely bypass this check by patching the code and returning 1. These instructions can actually do it all:
MOV EAX, 1 RETN
Once patched, we can continue our analysis. Jumping to the branch at 402B3F
, we can the second element of the argv
array being moved into a variable and later pushed to stack for the function call at 40380F
. After a short debugging session, I’ve come to the realization that this function is in fact simply checking to see if the second argument matches the given list of commands (referenced as offsets).
- IN
- RE
- C
- CC
Let’s re-execute the binary with the second argument being -in
. With this change, we’ve skipped the branch (to continue looking for potential command-line switches). Next, the argument count is compared to 3 and later to 4; here the third argument (after the switch and before the password) would indicate the lpServiceName
variable (indicating the name of the service to be installed).
Let’s analyze the two in detail:
If three arguments are provided to the binary at run-time, the function call at
4025B0
is executed. I won’t be diving into this function as all it does is strip the file path (which it retrieves using a function call toGetModuleFileName
) and returning the filename as the potential name of the service (stripping the extension as well).Let’s re-execute the binary in our debugger by providing four arguments (I’ll be using
program.exe -in ServiceX abcd
as my argument). As expected, the third argument (ServiceX) in my case is sent as a parameter to the function at04025B0
.
Eventually, the function at 402600
is called with the lpServiceName
as the name of the service to be installed on the system. Using a combination of REPNE SCASB
calls, a path is stitched together such that it is: %SYSTEMROOT%\system32\{NAME_OF_THE_EXECUTABLE}.exe
. A series of calls related to the Service Control Manager are made and the branch at 0040277D
contributes to stitching together the DisplayName
of the service which is equal to {NAME_OF_EXECUTABLE} Manager Service
. Finally, the service is created using the CreateServiceA
API call. Since the referenced binary file path for the service was set to %SYSTEMROOT%\system32\{NAME_OF_THE_EXECUTABLE}.exe
; the running binary is copied to the path using CopyFile
and the filetime is set to that of the kernel32.dll
so as to evade defenses.
It eventually calls out to the function at 401070
to create a new registry key at SOFTWARE\Microsoft\XPS
and sets the value of Configuration
to http://www.practicalmalwareanalysis[.]com:80
(I could be wrong here as it was a lot of string manipulation which I seem to have missed and the API call likely never succeeded on my system either). This is likely a beacon of some kind attempting to connect to the mentioned host using the ups
mode (we might uncover other modes later).
Moving on to the second command-line switch, -re
; we can see a similar argument check followed by a argument count check (used to check if a custom service name is provided or not) and the function call at 402900
is made. Digging into the function, we can easily see that the function is simply reversing the effect of -in
i.e., it’s deleting the service, any binary copied to SYSTEMROOT, and the registry key created earlier.
On to the second last switch, let’s check out -cc
. It checks to see if the argument count is equal to 3 and heads into the function which reads the current network configuration of the malware (set in the registry key discussed earlier). It acquires the values and prints them to the console using the format: k:%s h:%s p:%s per:%s\n
. Here’s the output from my own system: k:ups h:http://www.practicalmalwareanalysis.com p:80 per:60
.
Lastly, the -c
switch checks to see if 7 arguments are provided to the program on runtime. I can imagine how it must be the mode of execution (e.g. UPS
), host, port, and the time field. I’ll re-execute the program with 7 parameters i.e., -c down http://abc.com 90 60 90
(no clue what is what right now). It attempts to update the configuration in the registry. Since I wasn’t running the binary as an Administrator, the registry changes fail at my end.
That’s the end of the command-line switches. Now, what happens if no command-line argument is passed? This is where one last branch comes in. It starts off with the function at 401000
and checks to see if the registry key has been configured to beacon to the host. If it is set, the function at 402360
is called.
The function begins to read the values of the Configuration
value (of the registry key) and fills in variables which are later used in functions. 00402020
is where we witness the different modes the backdoor can operate in. Now this one’s a bit too extensive. Let’s dig in!
Right off the bat, we see a function call to 401E60
which is followed by checks to see if any of the following modes of operations are selected by the malware:
- SLEEP
- DOWNLOAD
- UPLOAD
- NOTHING
- CMD
We’ll discover their functionality later. Let’s first analyze the function at 401E60
. It’s first two function calls are quite similar; 401420
and 401470
. They retrieve two values from the configuration set in the registry - mainly the host and the port.
The third function call at 401D80
is a bit different in a sense that it generates a random alphanumeric string every time it’s run and the resulting string is pushed to stack. Since the malware did acquire the port and host, this random string might be part of the URL used to acquire a resource from the remote host. The next function call at 401AF0
shows several socket commands being sent to/from the remote host which suggest the randomly generated resource is acquired using a GET request. Follow-up branches show comparisons of the returned resources against the string combination backticks and single quotes.
Since this function has several sub-nested calls, this is where we’ll be concluding our analysis of it. Going back to our caller function, let’s finally take a look at the commands and what they do on the system. From our previous function, the actual command’s value (for e.g. the number of seconds to sleep for the SLEEP command) is retrieved and sent to the command for which it was received. Here’s a breakdown of the commands and their functionality (sums up loads of sub-nested calls):
SLEEP
: Sleep for given timeUPLOAD
: Function at4019E0
writes the received file to disk (contrary to what the UPLOAD command should do)
DOWNLOAD
: Function at401870
opens up the handle to a file, reads its content, and sends it to the remote server (again, contrary to what DOWNLOAD should do)
CMD
: Executes an arbitrary command on the system using the command prompt
Question Number 1: How can you get this malware to install itself?
Pass the -in
command-line switch with or without another string to act as the name of the service which the malware installs to persist on the system.
Question Number 2: What are the command-line options for this program? What is the password requirement?
Potential command-line options for this program are:
- IN (Install service)
- RE (Remove service)
- C (Update configuration)
- CC (Display configuration)
The password of the program is abcd
. Though it can easily be bypassed (answered next).
Question Number 3: How can you use OllyDbg to permanently patch this malware, so that it doesn’t require the special command-line password?
Assemble instructions in the function checking for the password (0x402510
) to return 1 (True) in EAX. These instructions can do it:
MOV EAX, 1 RETN
Question Number 4: What are the host-based indicators of this malware?
- Name of Service: “Executable Name”
- Display Name of Service: “Executable Name” Manager Service
- Registry Key: HKLM\SOFTWARE\Microsoft \XPS
- Malware: C:\Windows\system32{
NAME_OF_SERVICE.exe
}
Question Number 5: What are the different actions this malware can be instructed to take via the network?
- Download a file to disk (from the remote host)
- Upload a file to remote host
- Sleep for X seconds
- Do nothing
- Execute an arbitrary command on the system and return output to the remote host
Question Number 6: Are there any useful network-based signatures for this malware?
- Host:
http://www.practicalmalwareanalysis.com
- Port:
80
- Protocol:
HTTP 1.0
- Method:
GET
- Resources:
xxxx/xxxx.xxx
Exercise 2
Preface: Analyze the malware found in the file Lab09-02.exe using OllyDbg to answer the following questions.
Hash | Name |
---|---|
251f4d0caf6eadae453488f9c9c0ea95 | Lab09-02.exe |
ea8e109eb3fbdb76623cf9522267345b19721e42 | Lab09-02.exe |
f153dfacec09dd69809c3bbf68270a38ee3701f44220c7bf181c14a68c138133 | Lab09-02.exe |
Quick Analysis:
- Starts off the main function at:
401128
- Use the
GetModuleFilename
API call to get the name of the executable- Function
401550
returns the name of the executable in the EAX register along with a backslash (e.g.\ocl.exe
) - Function
4014C0
takes inocl.exe
in three registers - EAX, ECX, EDX- Compares the previously acquired file name with
ocl.exe
- Program execution continues if both match
- Compares the previously acquired file name with
- Function
- If matches,
WSAStartup
and other network configuration API calls are made includingWSASocketA
(sets up an IPv4, two-way stream, TCP connection) - Function at
401089
takes in an address of19FD40
and a string (initially pushed into stack variables -1qaz2wsx3edc
)- Function at
401440
takes in the same string as parameter [returns 0xC]
- Function at
- Function continues XOR decoding to de-obfuscate and copy domain into address
19FB0C
(byte by bye copy) - www.practicalmalwareanalysis.com - Socket connections continue. If successful, a reverse shell is launched. If not, the socket closes.
Question 1. What strings do you see statically in the binary?
Lack of interesting strings. Mostly API imports and junk strings.
Question 2. What happens when you run this binary?
The malware didn’t do anything at first. Changing the name to ocl.exe
will trigger a network connection to practicalmalwareanalysis.com
and fetch a command to execute on the system via cmd.exe
.
Question 3. How can you get this sample to run its malicious payload?
- Changing the name of the binary to
ocl.exe
.
Question 4. What is happening at 0x00401133?
Data is written to the memory address which can be seen in the dump as:
1qaz2wsx3edc µ¶·ocl.exe
. It is later used to de-obfuscate the name of the domain name.
Question 5. What arguments are being passed to subroutine 0x00401089?
- Address:
19FD40
- String:
1qaz2wsx3edc
Question 6. What domain name does this malware use?
practicalmalwareanalysis.com
Question 7. What encoding routine is being used to obfuscate the domain name?
Data at the buffer is XOR’ed with the string 1qaz2wsx3edc
.
Question 8. What is the significance of the CreateProcessA call at 0x0040106E?
Launches a shell with the input, output, and error handles configured to connect to the socket (sent as an argument to the function). The reverse shell is going to be connected to the opened socket soon after the CreateProcess
call is made.
Exercise 3
Preface: Analyze the malware found in the file Lab09-03.exe using OllyDbg and IDA Pro. This malware loads three included DLLs (DLL1.dll, DLL2.dll, and DLL3.dll) that are all built to request the same memory load location. Therefore, when viewing these DLLs in OllyDbg versus IDA Pro, code may appear at different memory locations. The purpose of this lab is to make you comfortable with finding the correct location of code within IDA Pro when you are looking at code in OllyDbg.
Quick Analysis: Wasn’t really necessary as this was a rather easy exercise.
Question 1: What DLLs are imported by Lab09-03.exe?
- NETAPI32
- KERNEL32
- DLL1
- DLL2
- USER32 [Dynamically via
LoadLibrary
] - DLL3 [Dynamically via
LoadLibrary
]
Question 2: What is the base address requested by DLL1.dll, DLL2.dll, and DLL3.dll?
All DLLs request the same base address of: 0x10000000
Question 3: When you use OllyDbg to debug Lab09-03.exe, what is the assigned based address for: DLL1.dll, DLL2.dll, and DLL3.dll?
- DLL1:
10001000
- DLL2:
00871000
- DLL3:
00521000
Question 4: When Lab09-03.exe calls an import function from DLL1.dll, what does this import function do?
- Calculates a random integer
- Prints the integer to console with the string format:
DLL1 Mystery Data is: %d
Question 5: When Lab09-03.exe calls WriteFile, what is the filename it writes to?
Temp.txt
in the same directory
Question 6: When Lab09-03.exe creates a job using NetScheduleJobAdd, where does it get the data for the second parameter?
Question 7: While running or debugging the program, you will see that it prints out three pieces of mystery data. What are the following: DLL 1 mystery data 1, DLL 2 mystery data 2, and DLL 3 mystery data 3?
DLL1 Mystery Data
: We can see two variables being passed to the function sub_10001038
in DLL1Print. One’s the string format and the second argument is the value of EAX which is a DWORD at a specific memory address. Looking at write cross-references of the DWORD, we can actually see it holding the return value of the GetCurrentProcessId
API call.
DLL2 Mystery Data
: Similarly, DLL2Print takes the return value of the CreateFile
call which spawns the temp.txt
file in the same directory. It’s the handle ID to the file which is printed by the function.
DLL3 Mystery Data
: Here, the MultiByteToWideChar
API call is used to convert the ASCII string, ping www.malwareanalysisbook.com
to UNICODE. Once converted, the address of the string in memory is printed to console.
Question 8: How can you load DLL2.dll into IDA Pro so that it matches the load address used by OllyDbg?
- Select
Manual Load
while opening the binary in IDA - Write the Image base address available in OllyDbg to ensure the two are synced