Chapter Seven focused on analyzing programs which are designed to run on the Windows operating system and make use of the Windows API exposed to developers to interact with the system, its kernel, and other resources available to the user.
Exercise 1
Hash | Name |
---|---|
c04fd8d9198095192e7d55345966da2e | Lab07-01.exe |
86ee262230cbf6f099b6086089da9eb9075b4521 | Lab07-01.exe |
0c98769e42b364711c478226ef199bfbba90db80175eb1b8cd565aa694c09852 | Lab07-01.exe |
Analyze the malware found in the file Lab07-01.exe
Question Number 1. How does this program ensure that it continues running (achieves persistence) when the computer is restarted?
Heading into the main function at 00401000
, we can see the API call to StartServiceCtrlDispatcherA
which (in oversimplified terms) is used to connect service programs to the Service Control Manager (SCM). It takes in one parameter, the lpServiceTable parameter, for which MSDN states:
The lpServiceTable parameter contains an entry for each service that can run in the calling process.
So, it should contain entries for services which are to run under the process. This Dispatcher call is simply used to allow the SCM to send control and service start requests to the main thread of the service process. But to succeed, this function should be called within 30 seconds of this service process starting up; otherwise the call is likely going to fail.
Backtracking into the main function, we can see one entry being added to the SERVICE_TABLE. It’s the MalService for which the following details are added in the SERVICE_TABLE_ENTRY:
- lpServiceName: MalService
- lpServiceMainFunction: Offset to the function,
sub_401040
[ServiceMain function used by the calling process to execute the service and is in turn used by the StartServiceCtrlDispatcher to control dispatches]
Soon after the dispatcher is initialized, the function call is made again. To answer this question in whole, the program ensures persistence by starting a service on the compromised system.
Question Number 2. Why does this program use a mutex?
Moving into the function call at, 00401040
, we see an OpenMutexA
call at the start of the function followed by other WinAPI calls. See this isn’t a CreateMutex
call which actually creates the named mutual exclusion (mutex) object. It’s simply checking to see if HGL345
exists as a named mutex on the compromised system. If it does, the malware ensures that the system has likely already been compromised and the program doesn’t need to re-execute. If it doesn’t find the mutex on the system, it proceeds to create the mutex, open the SC manager, launch services, and so on (basically kickstart the infection process).
Question Number 3. What is a good host-based signature to use for detecting this program?
The named mutex, HGL345
, and the service, MalService
should both be good host-based signatures of this program being executed on a system.
Question Number 4. What is a good network-based signature for detecting this malware?
Peeking into the function, StartAddress
(called by the CreateThread
API call), we can find a URL being referenced along with a User-agent string.
URL: http://www.malwareanalysisbook.com/
User-agent String: Internet Explorer 8.0
Question Number 5. What is the purpose of this program?
Though the questions don’t ask very detailed questions about the malware’s objectives but I’ve analyzed the rest and here’s my summary:
- The malware is actually a service process which connects to the SCM and executes as a service set to run using the AUTO_START mode and the name ‘MalService’
- Creates a WaitableTimer object on the system and then wait for completion of the object’s time out (set as the second parameter) which is actually the SYSTEMTIME structure; wherein the year is set to 2100 and the rest of the variables (hour, day, etc.) are set to zero which makes it January 1, 2100. It’s likely going to wait till the said date and continue execution of the thread(s)
- Once the wait on the object is over, the program attempts to spawn 20 threads (by means of a loop on ESI initially set to 20 and then decremented till 0) and executes the StartAddress process (set as the
lpStartAddress
parameter to theCreateThread
call) within the thread’s context - The
StartAddress
function establishes contact to the URL mentioned in Question Number 4 (in an infinite loop). This appears to be an amplification of some kind; infected workstations will all connect to the URL at the same time to likely bring it down.
Question Number 6. When will this program finish executing?
The WaitForSingleObject
call should kickstart execution on January 1st, 2100. However, since the thread executes a function with an infinite loop, it’s never bound to end execution.
Exercise 2
Hash | Name |
---|---|
7bbc691f7e87f0986a1030785268f190 | Lab07-02.exe |
8a55adee743d1124105d3acd688db621e3d8802f | Lab07-02.exe |
bdf941defbc52b03de3485a5eb1c97e64f5ac0f54325e8cb668c994d3d8c9c90 | Lab07-02.exe |
Analyze the malware found in the file Lab07-02.exe.
Jumping into the main
function at 00401000
; right off the bat, we’re able to see an API call to OleInitialize
which is used to initialize COM. Right after, we see the API call to CoCreateInstance
which is used to instantiate an object from the class (via its GUID/CLSID) referenced in the parameters sent to the call.
Tracing the CLSID, we can see the offset referencing a four-part Data object. This is how Windows structures a typical GUID as we can see in this MSDN article. GUIDs are used to reference objects such as COM interfaces or classes. We can also see how these GUIDs are broken down. Since IDA has done a tremendous job at structuring the rclisd
as a GUID, all we need to do is now convert this hex-encoded string into a GUID.
Data 1: 2DF01h
Data 2: 0
Data 3: 0
Data 4: 0C0h, 6 dup(0), 46h
Data 1: 0002DF01 (8-bytes)
Data 2: 0000 (4-bytes)
Data 3: 0000 (4-bytes)
Data 4: C000 00000000046
GUID: 0002DF01-0000-0000-C000-000000000046
Now that we have our GUID, let’s look for this CLSID in HKEY_CLASS_ROOT\CLSID\
. Here, the RI
Excellente! It’s the ‘Internet Explorer’ COM class. Before we move on, let’s have a primer on COM. CoCreateInstance
helps us get access to a COM class which is identified by the GUID (just how we accessed it). Now this COM class is going to implement one or several interfaces. These interfaces, also accessible by a GUID (RIID in our case) in turn define a vtable i.e., virtual function tables. So basically, when you pass the CLSID, RIID, and PPV (pointer-to-pointer-to-vtable), you get access to interface pointer (in PPV).
Alright, back to where we were. CLSID points to the IE COM class. To figure out the Interface the RIID is referring to, you can do two things:
- Google it?
- Acquire the Windows SDK and simply grep the first 8-byte hex segment (Data 1 of the GUID or any unique bit for that part) and you should have your answer with you. These Interfaces are actually available in SDK headers as you can see yourself (credit goes to Michael Bailey from the FLARE team; take a look at his awesome presentation here)
Now we have the Interface (IWebBrowser2) as well. Luckily, IDA also has several interfaces and the Virtual Tables they define. So, now that we’ve identified the interface, let’s create the virtual table using:
- Open the
structures
sub-window - Press the
Insert
key to insert a new structure - Keep the name whatever you want; select a Standard Structure from the button
- Select the
{Interface}Vtbl
structure from the drop-down e.g. IWebBrowser2Vtbl
- Select the
Alright so the virtual table is now set up too. Let’s go back to our text view. We can see a function call at 00401074
with the operand referencing the address at [edx+2Ch]
. We can see EDX is actually pointing to the pointer with the requested interface (at 0040105C
EAX gets the address which is later moved into EDX). Since EDX is the pointer to the base of the virtual table, the offset 2C points to the function it’s going to call. If you’ve followed along and setup the IWebBrowserVtbl, simply right click and change the representation (is that the right terminology?!). You should be able to see that it references the Navigate
function and passes the URL http://www.malwareanalysisbook.com/ad.html
to it. That’s it!
Question Number 1. How does this program achieve persistence?
The program never attempted to persist itself.
Question Number 2. What is the purpose of this program?
Use the Navigate
method of the IWebBrowser
interface to visit http://www.malwareanalysisbook.com/ad.html
.
Question Number 3. When will this program finish executing?
Soon after execution of the Navigate call. There’s no delay added to this program.
Exercise 3
Hash | Name |
---|---|
bd62dab79881bc6ec0f6be4eef1075bc | Lab07-03.exe |
c2f24c592d0a8e0e6bcaff8710ac6cde7819d151 | Lab07-03.exe |
3475ce2e4aaa555e5bbd0338641dd411c51615437a854c2cb24b4ca2c048791a | Lab07-03.exe |
290934c61de9176ad682ffdd65f0a669 | Lab07-03.dll |
a4b35de71ca20fe776dc72d12fb2886736f43c22 | Lab07-03.dll |
f50e42c8dfaab649bde0398867e930b86c2a599e8db83b8260393082268f2dba | Lab07-03.dll |
Premise: For this lab, we obtained the malicious executable, Lab07-03.exe, and DLL, Lab07-03.dll, prior to executing. This is important to note because the malware might change once it runs. Both files were found in the same directory on the victim machine. If you run the program, you should ensure that both files are in the same directory on the analysis machine. A visible IP string beginning with 127 (a loopback address) connects to the local machine. (In the real version of this malware, this address connects to a remote machine, but we’ve set it to connect to localhost to protect you.)
WARNING: This lab may cause considerable damage to your computer and may be difficult to remove once installed. Do not run this file without a virtual machine with a snapshot taken prior to execution.
This lab may be a bit more challenging than previous ones. You’ll need to use a combination of static and dynamic methods, and focus on the big picture in order to avoid getting bogged down by the details.
Analysis: Let’s start off with basic static analysis of LAB07-03.exe. Looking at the strings, I see some interesting API calls for file creation, mapping, and copying. Alright, so it might be trying to modify something on the filesystem. Looking further, we can see references to the strings:
C:\Windows\Sysem32\kerne132.dll
which obviously isn’t the legitimate Kernel32.dll in the System32 folder -'C:\Windows\System32\Kernel32.dll
which is the legitimate Windows DLLLab07-03.dll
which appears to be the DLL it might try and load. We’ll see this later on.
Let’s disassemble the binary using IDA and kick off our analysis with the main
function. It starts off with comparing the first argument count to 2 (and exits if it isn’t 2) and continues with a few operations which appear to be comparisons. Upon further analysis, the first argument is moved into EAX and compared with ESI in a series of comparisons. What’s compared you might ask? It’s the Warning string WARNING_THIS_WILL_DESTROY_YOUR_MACHINE
. If the string isn’t found, then the program exists.
If both pre-conditions are true, the IP jumps to the CreateFileA
API call; it was interesting to see it attempting to create the file, C:\Windows\System32\Kernel32.dll
(which obviously exists on every system).
Well, this was new to me; CreateFile can also be used to acquire a handle to an existing file if the correct dwCreationDisposition
is sent to the API call. Here, 3
means it is simply attempting to open an existing file.
Once the file is created, it is mapped into the memory using the CreateFileMappingA
API call. Later, the CreateFile
API call is called on LAB07-03.dll (with the same dwCreationDisposition
as before to open a handle to an existing file). Soon after, a CopyFileA
call is made to copy LAB07-03.dll to C:\\Windows\\System32\\kerne132.dll
.
Next, a call to the function, sub_4011E0
is made with a single argument C:\\*
. Looking into the function, we can see the API call to FindFirstFileA
which takes in two parameters; first, the directory or path to find the file in and second, the structure (WIN32_FIND_DATA) which receives the pointer to the returned data. Its return value is a search handle which can be used (in conjunction with FindNextFile
) to find files/folders (where the first file found is stored in the FindFileData
structure. Further into the function, we can see it calls itself with the lpFilename parameter so it’s likely recursive in nature. Two other calls are interesting; strcmp
is used to compare a string (EBX) with ‘.exe’ which, if true, calls another function. (sub_4010A0
)
[Note: I had to peek at the solutions because I felt lost at the DWORD replacements string and how it was being used as the source for the MOVSX instructions]
Peeking into the sub_4010A0
function, we can see the API calls repeat for CreateFileA
, CreateFileMappingA
, and MapViewOfFile
to read the file and load it into memory. Next intrusions use the IsBadReadPtr
(which according to MSDN is now obsolete) to verify if the pointer to the file is valid or not. If it is valid, a few instructions pass and another strcmp
call is made to compare EBX (Str1) to kernel32.dll
. If true, a series of memory copy instructions are made (via the movsd/movsb
instructions).
So, what does it move?
repne scasb
is actually functionally equivalent to the C strlen function which is used to calculate the length of the string. Next up, we have the movsb/movsd
instructions here. Now here, the data from the SI register is moved into the DI register. We see SI here initially pointed to a dword with hex-encoded data in it. How dumb of me; it was actually an ASCII string which IDA can quickly change for us by selecting and pressing the A
key. So, the SI points to kerne132.dll
. Similarly, DI here points to EBX, the same string which was used to compare against kernel32.dll.
Okay, so it’s likely changing legitimate Kernel32 references with Kerne132.dll in every .EXE
file it finds in the C:\
directory. Genius! I didn’t test this program dynamically just yet but the solutions section of this particular exercise gave great insights:
When we open the modified Lab07-03.dll (now named kerne132.dll), we see that it now has an export section. Opening it in PEview, we see that it exports all the functions that kernel32.dll exported, and that these are forwarded exports, so that the actual functionality is still in kernel32.dll. The overall effect of this modification is that whenever an .exe file is run on this computer, it will load the malicious kerne132.dll and run the code in DLLMain. Other than that, all functionality will be unchanged, and the code will execute as if the program were still calling the original kernel32.dll.
Now, let’s take a look at the DLLMain
function of LAB07-03.dll. Our first function of interest is sub_10001010
. First of all, it checks to see if a Mutex exists on the system. If it does, the DLL code exits. If not, it creates the mutex via CreateMutexA
and calls WSAStartup
for initialization of the Winsock library.
Soon after, a socket
is initialized with for TCP communication with IPv4 addressing. Using inet_addr
, the IP, 127.26.152.13, is converted into its binary equivalent. Using htons
, the port, 80 (provided as an argument to the htons call), is converted into network byte order. Don’t know what host or network byte orders mean? Take a look here. Finally, the connect
API call is made.
Next up, a send
call is made and a ‘hello’ is sent to the C2 server. Soon after, communication is shut down for SEND operations as we can see from the first parameter (‘how’) pushed to stack for the shutdown
call.
Next, the program receives some commands via the recv
call, the output of which is then compared to sleep
and exec
- to decide which path to choose for execution.
If it is sleep, it is set to sleep for 393216 milliseconds or ~6 minutes after which another hello
is sent to the server. If the received command is exec
, a process is spawned using the lpCommandLine
(which isn’t on the stack but actually is part of the buffer which you can see that the buffer is 4096 bytes and the CommandLine parameter is at the 4091th byte (i.e. 5 bytes into the buffer and is actually received as part of the network data from the C2 server.
Question Number 1. How does this program achieve persistence to ensure that it continues running when the computer is restarted?
By modifying every EXE and legitimate references to kernel32.dll
to kerne132.dll
(a newly written DLL) on the compromised system.
Question Number 2. What are two good host-based signatures for this malware?
- Mutex: SADFHUHF
- Filename: C:\Windows\System32\kerne132.dll
Question Number 3. What is the purpose of this program?
Execute custom commands via the DLLMain function (custom backdoor) of the kerne132.dll
file which has been replaced in every EXE residing in the C drive.
Question Number 4. How could you remove this malware once it is installed?
Hard to remove. Almost every .EXE file in the C drive has been modified. But you could still place the legitimate kernel32.dll as kerne132.dll or write scripts to remove the backdoor, just as it was installed.