It’s relatively easy to understand what’s going on when you are in front of a clear disasmed malware but most of the time you have to deal with packers, obfuscations, encryptions, useless code, etc etc… the list is long. There are many ways to deal with every single case and everyone follows his own method mainly because it’s hard to break the habit.
Take a look at this possible scenario, the picture comes from a real malware:
This piece of code is inside a memory space created at runtime by the malware. As you can see Ollydbg can’t directly understand where the call(s) will lead you. Ida can’t help you too:
The code has a lot of call(s) like these and all of them are used to exec a specific function from a system dll. The question is: how do you recognize if a call refers to a dll exported function or not? How do you change that raw code into a clear and readable version?
I have to understand what’s behind the dword pointed by the call and the idea is to use a sort of database containing the addresses of the dll exported functions running on the system. The database is actually a .txt file filled with a sequence of lines, each line is used to store information about a single exported function. Take a look at few lines from the file I use as database inside an XP machine:
It’s simple, a number and a string. The string is the name of the function and the address refers to the starting memory address of the function. The database has been created by a little application named DllExportAddressList. The utility retrieves the addresses of all the exported functions defined inside one or more dll. It doesn’t create the text file because I didn’t want to waste time with file manipulation. I prefer to use cmd’s redirection options. Some examples on how to use DllExportAddressList:
DllExportAddressList advapi32 > filename.txt
It creates the file “filename.txt” with the list of the exported functions of advapi32 dll.
DllExportAddressList kernel32 ntdll psapi >> filename.txt
Append the exported functions of the 3 defined dll to the existing “filename.txt” file.
You don’t have to run the utility every time you analyze a new malware; create the database from your virgin dedicated virtual machine and update it if and only if you need some more hardcoded values.
This is the base of the method, now I need something able to read the content of the database and eventually apply the result to the disasm: an idc script. I have more than one script on my dedicated machine for this purpose, but the concept is always the same and they are all based on this, the simplest:
I don’t think you need a detailed report about this piece of code, it basically searches a string inside a file.
Ok, this is the base point and starting from here you can write another script able to decode a range of addresses, or you can even write another one able to convert the instruction “call dword ptr ds:xxx” into “call ds:_FunctionName”. If you are interested in it and you are having difficulties in implementing the script drop me an e-mail and I’ll help you.
It’s hard to break the habit
My approach could be worse, easier, inappropriate, insane, faster (in terms of time), elegant, longest than any other approach but it’s my preferred and most of the time I use it. I would like to know what do you think about this method but only after I have read what your preferred method is. :)