In these days I had the opportunity to retrieve some information from an Iphone backup, in the specific case I was interested in WhatsApp’s messages. Nothing hard per se, it’s all inside some database files and with a tool like SQLite Database Browser you can easily access all the needed information but the annoying thing is represented by the interpretation of the result of your queries.
I didn’t have an immediate deadline for the analysis so I decided to code a little extractor for WhatsApp. Here are some screenshots:

- all the available chats on the top and the selected chat below:
Gui

- it can display images, movies and audio. A click will open the associated Windows program able to show/play the media:
Image
Movie
Audio

- it shows deleted messages:
Deleted

- events:
Events

- and messages that are not sent due to an internal error:
Not sent

- I can get the onwer’s info:
Owner

- and the contacts info too:
Contacts

The program has been tested on this configuration:
Os: Win7/XP
iOS: 6.1.3
WhatsApp: 2.8.7
It’s just a personal project but if you want you can try it. Download WABI from here

Two months ago Halvar Flake announced a new malware challenge for female reversers only. I’m a man and I couldn’t submit a solution but I decided to give it a try, challenges are always funny. I shared this reversing session with Kayaker, so credit for this blog post goes to him too.

The solution posted by Marion Marschalek (congratulations!) is pretty nice and it explains almost everything in details. I’m not writing a blog post with the same information she gave, but I would like to add something about the way I used to automatically naming/resolving the imported functions.
The idea is to change instructions like:
.text:0040100B call dword ptr [ecx+220h]
into something like:
.text:0040100B call dword ptr [ecx+_API.malloc]

To perform this switch I have written an IDC script. There are two functions inside it, GetAPINames and ResolveAPINames. The first function is used to retrieve the name of all the hidden API while the other one will change the call instructions into a new readable version.
The script stores all the information inside a structure named _API which is filled with all the API names. The structure is necessary and I’ll use it for some minor manual fix too.

GetAPINames
There’s no trace of clear API names inside the disasm, everything is constructed at runtime inside call 402DB0, take a look at this piece of code (without unnecessary junk code lines):

00403E42 C6 44 24 24 43 mov [esp+179CCh+var_179A8], 'C'
...
00403E49 C6 44 24 25 72 mov [esp+179CCh+var_179A7], 'r'
...
00403EC1 C6 44 24 26 65 mov [esp+179CCh+var_179A6], 'e'
...
00403F14 C6 44 24 2F 61 mov [esp+179D4h+var_179A5], 'a'
00403F19 C6 44 24 30 74 mov [esp+179D4h+var_179A4], 't'
00403F1E C6 44 24 31 65 mov [esp+179D4h+var_179A3], 'e'
00403F23 C6 44 24 32 54 mov [esp+179D4h+var_179A2], 'T'
00403F28 C6 44 24 33 68 mov [esp+179D4h+var_179A1], 'h'
00403F2D C6 44 24 34 72 mov [esp+179D4h+var_179A0], 'r'
00403F32 C6 44 24 35 65 mov [esp+179D4h+var_1799F], 'e'
00403F37 C6 44 24 36 61 mov [esp+179D4h+var_1799E], 'a'
00403F3C C6 44 24 37 64 mov [esp+179D4h+var_1799D], 'd'
00403F41 C6 44 24 38 00 mov [esp+179D4h+var_1799C], 0

As you can see CreateThread string is obtained appending every single char. To create names the malware uses another similar way:

00406EC9 C6 84 24 BC 0F 00 00 47 mov [esp+179CCh+var_16A10], 'G'
00406ED1 C6 84 24 BD 0F 00 00 65 mov [esp+179CCh+var_16A0F], 'e'
...
00406F4C C6 84 24 BE 0F 00 00 74 mov [esp+179CCh+var_16A0E], 't'
...
00406FA5 C6 84 24 C7 0F 00 00 4D mov [esp+179D4h+var_16A0D], 'M'
00406FAD C6 84 24 C8 0F 00 00 6F mov [esp+179D4h+var_16A0C], 'o'
00406FB5 C6 84 24 C9 0F 00 00 64 mov [esp+179D4h+var_16A0B], 'd'
00406FBD C6 84 24 CA 0F 00 00 75 mov [esp+179D4h+var_16A0A], 'u'
00406FC5 C6 84 24 CB 0F 00 00 6C mov [esp+179D4h+var_16A09], 'l'
00406FCD C6 84 24 CC 0F 00 00 65 mov [esp+179D4h+var_16A08], 'e'
00406FD5 C6 84 24 CD 0F 00 00 46 mov [esp+179D4h+var_16A07], 'F'
00406FDD C6 84 24 CE 0F 00 00 69 mov [esp+179D4h+var_16A06], 'i'
00406FE5 C6 84 24 CF 0F 00 00 6C mov [esp+179D4h+var_16A05], 'l'
00406FED C6 84 24 D0 0F 00 00 65 mov [esp+179D4h+var_16A04], 'e'
00406FF5 C6 84 24 D1 0F 00 00 4E mov [esp+179D4h+var_16A03], 'N'
00406FFD C6 84 24 D2 0F 00 00 61 mov [esp+179D4h+var_16A02], 'a'
00407005 C6 84 24 D3 0F 00 00 6D mov [esp+179D4h+var_16A01], 'm'
0040700D C6 84 24 D4 0F 00 00 65 mov [esp+179D4h+var_16A00], 'e'
00407015 C6 84 24 D5 0F 00 00 45 mov [esp+179D4h+var_169FF], 'E'
0040701D C6 84 24 D6 0F 00 00 78 mov [esp+179D4h+var_169FE], 'x'
00407025 C6 84 24 D7 0F 00 00 41 mov [esp+179D4h+var_169FD], 'A'
0040702D C6 84 24 D8 0F 00 00 00 mov [esp+179D4h+var_169FC], 0

The way used to create GetModuleFileNameExA is pretty similar to the previous one but there’s a little difference, look at the opcodes. The mov instructions are similar but the ModR/M byte defines a distinct displacement.
Is it possible to recognize and isolate all the instructions used to create all those strings? Well, it’s not so hard because some bytes are fixed! The idea is to parse all the instructions inside 402DB0 trying to recognize those two special mov instructions:

if (Byte(currAddress) == 0xC6) {
  if (Byte(currAddress+1) == 0x44) {
    if (Byte(currAddress+2) == 0x24) {
      if (Byte(currAddress+4) != 0x00) {
        // Get current char and append it to partial name
        szChar = sprintf("%c", Byte(currAddress+4));
        szAPI = sprintf("%s", szAPI + szChar);
      } else {
        if ((strstr(szAPI, ".DLL") == -1) && (strstr(szAPI, ".dll") == -1))
        {
          // Add member to struct (no DLL name)
          AddStrucMember(id, szAPI, -1, FF_DATA, -1, 4);
        }
        szAPI = sprintf("%s", ""); // reset for next string
      }
    }
  }
}

The instructions are checked byte by byte and the strings are created char by char. szChar is the current char to append to the partial string szAPI.
There’s a little problem with this parser, it constructs DLL names too. I’m not interested in DLL names, so a check over the formatted string is necessary:

if ((strstr(szAPI, ".DLL") == -1) && (strstr(szAPI, ".dll") == -1))

Now that I’m sure I don’t have a DLL name I can insert it into the structure:

AddStrucMember(id, szAPI, -1, FF_DATA, -1, 4);

The use of the structure is fondamental for the script.
Now that you know how to parse the first type of mov instruction you can easily change some checks over the fixed bytes and you’ll retrieve names like GetModuleFileNameExA too:

if (Byte(currAddress) == 0xC6) {
  if (Byte(currAddress+1) == 0x84) {
    if (Byte(currAddress+2) == 0x24) {
      if (Byte(currAddress+5) == 0x00) {
        if (Byte(currAddress+6) == 0x00)     {
          if (Byte(currAddress+7) != 0x00) {
            ...

This part of the script works pretty fine but it has a little problem with few functions. To understand it here is an example with GetQueuedCompletionStatus:

00405C2A mov [esp+179CCh+var_179A8], 'G'
00405C31 mov [esp+179CCh+var_179A7], 'e'
00405CA9 mov [esp+179CCh+var_179A6], 't'
00405CF3 mov bl, 'Q'
00405CFE mov [esp+179D4h+var_179A5], bl
00405D02 mov [esp+179D4h+var_179A4], 'u'
00405D07 mov [esp+179D4h+var_179A3], 'e'
00405D0C mov [esp+179D4h+var_179A2], 'u'
00405D11 mov [esp+179D4h+var_179A1], 'e'
00405D16 mov [esp+179D4h+var_179A0], 'd'
00405D1B mov [esp+179D4h+var_1799F], 'C'
00405D20 mov [esp+179D4h+var_1799E], 'o'
00405D25 mov [esp+179D4h+var_1799D], 'm'
...

As you can see the letter ‘Q’ is obtained by a sequence of two instructions and my script is not able to catch it; it creates GetueuedCompletionStatus name. It has been proved that human brain is able to recognize word without few letters or with scrambled letters so I think I can pass over this minor problem!

ResolveAPINames
Ok, now that I have the API structure I need to use it for the resolution part. The function ResolveAPINames scans the entire disasmed code trying to fix the necessary calls. To identify the call you can use a simple strstr function, and to convert it you can use OpStroff (it converts operand to an offset in a structure):

if(strstr(GetDisasm(ea), "call dword ptr") != -1) {
  OpStroff(ea, 0, GetStrucIdByName("_API"));

The malware uses a nice addressing method and IDA is not able to parse the hidden API but the nature of the addressing method lets us to solve the problem with some lines of code only. Now you can understand why the structure is the core of the entire script.

Manual fix
The script is able to resolve 678 calls, but it fails to fix some special cases like this:

00414E36 lea esi, [eax+1FCh]
00414E3C call dword ptr [eax+_API.GetTickCount]
00414E42 push eax
00414E43 call dword ptr [esi]

GetTickCount has been resolved but the next call not. It’s obvious that esi points to the API at offset 0x1FC. You can solve it manually because the structure contains it. Right click over 0x1FC and select the line “[eax+_API.srand]“. Now you know how to manually fix special cases too.

I enjoy browsing crackmes.de, and I used to try a crackme from time to time. It’s not my intention to write a tutorial for this crackme, but after reading a comment from andrewl.us I decided to spend some words for this tutorial.
andrewl.us’s comment is pretty simple: “two solutions accepted, and neither uses a deobfuscator that emits a cleaner crackme”. That’s the point, nowadays only few people spend some time writing a detailed/original/complete solution, most of them prefer to say “I did it”… that’s sad and it’s not the spirit of crackmes.de!

Intro
The protection routine created by the author is really simple, but the obfuscation method applied makes our reversing session hard. Look here and you’ll understand how the code works:

406E2A jmp short loc_406E2D
...
406E2D nop
406E2E jmp loc_405B10
...
405B10 nop
405B11 nop
405B12 and al, 0FFh
405B14 jmp loc_4053CC
...

A lot of jumps and a lot of junk code. The real challenge is to identify the valid instructions of the crackme because most of them are only useless junk code. So, to get the valid instruction you can:
1. step all the crackme code using a debugger trying to identify valid and not valid instruction
2. deobfuscate the crackme in some way

Point #1 represents the easy way because you will surely find out the valid instructions checking the code instruction by instruction. In this case it’s a good approach because the crackme is only 35kb… Will you apply this method on a huge obfuscated file? That’s why I opt for point #2!

Due to the nature of the protected code I think it’s not possible to write a static program able to deobfuscate an exe protected with this method, but if you imagine the exe like a set of procedures combined all together you can produce something which is pretty near to the original untouched program.
To solve this crackme I’ll write an idc script able to extract valid instructions from a single procedure. Having this in mind you can then put everything all together obtaining a working deobfuscated crackme.

The deobfuscator
The idea is pretty simple: starting from a specific address the script traces through the necessary instructions trying to identify (and eventually remove) junk code. The script will show all the valid instructions inside the output area.
First of all I have to create the skeleton of the script.

static main()
{
   do {
      check_current_instruction;
      if (!junk_instruction)
         print_valid_instruction;
      get_address_next_instruction;
   }
   while (valid_address_instruction);
}

The script checks every single instruction trying to understand if it’s valid or junk. I’ll show you later how to check it, now take a look at the snippet used to print a valid instruction:

static print_valid_instruction(address)
{
   auto instrLen;
   auto op1, op2;

   Jump(address);
   MakeUnkn(address, 0);
   instrLen = MakeCode(address);
   Message("\n%X: %s", address, GetMnem(address));
   op1 = GetOpnd(address, 0);
   if (op1 != 0x00)
      Message(" %s", op1);
   op2 = GetOpnd(address, 1);
   if (op2 != 0x00)
      Message(", %s", op2);
   return instrLen;
}

The function receives the address of the current valid instruction printing it inside the output area.
I have used Jump, MakeUnkn and MakeCode because the code produced by IDA has a lot of undefined lines: with an undefined instruction is not always possible to get the right menmonic text associated to a specific opcode.
So, when you have the decoded instruction you can print it with the necessary operands. The function returns the length of the current instruction, used to get the address of the next one to check.

Nothing hard, let’s see how to identify the junk code. The idea is to isolate all the unnecessary instructions. First of all, to have a general idea of the crackme’s code take a look at this blocks:

.jmp:00405B10 90 nop
.jmp:00405B11 90 nop
.jmp:00405B12 24 FF and al, 0FFh <-- al remains the same
.jmp:00405B14 E9 B3 F8 FF FF jmp loc_4053CC
...
.jmp:00408969 40 inc eax <--
.jmp:0040896A 48 dec eax <-- eax remains the same
.jmp:0040896B 90 nop
.jmp:0040896C 90 nop
.jmp:0040896D E9 14 D4 FF FF jmp near ptr unk_405D86
...
.jmp:00405CC3 90 nop
.jmp:00405CC4 80 EB 00 sub bl, 0 <-- bl remains the same
.jmp:00405CC7 E9 34 0E 00 00 jmp near ptr unk_406B00
...
.jmp:00407163 50 push eax
.jmp:00407164 04 66 add al, 66h <-- add inside push/pop it's nonsense...
.jmp:00407166 58 pop eax
.jmp:00407167 E9 0C EB FF FF jmp near ptr unk_405C78

If you execute the snippets above you’ll find out that the state of the involved register remains the same, no changes are applied.
So, what can we put inside the junk code set? All the “jmp” instructions, “nop2, “and” with 0xFF and so on… If you look at the code I think you’ll surely find out what to mark as junk in a short time.
Just to understand it, here is a part of the script used to avoid some specific instructions:

   opcode = Byte(currAddress); // Get the byte at the current address
   ...
   else if ((opcode == 0x24) && (Byte(currAddress+1) == 0xFF)) // Avoid "and op1, 0xFF" instruction
      currAddress = currAddress + 2; // Move to the next instruction
   ...
   else if ((opcode == 0x60) && (Byte(currAddress+2) == 0x61)) // Avoid instructions from pushad to popad
      currAddress = currAddress + 3;
   ...
   else if (opcode == 0x90) // Avoid nop instruction
      currAddress = currAddress + 1;
   ...
   if (opcode == 0xE9) // Avoid jump instruction
      currAddress = (currAddress + Dword(currAddress+1) + 5) & 0x00000000FFFFFFFF;

Instead of printing the instruction I simply avoid it moving the attention to the next one. You don’t need to use MakeCode or something else because the script works directly on the bytes code.
Ok, is it really all? Hmm, there is something more to add. You have to pay particular attention to a point: how can you manage a conditional jump?
The question deserves a reply because sooner or later the crackme will have a check and you’ll have to face a conditional jump. I decided to let you choose the path to follow! What does it mean?
I remove all the conditional jumps except the one which is after a compare instruction. I need to remember that a compare has been executed:

   else if (opcode == 0x3B) { // Compare found, take it in mind!
      cmp = 1;
      currAddress = currAddress + print_valid_instruction(currAddress);
   }

The variable “cmp” is used to take in mind that a compare instruction has been executed. cmp variable is originally initialized to -1. Here is how I handle a conditional jump instruction:

   else if ((opcode == 0x74) || (opcode == 0x75)){
   // Check to see if a cmp instruction has been executed
   if (cmp == 1) {
      Jump(currAddress);
      jump = AskYN(0, "Want to jump?"); // Box with a question
      if (jump == 0) // No jump
         currAddress = currAddress + 2;
      else // Jump
         currAddress = (currAddress + Byte(currAddress+1) + 2) & 0x00000000FFFFFFFF;
      cmp = -1; // Restore the original value
   }
   else
      currAddress = (currAddress + Byte(currAddress+1) + 2) & 0x00000000FFFFFFFF;
}

That’s what I meant when I said “I decided to let you choose the path to follow”. Pressing Yes or No inside the appeared box you can decide the flow of the crackme code, it’s just like a live trace.
I decided to not print the conditional jump but if you think it will help you put it inside the set of valid instructions.

That’s how my deobfuscator works (get it from here). I don’t need anything else because with the output produced by the script I can study the entire protection routine via dead list without having to debug the program.
For those who wants to study the deobfuscated crackme using a debugger it’s possible to produce a clean crackme too. Instead of printing the result in the output area you could move the valid instructions somewhere inside the file. There are a lot of free bytes to use inside the exe; just choose a starting point and with some minor adjustments, as I suggested before combining all the deobfuscated procedures, you can have a working clean crackme version.

The protection routine
The algorithm used by the crackme is really simple, but it requires a brute-force approach…

4092BA: mov cx, 54h
408B0D: shl ecx, 10h
405507: mov cx, 5854h
405680: push ecx <-- push "TXT
4085F4: mov cx, 2E59h
407CC1: shl ecx, 10h
405B4C: mov cx, 454Bh
4087F4: push ecx <-- push "KEY."
406705: mov edx, esp <-- edx -> "KEY.TXT"
40881F: mov cx, 40h
407FAF: shl ecx, 10h
408603: mov cx, 812Fh
405435: push ecx <-- push 0x40812F: first instruction to execute after CreateFileA
4078B7: xor eax, eax
408EDE: push eax
407E92: mov al, 50h
408810: push eax <-- Flags & attributes: 0x50: FILE_ATTRIBUTE_DIRECTORY | FILE_ATTRIBUTE_DEVICE
4070EB: mov al, 3
4052CF: push eax <-- OPEN_EXISTING
405B5C: mov al, 0
408441: push eax
406E3B: push eax
40868B: mov al, 50h
40610A: shl eax, 18h
408AF1: push eax <-- Desired access: 0x50000000: GENERIC_WRITE | GENERIC_ALL
4080CC: push edx <-- Filename: "KEY.TXT"
4092CE: call CreateFileA
4092D3: ret

From the readme file we already know that the protection routine is based on a keyfile, now we know the name of the file to produce: KEY.TXT
At the beginning of the tutorial I told you that it’s not possible to write a fully working static deobfuscator, and now you can understand why: look at the push at 405435. The address on the stack (0x40812F) is the point where the precedure will return after ret instruction at 4092D3. How can you identify the address as a possible return value? Well, maybe possible, but hard…

Ok, back to our protection routine: after executing CreateFileA function, the code pass from instruction 40812F:

4077E5: xor edx, edx
4068B9: dec edx
407FDC: cmp edx, eax <-- check
4092D9: call ExitProcess <-- here if check fails

The snippet above contains a first check which is done on the keyfile handle. If the file doesn’t exist the crackme ends. As you’ll see there are some more checks performed by the crackme, all of them end with a call to ExitProcess without a single error message.
To pass this check create the file and put some bytes inside. As you can guess the next piece of code will be used to read the content of the keyfile:

407FDC: cmp edx, eax
406372: xor edx, edx
406C87: push edx <-- push 0
40909F: push edx <-- push 0
405DD3: push edx <-- push 0
405D95: mov ebp, esp
406FFC: push edx <-- push 0
405B01: mov esi, esp
407426: push eax <-- push file handle
407235: mov cx, 40h
406263: shl ecx, 10h
406F38: mov cx, 5ADDh
40759B: push ecx <-- push 0x405ADD: first instruction to execute after ReadFile
40778A: push edx
4089C3: push esi <-- bytes read
407181: push 18h
4070DC: push ebp <-- buffer that receives the data
408E56: push eax <-- file handle
4092C8: call ReadFile
4092CD: retn

An obvious conseguence of CreateFile: ReadFile. It’s time to see how it will use the keyfile’s bytes.

405426: mov esp, ebp <-- ESP points to the contents of the keyfile
40764F: mov cx, 0F45Ah
40782F: shl ecx, 10h
405DA4: mov cx, 675Dh
4066E6: mov esi, ecx <-- ESI = 0xF45A675D
407B95: mov cx, 4DDAh
40892D: shl ecx, 10h
407CDF: mov cx, 0FA31h
406B4C: mov edi, ecx <-- EDI = 0x4DDAFA31
406083: pop eax <-- EAX = first 4 bytes of the file (Key_1)
405999: pop ecx <-- ECX = second 4 bytes of the file (Key_2)
405A04: push ecx
405EC3: push eax
407B59: xor eax, esi <--
4075AA: xor ecx, edi <-- operations based on Key_1 and Key_2
4067C8: add eax, ecx <--
406F47: mov cx, 0BAE3h
406F83: shl ecx, 10h
406BB4: mov cx, 0DC73h <-- ECX = 0xBAE3DC73
405544: cmp eax, ecx <-- first check

This first check is based on the first eight bytes of the keyfile. The scheme is:

(Key_1 ^ 0xF45A675D) + (Key_2 ^ 0x4DDAFA31) = 0xBAE3DC73

Nothing hard per se but without other informations if you want to solve this equation you are forced to write a brute force algorithm over key_1 and key_2. Really expansive in terms of time…

408DCE: mov cx, 0AADDh
40846E: shl ecx, 10h
406E48: mov cx, 357Dh
408324: mov esi, ecx <-- ESI = 0xAADD357D
408DEC: mov cx, 44FAh
405FED: shl ecx, 10h
405B2E: mov cx, 0FC3Ch
4059A9: mov edi, ecx <-- EDI = 0x44FAFC3C
4069F3: pop eax <-- EAX = Key_1
4074BB: pop ecx <-- ECX = Key_2
4085A9: push ecx
40919F: push eax
4051FC: xor eax, esi <--
408EB0: xor ecx, edi <-- operations based on Key_1 and Key_2
409143: add eax, ecx <--
407901: mov cx, 0F23Fh
40893C: shl ecx, 10h
406335: mov cx, 2C88h <-- ECX = 0xF23F2C88
406B87: cmp eax, ecx <-- second check

Another check over the keyfile’s content. As you can see it’s pretty similar to the previous one:

(Key_1 ^ 0xAADD357D) + (Key_2 ^ 0x44FAFC3C) = 0xF23F2C88

2 equations and 2 variables, we can reduce the time of the brute-agorithm because you can test all possible values of key_1 only (key_2 is obtained as a conseguence).

407E66: pop esi <-- ESI = Key_1
407FCD: pop edi <-- EDI = Key_2
407228: pop ebx <-- EBX = third 4 bytes of the file (Key_3)
408144: mov cx, 2164h
406AF1: shl ecx, 10h
406A4C: mov cx, 3172h <-- ECX = 0x21643172
40602A: cmp ecx, ebx <-- third check

4 more bytes join the party. This time no math or logic operations but a simple compare byte to byte: Key_3 must be “r1d!”. We know part of the keyfile, but this information doesn’t reduce our brute-time…

408C84: mov cx, 6F04h
406290: shl ecx, 10h
4091E8: mov cx, 530Ah <-- ECX = 0x6F04530A
406F65: xor ecx, edi <-- Key_2 ^ 0x6F04530A
406F57: push ecx
407514: mov cx, 0F0Fh
405408: shl ecx, 10h
40597B: mov cx, 101Bh <-- ECX = 0x0F0F101B
40819E: xor ecx, esi <-- Key_1 ^ 0x0F0F101B
406BD3: push ecx
408D0B: mov eax, esp
40638F: xor ebp, ebp
405284: push ebp
407AD2: push ebp <-- Caption: NULL
408EBF: push eax <-- Text:
406E95: push ebp <-- 0
4092D4: call MessageBoxA <-- CONGRATULATION BOX
4092D9: call ExitProcess

The final part of the crackme, it shows a congratulation message box. The text of the box is obtained using the correct value for Key_1 and Key_2.

To sum-up: there are 3 checks to pass, the first two are based on Key_1 and key_2 while the last check is over key_3 only. Key_3 values is fixed and I have to find the first two dwords (represented by Key_1 and Key_2) using a brute. The solution is not unique, you can provide some different keyfiles but I think the Valid solution is the one wich reveal the right text inside the message box. For me the valid text is “Success” that is obtained with keyfile = “Hello wor1d!” but who knows… :)

Final notes
It’s indeed an easy crackme, but it represents a good starting sample for those who have never played with an obfuscated target.

I use to browse crackmes.de from time to time when I want a reversing challenge to play with. I have to admit that there are much more interesting crackmes in the past, but today (digging deeper) I’ve found an unusual one; it’s not hard but the idea is really nice! The name of the crackme is “What Is This???”

crackmesde8

No exe file this time, the downloaded archive contains a readme file and the above image, nothing else. You can’t interact with an image so you can only load it into your preferred hex editor hoping to understand what’s going on.

It’s pretty easy to understand that something has been added at the end of the file (jpeg file ends with FF D9 bytes sequence):end_file

As you can see, the author appended some bytes at the end of the image. Now, I have to understand what’s behind these unknown bytes and Google will help me. I decided to start with “**TI83F*” which is the only reasonable string to search for. At the end I reach a page at http://merthsoft.com/linkguide/ti83+/fformat.html which enlighten the path to the solution of the crackme: the string “**TI83F” represents a tag used to define a program for a Texas Instruments calculator.

With the file format under your eyes it’s pretty easy to understand each byte:

- 2A 2A 54 49 38 33 46 2A:  8 bytes signature: it’s always “**TI83F*”

- 1A 0A 00: 3-byte, further signature: these three bytes always contain {1Ah, 0Ah, 00h}
- 00..00: 2Ah bytes, comment: it’s either zero-terminated or padded on the right with space characters
- 56 00: 2 bytes, length, in bytes, of the data section of the file
- 0D 00 45 … 2A 3F D4: n bytes, data section: consists of a number of variable entries
- 90 16: 2 bytes, file checksum: lower 16 bits of the sum of all bytes in the data section

Now, the data section:

- 0D 00: 2 bytes, always has a value of 0Bh or 0Dh
- 45 00: 2 bytes, length, in bytes, of the variable data
- 05: 1 byte, variable type ID byte (0×05 = Programs)
- 41 00 00 .. 00: 8 bytes, variable name padded with NULL characters
- 00: 1 byte, version: usually set to 00
- 00: 1 byte, flag: set to 80h if variable is archived, 00h else
- 45 00: 2 bytes, length, in bytes, of the variable data
- 43 00 DC .. 2A 3F D4: n bytes, variable data

Nothing interesting right now, just some definitions. The algo is all inside the variable data, each byte of that block represents a piece of the algo. I won’t explain every single byte definition because I think you can understand it by yourself simply using the table at http://merthsoft.com/linkguide/ti82/tokens.html

To sum-up, the variable data starting with “43 00 DC” bytes sequence can be decoded into this program:

Input A
A->B
0->C
While C<100
A+10*tan(A)->A
C+1->C
End
If A=19911.236
Then
Disp “OK”
Else
Disp “NO”
End

The algo is a TI-83 program, it’s pretty basic and the language is intuitive but if you need help take a look at chapter 16 of TI-83 Guidebook available online. Now that you know what’s going on you only have to solve it, good luck :)

I got some requests for this little tool so I decided to put it online for everyone.

As I told you in a recent blog post the program is still uncomplete because I want to add some more features and I don’t know if it has real bad bug(s) inside. So, your question might be “why are you putting it online?”. Just because I think that bug reports, constructive feedbacks, criticisms and comments represent a good way for making decent programs. Feel free to use my e-mail address for any kind of messages you want to send me.

DexInspector

DexInspector

This is more like a twitter post than a blog post… Anyway, for all those who haven’t tried yet I suggest to take a look at the Cyber challenge by the General Intelligence and Security Service of the Netherlands: https://www.aivd.nl/organisatie/eenheden/nationaal-bureau/nieuws/aivd-cyber-challenge/

I had fun decrypting the ENCrypted file. I would like to blog something about the crackme but I think I can’t give out spoilers…

Good luck and enjoy this cyber-challenge! :)

Rewording a quote from a famous film I would say “The Malwares are everywhere, even now in this very room”, and that’s could be true if you have an Android based mobile phone because the number of this kind of malwares are increasing nowadays. It’s not so hard to study an Android malware; there are some nice tools available on the net, and in the foreseeable future I could add my simple DexInspector too.

Why do I need another .dex analyzer? Well, I decided to write this tool because I had some problems dealing with disassembled output produced by some existing tools. Some of them shows an output which is hard to follow because of its syntax that is strictly related to .dex opcodes; on the other hand other tools try to understand what’s behind the code, but it’s really hard with a static analysis and the result may produce ambiguous results. I always use more than one single tool in my reversing sessions, and I think DexInspector could be a valid help for .dex analysis.

Since I don’t like to fill this blog post with a series of anonymous screenshots without a single word, to introduce the tool I’m going to use a malware named “FakeLookout.A” guiding you through a possible reversing session.
You can find a nice description of the malware at http://blog.trustgo.com/fakelookout/ and I’ll explain you how to get this information:

This malware can receive and execute commands from remote server.
Server address: hxxp://[hidden]press.com/controls.php
Commands:
clearFileList
getDir
clearAlarm
getFile
getSize
getTexts

DexInspector

.dex has been fully loaded

This is the main dialog of DexInspector, exactly when a .dex file has been fully loaded.
The box with a series of “[INFO]” entries is used to display information, depending on the current task. In this case it tells us everything has been loaded correctly.  The treeview on the left contains a list of all the .java files declared in the malware project.  Each disassembled class will be showed inside the empty box. The toolbar contains some buttons (used to view necessary information about structures inside a .dex file), a control used to store all the methods inside the current disassembled class and  3 search text controls.

Map list dialog

From the map_list dialog you can have an idea of what the file is. It has a lot of methods, classes and strings (item: method_id_item, class_def_item and string_id_item), so the question is: if I want to understand what’s going on inside the malware, where is a good starting point? The name of the classes and the original files used by the programmer are valid candidates, but like a common reversing session I prefer to start the static analysis directly from strings window, even if there are too many strings.

Selected string with location info

Due to the .dex nature it’s easy to discard useless entries; i.e. you can avoid specific type strings (something like “IILI”), strings starting with ‘L’ or ‘[', etcetc.  Among all the strings I see a suspicious address: "http://thelongislandpress.com/controls.php", the server doesn't exist anymore so I think I don't need to obscure the address.
As you can see from the listview at the bottom of the dialog the string is located in a single file named "com.updateszxt.HttpFileUploader.zdi".
What kind of file is this? It's the disassembled file and the .zdi is the extension I use, you can find the file on your hard disk too.  Anyway, the name of the original file is HttpFileUploader.java:

Disassembled view: class, fields and constructor

The original file has inside the definition of the public class com.updateszxt.HttpFileUploader which extends java.lang.Object and contains some fields definitions (both static and instance). The http address is used inside the constructor of the class, here is constructor’s code:

Code of the constructor

The program analysis is done instruction-by-instruction involving debug information which are mostly useful when you have to understand the code. In this specific case the program uses debug information in different places, the first one is inside the declaration of the method: the parameters have specific names which are not randomly created.
It’s pretty simple to understand what’s going on. First thing: the method is used to check if a specific folder exists. If the folder, named “dataCache”, doesn’t exist it will be created inside Android external storage directory. Second thing: it initializes an url using the http address above. Nothing special indeed, but before going directly to the next part of the class I wanto to point your attention to an instruction:
  java.io.File folder = v1;
It’s not a line derived from one of the .dex opcodes but it comes from debug information and represents a local variable definition. Nothing special, but dealing with the original name could be handy.
Another addition is represented by the info inside the small box, it shows something depending on the selected instruction. i.e.:
selected instruction:   v2 = v2->append(v3);
info: java.lang.StringBuilder java.lang.StringBuilder->append(java.lang.String)
The meaning of the info is: the method append from java.lang.StringBuilder gets a java.lang.String parameter and returns a java.lang.StringBuilder type.

Ok, back to the code!

Interesting code starts here

Searching through the methods listed in the combo box I select the one named runIt, after a little glance you can understand what’s going on: the program estabilishes a connection with the remote server, it gets a command and execute it. The command is returned by getControls method and it’s one of the six I have mentioned at the beginning of this post (it’s also possible to have a path following the command).

Check for the right command to execute

Except few lines at the beginning of the method the codeflow of the program depends on the invoked command. As you can see from the snippet the command is compared with “clearFileList”, if it’s the one they want to invoke it will be executed otherwise the next command will be checked (“getDir”). This kind of check is done until the right command is not found.

Final words
Malware analysis ends here, there are a some nice methods to explore and I’ll let you discover the rest yourself because I think I have attached too many pictures for a simple program. It’s time to end up with this blog post, you should have an idea of what DexInspector does.
It’s not perfect, but I feel comfortable with it. It’s still under development because I have to test it again and again (bugs are like malwares, they are everywhere!) and I would like to add some more things.
If you think it might help your Android reversing sessions I’ll put it online available for everyone, on condition that you’ll report bugs, comments and criticisms to me, ok? Let me know and don’t expect too much, it’s only a disassembler!

Follow

Get every new post delivered to your Inbox.