Here are the first lines of a malware I was looking at some days ago (MD5: DA4B7EF93C588AD799F1A1C5AFB6CFAD). The malware is packed, I think with an home made packer; 40107C is the entry point, the first line of the loader’s code. The code is filled with useless instructions, nothing hard but if you want to study the entire loader you have to pay attention on every single lines of code. This time I’m not interested in the loader itself, but I’ll focus my attention on a strange behaviour, something I have never noticed before. The malware crashes at 4010AC on XP sp3 machine but it works fine on XP with service pack 1 or 2.
40107C ADD ECX,DWORD PTR SS:[ESP] ; useless 40107F MOV ESI,-70 ; useless 401084 ADD EDI,EAX ; useless 401086 MOV ECX,2AFFC5C8 ; useless 40108B ROL ECX,1 ; useless 40108E ROR EDX,15 ; useless 401091 MOV EDI,ESP ; edi = 12FFC4 401093 MOV EDX,FE000001 ; edx = 0xFE000001 401098 ROL EDX,7 ; edx = 0xFF 40109B SUB EAX,EBX ; useless 40109D AND EDI,EDX ; edi = 0x12FFC4 && 0xFF = 0xC4 40109F MOV EDX,25FE0 ; edx = 0x25FE0 4010A4 ROL EDX,3 ; edx = 0x12FF00 4010A7 ADD EDX,EDI ; edx = 0x12FFC4 4010A9 SAL ECX,11 ; useless 4010AC MOV EAX,DWORD PTR DS:[EDX] ; eax = 0x77E5EB69
The comments are taken from a XP sp1 debugging session. At the end of the snippet eax points to ExitThread’s parameter, the one inside BaseProcessStart. There’s nothing interesting in these few lines of code, but it’s always better to open your eyes when there are hardcoded values around. I’m referring to value 0x12FF00 (hardcoded is not totally right but the sense is the same). It’s not obvious but this piece of code could not work on every single machine. Seems like the author was sure about the initial stack address value. I don’t know when the malware was written, but this piece of code crashes on XP machine with Service Pack 3. Maybe the malware was written before the final release of the latest service pack, I dont know. Here is the same code tested on a machine running XP sp3 :
401091 MOV EDI,ESP ; edi = 13FFC4 401093 MOV EDX,FE000001 ; edx = 0xFE000001 401098 ROL EDX,7 ; edx = 0xFF 40109D AND EDI,EDX ; edi = 0x13FFC4 && 0xFF = 0xC4 40109F MOV EDX,25FE0 ; edx = 0x25FE0 4010A4 ROL EDX,3 ; edx = 0x12FF00 4010A7 ADD EDX,EDI ; edx = 0x12FFC4 4010AC MOV EAX,DWORD PTR DS:[EDX] ; CRASH!!!
The initial stack address is not the same, this time it’s 0x13FFC4. The malware was expecting to see 0x12FFC4, but the value it was looking for is stored inside 0x13FFC4 address.
Who decide which kind of value should be assigned to esp? 12FFC4 or 13FFC4?
My investigation started from kernel32.CreateProcessInternalW function. All the code refers to a XP sp3 machine, but sp1 code is almost equal.
7C819DE1 mov eax, [ebp+MaximumStackSize] 7C819DE7 lea ecx, [ebp+InitialTEB] 7C819DED push ecx ; InitialTEB 7C819DEE push eax ; MaximumStackSize 7C819DEF push [ebp+StackSize] ; StackSize 7C819DF5 push [ebp+hProcess] ; hProcess 7C819DFB call _BaseCreateStack@16 ; BaseCreateStack(x,x,x,x) 7C819E00 mov [ebp+var_9EC], eax ; eax = 0 means SUCCESS 7C819E06 cmp eax, ebx ; ebx = 0 7C819E08 jl _BaseSetLastNTError ; Jump to error check 7C819E0E push ebx ; NULL 7C819E0F push [ebp+InitialSP] ; Stack pointer 7C819E15 push [ebp+InitialPC] ; Program counter 7C819E1B push [ebp+Parameter] ; Parameter 7C819E21 lea eax, [ebp+Context] 7C819E27 push eax ; Context 7C819E28 call _BaseInitializeContext@20 ; BaseInitializeContext(x,x,x,x,x)
This is where the new process’s context will be initialized. This is only an initialization, you won’t see the final values (values at EP of the new process) of each register, but it’s enough to understand why the esp values are not equal.
There are two functions in the snippet above, BaseCreateStack is used to create a stack for the process to run. BaseInitializeContext, as suggested by the name, initializes the context structure using some values obtained by the previous function. Let’s start with the first one: BaseCreateStack.
Firstly, it checks two values: MaximumStackSize and StackSize. Both of them are loaded from the process to run using NtQuerySection. Among all the information of a PE header there are two fields named SizeOfStackReserve and SizeOfStackCommit that are taken and saved by the system as MaximumStackSize and StackSize. Msdn has a description of the fields:
SizeOfStackReserve: the number of bytes to reserve for the stack. Only the memory specified by the SizeOfStackCommit member is committed at load time; the rest is made available one page at a time until this reserve size is reached.
SizeOfStackCommit: the number of bytes to commit for the stack.
Ok, now the system is going to check if they are valid or not:
7C8102B5 mov eax, large fs:18h ; eax = TEB 7C8102BB mov ecx, [eax+30h] ; ecx = PEB ... 7C8102D2 push dword ptr [ecx+8] ; PEB->ImageBaseAddress ... 7C8102DB call ds:__imp__RtlImageNtHeader@4 ; RtlImageNtHeader(x) 7C8102E1 test eax, eax 7C8102E3 jz failure 7C8102E9 mov ecx, [ebp+MaximumStackSize] 7C8102EC test ecx, ecx ; is MaximumStackSize zero? 7C8102EE mov edx, [eax+IMAGE_NT_HEADERS.OptionalHeader.SizeOfStackCommit] 7C8102F1 jnz short MaximumStackSize_not_zero 7C8102F3 mov ecx, [eax+IMAGE_NT_HEADERS.OptionalHeader.SizeOfStackReserve] 7C8102F6 mov [ebp+MaximumStackSize], ecx
If MaximumStackSize has a not zero value the flow goes on otherwise it’s necessary to set a value to this variable. Which is this value? It’s the one taken from the process’s PE header pointed by PEB->ImageBaseAddress.
Ok, now it’s time for a check over the other variable; the check is pretty similar to the previous one:
7C8102F9 MaximumStackSize_not_zero: 7C8102F9 mov eax, [ebp+StackSize] 7C8102FC test eax, eax ; Is StackSize zero? 7C8102FE push edi 7C8102FF mov edi, 0FFF00000h 7C810304 jnz StackSize_not_zero 7C81030A mov eax, edx ...
If StackSize is zero the content of the variable is filled with the value taken some lines above at 7C8102EE: SizeOfStackCommit. It’s almost the same check I described for MaximumStackSize.
If the values are not zero, the system checks them again, just to be sure they are valid:
7C80AFC2 cmp eax, ecx ; compare between StackSize and MaximumStackSize 7C80AFC4 jb loc_7C81030C 7C80AFCA lea ecx, [eax+0FFFFFh] ; 7C80AFD0 and ecx, edi ; fix MaximumStackSize 7C80AFD2 mov [ebp+MaximumStackSize], ecx ; 7C80AFD5 jmp loc_7C81030C
StackSize must be minor than MaximumStackSize, if it doesn’t happen the system raise up MaximumStackSize. Now that the initial check is complete the function proceeds working on some alignment stuff, not so interesting per se. I can pass over this part reaching an interesting snippet:
7C81036F mov ebx, ds:__imp__NtAllocateVirtualMemory@24 ; NtAllocateVirtualMemory(x,x,x,x,x,x) ... 7C81037A push PAGE_READWRITE ; Protect: PAGE_READ_WRITE ... 7C810380 push MEM_RESERVE ; AllocationType: MEM_RESERVE 7C810385 lea eax, [ebp+MaximumStackSize] 7C810388 push eax ; RegionSize = MaximumStackSize 7C810389 push 0 ; ZeroBits = 0 7C81038B lea eax, [ebp+_BaseAddress] 7C81038E push eax ; BaseAddress = 0; 7C81038F push [ebp+hProcess] ; ProcessHandle 7C810392 mov [ebp+MaximumStackSize], ecx 7C810395 call ebx ; NtAllocateVirtualMemory
The system reserves the right address space for the stack. It reserves MaximumStackSize bytes starting from an address chosen by the system; the address is the first available address inside the virtual space. The chosen address is stored inside BaseAddress and it’s used to update the content of InitialTeb->StackAllocationBase field:
7C81039F mov edi, [ebp+InitialTEB] 7C8103A2 mov ecx, [ebp+_BaseAddress] 7C8103A5 mov eax, [ebp+MaximumStackSize] 7C8103A8 and [edi+INITIAL_TEB.PreviousStackBase], 0 7C8103AB and [edi+INITIAL_TEB.PreviousStackLimit], 0 7C8103AF mov [edi+INITIAL_TEB.AllocateStackBase], ecx
The stack is created, there are 3 fields to set and for now the system updates the bottom of the stack only.
7C8103B2 add ecx, eax 7C8103B4 mov [edi+INITIAL_TEB.StackBase], ecx
InitialTeb->StackBase = BaseAddress + MaximumStackSize
The system sets up the stack area by giving the upper and lower bound. The initial stack value is StackBase and it will decrease everytime a push/call/.. occours.
The procedure goes on committing the initial area of the stack, and after that BaseInitializeContext fixes the righ values for the registers (including esp). No need to continue stepping the code, I have a lot information now, and I might come to a conclusion.
Under XP sp3:
AllocateStackBase = 0×40000
MaximumStackSize = 0×100000
StackBase = 0×140000
Under XP sp1/sp2:
AllocateStackBase = 0×30000
MaximumStackSize = 0×100000
StackBase = 0×130000
It’s impossible for sp_1/2 to have an esp value like 0x13FFC4 because the upper bound (StackBase) is 0×130000. StackBase was obtained by the operation “AllocateStackBase + MaximumStackSize” (AllocateStackBase is the same as BaseAddress value). MaximumStackSize was taken from the malware’s header, and AllocateStackBase was initialized from NtAllocateVirtualMemory call.
Seems like the solution to the puzzle comes from NtAllocateVirtualMemory. The function is called using zero as BaseAddress parameter; as I said before it means that the system decides to assign the first free virtual location which is obviously 0×40000 under sp_3 and 0×30000 under sp_1/2. From my sp_3 machine, trying to browse the memory I noticed a 0×1000 bytes allocated starting from 0×3000, there’s no trace about this memory area in old XP service packs… What did they change in XP sp3? Well, I’m ready for a vacation in Holland for now. I’ll try to reply when I’ll be back in two weeks. If the answer is obvious and/or you know why… feel free to comment your idea :)
Is it possible to solve the problem?
Well, it’s insane to fix a malware just to be sure to run it under an XP sp3 machine. Anyway it’s not hard to make it runnable, you can simply change SizeOfStackReserve and/or SizeOfStackCommit directly from the PE header. I tried changing SizeOfStackReserve from 0×100000 to 0xF0000 and I got a runnable file. I don’t know how safe is to change such parameters…
All the tests were done on my personal machines, I would like to know if your sp3 machine (or any other OS) has the same initial stack value.