And we just run it, and click here and ... What the?

Posted by: Brent York

Hello, and welcome back to the third blog post in the reverse engineering series. In the last two I discussed why someone would want to reverse engineer software. I also gave an example of reverse engineering a small Win32 application. In this one I'm going to show you how to use reverse engineering to find ABEND situations in release software. If you missed the first two, they can be found here and here respectively.

For most intents and purposes you wouldn't analyze a whole program. Well, maybe someone would, but I don't dislike myself that much :). Instead, you would use techniques like we used in the previous blog post, with a smattering of lateral thinking to find the parts of the code you're interested in and you'd analyze those.

However, how do you find a function or the reason for a crash when you're looking at a whole mess of assembly language code you're not at all familiar with? This could be a pretty daunting task. I know no one ever said reverse engineering code would be simple but that's just a touch rediculous, isn't it?

This is where things like crash dumps, crash messages, and using a debugger comes into play.

 

I say Holmes, what the heck does that error message mean?

We've all been there. We've tested and tested and tested and we think our code is safe and stable. So we deploy it to a production environment and *BLAM*, it explodes. It's the most unbelievably annoying thing.

In most cases developers spend an awful lot of thier time trying to find out why it exploded, and in many cases it's quite a shot in the dark. Sure there's methods like logging, and deploying test code to try and break it, but what if for some reason you can't recreate the error in a test environment? It's very unlikely you can deploy test code to production, and in that case you're stuck.

However, all is not lost. When a program crashes in modern operating systems generally they produce a crash dump. On UNIX it's a core file, and on Windows it's that dialog that comes up stating that the program has performed an illegal operation.

You know which one I'm talking about... It's that dialog that has users calling Microsoft worried that they're going to be arrested because their program crashed :).

Have you ever wondered what those weird numbers and names mean in that dialog? How Microsoft might use that information to figure out what's exploding in their own software when you submit it to them?

Well, let me enlighten you... It's information that tells developers why the application crashed. It includes information like the address of the instruction that crashed the application, the loaded module information, and a plethora of other things all of which is useful to someone who needs to debug a release-mode application.

Even better still in some cases you can save that information to a crash dump file which contains the binary image of the application when it crashed! Armed with where and possibly what caused the immediate crash sequence, and a nice binary image of our crashed application state, we can easily debug the application. In the case of a windows application, if the PDB is available to the person doing the debugging, reverse engineering even becomes an unnecessary task as they can take the option to debug it in visual studio if visual studio is installed.

However, in the case of a missing PDB file or one that does not match versions, we can use WinDBG and IDA to determine where and why the application crashed in detail. Now, if you have the PDB for the program, then Visual Studio's debugger will be the better choice as it will display your original source code, and even show you where you crashed in your code, and thus problem solved. But let's imagine for the rest of this exercise that you're missing the PDB either due to versioning issues, or you just don't have it any more.

So Dr. Watson, You didn't think I used the Sherlock Holmes reference for nothing did you?, let's get to it, shall we? I've provided the executable, and a crash dump in the example . We'll do our analysis without the crash dump, but if it is available you should generally use it. Especially since the ABEND may not be reproducible in your environment. In this case however, running the application gives us the ABEND, so we'll go the one step further and run the executable in the debugger, creating the ABEND situation. Furthermore we're going to use the pop up dialog information from XP to debug this thing :)... hey, you've got to learn to do this in the wild... right?

First our reconnaissance.

  1. It's a Windows Portable Executable (dumpbin tells us this, but so does the existence of a PE header)
  2. It is compiled code, not p-code, it was compiled with Visual C++
  3. Several strings contained of interest:
    1. "This is a test"
    2. "strncpy"
    3. "GlobalAlloc"
    4. "MessageBoxA"
  4. We have an ABEND, now to find out where:

When we run our executable, we get:

Gee, that seems familiar... let's click on the click here link...

This shows us a little information, for instance telling us it crashed in msvcr80.dll, but it is very general. Click on the click here beside To view technical information about the error report and you will get:

Well would you look at that... the exception information says we crashed at address 0x00000000781807f5. Hmmm let's chop that to 32 bits, because it's 32 bit windows we're running, and note it down somewhere because it looks like it might be handy :)...
So now, we have some info about the binary that's useful to us so lets get to debugging the app. Now in our case, we're going to use the EXE, but the exact same steps work with the dump file too. The only difference here is that the dump file knows of the exception before run and the executable doesn't.
So how do we generate the ABEND in the EXE? Well, we run it in the debugger... I load up the executable with WinDBG, and I get this:

I hit go (F5), because the program hasn't run yet and it tells us about an exception at 0x781807f5. Looks familiar, doesn't it?. It also shows us the instruction it crashes at, and some register state:
(Errors have been removed here)

(3d0.3dc): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=7efef0f8 ebx=00000400 ecx=00000100 edx=73696854 esi=004020f8 edi=00000000 eip=781807f5 esp=0012ff18 ebp=0012ffc0 iopl=0         nv up ei pl zr na pe nc cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00010246 MSVCR80!strncpy+0xa5: 781807f5 8917            mov     dword ptr [edi],edx  ds:0023:00000000=????????
Ooooh... Ouch, looks like a null pointer exception, EDI holds 0x00000000, and it being the destination index and the call being strncpy, I'd imagine our destination parameter is NULL.

For the sake of brevity, I won't include a full disassembly here as it includes a great deal of code for the standard libraries. However, in particular a section of the code from the CRT is interesting so I'll include it here. You can follow along by opening a disassembly window in WinDBG and typing in the crash address obtained above... I suggest subtracting a few hundred off it to ensure that you get some relevant code above it too:

 

 

MSVCR80!strncpy+0x2c (7818077c) 78180797 8bd9            mov     ebx,ecx 78180799 c1e902          shr     ecx,2 7818079c 7561            jne     msvcr80!strncpy+0xaf (781807ff) 7818079e 83e303          and     ebx,3 781807a1 7413            je      msvcr80!strncpy+0x66 (781807b6) 781807a3 8a06            mov     al,byte ptr [esi] 781807a5 83c601          add     esi,1 781807a8 8807            mov     byte ptr [edi],al 781807aa 83c701          add     edi,1 781807ad 84c0            test    al,al 781807af 7437            je      msvcr80!strncpy+0x98 (781807e8) 781807b1 83eb01          sub     ebx,1 781807b4 75ed            jne     msvcr80!strncpy+0x53 (781807a3) 781807b6 8b442410        mov     eax,dword ptr [esp+10h] 781807ba 5b              pop     ebx 781807bb 5e              pop     esi 781807bc 5f              pop     edi 781807bd c3              ret 781807be f7c703000000    test    edi,3 781807c4 7416            je      msvcr80!strncpy+0x8c (781807dc) 781807c6 8807            mov     byte ptr [edi],al 781807c8 83c701          add     edi,1 781807cb 83e901          sub     ecx,1 781807ce 0f8498000000    je      msvcr80!strncpy+0x11c (7818086c) 781807d4 f7c703000000    test    edi,3 781807da 75ea            jne     msvcr80!strncpy+0x76 (781807c6) 781807dc 8bd9            mov     ebx,ecx 781807de c1e902          shr     ecx,2 781807e1 7574            jne     msvcr80!strncpy+0x107 (78180857) 781807e3 8807            mov     byte ptr [edi],al 781807e5 83c701          add     edi,1 781807e8 83eb01          sub     ebx,1 781807eb 75f6            jne     msvcr80!strncpy+0x93 (781807e3) 781807ed 5b              pop     ebx 781807ee 5e              pop     esi 781807ef 8b442408        mov     eax,dword ptr [esp+8] 781807f3 5f              pop     edi 781807f4 c3              ret 781807f5 8917            mov     dword ptr [edi],edx  ds:0023:00000000=???????? 781807f7 83c704          add     edi,4 781807fa 83e901          sub     ecx,1 781807fd 749f            je      msvcr80!strncpy+0x4e (7818079e) 781807ff bafffefe7e      mov     edx,7EFEFEFFh

 

 

The bolded line is the bit of strncpy which explodes on us. Note the ds:0023:00000000. That's put there by WinDBG and tells us that the offset into the data segment of selector 23 is 0, which is reserved for the NULL pointer in C/C++. That's a pretty big clue. So we scroll up in the debugger. Notice that we're in strncpy... Interesting... So we already know it's a strncpy call that's failing here and now we've confirmed it, let's go back to the command window from earlier. Look at the section of output that covers off the executable module loading map. We're looking for the mapping address of our EXE.

 

All your base addresses are belong to us

Ok, I hear you say... that's great... but where's the call to strncpy? I mean the CRT runs pretty flawlessly, right? So this is almost definitely an error in our code...
Well, in order to find our code first we need to find out where our executable is loaded. Because of that we need to understand some addressing concepts, specifically relocation, because almost all executable formats support relocatable code.
In the windows operating system each application gets its own address space like in most 32 bit protected mode operating systems. The address spaces of processes in Windows never shall meet. This is because Windows provides virtual memory management. Thus a process in Windows can have a full 4 gigabyte address space to roam in... The caveat here is that this is with some slight exceptions, which we'll get into in a later blog post, I promise :).


But the most notable part of what this means is that an application has an address at which it is loaded at run time. This address is called the base address of the application, and the PE header contains a preferred address that the loader should use when loading the application. It is the loader's responsibility to attempt to load a process image at that address, and if it can't it is the loader's responsibility to relocate the application to a new address. This process is known as rebasing the application at runtime.

It's important to recognize that DLLs also have base addresses. So what happens if I were to say, take a DLL with a long name and call LoadLibrary on it, and then also call LoadLibrary on it's short name? You guessed it, it's loaded twice. But... don't they have the same base address? Shouldn't that fail? Yep, and it would if the code wasn't relocatable :)... Let me explain...

When loading a DLL or an executable, the loader reads the relocation table. It checks to see if the OS has already loaded something at the base address the DLL or application usually resides at. If it has, then the OS is free to load the DLL or application at some other base address. It chooses a base address and fixes up the portions of the executable image that need changed for that addressing using the list of fixup addresses contained in the relocation table. Several different situations exist where addresses may need to be fixed during a rebasing operation, for example effective addresses, jump and call locations, and direct offset references to statically allocated data.

In the case of a DLL double loaded as we've described, the rebasing is forced because the DLL has the same preferred base address, and on the second loading of the DLL the loader is forced to rebase the second image. In particular this happens because the DLLs themselves are mapped into the address space of the process that loads them.

Clear as mud? Good. So what does this all mean to us? Well it means several things in reverse engineering in particular, and we'll see some of them in next weeks example, but for now our base address is important because it gives us somewhere to look for our application's code. So... we look back at the command window we got when we loaded WinDBG and opened the example2.exe file.
In the output we see a line that says ModLoad: 00400000 00405000   example2.exe... This means our base address for our code is at 0x00400000, so we go back to the disassembly window and type in 00400000 in the "Offset" box. Then we scroll down for a bit looking for a call to strncpy...

And lo!

We found our strncpy call (it's at the address 00401019 in the code shown in our image)... but ugh the code surrounding it is ugly... so lets take note of a couple of useful lines and move over to something pretty... IDA.

We'll take note of some of the lines of the output so that we'll have some code or procedure calls to search on when we load it up in IDA. In our case, we'll use the following sequence, because it's unlikely to show up in several places:

push 400h
push 40h
call dword ptr [example2!_imp__GlobalAlloc (00402000)]

Now load the executable up in IDA and open up the names window ...

Find GlobalAlloc in the list (because it's called from the same code as the strncpy according to WinDBG), and double click it... you'll be taken to the imports list in the disassembly. It should look like this:

If you right click on GlobalAlloc you'll be shown a menu which allows you to select an option called Chart of xrefs to. Selecting that will bring up a graph of the cross references that call global alloc. Note the proc in the middle. It's the only procedure that calls GlobalAlloc, which according to what we know from WinDBG also calls our strncpy... So lets go there. Close the chart and hit Ctrl-X to bring up a list of cross-references for GlobalAlloc. It should look similar to this:

Now, if you double click the cross reference shown... Oh hey, look. This looks really familiar, doesn't it :)?

That's the same code we saw in WinDBG, but with some more useful information thrown in by IDA. Note the same push sequence up on top of the GlobalAlloc call? Now we can start to analyze the code. Looking at it we notice a few things.

Firstly, after the call to strncpy, there's an add to the ESP register. It's balancing the stack. This should come as very little surprise since strncpy is a C standard library function. Remember we said some times programs mix more than one calling convention? The strncpy call is called using the cdecl calling convention. Knowing that, and armed with the function prototype for strncpy:

char *strncpy(char *dst, char *src, size_t size)

We can safely decode the call to strncpy. The size is 400h (or 1024 decimal), which we can clearly see by the first push after the GlobalAlloc call. The second push contains the address of the source string that is an immutable text string from the .rdata section of the binary and contains "This is a test."... But wait a sec... the third push, also known as the destination string pointer is 0... Oh hey, there's our null pointer exception :).

Better yet, notice directly below the stack balance, we're looking at another function call. This one is stdcall (just as is the Globalalloc call) and what it's doing is calling MessageBoxA. We know MessageBoxA's prototype too:

int MessageBoxA(HWND hWnd, LPCTSTR message, LPCTSTR caption, UINT uType)

Looking at our assembly code, we see it push 40h, which is the equivalent bit flags for MB_OK | MB_ICONINFORMATION which is our uType. It then pushes the offset of the caption, and pushes 0 again for the text... Uh oh, another null pointer exception. Looks like the probability that we used the same variable name for the strncpy destination and the text of the message box is high...
Finally checking the references of the proc (by putting our cursor on the first line of it and hitting Ctrl-X), it appears to be called from the CRT startup code. Thus, this code is almost guaranteed to be WinMain.

So armed with what we know we can clean up the procedure given the stuff we learned in the last blog post:

Given that we're the developers of this wonderful peice of software we've been debugging, we take a look at the CPP code... looking for a sequence that looks like GlobalAlloc strncpy, MessageBoxA:

#include


int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCommandLine, int nCmdShow) { char *tmp1 = NULL; char *tmp2 = NULL; tmp2 = (char *)GlobalAlloc(GMEM_FIXED | GMEM_ZEROINIT, 1024); strncpy(tmp1, "This is a test", 1024); MessageBox(NULL, tmp1, "Message", MB_OK | MB_ICONINFORMATION); return 0; }


Seems we stored the GlobalAlloc result in tmp2. The variable tmp1 is NULL, in both the strncpy, and the MessageBox call just like IDA and WinDBG told us it was. Ooops.


Final thoughts

Now I'll admit, this example is contrived, but it's entirely possible. Worse yet, pointers can be assigned null through many different code paths. Functions may return null pointers, code paths may set them. These happenstances are rather common. Null pointer exceptions are extremely common, and hunting them down may take a few more steps, and may involve looking through structures defined in the code or Win32 API, but that's all very simple to understand assembly. For the sake of brevity, if you can call this post brief in any way, I picked a shorter example.

Please join us next time for our fourth and final post on reverse engineering that gives an example of how to perform surgery on an executable .
P.S. Yes, I know I said I'd provide IDA files for the disassembly of the console application in part 2 with this post. I haven't gotten around to performing the analysis yet, but keep checking back here. I'll link them to the end when I get some time to perform the analysis. I figured you'd rather I did the work for this post than the example :).

Comments (0)Add Comment

Write comment

busy