Debugging heap corruption with PageHeap

Heap corruption bugs in C++ code can present some difficult debugging challenges. Often the actual corruption goes unnoticed until some apparently random point in the future when the corrupt memory is accessed by unrelated code. In such cases, it’s almost impossible to infer the location of the bug using typical debugging tools.

Common causes of heap corruption memory errors are modifying objects after they have been destroyed, or double deleting a pointer. In such cases, you often find out about the corruption well after it occurs with unhelpful errors like this:

Free Heap block NNNNNNNN modified at NNNNNNNN after it was freed

Aside from defensive programming techniques, a common strategy for combating these types of bugs is to log and track all memory allocations so that it’s possible to backtrack and determine which code last used the memory address or range of memory that got corrupted. Unfortunately, sometimes it isn’t possible or practical to add such diagnostic capabilities. For example, ObjectARX add-ons use AutoCAD’s heap and memory allocation functions, and unless you happen to have access to the AutoCAD source code, there is no way to change them.

Luckily there is a debugging tool designed for this precise scenario: the PageHeap heap verifier tool. This tool has been built into Windows since Windows 2000, but you have to turn it on in order to use it. I use the Global Flags Utility (gflags.exe from Debugging Tools for Windows) to turn on the PageHeap verifier when needed. The PageHeap utility must be used in conjunction with a debugger such as WinDbg or Visual Studio.

A typical debugging use case for an ObjectARX add-on is to enable PageHeap monitoring for the entire AutoCAD process:

gflags /p /enable acad.exe /full

For monitoring only a specific ObjectARX module, use syntax like this:

gflags /p /enable acad.exe /full /dlls myarx.arx

Once PageHeap monitoring is enabled, you simply reproduce the heap corruption under a debugger like normal.  The PageHeap utility will break execution as soon as the heap corruption occurs. Examining the call stack at that point will show exactly which code is causing the corruption.

Once you’ve fixed the problem, don’t forget to disable the PageHeap utility again:

gflags /p /disable acad.exe

I have not needed to use the PageHeap utility very often, but it has been a godsend when I needed it.

Leave a Reply

Your email address will not be published. Required fields are marked *