Skip to content
Archive of posts filed under the Technical category.

Simplifying the problem: is it the drawing?

One type of support request that I occasionally receive from CADVault users goes something like “when I create a vault, AutoCAD crashes”. I’ve learned that problems like this are usually drawing-specific, so before going any further in simplifying the problem I ask the customer to check whether the problem occurs in an empty drawing. But there’s a trick to this.

The problem is that “empty drawing” is too nebulous. No drawing is completely empty, and in any case, even drawings with no visible geometry can be far from empty. What I really want to test is whether the problem occurs with a drawing that is as nearly empty as possible. To achieve that, I ask the customer to perform the test on “a new drawing created from scratch, with no template”.

New file with no template

The extra “no template” instruction is necessary because just creating a new drawing will bring in any flotsam and jetsam saved in the default template drawing, which could affect the results.

[Side note: it is my experience that many template drawings contain a lot of invisible junk that does nothing but slow things down. If you're curious about your own templates, open them and use my shareware SuperPurge tool to see what all they contain. You may be surprised.]

If the steps for reproducing the problem are such that they require some visible geometry, I ask the customer to draw a unit circle at the origin in a new “no template” drawing.

If the problem still occurs in a minimal drawing, then it is not drawing-specific; otherwise the next step is to divide and conquer the drawing to narrow things down further.

The art of simplifying the problem

One of the most important skills in resolving technical problems is not problem solving, but problem definition. Stripping a problem down to its essence often makes the solution obvious. I think this is generally true, but especially true in my experience with software tech support and tracking down software bugs.

The first stage in problem simplification is to document a set of steps that consistently reproduces the problem. These steps must be detailed enough so that someone else can use them to reproduce the problem, including a description of exactly what the problem is. Often this is the most difficult step, either because the problem doesn’t happen consistently, or because the problem description lacks detail.

The second stage is to try eliminating unnecessary steps with the goal of determining the bare minimum steps needed to reproduce the problem. If the problem is drawing-specific, this stage includes stripping everything out of the drawing except the bare minimum needed. This is often very time consuming, but almost always a worthwhile investment because it can eliminate a lot of potential dead ends in tracking down the ultimate cause. Often, this stage requires some trial and error.

The third stage is determining the exact cause and source of the problem. In my experience, even problems that at first appear very complicated can almost always be boiled down to just a few steps with a minimal amount of data.

Finally, in stage four the problem has to be solved, of course!

In future posts, I’ll share some tips and techniques that I use to simplify software problems.

All your base are belong to us

Excited news stories this morning about how your computer is vulnerable to attack from a phone plugged into your USB port made me chuckle. This “novel hack” involves making an Android phone mimic a USB keyboard in order to send keystrokes to the computer. Cool, but why wouldn’t a hacker just use a USB keyboard in the first place?

This is a good example of the over-hyped threats that often generate headlines. Sure, it’s true that allowing potentially compromised USB devices to be plugged into your computer could be harmful, but opening your web browser is much more likely to result in damage. The only thing remotely novel about this “attack” is the notion that your own phone could be the source. That’s interesting, but hardly worthy of the breathless coverage it’s getting.

The bottom line is that most malicious attacks are the result of you doing something you know you shouldn’t do, such as opening an email attachment or blithely running downloaded programs without regard to their trustworthiness. Very few malicious attacks occur without your explicit permission.

This post reminds me of the one time my computer was infected with a virus. It arrived on a floppy disk that I received from, of all places, my local health department. I guess smart phones are the new floppy disk.

What’s the diff?

One of the required tools in a programmer’s toolbox is a tool for comparing files. File comparison utilities have come a long way since the original Unix diff command, but this genre of utilities is still referred to as diff tools. For Windows, I use an open source diff utility called WinMerge. WinMerge includes a shell extension that makes it easy to use from within Windows Explorer.

The thing about diff tools is that they can be re-purposed and used for many tasks that have nothing to do with programming. For example, I use WinMerge to easily and quickly compare AutoCAD EULAs and other legal document revisions. I also use WinMerge to preview all changes before applying an upgrade to software like WordPress or Joomla on my web server. In both cases, I can quickly see exactly what changed.

The power of a diff tool is impressive in its own right, but it’s true power borders on amazing when using it as a merge tool to reconcile multiple simultaneous changes to the same files. In programming, reconciling and merging simultaneous revisions allows multiple programmers to work in parallel, thereby reducing management overhead and eliminating the inherent inefficiency of a serial workflow. I think that the principles and even the tools could just as well be applied to any collaborative project such as designing an airplane or building a skyscraper.

Most modern diff tools provide an easy to use GUI, and they do a lot more than just compare files. Even if you’re not a programmer, you should have a diff tool in your toolbox.

Debugging heap corruption with PageHeap

Heap corruption bugs in C++ code can present some difficult debugging challenges. Often the actual corruption goes unnoticed until some apparently random point in the future when the corrupt memory is accessed by unrelated code. In such cases, it’s almost impossible to infer the location of the bug using typical debugging tools.

Common causes of heap corruption memory errors are modifying objects after they have been destroyed, or double deleting a pointer. In such cases, you often find out about the corruption well after it occurs with unhelpful errors like this:

Free Heap block NNNNNNNN modified at NNNNNNNN after it was freed

Aside from defensive programming techniques, a common strategy for combating these types of bugs is to log and track all memory allocations so that it’s possible to backtrack and determine which code last used the memory address or range of memory that got corrupted. Unfortunately, sometimes it isn’t possible or practical to add such diagnostic capabilities. For example, ObjectARX add-ons use AutoCAD’s heap and memory allocation functions, and unless you happen to have access to the AutoCAD source code, there is no way to change them.

Luckily there is a debugging tool designed for this precise scenario: the PageHeap heap verifier tool. This tool has been built into Windows since Windows 2000, but you have to turn it on in order to use it. I use the Global Flags Utility (gflags.exe from Debugging Tools for Windows) to turn on the PageHeap verifier when needed. The PageHeap utility must be used in conjunction with a debugger such as WinDbg or Visual Studio.

A typical debugging use case for an ObjectARX add-on is to enable PageHeap monitoring for the entire AutoCAD process:

gflags /p /enable acad.exe /full

For monitoring only a specific ObjectARX module, use syntax like this:

gflags /p /enable acad.exe /full /dlls myarx.arx

Once PageHeap monitoring is enabled, you simply reproduce the heap corruption under a debugger like normal.  The PageHeap utility will break execution as soon as the heap corruption occurs. Examining the call stack at that point will show exactly which code is causing the corruption.

Once you’ve fixed the problem, don’t forget to disable the PageHeap utility again:

gflags /p /disable acad.exe

I have not needed to use the PageHeap utility very often, but it has been a godsend when I needed it.