HOW TO FIND MEMORY CORRUPTION BUGS
by Greg Hudson <ghudson@mit.edu>
Written 7/22/95
Last updated 7/22/95

Some of the most frustrating errors you will come across in Unix C and
C++ programs are errors which corrupt memory.  This document describes
various approaches to testing and finding memory corruption errors in
Unix programs.  The document is organized as follows:

1. Analysis of the problem
2. Specific approachs
   1. Purify
   2. Checker
   3. Electric Fence
   4. Codecenter
   5. gcc with bounds checking
   6. Debugging malloc libraries
A. Software available on Athena


1. ANALYSIS OF THE PROBLEM

Bugs in any kind of program can be divided into two categories: bugs
which cause visibly incorrect behavior as soon as the incorrect code
executes, and bugs which corrupt state (variable values, data
structures, files, etc.) such that correct code behaves incorrectly
later on.  Bugs in the former category are usually easy to find and
fix, since you can simply trace the execution of the code up to the
point of the incorrect behavior and see which piece of code failed.
Bugs in the second category are often much harder to find, since there
is no simple way of determining where the state of the program was
corrupted.

Memory corruption bugs can fall into either category.  If your Unix
program tries to dereference an uninitialized pointer, it will usually
fail immediately (with a "segmentation fault" or "bus error").  On the
other hand, if your Unix program writes beyond the end of an array or
allocated block of memory, your program may crash much later on, and
you may have a very difficult time finding out where your memory
storage became corrupt.

It is important to note that not all memory corruption bugs are state
corruption bugs.  Your program's memory storage, as managed by the
compiler's stack allocation and malloc(), is a part of your program's
state, which has well-defined, standardized invariants as well as a
set of operations (such as reading or writing freed memory) which are
never valid.  Your program almost certainly has other invariants
(which may not be documented or well-defined), and corruption of these
invariants may lead to memory corruption indirectly.  Thus, finding
memory corruption bugs may only reveal another layer of state
corruption in your own data structures.  The best way to find
corruption in your own data is to (a) carefully define and document
the invariants of your program's state, and (b) periodically check
your program's state, either piecemeal using assert() calls sprinkled
around your code, or by writing a procedure to check some large amount
of your program's state and calling it wherever you suspect state
might have been corrupted.

That said, the remainder of this document will focus on finding the
errors which are not valid according to the standard C and C++ memory
management discipline:

	* Reading or writing unallocated memory
	* Reading uninitialized memory
	* Freeing memory which was never allocated
	* Freeing memory which was already freed
	* Memory leaks (allocated memory which cannot be referenced
	  directly or indicrectly through pointers on the stack or in
	  static storage)
	* Array references beyond the bounds of an array (which may
	  happen to reference allocated, initialized memory, but is
	  still invalid)

There is no single, perfect method for finding all sis categories of
problems, but there are approaches which will find problems in most
cases.  This document will discuss approaches which try find memory
access violations by checking every memory reference, and other
approaches which try to detect past violations.


2. SPECIFIC APPROACHES

2.1. PURIFY

Purify is a commercial product produced by Pure Software, Inc..
Purify works by modifying the executable code produced by the C or C++
compiler to check every memory access.  As such, Purify is heavily
dependent on the instruction set of the target platform; at the time
of this writing, Purify only runs on Sun Sparc machines under SunOS or
Solaris.

Of the approaches we will discuss, Purify is probably the easiest to
use.  You simply prepend "purify" to the link line you use to build
your program, and Purify will instrument all of the object files and
libraries with memory access checks.  When you run the program, Purify
will display a window on your X display reporting reads or writes to
unallocated memory, reads of uninitialized memory, and frees of memory
which was never allocated or which was previous freed.  (If you do not
have an X display, or if you specifically disable the window option,
Purify will display reports of programs to standard error [XXX or is
it /dev/tty?]).  Purify has a flexible mechanism for ignoring expected
access violations depending on the call chain (for instance, the
Solaris C library often reads uninitialized memory).  After your
program has finished running, Purify will look for potential memory
leaks using a conservative garbage collection strategy (that is,
Purify may fail to find memory leaks because it treats numeric values
as pointers, but such confusions are rare since the vast majority of
numeric values in most programs are not valid pointer values).

Purify has the following limitations:

	* It is commercial, and expensive.
	* It only runs on Sun Sparcstations.
	* It will not work correctly on multithreaded code (because of
	  multiple different thread stacks).
	* It will not find out-of-bounds array references if they
	  happen to refer to allocated memory.

2.2. CHECKER

Checker is similar to Purify, but is free [XXX developed by who?] and
runs on Intel machines under the Linux operating system.  Checker
instruments the assembly output of gcc, and is therefore as
platform-dependent as Purify.  Checker does not have all the user
interface features of Purify, and it will also [XXX check] not
instrument libraries which have already been compiled.

I have never used Checker myself, but apart from not instrumenting
previously compiled libraries it should find the same kinds of errors
as Purify does and have the same limitations.  I don't know if Checker
will help find memory leaks.

2.3. ELECTRIC FENCE

The Electric Fence is another free Linux tool, developed by Bruce
Perens, to find memory access violations.  The Electric Fence uses
mmap() to put different pieces of allocated memory in different
virtual memory pages.  I have not used the Electric Fence and do not
know the details of its strategy, but its approach may give it some
level of platform-independence and will also help find some
out-of-bounds array reference errors which wouldn't be found by Purify
or Checker (specifically, if the out-of-bounds access would ordinarily
refer to another block of allocated memory, but does not because the
mmap() strategy separates different blocks of memory.)

2.4. CODECENTER

Codecenter (previously called Saber) is a commercial product of
Centerline.  Codecenter is a full-fledged C interpreter, and can
therefore check all memory access violations including out-of-bounds
array accesses which can't be found by Purify or Checker.  As an
interpreter, Codecenter is not highly platform-dependent (although as
a commercial product, you must rely on CenterLine to port the program
to new platforms), but it is heavily language-dependent [XXX does it
do C++ right now?].  [XXX multithreaded code?]

2.5. GCC WITH BOUNDS CHECKING

2.6. DEBUGGING MALLOC LIBRARIES


A. SOFTWARE AVAILABLE ON ATHENA