Docs: Code Management
In this file we list some of the techniques that may be used
to increase one's efficiency when developing PETSc application codes. We have learned to
use these techniques ourselves and they have improved our efficiency tremendously. Editing and Compiling
The biggest time sink in code development is generally the cycle of
EDIT-COMPILE-LINK-RUN. We often see users working in a single window with a cycle such as:
- Edit a file with VI.
- Exit VI.
- Run make and see some error messages.
- Start VI and try to fix the errors; often starting VI hides the error messages so that
users cannot remember all of them and thus do not fix all the compiler errors.
- Run make generating a bunch of object (.o) files.
- Link the executable (which also removes the .o files). Users may delete the .o files
because they anticipate compiling the next time on a different machine that uses a
different compiler.
- Run the executable.
- Detect some error condition and restart the cycle.
In addition, during this process the user often moves manually among different
directories for editing, compiling, and running.
Several easy ways to improve the cycle:
- Work in two windows next to each other.
- Keep the editor running all the time, rather than continually exiting and entering it.
- Organize directories so code is not scattered all over the place.
- Never use scripts to run makefiles. (This is a common trick by people who have trouble
getting make to do exactly what they want.) Instead, learn a bit more about how PETSc uses
make to compile complicated application codes portably, quickly, and easily.
- If an application generates lots of .o files, store them in libraries (.a files), with
names that indicate the type of machine and optimization levels. For example, PETSc's
libraries are stored in the directory ${PETSC_DIR}/lib/lib${BOPT}/${PETSC_ARCH}/. Thus,
when switching to another machine that shares the same file system, one does not have to
remember to delete a bunch of .o files and edit makefiles accordingly.
- To alleviate the need for jumping between directories every time a test run is made,
store the executable in the directory where it has been compiled when possible.
Several PETSc authors worked previously for years using exclusively VI for editing. We
have found that EMACS improves one's programming efficiency enormously. For example,
- EMACS has a feature to allow the user to compile using make and have the editor
automatically jump to the line of source code where the compiler detects an error, even
when not currently editing the erroneous file.
- The etags feature of EMACS enables one to search quickly through a group of user-defined
source files (and/or PETSc source files) regardless of the directory in which they are
located. Etags usage tips (where M- denotes the "meta" key in EMACS):
- To access the predefined PETSc tags in EMACS, type M-x visit-tags-table and specify the
file ${PETSC_DIR}/TAGS
- To move to where a PETSc function is defined, enter M-. This is useful for viewing
manpages and locating source code.
- To search for a string and move to the first occurrence, use M-x tags-search Then, to
locate later occurrences, type M-, In particular, this is useful for quickly locating
examples of usage of certain PETSc routines.
- Another quite helpful command is M-x tags-query-replace which replaces occurrences of a
string hroughout all tagged files (can retain any entries if desired); this enables quick
and painless revisions of source code, particularly when routines are spread throughout
many files and directories.
- Also, EMACS easily enables:
- editing files that reside in any directory and retaining one's place within each of them
- searching for files in any directory as one attempts to load them into the editor
Admittedly, it takes a while to get the hang of using EMACS. But we've found that the
increase in our programming efficiency has been well worth the time spent learning to use
this editor. We seriously believe that we could not have developed PETSc as it is
today if we had not switched to EMACS.
Debugging
Most code development for PETSc codes should be done on one processor; hence,using a
standard debugger such as dbx, gdb, xdbx, etc. is fine. For debugging parallel runs we
suggest Totalview if it is available on your machine. Otherwise, you can
run each process in a separate debugger; this is not the same as using a parallel
debugger, but in most cases it is not so bad. The PETSc, the run-time options
-start_in_debugger [-display xdisplay:0] will open separate windows and debuggers for each
process. You should debug using the BOPT=g versions of the libraries.
It really pays to learn how to use a debugger; you will end up writing more interesting
and far more ambitious codes once it is easy for you to track down problems in the
codes.
Other suggestions
- Choose consistent and obvious names for variables and functions. (Short variable names
may be faster to type, but by using longer names you don't have to remember what they
represent since it is clear from the name.)
- Use informative comments.
- Leave space in the code to make it readable.
- Line things up in the code for readability. Remember that any code written for an
application will be visited over and over again, so spending an extra 20
percent of effort on it the first time will make each of those visits faster and more
efficient.
- Realize that you will have to debug your code. No one
writes perfect code, so always write code that may be debugged and learn how to use a
debugger. In most cases using the debugger to track down problems is much fast than using
print statements.
- Never hardwire a large problem size into your code. Instead, allow a
command line option to run a small problem. We've seen people developing codes who have to
wait 15 minutes after starting a run to reach the crashing point; this is an inefficient
way of developing code.
- Develop your code on the simplest machine to which you have access. We have accounts on
IBM SPs, Cray T3D, Intel Paragon, SGIs etc, but we write and initially test all our code
on Sun Sparcstations because the user interface is friendlier and the turn-around time for
compiling and running is much faster than for the parallel machines. We only use the
parallel machines for large jobs. Since PETSc code is completely portable, switching to
the parallel machine from our Sun development environment simply means logging into
another machine -- there are no code or makefile changes.
- Never develop code directly on an MPP; your productivity will be much lower than if you
developed on a well-organized workstation.
- Keep your machines' operating systems and compilers up-to-date (or force your systems
people to do this :-). You should always work with whatever tools are currently the best
available. It may seem that you are saving time by not spending the time upgrading your
system, but, in fact, your loss in efficiency by sticking with an out-dated system is
probably larger than then the time required to keep it up-to-date.
Fortran notes
Since Fortran77 does not provide type checking of routine input/output parameters, we
find that many errors encountered within PETSc Fortran77 programs result from accidentally
using incorrect calling sequences. Thus, if your Fortran program is crashing, one of the
first things to check is that all subroutines are being called with the correct arguments.
Such mistakes are immediately detected during compilation when using C/C++. Thus, using a
mixture of C/C++ and Fortran often works well for programmers who wish to employ Fortran
for the core numerical routines within their applications. In particular, one can
effectively write PETSc driver routines in C/C++, thereby preserving flexibility within
the program, and still use Fortran when desired for underlying numerical computations.
When passing floating point numbers into Fortran subroutines always make sure you have
them marked as double precision (e.g., pass in 10.d0 instead of 10.0). Otherwise, the
compiler interprets the input as a single precision number, which can cause crashes or
other mysterious problems. Make sure to declare all variables (do not use the implicit
feature of Fortran). In fact, we highly recommend using the implicit
none option at the begining of each Fortran subroutine you write.