Assignment 2

Assignment 2: Grid of Resistors

Life is too short to spend writing do loops.
-- Cleve Moler, 1993

The main goal of this assignment is to convince you of the ease of use of the data parallel programming model. For those problems that lend themselves to this model, programming a parallel machine can be easier than programming serial machines. Copy all of the files in my directory on scout for Homework #2 by executing the command (in your own appropriate directory).

cp ~edelman/1995/HW2/*.fcm

You should obtain four cmf files. Each one of these files is performing the same operation. It is computing the effective resistance and voltages of a grid of resistors. Each program is only one page long. Also please note the assignment checklist at the end of this document. We will also use some timing routines. To understand what these routines are measuring as well as the syntax of the commands see pages 33-35 of the CM-5 CMF Performance Guide. You will notice some write statements that are commented out towards the end of the program. You may choose to remove the comments to "see" the resistance.

The problem is to compute the voltages and the effective resistance of a 2n+1 by 2n+2 grid of 1 ohm resistors if a battery is connected to the two center points. This is a discrete version of finding the lines of force using iron filings for a magnet.

A contour plot of the answer (for a larger value of n) looks like

The method of solution that we will use here is successive overrelaxation (SOR) with red-black ordering. This is certainly not the fastest way to solve the problem, but it does illustrate many important programming ideas.

It is not so important that you know the details of SOR. (Some of the basic ideas may be found on pages 407-409 of Gil Strang's Introduction to Applied Mathematics, somewhat more in depth discussion may be found any serious numerical analysis text such as Stoer and Bulirsch's Introduction to Numerical Analysis.) What is important is that you see that the nodes are divided in half into red nodes and black nodes. During the first pass, the red nodes obtain the voltages as a weighted average of their original voltage, the input (if any) and the four surrounding black nodes. During the second pass, the black nodes obtain voltages from the four surrounding red nodes. In the infinite limit, the process converges to the correct answer.

Let us have a look at the first two programs sor_cshift and sor_eoshift. These programs define a 2n+1 by 2n+2 matrix v for voltages, and two logical arrays, red and black for the red/black nodes. The first and last rows and columns (the boundary) are never modified, they are "grounded", have voltage 0. The difference in the two programs is only that while in the first one the voltage of the neighbors is added by cshift, the second uses eoshift for the same thing. Last year, we assigned this problem to 18.337 and found a big difference in the performance in the two codes. It is a nuisance that such small choices can radically affect the performance. If you try it now you will probably see that the cshift version is only slightly faster than the eoshift version.

The third program sor_arraysection uses a quite different approach: both reds and blacks are determined as two array section ("odds and evens"). Information from neighbors is obtained directly through shifting the array sections.

We would like you to time say 200 (or whatever you like) iterations by the CM_timers. The time for array sections is very bad; array sections as they are implemented on the CM-5 incur a huge amount of overhead.

Your first programing assignment is the following: implement the analogous problem on a 2n+1 by 2n+1 by 2n+2 grid in three dimensions. Pick whichever code or code you prefer. This may still be done with two colors. The value of the optimal mu you have to use is 2*cos(pi/(2*n))+cos(pi/(2*n+1)))/3. Next solve the problem in four dimensions, with mu = (3*cos(pi/(2*n))+cos(pi/(2*n+1)))/4. Can you guess the effective resistances of the various meshes as n --> infinity ?

Now this gets harder. In our cshift and eoshift programs we use the where(reds) statement. Though you might have imagined that the compiler is smart enough to work in time proportional to the active part (i.e. the "true" part) of the statement, unfortunately it is not. In fact, the time spent is proportional to the entire size and even more time for conditioning.

We would like you to improve the program by putting all the red nodes into one array called red and all the black nodes in the same size array black. The data arrangement is given in Figure 1 for the general case, and in Figure 2 for n=4. For simplicity we require n to be even.

Figure 1. The data arrangement with 2 arrays

nFigure 2. The case of Figure 1, for n = 4

This arrangement is in our program sor_2arrays. Unfortunately, timing is not improved; use the timers to discover which is the really slow part of the program. Try to see if you can improve the program, using two arrays and CSHIFT only. Or try using four arrays, two for the even rows of red and black, two for the odd rows.

Assignment Checklist

Copy the four files
Understand and time sor_cshift and sor_eoshift
Repeat for sor_arraysection
Extend sor to three and four dimensions
What is the resistance for an infinite grid in d dimensions?
Understand and time sor_2arrays

For the ambitious:
Improve on sor_2arrays