Spring 2002 6.826
          Project Suggestions

Ad-Hoc Mobile Networking

Creating a wireless network infrastructure is costly and requires
the involvement of large organizations to look after installed
base. And all of this seems so unnecessary, since there are always
lots of people wandering around with wireless devices that are
perfectly capable of routing messages from a sender to a desired
recipient. Why not just build a network out of whatever devices
seem to be available at the time? Potential issues you might
want to consider:
  a) How do you deal with the fact that devices are always moving around?
  b) How does the system route a message to a device if the sender
     has no idea where the receiver is located or what route it
     might be able to use?
  c) How much power are you going to consume from the devices
     of unsuspecting passers-by?

Relaxed Memory Consistency

It turns out that in a multiprocessor with caching, computer
architects believe that they can get better performance
by building machines that are not guaranteed to preserve
the observed order of writes. More specifically, if a one 
thread performs a sequence of writes such as x=1; y=2; z=3;
starting from a state in which x,y, and z =0, it may be
possible for other threads to observe states in which
x=0, y=0, and z=3, and so on. To enable programmers to 
develop programs, these machines typically guarantee that
if a program uses synchronization correctly, it will not
observe any of these reorderings. There have been many
papers published on this topic, but there is still a feeling
in the community that we don't have a complete handle on this
phenomenon. It would be nice to develop enough formal specification
of what is going on to better understand the problem.

Striped File Systems

Traditional file systems have been implemented on top of a
single disk. But this has drawbacks: large files must be
read sequentially, and if the disk fails, you lose everything. 
Another way to implement a file system is to somehow distribute
the data across many disks. This way a system can read the data in
parallel. And if you put some redundancy in the writes to disk,
your system can even tolerate the failure of one or more disks
before it loses data. Some questions to think about:
  a) How would you build such a system using a collection 
     of machines, each with a single disk?
  b) How would you make the data secure against eavesdroppers
     attempting to determine the contents of the file?
  c) What kind of system issues would affect whether a machine
     would observe any performance increase from reading or
     writing data in parallel?
  d) What interface to the file system would you provide?

Package tracking system


Write a spec and a high-level implementation for the Fedex package
tracking system. You might starting by going to the Fedex web site and
examining how the system works. The tricky part of this project is
separating the tracking from other functions, notable package routing.
You should treat these independently: tracking has as input the
information that the bar code scanners pick up about the package
identity at various points in the Fedex system, as well as the package
source and intended destination. Routing is an entirely separate
function driven by information about current locations and destinations
from the tracking system, as well as knowledge of available transport.
Work on the tracking system first, and then go on to routing if you have
time.

Garbage collection


Write one or more specs for garbage collectors, and one or more
high-level implementations. The specs should first deal with
reachability, assuming an infinite pool of memory to draw from. Then it
should take a finite pool into account, and say something about when a
request won't be satisfied. You might look at mark-and-sweep, copying,
and generational collectors. Note that collectors are used in
log-structured file systems and object-oriented data bases as well as in
main memory allocators.

Server farms


Write specs and high-level implementations for a collection of machines
used to implement a web server farm. Such farms are usually organized
into=20

*       Stateful machines that store persistent data

*       Stateless machines that do front-end processing

*       Switches that send requests to the right places

Things to think about are scaling, initialization, partitioning of data,
replication for availability and for load balancing.