6.033 Weekly Paper 2 on Distributed Virtual Memory

With network performance increasing faster than disk and physical memory performance, it is becoming increasingly practical to use resources on other machines on a local area network (LAN). For example, rather than swapping memory to a local disk it might make sense to swap to the physical memory of a "swap server" on a LAN. This might increase average performance and save money by reducing the overall amount of RAM needed. However, it introduces a number of problems, with security, interactive performance, and reliability. The solutions to these problems both increase the complexity of the system and remove many of the benefits.

Swapping pages to a remote machine rather than to a local disk introduces a number of security problems. When swapping to a local disk, most modern operating systems provide no way for unprivileged users to observe either the contents of the swap space or transactions into and out of it. Access to modify the swap space itself is limited to either the kernel or to a privileged memory server. However, neither of these are true on a LAN. Some types of LAN's (particularly bus-topology networks such as ethernet) are vulnerable to packet sniffing and source address spoofing. In order to secure against these attacks, it becomes necessary to employ methods for both authentication and data encryption. The overhead for this alone may be more than the performance gain. Because information such as passwords are often stored in memory, not encrypting may lead to security holes rather than to just breaches of privacy.

Reading and writing swapped pages over a LAN results in a number of performance issues which are not as relevant when swapping to disk. When a page fault occurs on a machine with swap space on the local disk, reading the page in from disk generally takes an amount of time almost entirely dependent on the local machine. When accessing the page over a network, issues such as network contention and server load have an effect on the interactive performance.

Whereas the preceding are simply performance issues, distributing the memory system has reliability implications which are far more significant. Unless redundant swap servers are used, this whole approach becomes very risky and is extremely fault-prone. If the network were to fail, any processes which had swapped out pages would be unable to access those pages until the network came back up. If the itself server were to fail, all pages stored on it would most likely be lost (unless they were backed up to disk) due to the volatile nature of RAM. This lack of fault-tolerance would be totally unacceptable in almost any real-world environment. Leslie Lamport said something to the effect of, "A distributed system is where you are unable to get work done because a machine you've never heard of is down." Any system will fail, and having the loss of data and processes be the result of a single point of failure would be very undesirable.

Erik Nygren (nygren@mit.edu), 1995.02.21