Austin Clements

Decentralized Deduplication

Eliminating duplicate data from a file system can vastly reduce storage requirements, especially in large, shared storage environments. Unlike existing deduplication systems, which all require either a centralized component or modification of the storage system itself, our system supports symmetrically decentralized, cluster-scale deduplication in a shared-disk file system.

Decentralized Deduplication in SAN Cluster File Systems
Austin T. Clements, Irfan Ahmad, Murali Vilayannur, Jinyuan LiJune 17, 2009
USENIX '09
Decentralized Data Deduplication in VMware's SAN File System
Austin T. Clements, Irfan Ahmad, Murali Vilayannur, Jinyuan LiSeptember 15, 2008
VMworld '08 Poster

Xoc

Xoc is an extension-oriented compiler for C based on a custom meta-language, concrete generic syntax manipulation, and pass-free lazy compilation. To the best of our knowledge, it is the first compiler to support an extension-oriented paradigm, meaning that the user can select which independently-written language extensions to compose at compiler run-time, as opposed to the traditional monolithic extensible paradigm, in which each extension represents a whole new compiler. Extensions can extend both the syntax and the semantics of C, and are intended to be both terse and easy to write without knowledge of the internals of Xoc.

See also the Xoc homepage.

Xoc, an Extension-Oriented Compiler for Systems Programming
Russ Cox, Tom Bergan, Austin T. Clements, Frans Kaashoek, Eddie KohlerMarch 1, 2008
ASPLOS '08
A Comparison of Designs for Extensible and Extension-Oriented Compilers
Austin T. ClementsFebruary 4, 2008
Master's thesis

Optimizing Distributed Read-Only Transactions

This algorithm and implementation explores the viability of an alternate approach to distributed transactions that optimizes for read-only transactions - the common case in file system workloads and many modern database applications. While typical distributed transactional systems weaken consistency guarantees in order to provide acceptable performance, our approach instead weakens causality guarantees while still ensuring true serializability.

Optimizing Distributed Read-Only Transactions Using Multiversion Concurrency
Dan R. K. Ports, Austin T. Clements, Irene Y. ZhangDecember 15, 2007
Final paper for 6.830 (Database Systems)

Plaid

Plaid is a pattern matching system for Scheme in which the pattern language is a declarative, reversible subset of Scheme itself. This allows pattern matching for abstract datatypes where a single definition serves as both the constructor and deconstructor for the data type. The pattern matching control-flow construct provided by Plaid is itself reversible, and is therefore available within the pattern language. This makes it possible to express canonicalization, alternate views of data, as well as non-determinism in a highly declarative style.

Plaid: Pattern Language for Abstract Datatypes
Dan R. K. Ports, Austin T. Clements, Irene Y. ZhangMay 14, 2007
Final paper for 6.891 (Adventures in Advanced Symbolic Programming)

Guarded Atomic Actions for Haskell

GAAH is a Haskell library that implements a parallelism paradigm known as guarded atomic actions, a form of transactional memory. A module consists of a set of actions, each of which has a guard. At any given time, all actions whose guards are satisfied can fire in parallel. Actions running in parallel are always guaranteed an atomic view of the module's state and their modifications of the state are transactional.

Guarded Atomic Actions for Haskell
Austin Clements and Yang ZhangDecember 13, 2006
Final paper for 6.827 (Multithreaded Parallelism)

Canopy

Canopy is an environment for experimentation and debugging of network systems. It provides total control over the system by running every node and the network in virtual machines. Time is carefully controlled and synchonized between all virtual machines, allowing complete control over network latencies, as well as behavior such as packet dropping. Furthermore, the entire system can be rolled back to any point in the past to determistically experiment with the effects of differing network behavior. Canopy is implemented as a distributed system to help alleviate the performance impacts of full emulation.

Canopy: A Controlled Emulation Environment for Network System Experimentation
Dan Ports, Austin Clements, Jeff ArnoldDecember 15, 2005
Final paper for 6.829 (Computer Networks)

PersiFS

PersiFS is a versioned file system that retains a history of its contents. The entire contents of the file system as it appeared at any point in the past can be accessed from a special automount-like directory. The current version of the file system can be accessed and modified almost as efficiently as in regular non-persistent file systems. Unlike other versioned file systems, PersiFS provides the same efficiency when accessing past versions. Furthermore, PersiFS optimizes storage space by efficiently recognizing and coalescing common substrings across different versions of files and across different files.

PersiFS: A Versioned File System with an Efficient Representation
Dan R. K. Ports, Austin T. Clements, Erik D. DemaineOctober 24, 2005
SOSP '05 Poster
Structures for Efficient File System-Scale Partial Persistence
Dan R. K. Ports, Austin T. ClementsMay 12, 2005
Final paper for 6.897 (Advanced Data Structures)
PersiFS: A Continuously Versioned Network File System
Austin T. Clements, Dan R. K. Ports, Ben A. Schmeckpeper, Hector YuenMay 12, 2005
Final paper for 6.824 (Distributed Systems Engineering)

√X

√X is a graphical window system developed in a few days as an answer to the "do something cool" challenge problem at the end of MIT's operating systems engineering course. Dan Ports and I figured it was the only obvious thing to do, given that for the last lab we had added a basic UNIX-like shell to the operating system we had written in the class. In exokernel-style, √X consisted largely of isolated user-space programs and libraries. Over the few days we had to implement our "something cool" for the class, we wrote a full VESA graphics driver (including a Virtual-8086 mode driver), a mouse driver, various IPC mechanisms, a graphics library (including translucency and anti-aliasing support), the window system itself, and a number of graphical applications (including a terminal emulator). And on the due date we rested.

√X: A Window System for the JOS Operating System
Austin T. Clements, Dan R. K. PortsDecember 8, 2004
Project technical notes for 6.828 (Operating Systems Engineering)

Arpeggio

Arpeggio started as a summer research project to design and develop a completely decentralized peer-to-peer file sharing system based on the Chord distributed hash table. The goal was to provide a provable level of completeness in search results, using carefully designed distributed algorithms to maintain both time and space efficiency. The design has been completed and published, and my co-conspirator, Dan Ports, is working on the implementation for his Master's thesis.

Arpeggio: Metadata Searching and Content Sharing with Chord
Austin T. Clements, Dan R. K. Ports, David R. KargerJanuary 21, 2005
IPTPS '05
Apeggio: Efficient Metadata-based Searching and File Transfer with DHTs
Austin T. Clements, Dan R. K. Ports, David R. KargerNovember 8, 2004
ISW '04 Poster

Gizmoball

My 6.170 (Software Engineering) team implemented a pinball-like game called Gizmoball for our final project. Our implementation includes numerous innovations under-the-hood as well as in the user interface. The engine made extensive use of Java reflection to produce a highly decoupled system that essentially figured out what it was capable of at runtime, and our generalized "interaction engine" provided an abstract platform on which to implement the physics system. Our user interface made use of "frobbers" (ie, pie menus) to make the game board editor intuitive and efficient to use, and won the 6.170 Usability Award.

Gizmoball Design Document
Austin Clements, Albert Leung, Dan PortsMay 11, 2004
Final project report for 6.170 (Software Engineering)

Emulab

I spent two summers working with the Flux group at the University of Utah on a controlled network simulation and emulation testbed called Emulab. One of these summers I spent integrating our local-area system with wide-area distributed experimentation platform called PlanetLab, allowing experiments to integrate use of Emulab's nodes and highly controlled network environment with use of PlanetLab's world-wide network of nodes.

Implementing the Emulab-PlanetLab Portal: Experiences and Lessons Learned
Kirk Webb, Mike Hibler, Robert Ricci, Austin Clements, Jay LepreauDecember 5, 2004
WORLDS '04

Active Doom

I spent the summer after my junior year of high school developing test and demonstration applications for the Java-based active network operating system JANOS, being developed at the University of Utah. The fruit of my work was an experiment in redesigning a real-world non-active protocol as an active protocol to see what power and efficiency could be gained. And what better protocol to start with than the multi-player protocol of Id Software's DOOM?

Active Doom: Applying Active Networks to Traditional Protocols
Austin T. Clements, Patrick Tullmann, Jay LepreauOctober 22, 2001
SOSP '01 WIP Presentation