6.897 Cloud Computing Reading List

Reading List for 6.897 Cloud Computing (Spring 2011)

This schedule is tentative and subject to change.

Feb 4 2011: Overview & kick-off

So what is a "cloud", anyway? Here are some attempts at definitions. The NIST definition appears to be gaining considerable traction. And a pointer to a blog essay on what Netflix has learned (that posting has a pointer to a previous one on why they moved to AWS)

Agenda: Introduction and overview of seminar topics (Hari). Slides (needs MIT certificate).

[Berkeley:AboveTheClouds] Above the Clouds: A Berkeley View of Cloud Computing
M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, M. Zaharia
Tech. Rep. UCB/EECS-2009-28, Feb 10, 2009.
[NIST:Definition] NIST definition of cloud computing (v15)
P. Mell and T. Grance. May 2010. (The web link above has other information too. The document with the definition (v15) is in MS Word -- ironic for a standards body to use a not-quite-standard document format! I'll convert it to PDF and make a local copy at MIT.)
5 Lessons We've Learned Using AWS
John Ciancutti, Netflix blog, December 16, 2010.

Feb 11 2011: Example IaaS/PaaS systems

Agenda (links to essays): CloudCmp (Katrina LaCurts), AWS/Azure/AppEngine overview (Daniel Firestone), Eucalyptus (Jeff Bezanson)

[CloudCmp] CloudCmp: Comparing Public Cloud Providers
Ang Li, Xiaowei Yang, Srikanth Kandula, Ming Zhang
Internet Measurement Conf., November 2010.
[Eucalyptus] The Eucalyptus Open-source Cloud-computing System
D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff and D. Zagorodnov
CCGrid: The 9th IEEE International Symposium on Cluster Computing and the Grid, May 2009.

Feb 18 2011: Scalable data storage -- "NOSQL" antecedents

These papers don't quite say it, but the two papers on Dynamo and BigTable, below, led to a number of NOSQL systems, such as HBase, Voldemort, Riak, Scalaris, Cassandra, Tokyo Cabinet, MongoDB, CouchDB, etc. that all started from the premise that "Transactional SQL systems can't scale-out".

Agenda (links to essays): FBPhotos (Jonathan Ledlie -- Nokia), Dynamo (Alexandre Milouchev), BigTable (Garthee Ganeshapillai)

[FBPhotos] Finding a Needle in Haystack: Facebook's Photo Storage
D. Beaver, S. Kumar, H.C. Li, J. Sobel, and P. Vajgel
OSDI, October 2010.
Also read: Needle in a haystack: efficient storage of billions of photos
Peter Vajgel, Facebook note, April 30, 2009.
[Dynamo] Dynamo: Amazon's Highly Available Key-value Store
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels
SOSP, October 2007.
Also read: Eventually Consistent - Revisited, W. Vogels, Amazon Inc.
[BigTable] BigTable: A Distributed Storage System for Structured Data (ACM DL version using MIT's libproxy; requires MIT cert. or retrieval from an MIT IP address)
F. Chang, J. Dean, S. Ghemawat, W. Hsieh, D. Wallach, M. Burrows, T. Chandra, A. Fikes, R. Gruber
ACM Trans. on Computer Systems (TOCS), June 2008 (journal version of OSDI 2006 paper).
Background: GFS (Google File System, SOSP 2003)

Feb 25 2011: Scalable transactional systems

Not to pass on a challenge, researchers have in fact developed (and continue to develop) techniques to scale SQL. Some examples are Megastore (which actually is built on top of BigTable), Hyder, and Schism, as well as H-Store (which we'll study next week).

Agenda: NOSQL (Mark Yen), Megastore (Adam Mustafa), Hyder & Schism (Carlo Curino)

[NOSQL] NoSQL Ecosystem
Jonathan Ellis, Rackspace, November 9, 2009
[Hyder] Hyder - A Transactional Record Manager for Shared Flash
Philip Bernstein, Colin Reid, Sudipto Das
CIDR 2011
[Schism] Schism: a Workload-Driven Approach to Database Replication and Partitioning
Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden
VLDB 2010
[Megastore] Megastore: Providing Scalable, Highly Available Storage for Interactive Services
J. Baker, C. Bond, J. Corbett, J. Furman, A. Khorlin, J. Larson, J-M. Leon, Y. Li, A. Lloyd, V. Yushprakh
CIDR 2011

March 4 2011: Harnessing RAM and Flash Storage (for Better Latency, Throughput, and Power)

Agenda: RAMCloud (Tuan Huynh), HStore (Chenxia Liu), FAWN (Eric Lau)

[RAMCloud] The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D., Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. Rumble, E. Stratmann, and R. Stutsman
SIGOPS OSR, 43(4), December 2009, pp. 92-105.
[HStore] The end of an architectural era: (it's time for a complete rewrite)
M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland
VLDB 2007.
[FAWN] FAWN: A Fast Array of Wimpy Nodes
D. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, V. Vasudevan
SOSP 2009.

March 11 2011: Programming frameworks

Agenda: MapReduce recap & MRvDB (Haitao Mao), Dryad/DryadLINQ (Lenin Ravindranath), Pig latin (Arvind Thiagarajan)

[MapReduce] MapReduce: Simplified Data Processing on Large Clusters
J. Dean and S. Ghemawat
OSDI 2004.
[MRvDB] MapReduce v. parallel DBMSs (two short papers):
- MapReduce and Parallel DBMSs: Friends or Foes?, M. Stonebraker, et al. Communications of the ACM, January 2010.
- MapReduce: A Flexible Data Processing Tool, J. Dean and S. Ghemawat, Communications of the ACM, January 2010.
[Dryad] Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks
M. Isard, M. Budiu, Y. Yu, A. Birrrell, D. Fetterly
Also skim: [Dryadlinq] DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language, Y. Yu, et al.
[Piglatin] Pig latin: a not-so-foreign language for data processing
C. Olston, B. Reed, U. Srivastava, R. Kumar, A. Tomkins
SIGMOD 2008.

March 18 2011: Securing cloud storage

Agenda: HAIL (Alina Oprea and Kevin Bowers -- RSA Labs), Depot (Wissam Jarjoui)

HAIL: A High-Availability and Integrity Layer for Cloud Storage
Kevin D. Bowers, Ari Juels, Alina Oprea
CCS 2009.
Slides.
Depot: Cloud Storage with Minimal Trust
P. Mahajan, S. Setty, S. Lee, A. Clement, L. Alvisi, M. Dahlin, and Michael Walfish
OSDI 2010.

March 25 2011: Spring break!

April 1 2011: Datacenter OS

Agenda: Mesos (Jeffrey Bezanson), DRF (Da Wang), fos (Subha Gollakota),

[Mesos] Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A.D. Joseph, R. Katz, S. Shenker, I. Stoica
NSDI, March 2011.
[DRF] Dominant Resource Fairness: Fair Allocation of Heterogeneous Resources in Datacenters
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, I. Stoica
NSDI, March 2011.
[fos] An Operating System for Multicore and Clouds: Mechanisms and Implementation
D. Wentzlaff, C. Gruenwald III, N. Beckmann, K. Modzelewski, A. Belay, L. Youseff, J. Miller, and A. Agarwal
ACM Symp. Cloud Computing (SoCC), June 2010.

April 8 2011: Virtualized datacenters

Agenda: Orran Krieger (VMWare) [to be confirmed]

[Marketplace] Director Enabling a Marketplace of Clouds: VMware's vCloud Director
O. Krieger, P. McGachey, A. Kanevsky
ACM SIGOPS OSR, 44(4), December 2010.
The link to the paper above will allow you to get the paper via MIT's libproxy; it needs MIT certificates. You can also get it directly from the ACM Digital Library.

April 15 2011: Cloud security

Agenda: WhatsNew (Charles Wright / Sophia Yuditskaya -- Lincoln Labs), HeyYou (Yod Watanaprakornkul), CryptDB (Raluca Ada Popa)

[WhatsNew] What's New About Cloud Computing Security?
Yanpei Chen, Vern Paxson, Randy H. Katz
[HeyYou] Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds
T. Ristenpart, E. Tromer, H. Shacham, and S. Savage
ACM CCS, 2009.
[CryptDB] CryptDB: A Practical Encrypted Relational DBMS
R.A. Popa, N. Zeldovich, H. Balakrishnan
MIT-CSAIL-TR-2011-005, January 2011.

April 22 2011: Datacenter Networking

Incast (Sari Canelake), DCTCP (Shuo Deng), Onix (Jonathan Perry), CloudCost (David Lam)

[Incast] Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication
V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. Andersen, G. Ganger, G. Gibson, B. Mueller
SIGCOMM 2009.
[DCTCP] DCTCP: Efficient Packet Transport for the Commoditized Data Center
M. Alizadeh, A. Greenberg, D. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, M. Sridharan
SIGCOMM 2010.
[Onix] Onix: A Distributed Control Platform for Large-scale Production Networks
T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, S. Shenker
[CloudCost] Cost of a cloud: Research Problems in Data Center Networks
A Greenberg, J. Hamilton, D. Maltz, P. Patel.
ACM SIGCOMM CCR, Jan. 2009.

April 29 2011: Datacenter routing and transport

Agenda: Portland & VL2 (Hari Balakrishnan), EnergyPropDC (Youngjune Gwon)

[VL2] VL2: A Scalable and Flexible Data Center Network
A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, S. Sengupta
SIGCOMM 2009.
[Portland] Portland: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric
R. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, A. Vahdat
SIGCOMM 2009.
[EnergyPropDC] Energy Proportional Datacenter Networks
D. Abts, M. Marty, P. Wells, P. Klauser, H. Liu
ISCA 2010.

May 6 2011: Energy considerations

Agenda: EProp (David Lam), PowerProv (Ben Wheeler), ElectricBill (Ethan Xu)

[EProp] The Case for Energy-Proportional Computing
L. Barroso, U. Holzle
IEEE Computer, vol. 40, 2007.
[PowerProv] Power Provisioning for a Warehouse-sized Computer
X. Fan, W-D. Weber, L. Barroso
ISCA 2007.
[ElectricBill] Cutting the Electric Bill for Internet-Scale Systems
A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, B. Maggs
SIGCOMM 2009.