Reading List for 6.897 Cloud Computing (Spring 2011)
This schedule is tentative and subject to change.
Feb 4 2011: Overview & kick-off
So what is a "cloud", anyway? Here are some attempts at
definitions. The NIST definition appears to be gaining considerable
traction. And a pointer to a blog essay on what Netflix has learned
(that posting has a pointer to a previous one on why they moved to
AWS)
Agenda: Introduction and overview of seminar topics
(Hari). Slides (needs MIT
certificate).
- [Berkeley:AboveTheClouds] Above
the Clouds: A Berkeley View of Cloud Computing
M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz,
A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica,
M. Zaharia
Tech. Rep. UCB/EECS-2009-28, Feb 10, 2009.
- [NIST:Definition] NIST
definition of cloud computing (v15)
P. Mell and T. Grance.
May 2010. (The web link above has other information too. The
document with the definition (v15) is in MS Word -- ironic for a
standards body to use a not-quite-standard document format! I'll convert it
to PDF and make a local copy at MIT.)
- 5
Lessons We've Learned Using AWS
John Ciancutti, Netflix
blog, December 16, 2010.
Feb 11 2011: Example IaaS/PaaS systems
Agenda (links to essays):
CloudCmp (Katrina LaCurts),
AWS/Azure/AppEngine
overview (Daniel Firestone),
Eucalyptus (Jeff Bezanson)
- [CloudCmp] CloudCmp:
Comparing Public Cloud Providers
Ang Li, Xiaowei Yang,
Srikanth Kandula, Ming Zhang
Internet Measurement Conf.,
November 2010.
- [Eucalyptus] The
Eucalyptus Open-source Cloud-computing System
D. Nurmi,
R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff and
D. Zagorodnov
CCGrid: The 9th IEEE International Symposium on
Cluster Computing and the Grid, May 2009.
Feb 18 2011: Scalable data storage -- "NOSQL" antecedents
These papers don't quite say it, but the two papers on Dynamo and BigTable, below, led to a number of NOSQL systems, such as HBase, Voldemort, Riak, Scalaris, Cassandra, Tokyo Cabinet, MongoDB, CouchDB, etc. that all started from the premise that "Transactional SQL systems can't scale-out".
Agenda (links to essays):
FBPhotos (Jonathan Ledlie --
Nokia), Dynamo (Alexandre
Milouchev), BigTable
(Garthee Ganeshapillai)
- [FBPhotos] Finding
a Needle in Haystack: Facebook's Photo Storage
D. Beaver, S. Kumar, H.C. Li, J. Sobel, and P. Vajgel
OSDI, October 2010.
Also read: Needle in
a haystack: efficient storage of billions of photos
Peter Vajgel, Facebook note, April 30, 2009.
- [Dynamo] Dynamo:
Amazon's Highly Available Key-value Store
G. DeCandia,
D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin,
S. Sivasubramanian, P. Vosshall, and W. Vogels
SOSP, October 2007.
Also read: Eventually
Consistent - Revisited, W. Vogels, Amazon Inc.
- [BigTable]
BigTable:
A Distributed Storage System for Structured Data (ACM DL
version using MIT's libproxy; requires MIT cert. or retrieval
from an MIT IP address)
F. Chang, J. Dean, S. Ghemawat, W. Hsieh, D. Wallach, M. Burrows,
T. Chandra, A. Fikes, R. Gruber
ACM Trans. on Computer Systems (TOCS), June 2008 (journal version
of OSDI 2006 paper).
Background:
GFS
(Google File System, SOSP 2003)
Feb 25 2011: Scalable transactional systems
Not to pass on a challenge, researchers have in fact developed (and
continue to develop) techniques to scale SQL. Some examples are
Megastore (which actually is built on top of BigTable), Hyder, and
Schism, as well as H-Store (which we'll study next week).
Agenda: NOSQL (Mark Yen), Megastore (Adam Mustafa),
Hyder & Schism (Carlo Curino)
- [NOSQL]
NoSQL
Ecosystem
Jonathan Ellis, Rackspace, November 9, 2009
- [Hyder] Hyder - A Transactional Record Manager for Shared Flash
Philip Bernstein, Colin Reid, Sudipto Das
CIDR 2011
- [Schism]
Schism: a
Workload-Driven Approach to Database Replication and
Partitioning
Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden
VLDB 2010
- [Megastore] Megastore: Providing Scalable, Highly Available
Storage for Interactive Services
J. Baker, C. Bond, J. Corbett, J. Furman, A. Khorlin, J. Larson,
J-M. Leon, Y. Li, A. Lloyd, V. Yushprakh
CIDR 2011
March 4 2011: Harnessing RAM and Flash Storage (for Better Latency,
Throughput, and Power)
Agenda: RAMCloud (Tuan Huynh), HStore (Chenxia
Liu), FAWN (Eric Lau)
- [RAMCloud] The
Case for RAMClouds: Scalable High-Performance Storage Entirely in
DRAM
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D.,
Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum,
S. Rumble, E. Stratmann, and R. Stutsman
SIGOPS OSR, 43(4), December 2009, pp. 92-105.
- [HStore] The end
of an architectural era: (it's time for a complete
rewrite)
M. Stonebraker, S. Madden, D. J. Abadi,
S. Harizopoulos, N. Hachem, and P. Helland
VLDB 2007.
- [FAWN] FAWN:
A Fast Array of Wimpy Nodes
D. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan,
V. Vasudevan
SOSP 2009.
March 11 2011: Programming frameworks
Agenda: MapReduce recap & MRvDB (Haitao Mao),
Dryad/DryadLINQ (Lenin Ravindranath), Pig latin (Arvind
Thiagarajan)
- [MapReduce]
MapReduce:
Simplified Data Processing on Large Clusters
J. Dean and
S. Ghemawat
OSDI 2004.
- [MRvDB] MapReduce v. parallel DBMSs (two short papers):
- [Dryad]
Dryad:
Distributed Data-Parallel Programs from Sequential Building
Blocks
M. Isard, M. Budiu, Y. Yu, A. Birrrell, D. Fetterly
Also skim: [Dryadlinq] DryadLINQ: A System for
General-Purpose Distributed Data-Parallel Computing Using a
High-Level Language, Y. Yu, et al.
- [Piglatin]
Pig
latin: a not-so-foreign language for data processing
C. Olston, B. Reed, U. Srivastava, R. Kumar, A. Tomkins
SIGMOD 2008.
March 18 2011: Securing cloud storage
Agenda: HAIL (Alina Oprea and Kevin Bowers -- RSA
Labs), Depot (Wissam Jarjoui)
March 25 2011: Spring break!
April 1 2011: Datacenter OS
Agenda: Mesos (Jeffrey Bezanson), DRF (Da Wang),
fos (Subha Gollakota),
- [Mesos] Mesos: A
Platform for Fine-Grained Resource Sharing in the Data Center
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A.D. Joseph,
R. Katz, S. Shenker, I. Stoica
NSDI, March 2011.
- [DRF]
Dominant
Resource Fairness: Fair Allocation of Heterogeneous Resources in
Datacenters
A. Ghodsi, M. Zaharia, B. Hindman,
A. Konwinski, S. Shenker, I. Stoica
NSDI, March 2011.
- [fos]
An
Operating System for Multicore and Clouds: Mechanisms and
Implementation
D. Wentzlaff, C. Gruenwald III, N. Beckmann, K. Modzelewski, A. Belay,
L. Youseff, J. Miller, and A. Agarwal
ACM Symp. Cloud Computing (SoCC), June 2010.
April 8 2011: Virtualized datacenters
Agenda: Orran Krieger (VMWare) [to be confirmed]
April 15 2011: Cloud security
Agenda: WhatsNew (Charles Wright / Sophia
Yuditskaya -- Lincoln Labs), HeyYou (Yod Watanaprakornkul), CryptDB (Raluca Ada
Popa)
- [WhatsNew] What's New About Cloud Computing Security?
Yanpei Chen, Vern Paxson, Randy H. Katz
- [HeyYou]
Hey,
You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds
T. Ristenpart, E. Tromer, H. Shacham, and S. Savage
ACM CCS, 2009.
- [CryptDB] CryptDB: A
Practical Encrypted Relational DBMS
R.A. Popa, N. Zeldovich, H. Balakrishnan
MIT-CSAIL-TR-2011-005, January 2011.
April 22 2011: Datacenter Networking
Incast (Sari Canelake), DCTCP (Shuo Deng), Onix
(Jonathan Perry), CloudCost (David Lam)
- [Incast]
Safe
and Effective Fine-grained TCP Retransmissions for Datacenter
Communication
V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. Andersen,
G. Ganger, G. Gibson, B. Mueller
SIGCOMM 2009.
- [DCTCP]
DCTCP:
Efficient Packet Transport for the Commoditized Data
Center
M. Alizadeh, A. Greenberg, D. Maltz, J. Padhye, P. Patel,
B. Prabhakar, S. Sengupta, M. Sridharan
SIGCOMM 2010.
- [Onix] Onix:
A Distributed Control Platform for Large-scale Production
Networks
T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu,
R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, S. Shenker
- [CloudCost] Cost
of a cloud: Research Problems in Data Center Networks
A Greenberg, J. Hamilton, D. Maltz, P. Patel.
ACM SIGCOMM CCR, Jan. 2009.
April 29 2011: Datacenter routing and transport
Agenda: Portland & VL2 (Hari Balakrishnan),
EnergyPropDC (Youngjune Gwon)
- [VL2]
VL2:
A Scalable and Flexible Data Center Network
A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri,
D. Maltz, P. Patel, S. Sengupta
SIGCOMM 2009.
- [Portland]
Portland:
A Scalable Fault-Tolerant Layer 2 Data Center Network
Fabric
R. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri,
S. Radhakrishnan, V. Subramanya, A. Vahdat
SIGCOMM 2009.
- [EnergyPropDC]
Energy
Proportional Datacenter Networks
D. Abts, M. Marty,
P. Wells, P. Klauser, H. Liu
ISCA 2010.
May 6 2011: Energy considerations
Agenda: EProp (David Lam), PowerProv (Ben
Wheeler), ElectricBill (Ethan Xu)
Other readings (that we unfortunately may not have time for...)
- Benchmarking Cloud
Serving Systems with YCSB
B.F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, R. Sears
SoCC, June 2010.
- FATE and DESTINI: A Framework for Cloud Recovery Testing
Haryadi S. Gunawi, Thanh Do, Pallavi Joshi, Peter Alvaro, Joseph M. Hellerstein, Andrea
C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Koushik Sen, Dhruba
Borthakur
NSDI 2011.
- Availability
in Globally Distributed Storage Systems
D. Ford, F. Labelle, F.I. Popovici, M. Stokely, V. Truong,
L. Barroso, C. Grimes, and S. Quinlan
OSDI 2010.
- [HadoopDB]