6.033 Design Project 2: YouCast, A Peer-to-Peer Streaming Service

I. Assignment

You will do design project 2 in teams of three students who are in your recitation or share the same instructor. There are four deliverables for Design Project 2:

A list of team members emailed to your recitation instructor by April 12, 2007.
One copy of a design proposal not exceeding 1200 words, due on Tuesday, April 24, 2007.
One copy of a design report not exceeding 5,000 words, due on Thursday, May 10, 2007.
An in-class presentation on May 15, 2007. Details about your presentation. While this presentation falls during the last week of classes, in consultation with the chair of the faculty we have determined that the assignment falls in the spirit of the end-of-term rules.

6.033 design reports are different from quizzes and problem sets. These projects, like those in real life, are under-specified, and it is your job to complete the specification sensibly, given the project requirements. As with real-world designs, those requirements often need some adjustment as you flesh out your design. We strongly recommend that you start early so that you can iterate your design. A good design will likely take more than a few days.

II. Introduction

You and your friends have decided to start a new company, YouCast. In contrast to YouTube, where people can share their videos with others, YouCast provides live video streams. Anyone who has a camera connected to the Internet can contribute a video stream. Internet users browse a list of available YouCast streams and watch whichever stream they like.

To launch a prototype, you deploy a few cameras around MIT. For example, you may position a camera at LaVerde's to show the waiting lines at the register. The camera allows people to estimate the waiting time if they go buy stuff now. Other cameras show the lines at the Stata cafeteria and Lobdell. Your cameras at the Z-center allow students to check for congestion and avoid going to the gym when it is full. You have also deployed cameras in class rooms so that students can follow the lectures from home.

YouCast also allows any Internet user to contribute live streams. You have told your friends at MIT and other universities about your prototype. Many of them got excited and decided to contribute video streams. Now, whenever they have a party, they stream it live on the Internet via YouCast.

II.1 The YouCast System

The YouCast system has three types of machines, as shown in Figure 1:

Tracker: the tracker is a server that maintains a list of video servers that stream live video. For example, the tracker maintains the IP address of the PCs connected to the camera in LaVerde's and a short description of the video produced by that camera, e.g., "camera shows the register line at LaVerde's, MIT." Users who are interested in viewing a YouCast stream contact the tracker and search the list of available streams to find something they want to watch.
Video Servers: A YouCast video server streams the output of a particular camera to interested users. For example, the PC connected to the camera at LaVerde's runs a YouCast video server. The server first contacts the tracker and informs it of the availability of the video stream. Users who learn the IP address of the video server from the tracker may contact the video server to watch the stream.
YouCast Peers: A user interested in watching YouCast streams runs a YouCast peer on her/his local machine. The peer acts as a client and a server. The client part performs the following functionalities:

It allows a user to search the list of streams available at the tracker
It allows the user to pick a stream and start watching it
It allows the user to switch from one stream to another

The server part allows a YouCast peer to download a video stream from another peer that is watching it instead of downloading it from the video server. This reduces the load on the video server and improves scalability.

Diagram of system
Figure 1: YouCast includes three types of machines: a tracker, video servers, and peers. Each peer runs both a client and a server.

III. Your Task

In this project, you will design a few protocols that control the interaction among peers and between peers and a video server. Your design should achieve the following four tasks: high video quality, scalability, resilience to failures, and minimal security. Though we discuss these tasks in separate sections, they do depend on each other. You need to come up with a single design that addresses all of them.

Your design should be concise enough that other people can implement video servers and peers based on your description, and your protocols should not dictate particular implementations, in the same way that HTTP and HTML do not dictate the implementation of a Web browser.

III.1 High Quality Video Streams

A video stream is a sequence of frames. Each frame has a sequence number to identify its position in the stream (e.g., frame 5 is played after frame 4 and before frame 6). You do not need to worry about how to produce video frames or how to show them on the screen. You, however, need to design a protocol for timely dissemination of these frames from a video server to a peer machine interested in watching the stream.

Goal: The objective of your dissemination protocol is to ensure high-quality timely streaming of video. To see how you can achieve this goal you need to learn a few things about video. For proper video viewing, a peer needs to play the frames in order and at a constant rate. For this project, you can assume that a peer needs to play one frame every 50ms. This means that once a peer starts playing the frames, it needs to provide the next frame every 50ms. Say that the peer plays frame 6 at time t. Then at t+50ms, the peer needs to play frame 7. If the peer does not receive frame 7 by t+50ms, then it does not have the right frame to play. The peer can compensate for the missing frame but at the expense of a degraded video quality. Specifically, if the peer has received a frame with a higher sequence number (say frame 8), it skips the missing frame and plays that frame. If the peer has not received any frame with a sequence number larger than the missing frame, it plays the most recent frame (i.e., frame 6). In both cases, the video quality degrades. If the peer skips the missing frame, the user sees a jump in the video stream. If the peer replays the last frame, the user sees that the video freezes. Note that there is no point receiving frame 7 after a peer has played a frame with a larger sequence number (say frame 8) because frames have to be played in increasing order. Thus, to ensure high video quality, your design should maximize the likelihood that a peer has a frame by the time it needs to play it.

A naive design that retransmits every frame until it is received, regardless of whether it can still be played, is inefficient. In fact, such an approach may degrade the video quality because it spends the system's bandwidth on retransmitting the wrong frames.

Note also that a peer cannot download the whole video before starting to play it. For example, it is not acceptable to say that a peer who wants to watch the lines at LaVerde's reliably downloads frames for long interval, and starts playing the video once it has received all frames in a sequence of 10 minutes. By then the lines may have changed significantly. Your design should provide the user with the feeling that the video is streamed in realtime. This requires that the delay between generating a frame and playing it stays low, e.g., less than a minute.
Task and assumptions: Your job is to design a dissemination protocol that ensures good video quality. Your design may make use of the following:
- Producing Frames: You do not need to worry about frame generation. You can assume that someone has written a module that takes the output of a video camera and generates a video frame every 50ms. You can provide a call back function to this module Frame_Handler(frame_ptr pf). The module will call this function every 50ms and passes to it a pointer to most recent frame. Your job is to design the Frame_Handler and to disseminate the frames according to your designed protocol.
- Playing Frames: You do not need to worry about how to show frames on the screen. You can assume that someone else has written the function Play_Frame(frame_ptr fp), which takes a frame and shows it on the screen. Your protocol should arrange to call this function once every 50ms and give it a pointer to the frame you want to send to the screen.
- Network Delay: You can assume that the maximum delay in the network is bounded by 1 second. The network however may delay, drop, or reorder packets.
- Sending and Receiving Frames: To send a frame to a specific peer, you can call a pre-written function Send_Frame(frame_ptr fp, IP_address Destination), which sends the frame at fp to the peer whose IP address is Destination. Your video server should call Send_Frame once every 50ms, and pass to it the most recent frame. The call needs to be repeated for every destination (i.e., every peer that is streaming directly from this server). Whenever a peer receives a frame it calls Receive_Frame(frame_ptr fp, IP_address Source) and passes to it pointers to the new frame and its source IP address. This is a callback function, i.e., by providing code for this function you obtain every frame that the peer receives and can decide what to do with it.
- Frame Retransmission: Send_Frame runs on top of UDP. Thus, your protocol must detect lost frames and perform retransmissions when needed.
- Frame Header: A header is attached to each frame to indicate the frame sequence number. Given a frame you can obtain its sequence number by calling the function Get_Frame_Seqn(frame_ptr fp), which takes a pointer to a frame and returns its sequence number.
- Packets: You can assume that each packet contains a single video frame and its header.
- Extending API: You are free to extend the above API and the frame header, as you see fit. You need, however, to clearly describe your extensions.
- Timeliness: To maintain the realtime effect, the time between receiving a frame and playing it should not exceed 60 seconds.

III.2 Scalable Streaming

Your design should scale to a large number of peers, while maintaining high video quality. A YouCast video stream has an average rate of 0.5 Mb/s. This is a significant amount of bandwidth. Many video servers may be behind DSL or T1 lines, which are limited to 1-2 Mb/s. In this case, only a few peers can directly download the stream from the video server without causing severe congestion. This problem is not limited to servers behind DSL or T1 lines. Consider a scenario in which tens of thousands of international students start streaming a 6.033 lecture in realtime. This number of peers would overwhelm any MIT machine, despite its high-speed connectivity.

Goal: Your design needs to support scenarios in which thousands of peers watch the same stream. To address this issue, you may want to make some peers download the stream from each other, as shown in Figure 1. This means that each peer runs a peer server, which takes the received frames as input and transmits them to interested receivers. We say that the downloading peer is a child of the serving peer.

A naive approach would make the first peer stream the video from the video server. Other peers stream the video from the peer that joined immediately before them. As a result, the video will be delivered along a string of peers. Assume 1000 peers are interested in watching a particular stream. First, the stream will incur a very large delay before it reaches the 800th or 900th peer. Second, if Peer 1 misses a particular frame all other peers fail to receive that frame. Third, if Peer 1 crashes, the other 999 peers will experience a disturbance in video quality. You should propose a dissemination infrastructure that works much better than string dissemination infrastructure.
Your Task: You should augment your dissemination protocol with a join protocol. A peer runs the join protocol before the dissemination protocol to discover from which machine to stream video, i.e., to discover a parent machine. Then the child peer runs the dissemination protocol to stream video from its parent machine.

Your design of the join protocol should address the following issues:
- How does the system maintain information about which peers are currently watching a particular stream? How does it keep information about the various parent-child relationships? Which machine (or machines) maintains that information? What data structure do you use to store that information and how do you update it?
- How does the system use the above information to direct a new peer to stream the video from a particular machine?
- Ensure that your dissemination protocol (from Section III.1) continues to work properly when it runs between two peers, a parent and its child. Whether you use a server-push or client-pull model, you should make sure that the parent appropriately provides frames to its children in the tree. Remember to consider the case where the parent lacks the appropriate frame to send to the child.
- Does the system ensure scalability to a large number of peers? Do a quick estimation of the maximum number of peers per stream, in your system. Use the assumptions below in your evaluation.
Assumptions and constraints: In addition to all previous assumptions, you can assume the following:
- A video stream requires an average rate of 0.5 Mb/s.
- Each machine (a peer or a video server) has a fixed limit on the total streaming rate that it can handle. This limit bounds the sum of incoming and outgoing streaming traffic.
- The limit varies from one machine to another, but each machine knows its own limit.
- For your analysis only, you can assume that on average a machine can handle 5 streams.
- The maximum latency in the network between any pair of machines is 1second.
- Each peer watches only one stream at any point in time, and each video server is attached to one camera.
- For your analysis, assume that a frame that is delayed by more than one minute while being disseminated is not useful.
- Finally, focus on the scalability of the dissemination protocol. You do not need to worry about whether the tracker can handle queries about available streams.

III.3 Resilience to Peer Failures

A machine may crash or reboot at any time. Further, a peer may stop watching a stream at any time. Your design should minimize the impact of such events on performance. In particular, you may want to make peers serve video to each other to reduce the load on the video server. But if a peer crashes or leaves the system, your design should minimize the likelihood that other peers that download the video from the failed peer will miss frames. Note that you cannot assume that a peer will inform its children before rebooting or killing the application.

Goal & Task: Design your protocols to be robust to any peer failure, i.e., as long as multiple peers do not fail at the same time, no peer will see performance degradation that is caused by peer failures. Specifically:

Modify your design of the join protocol and the dissemination protocol (Sections III.1 and III.2) to maximize the likelihood that a peer has a frame by the time it needs to play it, despite the possibility of a parent failure.
Add a repair protocol that allows a peer to recover after a parent failure. A peer runs the repair protocol after it detects a parent failure to recover the dissemination structure and its robustness to future failures.
Try when possible to reduce the overhead of your design. Recall that the average machine in your system can handle a maximum of 5 streams. If your failure design introduces a large overhead, it will affect the scalability of your protocol.
You may need to revisit your scalability estimation from section III.2 to take into account the modification you introduced to deal with failures. Submit only one calculation that estimates the maximum number of peers per stream supported in your final design supports.

Additional Assumptions:

You can assume that all peer failures are fail-stop failures (i.e., no malicious peers).
You do not need to worry about failures of the tracker or the video server. You also need not worry about network partitions. Your design should focus only on peer failures.

III.4 Minimal Security

Your design should support private streams. A camera owner can make his stream accessible only to specific peers. Users who are allowed to watch the stream exchange shared-secret keys with the camera owner using an out of band mechanism (e.g., in email or over the phone). Peers that are not allowed to watch a stream should not be able to watch it, even though they may download the video.

Goal: Extend your design to support private streams. Your design should specify how to use the keys to ensure that only peers who have the right keys can watch the stream. Your design should also ensure that private streams stay scalable.
Additional Assumption:

Assume a camera owner distributes shared-secret keys to his friends in email messages.

Task: You need to tell us how you address the following issues:

Does the camera owner distribute the same key to all potential peers or different keys?
How does your system use the keys to ensure that a peer that does not have the right key cannot watch the stream?
Do you need to change the dissemination, join, or repair protocols? If yes, how? In particular, describe how parent peers handle frames from private streams.

IV. Your Design Proposal

The design proposal should be a concise summary (1200 words) of your overall system design. It's a good idea to start out with a system diagram to show how you plan to make the overall system work. Your proposal should lay out the basic mechanisms in the dissemination protocol and the join protocol.

You do not have to present a detailed rationale or analysis. However, if any of your design decisions is unusual (particularly creative, experimental, risky, or specifically warned against in the assignment), or you want to change any of the assumptions substantially, it would be wise to describe such changes in your proposal.

You will receive feedback from your TA in time to adjust your final report.

V. Your Design Report

Here are some items we are interested in seeing in your final report.

Distribution mechanism diagram: A diagram (or a series of diagrams) that show the overall organization of peers to support peer to peer streaming and the dissemination infrastructure.
Overview: An overview explanation that describes the design from a high level, referring to and introducing your diagrams.
Dissemination protocol: A description of your proposed dissemination protocol. Describe how both how the server and peers participate in the protocol. Detail how the protocol handles network problems as well as peer failure.
Join protocol: A description of how a peer joins the dissemination infrastructure. If you use any complicated metadata structures, be sure to describe how they work. Provide a reasonable example of a peer joining a stream.
Repair protocol: A description of how a peer detects a failure and repairs the dissemination infrastructure to recover robustness and ensure it can recover from future failures
Security extension: A description of how you handle private streams.
Scalability and Fault Tolerance analysis: Describe how your system performs as the number of peers grows. Furthermore, argue that your system handles a single crashed peer with no loss of performance (be sure to look for tricky corner cases).

Note that your report should describe only the final design of your protocols. For example, if you have designed a dissemination protocol to address the goal in Section III.1 but had to modify it later to address the goals in Sections III.2, III.3, and III.4, describe only the final design. We expect one design for each protocol, and that the combination of the three protocols satisfies all the goals in this document. Similarly, you should report the analysis of only the final design.

VI. Your written work

The following are some tips on writing your design report. While the Writing Program will not be grading DP2, you can still use them as a valuable writing resource.

Audience: You may assume that the reader has read the DP2 assignment but has not thought carefully about this particular design problem. Give enough detail that your project can be turned over successfully to an implementation team.

Report Outline (use this organization for your report):

Title Page: Give your design report a title that reflects the subject and scope of your project. Include your names, recitation instructor, and the date on the title page.
No Table of Contents.
Introduction: Summarize the salient aspects of your design, the trade-offs you made, a brief rationale for the design you have chosen, and the results from your analysis.
Design Description and Analysis: Explain and elaborate your solution. Show how your solution satisfies the design constraints and solves the design problem (or how it fails to do so). Be sure to identify trade-offs you made, and justify them. Clearly describe your analysis and the conclusions you have drawn from your analysis. You will want to sub-divide this section into subsections and use figures or code, where needed, to support your design choices.
Conclusion: Provide a short conclusion that provides recommendations for further actions and a list of issues that must be resolved before the design can be implemented.
Acknowledgments and References: Give credit to individuals whom you consulted in developing your design. Reference any sources cited in your report. The style of your references should be in the format described in the Mayfield Handbook's explanation of the IEEE style.
Word count. Please indicate the word count of your document at the end of the project.