Multimedia Storage and Communications:
Online Documentation and Training
 
David C. Carver
dcc@mit.edu
Branko J. Gerovac
bjg@mit.edu
 
Project report for the
Multimedia Server Deployment project
Fall 1997
Research Program on Communications Policy
Center for Technology, Policy and Industrial Development
Massachusetts Institute of Technology

Abstract

Two years ago we set out to purchase a 300GB multimedia capable storage server for use in online documentation and training.

What started out as a straightforward task of performing due diligence in the acquisition - requirements assessment, solicitation, and purchase for a piece of equipment that everyone believed was readily available - turned out to be on the edge of what was possible.

This is the story of what we discovered along the way and what it says about the state of storage server development, of multimedia systems development, and the direction these developments are taking.
 
 

    The Problem, The Opportunity

The US Army is undertaking a major initiative to replace its paper based documentation and training systems with an online computer mediated system. To put this into perspective, the Army Training Support Center (ATSC) in Ft. Eustis, Virginia maintains approximately 5000 training publications. The Army Correspondence Course Program alone offers 2700 titles to over 200,000 subscribers per year. The most popular course? Combat Lifesaver. Everyday, a tractor-trailer full of publications - several tons of paper - leaves the base and is distributed via US Mail to personnel across the country. Interestingly, ATSC has it own dedicated zip code.

The goal is not just to eliminate paper, but to convert conventional publications into interactive courses that are multimedia and graphically rich - a truly massive undertaking. Even at this early stage of development, interactive multimedia courses may contain hours of video, cost hundreds of thousands of dollars, and take a team of people a year to produce. Putting aside, for the moment, the potential documentation and training demands of high tech weapons systems and warfare practices, converting the existing courses translates to terabytes of content, contributed and maintained by dozens of military colleges, accessed by hundreds of thousands of military and civilian personnel who are geographically dispersed around world.

Immediate deployment of a complete solution is neither possible nor desirable. Aside from the practical issues and logistics involved in deploying a multimedia capable system of this magnitude, the technology and products themselves are in transition. Therefore, thought must be given to tracking technology improvements, accommodating new functionality in the system design, and preparing for long term smooth and effective system evolution. Furthermore, the existing storage and communications infrastructure are both operating at a small fraction of the capacity and performance needed, creating a chicken and egg problem of sorts.

The question is where to start. What would be the high leverage thing to do?

The answer that eventually emerged involved deploying a large multimedia capable network accessible storage server and provisioning it with exemplar programs and services. This would provide a platform for use in developing and evaluating methods of program production, distribution, and use.

We recognized that if we were successful the proposed storage system would vastly overpower the communications infrastructure. Normally this would be a sign of poor engineering, but in this context it would be desirable: not only would it be a useful first step, but it would also, in part, define the next steps: addressing the communications and distribution issues.

    Project Beginnings

Our efforts in project described here has are rooted in the Network Multimedia Information Services (NMIS) project. NMIS was a collaborative research effort that began in 1993 at MIT, Dartmouth College, and CMU. Paul Bosco, then an IBM research fellow and MIT doctoral candidate, was the primary force behind NMIS. The objective was to "develop new multimedia programming and services for access and delivery nationwide via the Internet". Programs would be developed in the key applications areas of health care, science and engineering, and K-12 education.

Health care programs were provided by Dr. Joe Henderson and his team at Dartmouth's Interactive Media Laboratory (IML). IML has produces award winning programs like "HIV and AIDS: an Interactive Curriculum for Health Sciences" (1994 Gold Medal in Professional Education, New Media/Invision Awards) and "Respiratory Emergencies in Children" (1991 Silver Medal in Professional Education, New York International Film and Video Festival). Science and engineering programs were developed by the MIT Center for Advanced Engineering Studies. K-12 programs were provided by Turner Educational Services - daily broadcasts of CNN Newsroom were compressed and make accessible worldwide over the Internet. The project broke new ground in a number of areas, including the first demonstration of Web accessible high quality video streaming.

Early in 1995, we were looking for ways to leverage our success with NMIS. This eventually led to discussions with Col. Steve Funk, then program manager for DARPA's SIMITAR program and long time promoter of advanced technologies for military training and for K-12 education in DoD administered schools. Kirsty Bellman, program manager for DARPA's CAETI program, and Tice DeYoung, DARPA's program manager for NMIS, participated in these discussions, and fostered funding for a follow-on project.

The project was formulated as a funding extension to the NMIS projects at MIT and Dartmouth College. MIT would established requirements for and procure a high capacity multimedia storage system. Dartmouth would develop the exemplar content and services.

In conjunction with the server acquisition we would also conduct an analysis of local access requirements Though content delivery was not an emphasis of this project, in the end, given the geographically dispersed user population, the ultimate challenge will be in enabling individual active and reserve military personnel and civilians in communities across the nation and worldwide to interact with multimedia course material.

The grant was awarded in the Fall of 1995.

    Request for Proposals

Even though this was an equipment purchase for a research project, given the variety of objectives that we were trying to meet, we knew we would have to go through an RFP process. Our added purpose, however, wasn't just to procure a single system - which we thought would be straightforward to accomplish - but to ascertain the current state and future direction of commercial storage systems development. We needed a way to sort out fact from hype and an RFP provided a useful vehicle for gathering this kind of authoritative information from the industry.

After several rounds of meetings with Dartmouth and ATSC, we arrived at set of requirements for the server. The RFP was written and issued through the MIT Subcontracts office in April 1996.

    Requirements Analysis

Server technology is evolving rapidly and seemingly minor variations in system architecture, costs, and timeframes can have a large effect. The intent of the RFP was to place minimum technical constraints and requirements. Respondents were asked to address each requirement explicitly and invited to propose diverse approaches with the best and latest technology. The following is a synopsis of the RFP.

    Architectural Requirements

Theory of Operation - the proposed general system architecture. How storage is added, whether or not tertiary storage employed, etc.

Distributed Architecture - Though not a constraint at the time, a distributed server architecture will be most appropriate in future phases of deployment. How the proposed system fits into a distributed server strategy.

Partitioned Multiple Uses - Though intended primarily for use in military training and documentation, the server should also support other educational and information applications, including current and developing Internet and World Wide Web services.

Scalability and Expandability - The server and server architecture should be able to scale over time to support increased use.

Interoperability and Open Systems - Network and programming interfaces and protocols should conform to broadly recognized standards, e.g., TCP/UDP, IP, HTML/HTTP, JAVA, POSIX, ANSI C++, MPEG-1/2, CORBA, etc.

    Technical Requirements

Capacity - The system should provide an initial storage capacity of 300GB. The storage devices and interconnections used should permit future expansion by at least a factor of ten.

Content - The server must support network access to high quality interactive multimedia content and services, including multiple access to streaming audio/video and associated data.

Latency - To achieve an interactive feel, the system should respond in under 3 seconds for initial program access and under 200 ms for incremental accesses.

Bandwidth - At a minimum the server should support tens of simultaneous video streams and up to a thousands of simultaneous low speed connections, and be able to grow to support a factor of ten more users. In this regard, delivery guarantees are required, best effort service is not sufficient.

Network Interfaces - At a minimum Ethernet and Fast Ethernet should be supported with an upgrade path to ATM.

Availability and Robustness - Service should continue to be availability during automatic recovery from single points of failure (hot-swap capability, etc.).

    Results

The result of our efforts were somewhat surprising. We were looking for a high performance large capacity server that supported network video streaming in conjunction with standard Internet and World Wide Web services. Our expectations and those of our research sponsors and partners were that such a system would be readily available. We discovered that this was not the case and that price wasn't the issue. The issue was (and is) that the server industry is evolving rapidly and we were seeking to catch the early consolidation of previously distinct server functions (video on demand, World Wide Web, transaction processing, etc.). A consolidation that is still in progress.

Responses fell into roughly three category: (1) Old technology: server design and media format reflecting "last year's" technology, a severe price penalty (almost a factor of 2 in both the server and media), and a failure to meet one or more of our basic requirements. (2) Incomplete PC solutions: missing capabilities, poor performance, inadequate software or lacking software altogether, and an offer to supply unbounded systems integration on a time and materials basis. (3) Partitioned approaches: though presented as "distributed servers", the distributedness was simply to overcome limitations in the underlying products, resulting in a significant price penalty; for example, if the underlying system maxed out at 100GB of storage, the proposed "server" would be three servers networked together, or if the underlying system only supports video on demand, a second system would be added to make up for missing (Internet and Web) capabilities.

After analyzing all RFP responses, we selected a proposal from Edgemark Systems Inc. (SGI's Government Systems distributor) for an SGI server and RAID storage subsystem. While other proposals presented interesting ideas and approaches, the SGI server was the only proposal that promised a complete solution. It would meet and likely exceed our initial performance and capacity requirements as well as our expandability requirements. It would have Web accessible video streaming, asset management, and a full set of standard interfaces and network services. Everyone we talked to endorsed this selection and we were optimistic that the server purchase would soon be done.

As we were soon to learn, the particular SGI configuration as proposed didn't yet exist and wouldn't for several months. When the hardware was delivered, the existing product software didn't support the new hardware nor did the next beta release and it was unclear why. Eventually, SGI product engineers were brought in to address the problem. They discovered that our disk configuration simply wasn't in the appropriate system software table. Fortunately, SGI had a program that characterized actual disk performance that they used it to set workable defaults for our system.

Though the process took much longer than we had anticipated, in the end we were pleased with the result. Yet it was interesting to discover that something we thought readily available turned out to be on the edge of what is possible.

    Issues and Observations

    Functional Integration?

Even functionally complete proposed systems were partitioned to some extent, requiring video streaming resources to be managed differently than the rest of the system. For example, SGI went the furthest in integrating media authoring, hosting, and streaming with the Internet, the Web, and Java; nevertheless, video streaming was handled by a functionally distinct realtime file system. Though this was not unexpected, in the future we are looking for such partitioning to become as soft and transparent as possible. Eventually video should be fully integrated as just another datatype.

    UNIX Workstation or NT PC?

A major decision point in our analysis involved choosing between UNIX workstation servers and NT PC servers. Going into the RFP process, we knew we would get proposals for both kinds of systems and, if anything, we were inclined towards a PC solution if one could be found to meet the requirements. A workstation proposal offered the only complete solution, but the relative technical merits of UNIX versus NT weren't really a factor - the project staff could deal easily with either operating system.

Further, the project would be better served by an off-the-shelf server that did not require extensive systems integration. Off-the-shelf workstation servers generally had the data paths and capabilities necessary to meet performance and expandability requirements. Off-the-shelf PC servers did not. Custom PC platforms designed to provide the necessary performance were functionally incomplete and would need significant systems integration to meet project needs. Incidentally, the PC versus NC debate wasn't an issue at the time, but clearly is an issue for future consideration.

    ATM or Fast Ethernet?

While the communications environment was not the focus of this project, a decision had to be made about what network interface to buy. We didn't want the network interface to be an immediate bottleneck to accessing the performance of the server. Thus, conventional Ethernet was clearly inadequate as was Token Ring, the technology used in the existing site networks. This decision was remarkably easy. The SGI server came with 100Mb Ethernet built-in and it was relatively inexpensive to upgrade the site's existing router accordingly.

ATM was considered, but would have been more expensive on both ends and would have required addressing the IP over ATM problem. Though ATM is being promoted for media intensive applications, the use of ATM in this situation would have required further analysis taking into account long term objectives for the whole site. Instead, we opted to keep that camel's nose out of the tent. Recent advances in Ethernet technology - readily available and inexpensive 100Mb Ethernet adapters, new Ethernet switches with built-in streaming support, and gigabit Ethernet under development - continue to complicate ATM's entry into the LAN market.

    Streaming Media

Streaming media over IP networks can be done only under special circumstances, but the unrestricted case of streaming media over IP requires protocol enhancements to provide service classes and realtime delivery guarantees. RSVP offers the possibility of a long term solution, but it will take time before RSVP can be fully implemented and deployed and during that time something else may emerge. In the near term, some organizations are moving to streaming media at the link level over switched LANs modified to support bandwidth guarantees.

All media streaming systems (applications, services, and protocols) are proprietary and format (i.e., MPEG) specific. While all promise support for additional formats sometime in the future, none promised support for realtime datatypes in arbitrary formats. In reference to our earlier work, some sort of universal header/descriptor system would alleviate this problem and promote much greater interoperability across media streaming systems.

    Timeline and Transitions

Five years ago (c 1992), as the NMIS project was formulated, servers were proprietary assemblies of high end components, and storage was priced ~$2.50 per megabyte. A system of the kind we sought for this project would have cost over a million dollars and lack many key features. The Web was just a research experiment and video streaming was an experimental capability being touted for video-on-demand applications.

Two years ago (c 1995), just as we were formulating this project, servers were making a significant transition to mid range systems built from standard workstation components. 4 GB drives were becoming available, priced ~30¢ per megabyte. The Web was exploding on the scene and Web accessible media streaming was demonstrated in R&D environments.

One year ago (c 1996), during the RFP process, 9 GB drives were becoming available at a price ~15¢ per megabyte and PC based servers were a possibility. Though encouraging, the proposed PC based systems failed to meet basic requirements. Such systems were undergoing drastic change, media hosting and streaming standards were nonexistent, and Microsoft and others had not yet weighed in with their approaches. To further complicate matters, Microsoft and Netscape were engaging in their browser/server battles.

Today, companies in the PC industry are aggressively developing their media streaming products and technology and negotiating to establish their standards (e.g., Microsoft with NetShow and Progressive Networks ). Workstation vendors are also coming out with a new generation of servers that look very attractive. If we ran the RFP process again today, the outcome would likely be similar; however, the outcome next year or the year after is unclear. Incidentally, 20GB drives are now available for ~7¢ per megabyte and predicted to drop to ~3¢ in 1998.

So, in a period of roughly 5 years, storage servers went from very expensive proprietary systems, to mid range workstation systems, to the verge of becoming commodity PC systems. Storage prices during the same period dropped over 50% per year.

    Future Directions

In this project, we removed a major barrier in developing interactive programs that are multimedia and graphically rich. However, in our requirements analysis, we recognized that a distributed server architecture would be most appropriate to the Army's needs given the amount of content to be stored, the multiple sources of training material, and the geographically dispersed nature of training material development and use.

    Acquiring and managing content

Army training and documentation materials are developed by dozens of military schools across the country. A system of contribution needs to be established for incorporating those materials into active online training services. This is not simply a technology deployment issue. Decisions about what systems and services are interdependent with organizational structures. Is the digital library metaphor an optimal approach? The models of contribution used for terrestrial, satellite, and cable television could also be considered.

    Distributed server deployment

The Army is a national (and international) organization. Broad deployment of terabytes of online interactive training and documentation will require a nationwide (and worldwide) system of media distribution. Building an effective infrastructure for training and documentation requires investigating several questions, including: how servers are provisioned and where they are located, how masters and copies of documents are managed, how the communications network is provisioned, how interaction is provided, and how training is conducted and mapped into personnel records.

    High bandwidth access

The hundreds of thousands of military and civilian personnel who need to access the online materials are widely dispersed across the country and overseas. Existing and emerging access technologies will not offer sufficient bandwidth for interactive program access. Hybrid approaches using packaged media are being proposed for the near term, however, packaged media perpetuates some of the limitations of paper media in having to be physically produced, distributed, updated, and managed.

Providing truly high bandwidth online access offers greater functionality, immediacy and flexibility in usage, and linkage into online conferencing and consultation. Though the Army can't directly solve the access issue, by defining and promoting their aggressive requirements, they could drive commercial development in a direction that supports their application needs.

    Acknowledgments

This work is sponsored in part by the Defense Advanced Research Projects Agency and the National Science Foundation under contract NCR 9423889.

    About the Authors

David C. Carver and Branko J. Gerovac are with the Research Laboratory of Electronics at the Massachusetts Institute of Technology. They recently directed the projects on Multimedia Server Deployment and Local Access Communications as Associate Directors of MIT's Research Program on Communications Policy.