DSpace :: Home

DSpace

Frequently Asked Questions

The Questions

The Answers

Benefits

What do I get out of this?

If your alternative to DSpace is to place your work onto your own web site, then DSpace will provide a number of advantages, including: stable long-term storage, support for formats beyond HTML, access control, rights management, versioning, community feedback, and flexible publishing capabilities. You won't have to worry about changing URLs or making backups. DSpace will provide long-term care and availability, and a persistent URL, even if you leave MIT. If your alternative is to place your work in another archive, the advantages of DSpace become more subtle. The commitment of the MIT Libraries to the accessibility of your work and our further commitment to participating in open solutions to cross-archival cooperation may encourage you to submit your work to DSpace, but you will have to judge for yourself.

What do I give up by taking part in this?

Not much, though you do give up some bit of control over the distribution of your work. While we plan to allow authors quite a bit of latitude in specifying the access restrictions they wish to place on their material, we do expect to generally prohibit the outright removal of material from the archive and to generally require that access be allowed to at least the MIT community.

Why would anyone submit their work to this repository?

We hope that people submit their work to DSpace because they will want to provide their colleagues around the globe with reliable access to their papers, images, and data.

What will this add to my workload?

While we don't know at this early stage what faculty and researchers will have to do to deposit work into DSpace, we are aware that we need to keep the submission process as streamlined as possible. For example, while documents in some formats can be submitted directly to DSpace, others may need to be converted to a supported format (such as PDF) prior to deposit.

Contacts

My school/institution wants what you are creating, how do I get it?

Please do let us know of your interest by sending a message to MacKenzie Smith at kenzie@mit.edu . At this early stage we really have nothing ready to share, but we are very interested in distributing our work as broadly as possible as it matures.

Content

Who decides what does or doesn't go into the archive?

The DSpace goal is to embrace all of the intellectual output of MIT, so the limitations will come from the authors rather than from the archive. We plan to build a system that allows communities of interest around MIT to manage their own contributions to DSpace.

Is this only for content created digitally or can older material be "scanned" in?

The initial intent is to capture new material, but this may be only a resources limitation. We may be able to expand to include older material if the submitters provide the material in digital form. We are also interested in "priming the pump" with some older material to demonstrate the benefits of digital distribution.

Can I get my lab's older material into this archive?

If that material is in electronic form already, then there is a very good chance that we can work out some way of transferring it into DSpace. If the material is currently on paper, then some retrospective conversion project would have to be funded to scan the material. This sort of scanning project is likely to incur costs that can't be covered by DSpace, but the resulting documents would likely be welcome in the collection.

Can I submit courseware to the archive? video? images? sound files?

Although it is not clear when each of these formats will be supported in a big way, nor exactly how course-based communities will be established, our plan is that the answer to this question be, "yes".

Can I pull my material off the archive after it has been submitted?

It is our intention that DSpace be looked upon as the repository of record for submitted materials. It makes sense that some preprints would become inaccessible to the general public after publication, but that such items should be retained for posterity if at all possible. As with present libraries, sometimes items are removed. We don't want to make it impossible to remove items, but we hope to encourage people to leave their items in the repository, even if their access must be heavily restricted under certain conditions. It may be that we make a document "invisible" to users, without actually removing it from the site, but only under special circumstances.

Can my collaborators at other institutions submit their work to this archive also?

Initially our focus will be within MIT. But it is our hope and expectation that follow-on work would enable other institutions to create communities within DSpace and to cooperate, perhaps even by hosting cross-replicated collections.

Can I update my work after submitting it to the archive?

Yes, but we don't know how this "versioning" will work yet.

Will student work be housed in this repository?

Though our primary interest is in gaining faculty participation, student contributions will also form the basis of some collections within DSpace.

How can I post conference proceedings of an MIT-hosted conference?

How do I upload my department's electronic collection of reports, theses, or papers?

Can I start an electronic journal in DSpace?

These are excellent ideas for collections. Let's talk offline about how we might proceed, and what the proper timeframe for that sort of support should be. If you have other ideas for collections or want to participate as a collaborator in creating one, contact Margret Branschofsky, margretb@mit.edu.

Finances

Will this become a product that HP sells?

We don't expect so. In fact, HP is quite supportive of the goal that whatever we come up with be made freely available to other academic research institutions. Still, this is a research project and as such is likely to take a few surprising turns. If a commercial product suggests itself as a result of this work, we would expect HP to grab the opportunity.

How much will this cost and who is paying the bills?

The initial research project will cost in the neighborhood of two million dollars with the HP Labs paying the bills. We plan to spend some of our effort investigating business model choices for sustaining DSpace over the long haul.

Is this going to cost me anything?

Over the next two years the costs of the DSpace research are being covered by the project. We are, as part of this project, exploring the "business models" that can make DSpace a sustainable part of MIT's infrastructure. We don't expect any working model would include charging individual authors for their contributions.

Do you have funds to digitize material?

The DSpace project is not currently funded to digitize material.

Originality

What is new about this effort?

At least two things are new about this effort: the institutional commitment and the industrial partnership. While other archive efforts are underway, indeed a few are well established, we are not aware of any that make such a broad commitment to capturing an institution's intellectual output. Archives, even when based within an institution, have been devoted to a particular field of scholarship. DSpace will be devoted to preserving access to MIT's contribution to the scholarly community, in all the fields MIT touches. Of course, we recognize that DSpace cannot stand alone, and we plan to implement protocols, such as those defined at OpenArchives.org, for linking our archive to others around the world. The MIT Libraries' partnership with Hewlett-Packard Laboratories also sets this effort apart from others. This project will not only benefit from the participation of talented MIT and HP staff, it will also teach both organizations much about the readiness of faculty and researchers for this kind of tool.

How does this differ from what is already being done at [arXiv, NCSTRL/CoRR, NDLTD]?

We intend to stand on the shoulders of the giants who came before us. We aim to provide an implementation that others can replicate, that will handle larger collections, of more varied content types, from more diverse sources of input than our predecessors. We also believe we have something new to offer in way of an ambitious rights management architecture.

Overlap

How is this related to the Institute Archives?

DSpace is not an archive in the traditional sense of the word. Though we often refer to DSpace as an "archive", it does not house what the MIT Institute Archives collects — official "Archives" or records relevant to the history and operation of MIT or the collections of personal papers of MIT faculty documenting a life's work at the Institute. DSpace is more of a place to record the intellectual work of the Institute intended for ultimate publication or broad dissemination and comment. At times the two missions will overlap, and we expect the Institute Archives will deposit some material in DSpace.

How does this relate to e-theses?

The document submission and management infrastructure of DSpace may eventually be used to implement e-theses.

How does this relate to my department's web site?

You'll be able to refer to specific documents by stable, lifetime URLs. We should also have some provision for stable expressions of dynamic URLs that search for or present a collection representing a department's holdings (or an author's holdings). We also hope to leverage XML in such a way that skilled webmasters will be able to weave DSpace into their own web sites, making it look like an integral part of the department's web site.

How does this relate to Barton?

Barton is the catalog of resources acquired by the MIT Libraries from publishers and authors worldwide. Barton also includes references to digital material licensed by MIT Libraries and to some electronic material on the net that our librarians feel is of particular importance to our community. DSpace contains actual documents created by MIT faculty and researchers, as well as the "metadata" which describes those documents. Barton may well acquire links to documents which reside in DSpace, but the mission of Barton as a "catalog" is quite distinct from that of DSpace as a "repository".

Will we "catalog" all the submissions and put them into Barton?

While some documents in DSpace will certainly be cataloged in Barton, we don't expect that such inclusion in Barton will ever be universal. However, the MIT Libraries is working on a separate track to build a more unified environment for searching which would help researchers discover material from a number of sources, from Barton to DSpace to many of our licensed databases.

Publication and Copyright

Are you trying to cut scholarly publishers out of the loop?

No. The business of scholarly publishing is vital to the dissemination of knowledge around the world. We recognize the value publishers bring to polishing the scholarly product and distributing it in trusted forms. Our aim is to supplement the work that scholarly publishers do and to preserve MIT's own intellectual heritage.

Can I publish my paper commercially after I submit it to this archive?

It is not our intention to limit commercial publication nor to compete with publishers.

Doesn't this violate copyright laws?

Material will be placed in the archive only with the consent of the copyright holder. Dissemination from the archive will have to conform to copyright laws. How all of this will be achieved is one of the project's challenges.

Who owns the content in this repository?

We intend to let authors retain as much ownership as possible of DSpace content. Authors may already have negotiated other rights for their contributions to DSpace, so this may present an interesting problem for DSpace. We look forward to exploring some of these issues.

Who owns the metadata you house in this repository?

The general problem of rights management, including the metadata ownership, is very complex and will be the subject of intense study during the project. If metadata is created by DSpace, then presumably it is "owned" by the MIT Libraries. Otherwise it is our intent to let authors retain as much ownership as is practical of the DSpace content.

Searchability, Look and Feel

What will this thing look and feel like in August 2001?

We don't know yet. Part of the project is to consult with the user community and try to incorporate their suggestions and requirements into the design.

Will I have to describe/index my own material?

Maybe. Metatdata is a tricky thing. Different communities have different vocabularies and standards for describing and indexing their materials. The best descriptions and indicies are often a lot of effort, even for the authors. Our plan is to make the submission as effortless as possible. We hope that by creating different communities within DSpace, rich and accurate descriptions and indicies will be made without requiring authors to do all the work themselves.

Will there be any automated indexing?

We expect so. But we do not yet know what form it will take.

How truly searchable will this archive be?

We expect DSpace to incorporate its own searching facility and technology. But DSpace will also be designed to share metadata with other services which may provide their own paths to DSpace content. The experience of our predecessors has shown that letting web crawling robots indiscriminately walk entire collections does not work. Perhaps, through more controlled collaborations, novel third party indexing schemes might be allowed in to some large subset of the collection.

Security and Durability

What happens to my documents if your business plan fails and you have to shut down the service shortly after it goes live?

This is an extremely scary scenario for everyone involved. We are beginning very early to work on putting this infrastructure on a stable footing so that the trust of our collaborators will be validated. Perhaps the right way to frame the worst case is this: It would be no worse than the present situation if the computer you use now to store your archival data and preprints were shut down. You currently run a big risk of data loss, but with no others to get together and help out if there is a crisis. In your current situation, you'd move your data to prevent loss. In DSpace, we seek to reduce the risk by building a sensible business model that reflects the trust invested by many stakeholders like yourself.

How do you intend to preserve data and documents in the face of changing technologies and standards?

This is currently a research problem. We don't presume to know how to solve it. What we intend to do is set policies to encourage people to use standards and formats that have the best longevity, and widest acceptance. As DSpace gains in stature, we hope to be able to influence the direction of formats and standards towards greater longevity. Some communities need functionality more than longevity for their materials. We intend to be very up front with those communities: Some formats will not be able to be migrated. Such items will be kept for historical record, and may require emulation to fully utilize.

Can I control who gets free access to my work and who has to pay me for access?

It's not too early for this kind of request for the project. We certainly want to create a system with this kind of flexibility, but it is too early to guarantee this facility.

Standards

Do you intend to make this a Digital Library Federation project?

We plan to have conversations with many different Digital Library organizations. There are many opportunities for cooperation.

Technology

Will the code which makes this archive work be available to the rest of the world?

We expect so, but we can make no absolute guarantees at this time since we may have to incorporate components developed by other parties and subject to their own restrictions.

Is this an "Open Source" project?

We intend to produce interesting code and make it freely available. But in the interests of getting something useful running quickly, we may end up incorporating modules from others that we license without the ability to freely redistribute. Since one of our goals is to make it easy for others to replicate our work, we will always try to maximize what can be redistributed. Code we write ourselves will be open source. Our protocols will be well documented. For modules we cannot redistribute, we intend to make sensible public interfaces so that outside collaborators, in the spirit of Open Source software development can help us eventually produce a complete and open system. One hope is that this project could form the kernel of an "Open Source" effort down the road. But one of the defining characteristics of Open Source is the participation of people from all over the place in a given project. In order to attract that sort of interest and participation, we must first build something that works, that demonstrates the concept. That is the stage we are in for now. Someday this project may be Open Source, but today it is not.

Is this archive a web product?

The web is an interface to DSpace, but the architecture will be independent of web technology so different clients can easily be added as other browsing technology emerges.

Will you make any attempts to authenticate documents submitted to the archive?

We believe it is very important that DSpace be viewed as a credible source, and so proper provenance of documents is important to us. Different communities will be able to establish different standards regarding authentication. We expect to implement submission and retrieval capabilities that make it very clear the level of authentication that has been done for each item in DSpace.

How can I be sure I'm getting out of the archive the same material someone else put into the archive?

We expect the rights management architecture we implement will satisfy this requirement.

Is the content submitted going to be "XML'd"?

Perhaps, but not any time soon. If it realizes its potential as a means of preserving semantics and allowing rich display, and if tools come along that enable it to become a widely accepted format, it is possible that submitted material might be converted to XML. The first step would be agreement on policies for acceptance of XML documents, and then evolving an understanding of what it would mean to convert submissions to XML. The metadata, the description of the items in DSpace, on the other hand, is likely to be expressed in XML. But this will be an internal representation not evident to most users.


	Copyright © 2000 MIT Libraries & Hewlett-Packard Company DSpace is a trademark of the Massachusetts Institute of Technology	Contact Webmaster