Skip to content Accesskey=4Skip to sub-navigation Accesskey=3View our Accessibility Options MIT Information Services and Technology Home About IS&T Contact IS&T Site Map Search Advanced Search
Getting StartedGetting Services by Topic or Alphabetically Getting Help

MIT Data Warehouse

On This Page

Vision

Definition and Functions

User Engagement

Design Points

Related Links

Data and Reporting Services


Data Warehouse at MIT: Strategy Document

Vision

The basic vision of the Data Warehouse is to make information at MIT more accessible and easier to use. If people can get data easily, they will spend less time gathering information and more time analyzing it. Having information from several different sources allows people to create reports easily. Once similar reports would have taken days to construct. This allows time for information to be used in new and creative ways. In the past, even if individuals invested the time and energy to put together information for themselves, it was not easily shared. With the technologies available today, once a user figures out a good way to look at the information (by generating a report definition in BrioQuery, for example), they can easily share it with others. Also, by making information available in this way, users should be able to combine Data Warehouse information with local information. This is a powerful concept, because it frees the users to keep track of the information unique to them, and not recreate information available in the Warehouse. In this way the Warehouse and local systems compliment each other. Making data more accessible also serves several other goals. First, it will improve data quality over time. As people use the data, errors can be corrected as they are found. Additionally, using the information for more purposes will help us improve the design of our systems in the future, so that we will have the information that we need on hand.


Definition and Functions

The Data Warehouse is a database with information coming from many areas, which is structured in a way to make it easy to use the information.

  • Many Sources; integrates data from various administrative systems and stores them in one location.
  • Read Only; is a read only database. Information represented in the Warehouse is maintained in other systems, called "systems of record".
  • Restructured Information; presents data in a simple form so that reports are easy to construct and the Data Warehouse is easy to use.

The main purpose of the Data Warehouse is a reporting and data distribution environment for Departments, Labs and Centers at MIT. In addition to this, the Data Warehouse will support some of the reporting needs of Central departments. Also, the Data Warehouse acts as a hub, to facilitate the exchange of information between systems. Taking all of these functions together, the Data Warehouse serves as the enterprise information infrastructure.

The Data Warehouse is not the place to solve problems like "Did this invoice get paid?" This type of question should still be directed to the transactional systems such as SAP. A warehouse system is not meant to take the place of the transactional system, but to complement it. It will take some training for users to understand when they should be using the Data Warehouse and when it would be better to query the transactional system.

The goal is to put all administrative information in the Data Warehouse. The two basic requirements we have for determining which information goes into the Data Warehouse are:

  • That the information is of use to more than one group of people.
  • That the information has a source system of record.

Since the community has diverse needs, the Warehouse is being designed to support three separate access mechanisms:

  • End user query and reporting tools: MIT has obtained a site license for BrioQuery, one of a large number of commercial products in this area. Ad Hoc reports are very easily created. Standard reports can be defined and shared easily through the Web or an email attachment.

  • Creation of data extracts: Data can be created and transferred from the Data Warehouse to local systems via database links, snapshots, FTP, etc.

  • Custom programs: Programs can be written to access the Warehouse using SQL. These can range from simple Perl scripts which just extract data, to full applications written in Powerbuilder, C, Java, etc.

[Back to top]


User Engagement

We have a diverse set of users at MIT. Some people need to be able to just push a button to get the standard report they need. Others need to analyze information in a much more dynamic way; asking a series of questions (ad hoc queries) to investigate something.

We don't expect all users to learn how to build reports themselves. What we hope to do is get users comfortable with running pre-built queries. Then when the need arises, they can learn to modify the reports to suit themselves. As users use and understand information they will start creating reports themselves, hopefully sharing this work with others.

[Back to top]


Design Points

The design of the Data Warehouse has many aspects. As in most designs the considerations need to be balanced. For example; simplicity vs. flexibility and functionality; every time we allow another way to do things we make the system a bit more complicated.

Follow the Data Warehouse Strategy Document Part 2 for more details on our design points.

[Back to top]

MIT Home | Getting Started | Getting Services | Getting Help | About IS&T | Accessibility
Ask a technology question or send a comment about this web page.