[i/s Home] [Distribution] [Search] [Back Issues] [Publications] [Feedback]


 


Volume 23

No. 3  January/February 2008

With Exhibit, You Too Can Publish Interactive Web Pages

Robyn Fizz

Remember how simple web publishing used to be? You grabbed the code of a web page you liked and replaced the text and images with your own. True, the pages were basic, but if you knew hypertext markup language (HTML) and did some tweaking, you could make information available to Internet users around the world.

Then came Web 2.0 and a huge leap in sophistication. Many web sites today rely on back-end databases and a slew of programs with acronyms – SQL, ASP, PHP, CGI – to publish data that can be searched and sorted in multiple ways. Just recall the last e-commerce site you visited and its flexible features for comparing products and honing your search.

What if publishing interactive, data-rich pages were made simple, so that all you needed was HTML savvy, a few tutorials, and sample pages from which you could copy and paste – just like in the pioneer days of web publishing? Even better, what if you could mash data from other pages with your own or make your data available for others to reuse?

These are the premises behind Exhibit, a lightweight web framework
for publishing structured data – from tables and timelines to maps studded with information icons. To quote David Karger, the Computer Science and Artificial Intelligence Laboratory (CSAIL) professor who has overseen Exhibit’s development, the software “reduces the friction of working with data.”

Exhibit is the brainchild of researcher David Huynh, who was a doctoral student of Karger and Rob Miller when he began working on the prototype for this application programming interface (API). Exhibit sprang from research in CSAIL’s Haystack and User Interface Design groups, and now falls under the umbrella of Simile, a joint project of the MIT Libraries, World Wide Web Consortium, and CSAIL. Simile focuses on developing robust, open-source tools that empower users to access, manage, visualize, and reuse digital assets.

Roll Up Those Sleeves
To publish with Exhibit, you provide a simple data file and an HTML file in which you specify how the data should be shown. Exhibit relies on the JSON format for its data files, but if you have a file in another format – an Excel spreadsheet, for example – you can convert it using a web service called Babel. JSON itself is a simplified form of the Research Description Framework (RDF), a metadata model from the World Wide Web Consortium.

While Exhibit gives ordinary mortals the tools to publish interactive web pages (“exhibits”) – and also saves programmers a lot of time – it promises to deliver even more. By default, data in Exhibit-authored pages is publicly available and can be copied and customized by other Exhibit users. This “open data” concept is similar to that of open-source software, in which source code is made available for others to copy and modify. Exhibit itself is open-source software.

While it may sound scary to let public data into the wild, the benefits are clear. The availability of open, structured data makes it easy to create mashups – web sites with multiple sources of data and flexible options for searching and sorting. Recombining data in innovative ways benefits everyone.

Exhibit needs data that’s machine readable, but a lot of data that’s pulled from the web is not. This data needs to be “scraped,” that is, extracted from web pages and coded with metadata.Huynh has developed a set of tools – with names like Solvent and Crowbar – to help automate the scraping process. It’s still no fun. But scraping isn’t required if you’re supplying your own data or pulling it from other exhibits.

Huynh offers one other caution for now. Because web browsers, rather than backroom servers, tabulate the data in exhibits, Exhibit works best with data sets of a few hundred items. The larger the data set, the longer an exhibit may take to load.

A Prime-Time Example
Exhibit is no ivory-tower project. It has already been used to build several pages, including an innovative web service that benefits MIT students. That exhibit, the MIT Course Picker, helps students make informed decisions during preregistration. CSAIL and Information Services and Technology (IS&T) launched an improved beta version of this service on January 29.

Course Picker provides a compelling, interactive format that lets students sort and filter subjects based on their own preferences. For example, students can

  • Browse and sort information by subject, semester, and date/time
  • Add core subjects first, then see what elective subjects fit their schedule
  • Filter by criteria, such as subject or instructor ratings from the Underground Guide to Course VI, or subjects that fulfill the HASS-D requirement
  • View official catalog data sourced from IS&T’s Data Warehouse
  • See when recitation and lab sections have been scheduled and add choices to a planning schedule
  • Check official information posted on the Online Subject Listing and Schedule from within Course Picker
  • See and print a calendar view of their proposed weekly schedule before signing up for courses through WebSIS, MIT’s student information system

Note: The calendar students create with Course Picker serves as an aid only. Students’ official schedules are created by WebSIS.

Behind the Scenes
Course Picker began in January 2007 as a proof of concept by Huynh. He built a prototype in about a week – mostly spent scraping the HTML of MIT’s online subject listings. This summer, Huynh and four UROP students – Margaret Leibovic, Gabriel Durazo, Nina Guo, and Mason Tang – spent another week updating that prototype. This fall, after IS&T provided Huynh’s group with the official feed for catalog data, Huynh and Leibovic took another week to complete the integration.

Building this sophisticated exhibit in three weeks’ time attests to the power of Exhibit and open, structured data.

Exhibit-ionist Tendencies?
To learn more, visit the Exhibit 2.0 page. It offers examples, tutorials, and a wiki with documentation. For more examples, see Surf Sites.


is&t Home |  is&t Back Issues |  Volume 23 |  No. 3