6.S097 Urban Data Analytics and Machine Learning

This project based class will teach the fundamentals of urban data analytics.

The class focuses on a machine learning approach to create predictive models and effectively handle the large datasets common to urban data applications.


On the machine learning side, the class will introduce basic classifiers, collaborative filtering, neural networks, and other common machine learning algorithms. However, unlike other machine learning courses, 6.S097 is application heavy. A large part of the class will also involve understanding the tools in the space. Some of the libraries we will work with and you may find yourself working with in this class (such as caffe, tensor flow, theano, etc.) are used often outside of academia.

Class Structure

In regard to the class structure, lectures will be from 12:30pm to 2:00pm on tuesdays and thursdays from January 17th to February 2nd. Lecture will be held in building 34, room 301. The class is 3 units and will be graded P/D/F based on performance on the final project. A more detailed schedule can be found in the notes section. The last couple lectures will be focused on project work and presentations.

The Website

The site will have notes posted regularly in the notes section. This will include lecture notes in addition to supplementary notes we feel will be useful to your learning experience. Our goal is to make the learning as intuitive and hands on as possible.

Final Project

The final project is the capstone of this class. It will be an exciting opportunity to apply machine learning algorithms on never before seen datasets, courtesy of Nodal API. Nodal API is a urban data analytics api that has features ranging from real time bus data to collision and crime data, from advanced route recommendation to a robust notification system. The final project will involve augmenting standard machine learning algorithms to create an urban data application of your choice.

Some examples of projects are as follows:

1) a randomized route recommendation algorithm to provide a variety of routes to everyday runners while still retaining the quality of the recommended routes.

2) a popularity score that uses Nodal's user feedback system to rank routes/regions

3) a collision readiness system that uses collision data to assist with fast ambulance response.


This class is taught by Parth Shah and Riju Pahwa. If you are interested email shahp@mit.edu and pre-register on websis.