11.522: UIS Research Seminar (Fall 2009) – Project Update - Shan Jiang

Estimating Activity Destinations at Disaggregated Level: Towards an Activity-Based Approach

Purpose of Study

This research proposes to develop and analyze data fusion and data mining methods that use such data to calibrate models of urban activity that can substitute for traditional Trip Generation and Trip Distribution steps in the FSM at more disaggregated level.  The methods will be developed and illustrated by using Boston Metropolitan Area, MA, as an example, and applied to its counterpart of a Portuguese metropolitan area, Lisbon, after testing the reliability and robustness of the methods. 

Data Sources

Data sources include land use, derived 'point of interest' information, road networks, high resolution orthophotos, and proprietary databases of destinations for model validation (such as Info USA data for Boston Metro Area). 

Proposed Methods

Machine Learning Approach

Machine Learning method will be applied to train the algorithms to capture useful information from the Internet to match with the aggregated data sources to generate disaggregated destination measures at the parcel level or high resolution grid cell level (see Figure 1) with our research collaborators.

The challenging part of this approach is to develop a method to ground truth the derived POI with observed population destination to test and calibrate the algorithms.  The following steps will be implemented and tuned based on the performance of the implementation results:

1)      Develop a structure/framework for activity classifications which can be useful for travel demand modeling, based on the North American Industry Classification System (NAICS) standards.

2)      Prepare disaggregated and aggregated destination information (including categories, and number of activities within each category) based on the previously defined activity classification framework.

3)      Develop an algorithm that can capture the differences between the derived POI and the observed destinations and guide new search of POIs so as to diminish the gaps. It’s an iterative process until the gaps converge to a certain point (say 1% differences compared with the population destinations).

4)      Derived disaggregated destination layers are ready to feed into other transportation models.

Figure 1: Illustration of the Data Mining Structure

Non-parametric Estimation Approach

One of the disadvantages of the machine learning approach to derive disaggregated destination information is its limitation of deriving other relative information, such as number of jobs associated with the destinations, which are important parameters for transportation demand modeling. It is usually limited due to online data availability.

Non-parametric estimation approach may have some advantages on this aspect. By using sample disaggregated data, and aggregated data, we plan to derive an intermediate level of data, which will be useful for activity based transportation modeling, and other policy analysis at the micro-scale.

Potential Applications

1.      By utilizing disaggregated measures of urban activity destinations, urban transportation demand models can improve by revealing urban activity more realistically than the traditional aggregated measures (such as using Transportation Analysis Zones as analysis units), since aggregated geographic analysis units lose heterogeneity of urban activities.

 

Independent variable:

·         Disaggregated measures of urban activity destinations

·         Other variables in the traditional four-step model of transportation demand modeling

o    Land Use

o    Population

o    Road Network

o    Travel Survey 

Dependent variable

·         Travel time for peak hours (for home based work trip)

In contrast, an aggregated measure of activity destinations will be applied to the same model, so as to compare the two different measures.

Model

Traditional Transportation Demand Modeling of the Four-Step Model will be applied.

 

2.      Utilizing the disaggregated measures of activity destinations will enable us to improve urban growth management policy making. (An example will be given to illustrate the idea, though the applications will not be limited to only such cases). By using models developed in the previous example, we will develop different policy scenarios to test sensitivity of the different measures of urban activity destinations in repose to policy changes.

 

One example is to test “Urban Renew Policy” in Lisbon. In the “Urban Renew Policy” scenario, we imagine that small businesses are brought to downtown Lisbon due to certain policy incentives. We will test the accessibility changes in this scenario by using different measures of activity destinations.

 

Independent variable:

·      Disaggregated measures urban activity destinations

·      Other variables in the traditional four-step model of transportation demand modeling

o  Land Use

o  Population

o  Road Network

o  Travel Survey 

Dependent variable

·      Accessibility measurement
(which is a function of the activity destinations and transportation infrastructure and services)

 

In contrast, an aggregated measure of activity destinations will be applied to the same model, so as to compare the two different measures.

 

References:

The North American Industry Classification System (NAICS): http://www.census.gov/eos/www/naics/

Scott, D. 1992 Multivariate density estimation: theory, practice, and visualization. New York : Wiley, c1992.

Duda, R., Hart,P., and Stork,D., Pattern Classification (2nd edition) , JohnWiley and Sons,  2001