11.522: UIS Research Seminar (Fall
2009) – Project Update - Shan Jiang
Estimating Activity Destinations at
Disaggregated Level: Towards an Activity-Based Approach
Purpose of Study
This research proposes to develop
and analyze data fusion and data mining methods that use such data to calibrate
models of urban activity that can substitute for traditional Trip Generation
and Trip Distribution steps in the FSM at more disaggregated level. The
methods will be developed and illustrated by using Boston Metropolitan Area,
MA, as an example, and applied to its counterpart of a Portuguese metropolitan
area, Lisbon, after testing the reliability and robustness of the
methods.
Data Sources
Data sources include land use,
derived 'point of interest' information, road networks, high resolution
orthophotos, and proprietary databases of destinations for model validation
(such as Info USA data for Boston Metro Area).
Proposed Methods
Machine Learning Approach
Machine Learning
method will be applied to train the algorithms to capture useful information
from the Internet to match with the aggregated data sources to generate
disaggregated destination measures at the parcel level or high resolution grid
cell level (see Figure 1) with our research collaborators.
The challenging
part of this approach is to develop a method to ground truth the derived POI
with observed population destination to test and calibrate the algorithms. The following steps will be implemented
and tuned based on the performance of the implementation results:
1)
Develop a
structure/framework for activity classifications which can be useful for travel
demand modeling, based on the North American Industry Classification
System (NAICS)
standards.
2)
Prepare disaggregated and
aggregated destination information (including categories, and number of
activities within each category) based on the previously defined activity classification
framework.
3)
Develop an algorithm that can
capture the differences between the derived POI and the observed destinations
and guide new search of POIs so as to diminish the gaps. It’s an iterative
process until the gaps converge to a certain point (say 1% differences compared
with the population destinations).
4)
Derived disaggregated destination
layers are ready to feed into other transportation models.

Figure 1: Illustration of the Data
Mining Structure
Non-parametric Estimation Approach
One of the disadvantages of the machine
learning approach to derive disaggregated destination information is its
limitation of deriving other relative information, such as number of jobs
associated with the destinations, which are important parameters for
transportation demand modeling. It is usually limited due to online data
availability.
Non-parametric estimation approach
may have some advantages on this aspect. By using sample disaggregated data,
and aggregated data, we plan to derive an intermediate level of data, which
will be useful for activity based transportation modeling, and other policy
analysis at the micro-scale.
Potential Applications
1.
By
utilizing disaggregated measures of urban activity destinations, urban
transportation demand models can improve by revealing urban activity more
realistically than the traditional aggregated measures (such as using
Transportation Analysis Zones as analysis units), since aggregated geographic
analysis units lose heterogeneity of urban activities.
Independent
variable:
·
Disaggregated
measures of urban activity destinations
·
Other
variables in the traditional four-step model of transportation demand modeling
o
Land
Use
o
Population
o
Road
Network
o
Travel
Survey
Dependent
variable
·
Travel
time for peak hours (for home based work trip)
In contrast, an
aggregated measure of activity destinations will be applied to the same model,
so as to compare the two different measures.
Model
Traditional
Transportation Demand Modeling of the Four-Step Model will be applied.
2.
Utilizing
the disaggregated measures of activity destinations will enable us to improve
urban growth management policy making. (An example will be given to illustrate
the idea, though the applications will not be limited to only such cases). By
using models developed in the previous example, we will develop different
policy scenarios to test sensitivity of the different measures of urban
activity destinations in repose to policy changes.
One
example is to test “Urban Renew Policy” in Lisbon. In the “Urban Renew Policy”
scenario, we imagine that small businesses are brought to downtown Lisbon due
to certain policy incentives. We will test the accessibility changes in this
scenario by using different measures of activity destinations.
Independent
variable:
·
Disaggregated
measures urban activity destinations
·
Other
variables in the traditional four-step model of transportation demand modeling
o Land Use
o Population
o Road Network
o Travel
Survey
Dependent
variable
·
Accessibility
measurement
(which is a function of the activity destinations and transportation
infrastructure and services)
In contrast, an
aggregated measure of activity destinations will be applied to the same model,
so as to compare the two different measures.
References:
The North American Industry Classification System
(NAICS): http://www.census.gov/eos/www/naics/
Scott, D. 1992 Multivariate
density estimation: theory, practice, and visualization. New York : Wiley, c1992.
Duda, R., Hart,P., and Stork,D., Pattern
Classification (2nd edition) , JohnWiley and Sons, 2001