11.522: UIS Research Seminar (Fall 2014) - Discussion notes

Monday, October 6, 2014, 7-9 PM

Residential Choice Sub-models as Components of Large Scale Urban Models

Discussion Leader: Roberto Ponce-Lopez

Background:

Large scale urban models rely on a number of sub-models to represent the complexity of interactions between land use, transportation, carbon footprint, and housing values. Each of those sub-models is a highly sophisticated function itself, requiring rich datasets to estimate parameters and make predictions. For example, sub-models could depend on both aggregate and individual data coming from a variety of sources, such as the census, travel surveys, real estate transactions, firm surveys, among others. Individual level information is a desirable unit of analysis to estimate behavioral models of residential choice, firm location, and home values in order to avoid ecological fallacies whereby aggregate market indicators may not reflect individual preferences and tradeoffs.  Those individual level estimates are in turn an input to micro-simulate interactions amongst agents, to model housing market behavior under various circumstances and policies, and so that urban dynamics can be simulated over time. The limitation of an individual level analysis is that this kind of data is expensive and adds a layer of complexity to the models. In contrast, aggregate data is widely available, but it entails two risks if used to model individual dynamics: 1) ecological fallacy (inferring individual behavior from aggregate data); 2) missing an important effect that is only discoverable at the individual level. Model builders face several tradeoffs in deciding the model approach, having to select the model that best fits to the question that they are trying to answer based on the available data.


Residential choice is one such type of sub-model component in large scale models. The goal of the residential choice sub-model is to estimate the predicted probability of a household occupying a certain type of house, conditional on the characteristics of the house, the price, the household demographic traits, and the other housing alternatives also considered. Classic hedonic models estimate an expected price conditional on certain characteristics of the house, and some measures of accessibility, schools or amenities amongst others. However, those models do not necessarily reflect the choices of residential location made by individual households. As a response to that limitation, other models estimate, grounded on micro economic theory, an expected utility or willingness to pay by each household for a set of residential choices, conditional on a number of independent variables. These 'willingness to pay' estimates can then be used in a bidding model that tries to simulate the weekly dynamcis of urban housing markets. The benefit of the latter behavioral approach is that its results might be more realistic in simulating the alternative growth paths that result from different policy interventions.  

Research Questions & Methodology

I am interested in exploring models of residential choice that utilize individual level data from surveys and housing transactions for Singapore to identify variables and processes that are subject to planning  decisions, which affect the residential choice of families, controlling by housing alternatives, house characteristics, and demographic traits. The methodology will consist of a Multinomial Logit estimated using Maximum Likelihood techniques. The intuition is that households have a range of location alternatives and they are willing to bid more for the option that yields the highest utility to them. The problem is that we do not observe the alternative set of choices for households; we only see the selected alternative which is where the interviewed actually lives. Moreover, the household surveys include demographic information about each household but do not include the price of the home, while the database of real estate transactions include the price but do not keep track of demographic characteristics of the buyer. The estimation challenge is to merge two such datasets appropriately with indicators of accessibility and amenities from aggregate data, without falling into the ecological fallacy when fitting the model.

Key Readings:

  1. Waddell, P. (2009). Parcel-Level Microsimulation of Land Use and Transportation: The Walking Scale of Urban Sustainability. In 12th International Conference on Travel Behavior Research.
  2. Ellickson, B. (1981). An Alternative Test of The Hedonic Theory of Housing Markets, Journal of Urban Economy 9, 56-79.

The paper by Waddell gives a nice overview of the difficulties when integrating land use and transportation; it exemplifies the hardness of implementing micro-simulation, merging data from different sources / levels of aggregation, and dealing with incomplete data. It also illustrates where residential location models fit in the broader effort of large scale urban models. Read all the paper, but pay special attention to pages 11-12 where he summarizes the most important contributions to the literature of location choice and hedonic prices over the last decades.

The paper by Ellickson is a landmark work on hedonic price models. His approach of a bidding function employs a multinomial logit to estimate hedonic prices. My goal is to replicate a similar model using Singapore’s data. The intuition behind his theory is that a given house will be occupied by the household offering the highest bid price. The household who receives the highest utility from a dwelling unit will bid the highest price for it. Bidding prices and utilities can be derived from both characteristics of the household and characteristics of the flat. The first 4-5 pages of the paper layout the theoretical foundations of his work.

Additional References

Discussion questions:

  1. What are the pros and cons of classic hedonic models versus a micro-simulation approach, in terms of complexity and interpretation of the data and results?
  2. What are the main theoretical differences between the model proposed by Ellickson and the hedonic models described by Waddell?
  3. What are the desirable characteristics in a model that seeks to answer a question about housing policy?
  4. What type of data would such amodel require?
  5. What are the practical differences between a residential choice model that micro-simulates behavior versus one employing aggregate data?
  6. How are those practical differences dependent on a given problem context?

Back to 11.522 home page.