Massachusetts Institute of Technology - Department of Urban Studies and Planning


11.188: Urban Planning and Social Science Laboratory

11.205: Intro to Spatial Analysis (1st half-semester)

Homework 2

Extracting & Querying Census Data plus Site Suitability Analysis


Distributed:

Wednesday, March 10, 2021

Due: (at start of class/lab)

Friday, March 19, 2021 - Question #1
Friday, March 26, 2021 - Question #2

Extra Notes-for Part II

Site Suitability Analyses; Dataset for suitably example

INTRODUCTION

The main focus of this homework is Question #2 where we undertake a classic Ian McHarg-style 'site suitability analysis'. See this note on spatial analysis "classics" by John Corbett, Ian McHarg: Overlay Maps and the Evaluation of Social and Environmental Costs of Land Use Change. For our hypothetical 'site suitability analysis' example (in question #2), we will find suitable locations for a Cambridge senior center by identifying the possible locations that meet each of several criteria: close to major roads, near the target population, far from hazards, etc.

However, this homework also serves to improve your facility with data manipulation tools including use of 'group by' operation with SQL queries in QGIS. Question #1 focuses on database manipulation and analysis using some of the 2010 census data involved in Question #2. There are only two questions in this homework set but each is more complicated than you might think because of the difficulty of decomposing the problems into doable steps.

BEFORE YOU START WITH QGIS - AND LONG BEFORE THE DUE DATES FOR EACH PART OF THE HOMEWORK - PLEASE BACK AWAY FROM THE COMPUTER AND READ THROUGH THE WHOLE ASSIGNMENT. Getting the 'big picture' first will help you develop a better GIS strategy and will reduce time and energy wasted.

DATA for this homework:

All the data needed for this homework set are available in the class 'data' locker on AFS as indicated in the text.  In addition, we have copied the data into a zipped data package called hwk2-package.zip available in the 'Materials' section for the class on Stellar: https://stellar.mit.edu/S/course/11/sp21/11.188/index.html The following files are included in this package:

PART I: QUESTION 1 [40 points]

The following table of Senior Citizen poverty statistics was developed from the 2009-2013, 5-year, ACS estimates.

DATA

Question 1a: Fill in the empty cells in the following partially completed table and answer the questions below. Instead of using QGIS, you are free to use other database management software (such as MS-Access, postgres, or even Excel or ArcMap). Note that you will have to make careful use of the 'group by' operation in your SQL queries (or equivalent 'total' or 'summarize' commands in MS-Access or ArcMap) in order to compute the correct values for many of the columns. Note, also, that all five towns are in the same County, namely Middlesex County in Massachusetts.

Summary of Senior Citizen Poverty Statistics for Cambridge and Abutting Towns, 2009-2013

Town

Number of tracts

Number of seniors with determined poverty status

Average of percentages calculated for each tract

Total

Tracts with some population for whom poverty status is determined

Tracts with seniors for whom poverty status is determined

# of seniors below the Federal Poverty Level

Percent of population with a determined poverty status that are poor seniors

Arlington

8

 

 

631

 

 

Belmont

8

8

 

 

 

1.13

Cambridge

32

 

 

 

2.03

 

Somerville

18

 

 

1282

 

 

Watertown

6

6

6

651

2.05

1.93

Overall

72

 

 

 

1.77

 

Now, take a look at the last two columns and note that the overall percentage of impoverished seniors within a town is generally different from the town average of the percentages that one can calculate separately for each tract within that town. Answer the following three questions (in a short paragraph or two each).

Question 1b: Explain briefly why the overall average (the second to last column) and the average of the tract percentages (the last column) could be different.

Question 1c: In this instance which method (that is which of the last two columns) tends to be larger? Why?

Question 1d: Which set of numbers are most appropriate for use in our analysis for Question #2? Why?

Turn in the table with all your computed values added and write at most a paragraph or two for each question - 1b, 1c, and 1d.

Tips: Filling in the table looks simple enough -- just run some of those 'Field calculator' commands for the requisite tracts. But it will take some thought to determine what each column is measuring, which census rows to include and how to do the summarizing. Another tricky part of this question, which we've taken care of by providing TractToTown is figuring out which tract is within which town. The tracts can span a town boundary, or the boundaries of towns and block groups might not line up exactly.. For this exercise, we define a tract to be "in" a town if the centroid of the tract is inside the town boundary. (See Appendix I below for information about various ways to examine the spatial relationship between tracts and towns.)

PART II: QUESTION 2 [60 points]: The site suitability analysis

A local non-profit group is interested in locating a site for building a senior center in Cambridge. Given your expertise in GIS, you are hired as a GIS analyst by this company to help them locate the best site. After a long meeting with the organization and the community you agree to run some numbers in order to get a handle on the locations and characteristics of potentially suitable Cambridge sites. You settle on the following criteria to get rolling with your site selection process:

  1. The minimum area of land needed for the project is 1 contiguous hectares (1 hectare = 10,000 square meters = 2.471 acres, and 1 acre = 43,560 square feet).
  2. Ideally, the site should be located near, but not in, a residential neighborhood.

 

  1. Accessibility to the project is a major concern for the organization, especially given the often limited mobility of seniors. You determine that the project should be located within 250 meters of a major road.

 

  1. The organization is also worried about health risks. They decide that they want the site to be far (more than 300 meters away) from Toxic Release Inventory (TRI) sites as identified by the Environmental Protection Agency (EPA) data from their toxic release inventory databases. Note, that they only want the area within 300m from a TRI site excluded.

 

  1. Accessibility by seniors with limited financial means for joining private clubs is deemed especially important. Therefore, you decide to narrow the criteria to focus on census tracts where:

Use QGIS and the various layers that we have used in class exercises to undertake a basic site suitability analysis. (Feel free to augment your maps with some other map layers stored in the class data directory in order to improve the visual quality of the presentation - but stick with the site selection criteria listed above).

Question 2a: First, prepare 3 maps showing the locations that are acceptable based on the criteria identified above. Map these criteria separately:

  1. proximity to major roads and distance from the TRI Facilities.
  2. appropriate land use characteristics
  3. tracts having a high percentage of seniors below the poverty level.

Question 2b: Next, prepare a fourth map that shows the Cambridge locations that meet all these criteria as well as the 1-continguous-hectare constraint.

Question 2c: Along with the maps, provide a page or two (not a treatise) of discussion concerning:

  1. any choices or interpretations that you made in generating the suitable locations and which you feel bear some explanation, and
  2. your conclusions from this initial analysis regarding suitable sites. Regarding your conclusions, don't just pick one site--there isn't a definitive 'best site' given the criteria we've suggested. Be sure to include some interpretation regarding the extent to which the analysis helps you focus on one part of town, on one or another criterion, on a set of proximity issues that might forecast special interest concerns/complaints, etc. Would you suggest tightening or relaxing some of the criteria? Can you suggest important considerations that are not well captured in this suitability analysis?

Hand in a discussion of the answers to question 2c along with the four maps. That is, what you hand in should be a short report on your facility siting analysis that explains what you did, interprets your results, and discusses your conclusions - with the four maps referenced in your text and included in-line after each is mentioned (or all together on separate pages at the end of the text).

 

FAQ and other Suggestions

Need Help?

If you need help, and you think that your question might be of interest to the whole class, please post your question on Piazza. We strongly encourage you to use this option since this will be the quickest way to get help.    If you would prefer to ask just the class staff, send e-mail to 11.188@mit.edu. Please don't be shy about asking help if you are struggling with understanding the assignment or having trouble with a particular aspect of ArcMap. In the past, we've heard stories of students struggling for many hours over minor issues that had easy solutions. We'd like to avoid these misadventures, so please contact us sooner rather than later if you get stuck. (When you finally get unstuck, spend a moment reflecting on what you could have done to get unstuck earlier if you had the vocabulary, GIS understanding, or roadmap of QGIS to use the various help files and references.) Also, we recognize the value of group work and encourage study group discussion of the homework as well as lab exercises - but we require that you turn in individual work that reflects your own learning and hands-on discovery.

 


Appendix I: Determining which tracts fall within which town

As indicated in the homework text, the Massachusetts town boundaries do not line up precisely with the US Census tract boundaries even though, in almost all instances, the tracts fall within a single town. These differences are known as a 'sliver problem' and, in this case, the reason for the problem is that the tract boundaries are much less detailed and precise than the Massachusetts Town boundary layer. The following graphics illustrate the problem. [They use a shapefile 5Towns.shp containing the borders of the five municipalities in and around Cambridge and north of the Charles River. This shapefile was exported from the MassGIS matown00.shp shapefile.

MA town and tract overlap

The image above shows the five towns in and around Cambridge (and north of the Charles river) together with all the tracts that are selected using "Select by Location" function with "intersect" specified as the topological relation between two layers. Many tracts outside of the five-town boundary are also included! If you zoom into a small area of a town boundary, as shown below, you will see why so many extra tracts are selected. The two themes have different levels of detail and there is a 'sliver' problem in trying to reconcile the common boundaries.

Zoom-in of 'sliver problem'

 

For your convenience, we have provided in the ms-access database, hw2_ACS_lite.mdb, a cross-reference table, TractToTown, that was created by doing a "Select by Location" operation for each of the five towns, with "have their center in" as the topological relation. The table has two critical columns for this assignment: GEOID10 has the state+county+tract identifier and TOWN has the corresponding town name. For example, the image below shows the tracts that "have their center in" Cambridge. A new field "Town" is created and its value is set to be "Cambridge" for those tracts.

Cambridge selection

This method works when only a handful towns are considered. If you want to analyze hundreds of towns, you don't want to undertake this town-by-town selection process by hand. There are more advanced tools to solve this problem such as creating a new shapefile that contains points representing the centroid for each tract in Middlesex County and then doing a spatial join between this new centroid layer and the town layer. However, creating the centroid layer from the tract shapefile is too much of a distraction for this homework set.


Written in 1996-2001 by Kamal Azar, Joseph Ferreira, and Tom Grayson
Modified by Myounggu Kang in October 2002 to incorporate Census 2000 data
Modified by Eric Schultheis in February 2015 to incorporate 2009-2013 ACS data
Modified 2003-2015 by Jeeseong Chung, Jinhua Zhao, Shan Jiang, Eric Schultheis and Joe Ferreira.
Modified by Juan Camilo Osorio and Hongmou Zhang on 06 March, 2017.

Last modified by Joe Ferreirai on March 9, 2021.

Back to the 11.188 Home Page. - Back to the CRON Home Page.