Massachusetts Institute of Technology
Department of Urban Studies and Planning


11.520: A Workshop on Geographic Information Systems
11.188: Urban Planning and Social Science Laboratory

In-Lab TEST - November 17, 2008


Test Instructions

Good luck!

Datasets for the Test

For this test, we will use US county-level data from the recent November 4, 2008, presidential election. In the usual manner, mount the class locker as drive M, and copy the entire (readonly) folder from M:\test08data to writeable space on your local drive (e.g., C:\USERTEMP in Room 37-312 and C:\tmp or C:\workspace in Room 9-251). (In Room 9-251, the class locker may not mount as Drive M:\ and you may have to navigate down Drive Z to find Z:\afs\athena.mit.edu\course\11\11.520\test08data )

Look in your copy of the folder for an ArcMap document called 11.520test08_start.mxd that has already included many of the shapefiles and tables that you will need. This ArcMap Document utilizes the following shapefiles that were obtained from the MIT Geodata Repository and then projected to a North_America_Albers_Equal_Area_Conic projection (NAD 1983) projection (in meters). (The standard EPSG code for this coordinate system is 102008 so I have included it in the shapefile names, but you will not need to use this fact.) Only the contiguous 48 United States were saved (the "lower 48"). Four shapefiles and one dBase file are referenced:

Filename Description
us_a1city_2000_lower48_epsg102008.shp Point feature shapefile showing the location of major US cities (within the lower 48 states) having 2000 US Census population of at least 50,000. [For your information only, not needed for any questions.]
us_f7states_2006_lower48_epsg102008.shp

Polygon feature shapefile showing state boundaries for the lower 48 states.

us_e25msa_2000_lower48_epsg102008.shp Polygon feature shapefile of the US 2000 Census boundaries of metropolitan statistical areas (MSAs) within the lower 48 states. .
us_f7counties_1996_lower48_epsg102008.shp Polygon feature shapefile from US Census Bureau of the 1996 County boundaries within the lower 48 states together with the state, county name, Federal Information Processing Standard code (FIPS), and estimated 1990 and 1996 population.
ELECT08C.DBF

A dBase-formatted table containing two-digit state abbreviation (st_code), state name (state), county name (county), and county FIPS code plus the county-level vote count for John McCain (McCain) and Barack Obama (Obama). The vote counts come from MSNBC reporting of the 2008 Presidential Election data for each US County as published online at "ManyEyes" on 11-11-08: http://manyeyes.alphaworks.ibm.com/manyeyes/datasets/2008-vs-2004-presidential-election-r/versions/1
[This is an interesting IBM-sponsored site for sharing data and visualizations.]

Within the ArcMap document, 11.520test08_start.mxd, the election data in elect08c.dbf has already been joined to the county shapefile using the FIPS code. Data are shown for 3107 of the 3112 counties in the lower 48 states. Votes in the District of Columbia and Yellowstone National Park are not included. In addition, coding differences regarding the handling of county vs. center-city FIPS codes prevent the election data from matching up with the county shapefile for Miami-Dade County and two center-cities in Virginia: Clifton Forge and South Boston. The omission of these 5 counties will not matter for the purposes of this test.

In addition to the shapefiles, dBase table, and ArcMap document, the test08data folder also contains an MS-Access database, 11.520_election08.mdb, with two tables: election08_county is the same as ELECT08C.DBF and election04_county is a similar table with 2004 presidential election results for George W. Bush and John Kerry together with some additional demographic data about the US counties. The data dictionary for the columns in both tables is visible in the 'design view' of the tables within MS-Access. We have also included a raster grid layer, nc_grid, in the test08data folder that divides North Carolina into 10 km grid cells. This grid layer will be used in the last question.

When you open the ArcMap document, you will see, in the Data Frame, a thematic map of vote counts for McCain that uses the US county shapefile after it has been joined to the ELECT08C table. The thematic map shades the number of McCain votes received in each county using a red-to-blue color ramp with 5 categories of 'natural break' classification.

For some of the maps you create, we request a color ramp varying from red to blue (as in the ArcMap document that we provide). Note that ArcMap provides tools that allow you to 'flip' the color scale in the event that your red-to-blue scale has red at the wrong (Obama) end rather than at the Republican (McCain) end. For example, in the 'symbology' window for 'graduated symbols', you can click the 'Symbol' column heading and it will provide an option to 'flip symbols'. You may find these features helpful to get the shading that you want.

Part I: Short Answers (37 Points)

Question I-1 (18 points total, 2 points each for first four, 3 points each for last 3)

Question I-2 (9 points total, 3 points for 'a' and 6 for 'b')

This election map is displayed using a North_America_Albers_Equal_Area_Conic projection (NAD 1983). Local planning agencies typically use projected coordinate systems. For example, the MassGIS mapfiles that we have used are generally saved in the Massachusetts (mainland) State Plane coordinated system. Explain briefly (a) why local agencies prefer to display maps in projected coordinates rather than lat/lon, and (b) two noticeable changes in the visual appearance of the US election map if it were instead displayed in a lattitude/longitude coordinates (such as geographic coordinate system, World, WGS 1984).

Question I-3 (10 points total)

Part I-3A (5 points): In the ArcMap document, the 'Bad Election Map A' map looks strange. It shades votes for McCain using a natural-break classification. We all know that the Republicans are stronger in the rural areas and the news reports tend to use red colors where Republicans won and blue colors where Democrats won. This map does use red for the counties where McCain received more votes and blue where he received fewer votes. Yet the map is mostly blue in the rural parts of the country and red along some of the coasts. Explain briefly why this is the case.

Part I-3B (5 points): Explain briefly your choice of attribute field, symbology, and classification method in order to display a thematic map using the data in ELECT08C.DBF that presents a better indication of the geographic pattern of the voting outcomes for the contest between John McCain and Barack Obama. (You do not need to turn in a map at this point - just explain what you would do and why.)

Part II: Spatial Analysis and Mapping Using ArcGIS (63 Points)

For this portion of the test, you will want write access to some of the test data so be sure you have copied the test08data folder to a local drive before you use ArcMap to open your copy of the 11.520test08_start.mxd document on your local drive. Our questions will focus on North Carolina, one of the large 'battleground' states that was heavily contested in the recent election.

Question II-1 (12 points)

Create a 'Definition Query' for the US County map layer so that only those counties in North Carolina are included. (For the rest of the test we will focus only on North Carolina and this restriction will speed up the processing. For even faster processing, you can export the counties of North Carolina into a new shapefile for use in the remainder of the test.) Also change the coordinate system of the Data Frame so that, instead of North_America_Albers_Equal_Area_Conic projection, you use 'NAD_1983_StatePlane_North_Carolina_FIPS_3200' (Hint: Set the properties of the Date Frame to be the appropriate pre-defined coordinate system.) Zoom in on North Carolina and shade the counties based on the percentage of the votes that Obama received. (Note that ELECT08C.DBF only has the votes for McCain and Obama and not for third-party candidates that were on the ballet in various states. For the purposes of this test, we will ignore all third party candidates so you should compute the percentage of votes for Obama as equal to (100 * [Obama votes]) / ([Obama votes] + [McCain votes]). Use quantiles with 5 categories and a red-to-blue color range with blue indicating more support for Obama. Turn on the MSA layer with 50% transparency [see the Display tab in the Layer Properties dialog window] and an appropriate color or shading pattern so you can both read the thematic map and visually distinguish which counties fall within the MSAs.

Turn in a PDF file showing a layout view of the North Carolina thematic map that you create. Be sure to have your name and Athena userid on the map. Also be sure to project the map to North Carolina State Plane coordinates (NAD83) and have the MSA layer on top with symbology that makes them clearly visible. Include a North Arrow and legend as well. The data sources are 2008 US Presidential Election (MSNBC data reported in ManyEyes) and the MIT GeoData Repository.

Question II-2 (23 points total)

The newspapers have made a lot of the urban/rural dichotomy - Obama was strong in the cities and Bush was strong outside the cities. Let's look at the vote in counties that are within vs. outside the MSAs:

Part II-2A (8 points): Highlight those North Carolina counties that have their centroids outside the MSAs. How many North Carolina counties have their centroids outside MSAs? ___________ Turn in a PDF file showing a layout view of North Carolina counties and MSAs with those counties clearly highlighted that have their centroids outside the MSAs.

Part II-2B (11 points): Examine the attribute table of election results in order to fill in the eleven blanks in the following table

North Carolina
Number of Counties
McCain Votes
Obama Votes

Difference
(Obama-McCain)

Total Votes
McCain Percent
Obama Percent
In MSAs      
137197
     
Outside MSAs      
-122696
 
54.9%
45.1%
All counties  
2,101,837
2,116,338
14,501
4,218,175
49.8%
50.2%

Part II-2C (4 points): Briefly interpret your results. Is McCain a lot stronger in those counties outside the MSAs?

Question II-3 (16 points total)

Next, consider the election04_county table (in the MS-Access 11.520_election08.mdb database). This table shows the number of votes that George W. Bush and John Kerry received in each US county in the 2004 presidential election. Compute the total number of votes for Bush + Kerry in each county and then the compute the difference in the total number of votes (for Republican and Democratic candidates) for the 2008 and 2004 elections. Call this difference delta_votes = (2008 total) minus (2004 total). Use the FIPS code to join this table to your county map. Now, let's examine whether the changes in turnout between the 2004 and 2008 election across the North Carolina counties favored McCain or Obama.

Part II-3A (4 points): What is the total number of 2004 votes for either Bush or Kerry across all North Carolina counties? ___________

Part II-3B (2 points each): What county in North Carolina had the largest increase in votes cast (for Republicans or Democrats) in the 2008 election compared with the 2004 election.? That is, what North Carolina county had the largest delta_votes? County FIPS = _________? County Name = _____________? delta_votes = ___________?

Part II-3B (3 points each): What is the sum of delta_votes for those North Carolina counties outside the MSAs? _____________? What is the sum of delta_votes for those counties in the MSAs? ________________?

Question II-4 (12 points total)

Instead of simply determining which counties are inside or outside of MSAs, we decide to measure distance from MSAs. Let's use ArcMap's spatial analyst for this purpose. To save you time, we have already rasterized North Carolina into 10 kilometer grid cells (using the same North Carolina State Plane, NAD-83, projection mentioned above). We have saved this coverage for your use under the name: nc_grid. (The cell values in the grid are the FIPS codes of the Country containing the center of the grid cell. This encoding makes it easier to see County boundaries when shading the grid cells but you will not need to utilize these grid cell values for the test.)

Use the Spatial-Analyst/Distance/Straight-line function to compute a new raster layer whose cell value is the straight-line distance (of the center of the grid cell) to the nearest MSA boundary within North Carolina. (Note: The grid cell distance computation will measure distance to all MSAs that are not 'masked off' by whatever mask you have set. So, if you set the mask to be nc_grid, then the distance operation will only consider MSAs within the North Carolina grid.) BEWARE of the usual spatial analyst cautions - you will not be able to do this raster-cell distance computation unless the 'spatial analyst' extension is turned on in Tools/Extensions and the 'spatial analyst' toolbar is turned on in View/Toolbars, and the Data Frame is set to a projected coordinate system, and you have set all the usual properties in the Spatial-Analyst/Options dialogue box. (So much for simple spatial analysis!) [The nc_grid layer is already in the desired projected coordinates with the appropriate cell size and the like - just be sure your Data Frame is using the same projection.]

We don't have time on this test to do much raster exploration of the voting data. It will be enough to do the following:

Part II-4A (6 points): Determine that grid cell within North Carolina that is farthest from any MSA (within North Carolina). What is the distance from that grid cell to the nearest MSA? ____________ What county is underneath the center of that grid cell? ______________ What is delta_votes for that county? _______________

Part II-4B (6 points): Shade the grid cells using a red-to-blue color scale with red for those cells farthest from MSAs. Turn in a PDF file of your North Carolina map while taking care that to set the transparency so that shaded grid cells are visible underneath the MSAs. Be sure to highlight that grid cell from Part II-4A that is farthest from the MSAs. (Hint: In case you aren't familiar with setting the transparency level for the buffer, you can set it from the display tab of the 'layer properties' window).

That's all for the test, but feel free to keep the election data and explore some of the election patterns further when you have time. Try using 'zonal statistics' to get the average distance of each rural county from MSAs, then plot a scttergram of distance versus percent of vote for McCain.Did Obama increase the 2008 turnout (compared to 2004) by a greater percentage in those counties that he won? that favored Kerry in 2004? that are furthest from MSAs. What happened in other swing states? What about counties with high/low percentages of minorities? Enjoy!


Please note:



Last modified 15 November 2008. (Joe Ferreira)

Back to the 11.520 Home Page.
Back to the CRON Home Page.