11.188/11.205 Lab #2

Massachusetts Institute of Technology
Department of Urban Studies and Planning

11.188: Urban Planning and Social Science Laboratory

Lab Exercise 2: Database Operations in ArcGIS

Out: Monday, Feb. 19, 2019, & Due: Monday, Feb. 25

[Click here for today's in-class notes. ]

Overview

In this exercise, you will use the Cambridge housing sales and census block group data to continue exploring socioeconomic patterns across Cambridge. In order to do this, you will use ArcMap's database capabilities involving query selection, tabular joins, and spatial joins. The purpose of this part of the lab exercise is to introduce ArcMap's ability to perform analyses that depend upon data queries and the juxtaposition and manipulation of spatial features. Much of the analytic power of GIS comes from its capacity to compute and then manipulate and visualize the geospatial relationships among selected events and locations. We'll illustrate a few basic spatial analysis techniques using data sets from the previous ArcMap lab exercise - namely, the 1989 housing sales data for Cambridge and 1990 Census data for Cambridge block groups. (Subsequent labs will cover additional database capabilities by linking ArcMap to other database management tools, such as MS-Access.) Also, this exercise provides opportunities to set an appropriate data coordinate system. In this particular lab you will turn in your answers by filling out a form as well as generating maps in PDF format. As you work on the exercise, edit your own copy of the following form (http://web.mit.edu/11.188/www/labs/lab2/lab2ex.html ) and submit the copy with your answers to the appropriate Stellar 'turn in' site.

(1) Merging, editing, and saving tables: To do this lab, we'll have to learn a few more ArcMap techniques for merging and saving tables. There are hundreds of census variables and any Cambridge block group map is likely to include only a small subset of them. To use other census data, we'll have to load these data into ArcMap and "join" them with the attribute table of the Cambridge block group theme using a common geographic reference variable (such as the state-county-tract-block group identifier, Stcntrbg). We may also need to compute new fields that normalize or otherwise combine multiple columns of the original data. But the official class datasets are "read-only" and ArcMap requires that you have write access to attribute tables in order to create and calculate new fields. So we'll also need to learn how to save local "writeable" copies of (extracts of) our class datasets.

(2) Spatial Joins: Suppose we'd like to compare the sales prices of Cambridge homes with the socioeconomic characteristics of their neighborhoods. For example, we might want to know whether the high-priced sales tended to be in neighborhoods with highly educated adults. By drawing a pin-map of high-priced 1989 sales locations on top of a Cambridge block group map, we may be able to see a pattern. The block group map can be thematically shaded by educational attainment levels. With this map, we can 'see' whether high/low-priced sales cluster in neighborhoods with, for example, more (or less) educational attainment. But the pattern may be misleading or hard to interpret and we may want to quantify the relationship to measure the degree of association and to tag the sales data with some of the characteristics of their neighborhoods. Much of the benefit of using GIS software depends on its tools for cross-referencing and reinterpreting data based on spatial location. We'll defer until later labs the more general and advanced tools for tagging data based on spatial proximity and we'll focus in this lab on simpler spatial joins and buffer creation.

(3) Setting coordinate system: So far, the coordinate system of the shapefiles we have used has already been preset for your use. However, in many cases, you may need to specify or change the coordinate systems associated with GIS datasets that you acquire. In this exercise, you will learn how to apply a suitable coordinate system to your datasets.

I. Setting up your Work Environment

Follow the usual routine to set up your work environment:

Log onto a WinAthena PC
Launch a web browser, access the class webpage, and find the link for the current lab: http://mit.edu/11.188/www/labs/lab2/lab2.html
Insure that you have access to the class data locker (Z:\afs\athena.mit.edu\course\11\11.188\data) as explained in earlier lab exercises. (If drive Z:\ is not listed in Windows Explorer as an available drive under 'Computer' you can usually fix the problem by logging out and logging right back in again. In this exercise, we will assume that the class data locker has been attached as drive Q (but you may navigate to the class locker directly on drive Z:). (Since some machines have already have mapped drive M to another network locker, we will suggest mapping the class locker to drive Q.)
Most of the shapefiles that you will need for this exercise are the ones from Lab #2 that were zipped into 'cambridge_shapefiles.zip' in the class data locker (Q:\data\cambridge_shapefiles.zip). Once again, copy and extract this zipped file into C:\temp so the shapefiles are available on a writeable, local disk drive. You will need one more shapefile, middle_county.shp, for the lab exercise. This shapefile can be found in the folder Q:\data\lab2_county. Try copying this shapefile into C:\temp\cambridge_shapefiles using ArcCatalog (either launched separately or via Windows/ArcCatalog within ArcMap). Windows Explorer can also be used to copy the shapefile, but you must be sure to copy all six files that start with the name 'middle_county' in order to copy the entire shapefile.
Start ArcMap.
- Start ArcMap by double-clicking on C:\temp\cambridge_shapefiles\11.188_lab2_web.mxd, or by starting a blank ArcMap session and adding the needed shapefiles. However, if you start with a blank ArcMap, the MassGIS and ESRI web services will not be available to provide useful background images.
- (Optional) Check the location of your 'scratch drive': When ArcMap needs temporary disk space for various calculations, it writes data into a 'scratch' workspace. The deafult location is under your portion of C:\user\<yourid>\.... On personal machines, this is part of the local drive C:. However, on MIT lab machines, the default space is relocated to your network locker (on drive H:). This setting will work but can lead to slow performance. In Lab #1, we reset our default scratch space (including the default file geodatabase, scratch.gdb) to be in C:\temp. On WinAthena machines, we suggest that you reset your 'scratch' space to be C:\temp (or whatever other local portion of drive C: or drive D: that is writeable). You can reset the scratch drive by choosing the GeoProcessing menu option, and then the Environments settings. Finally, click the Workspace entry and you can enter C:\temp into the Scratch Workspace box.
- (Optional) Select File > Map-document-properties from the main ArcMap menu and check the location of the Default Geodatabase. It should be C:\temp\scratch.gdb if, in Lab #1, you managed to successfully reset your default geodatabase to be C:\temp\scratch.gdb. If not, you can redo the part of the instructions in Lab #1 that used ArcCatalog to create a new file geodatabase called scratch.gdb in C:\temp. Changing the location of the default file geodatabase will only affect the performance of ArcMap, not the results. For this lab, the files are small enough that performance will not suffer if the default geodatabase remains on your H: drive. But you will want to get used to checking its location and remembering how to reset it so that you know how to change the defaults to local drives when we start using larger datasets.

II. Examining Attribute Data

As in previous labs, we will use Cambridge data for this exercise. In ArcMap, you should by now have added the Cambbgrp.shp shapefile. Be sure to get the shapefile version of Cambridge block groups. (The shapefile is recognizable by the *.shp suffix, if you have the visibility of suffixes turned on, and by the shapefile icon, .) For clarity, rename your Data Frame to be "Lab Exercise 2." You can do this by doubling clicking the data frame name layers, and then edit the name on the General tab. While on the General tab, you should also set the "Display Units" to miles. Remember: you need to set the Map and Display units in every new view you create in order for ArcMap to interpret the coordinates properly and generate correct scale bars and distance measurements. The Map units tell the software the measurement units associated with the X,Y values actually saved in the shapefile, whereas the Display units choose the measurement units to be used "on-screen" when you visualize the shapefile is displayed in the ArcMap window. Make a habit of checking the 'scratch' space and display units immediately after adding the first layer to a new map document.

A. Simple Queries

In an earlier lab, we've already learned how to use the Info button,, to provide attribute information about particular spatial features. We have also learned how to open an attribute table of the geographical layer.

Now we will experiment with querying the data. Select "Select by Attributes" under the "Selection" menu. In the "Select by Attribute" window, make sure to choose Cambbgrp for the "layer" window. You may have to pick the correct layer from the drop-down list. In the method box, keep the default choice "create a new selection".

Now you can type your query manually into the box in the bottom left corner of the window, or you can use the tools at the top of the window.

Let's find the block groups with median household income in excess of $35,000 per year. In the "Fields" list, double-click on "Med_hh_inc". Then single-click on the ">" button. Next, you can type in 35000 or click on "Get Unique Values" and scroll down "Values" list until you can double-click on 35000 (or the next lowest value if no block group had a value of exactly $35,000). The query entry window should now show "Med_hh_inc">35000: (NOTE: the illustrative graphics use $40,000 as the cutoff but your graphics should use $35,000)

Fig. 1. Query Window

The window finally looks like Fig. 1. Note that attribute names are enclosed in double-quotes and numerical values do not use quotes (and text values would use single quotes). Click the "Apply" button to run the query and 'select' these block groups. Notice that all block groups where the median household income is greater than $35,000 are now highlighted in the data display area. If you open the attribute table, you will find that the associated records are also highlighted. You may need to scroll up or down in the attributes window to see the highlighted records. If you click the Selected button, which is at the bottom of the attributes window, only the selected records will be visible. In our case, next to the Selected button.

B. Statistics and Selected Sets

Now close the query window. Make sure the selected records are still highlighted. Select Statistics from the Selection menu. You should see a selection statistics window. Choose Med_hh_inc in the field drop down list (Fig. 2). You should see a window listing some common statistical measures describing the 28 selected records of the Med_hh_inc field. It also generates a frequency distribution diagram (although this frequency distribution may not be visible if you work with older version of ArcGIS because of font compatibility issues on some machines).

Now close the selection statistics window and return to the ArcMap window. Swap the selected and unselected block groups in the following way: 1. Open the attribute table; 2. Click the Table Options icon (in the upper left of the window); 3. On the pop-up menu, select Switch selection. There is also a separate icon to switch selections - see the image below from the attribute window (Fig. 3). (NOTE: the illustrative graphic used a $40,000 cutoff rather than the $35,000 cutoff that you are asked to use.}


Fig. 2. Statistics Window	Fig. 3. Switch Selection

Select Statistics from the selection menu again to identify those block groups with Med_hh_inc <= 35000 . Write the count, mean and standard deviation of these block groups on your lab assignment. (There is more than one way to access the selection statistics. With the attribute table open, try right-clicking on the column heading of the attribute field whose statistics you wish to compute.)

Note from the statistics for these block groups that at least one block group had a median household income of zero. Let's look at these block groups using the select by attributes function. We find that there are some with zero income. Two of these block groups are near Tech Square and one of them is near Harvard Square (where students in dorms count as people but not households). Let's exclude these block groups from consideration and recompute the statistics for those block groups with a median income of $35,000 or less. Specify the selection criterion as "MED_HH_INC" >0 and "MED_HH_INC" <=35000. Then use the Selection Statistics menu to compute the mean and standard deviation for those block groups with median household income greater than zero and less than or equal to $35,000. Record these two numbers on your lab assignment. Since we consider the census data to be unreliable for the three block groups with '0' income, we may wish to exclude them from consideration entirely. One way to do this is to set the layer properties for Cambbgrp. Before doing this, however, be sure to clear all selected features (via Selection > Clear-Seleced-Features from the main menu). Double click on the layer's name to bring up the layer property window. Under the tab "definition query", type in "MED_HH_INC"> 0. Click OK. Notice that the Cambbgrp layer no longer shades the three block groups with zero income. They have been excluded from the layer and they are no longer shaded, but are 'donut holes' in the Cambridge map. The data are still there on disk but ArcMap is ignoring the three block groups that don't meet the selection criteria specified in the layer definition criterion. For the remainder of the exercise we will exclude the zero-population block groups from our analysis.

C. Saving and Editing Data Tables

Column	Description
Table 1. EDUCATIONAL ATTAINMENT (Universe: Persons 25 years and over)
EDUTOTAL	Total Persons 25 years and over
EDU1	Less than 9th grade
EDU2	9th to 12th grade, no diploma
EDU3	High school grad (includes equivalence)
EDU4	Some college, no degree
EDU5	Associate's degree
EDU6	Bachelor's degree
EDU7	Graduate or professional degree

Suppose we wanted to generate a thematic map of the percentage of people (aged 25 and older) who have less than a high school education. The EDU variables in the Attribute table of Cambbgrp table provides the required information. They indicate the 1990 census counts of persons (aged 25 years and over) with various degrees of education as in Table 1.

From the listing, we see that the desired percentage equals

100 * ( [EDU1] + [EDU2]) / [EDUTOTAL]

If we loaded Cambbgrp directly from the class data locker, then the attribute table is read-only so we would not be able to edit the table in order to add a new column that computes this percentage. To overcome this problem, we could create (and edit) a local copy of the table. [NOTE: We will go through these extra steps of creating a new table even though we are using a local, writeable copy of the original Cambbgrp shapefile that we copied to C:\temp\cambridge_shapefiles. We do this so you learn how to extract and save portions of the attribute table from a large shapefile that might be an official dataset that you should not change.]

The Cambbgrp attribute table contains several dozen columns and we are concerned only with the geographic identifiers and the EDU fields. So, when we construct our new table, education.dbf, we need only include the geographic identifiers and the EDU fields. Open the layer properties window, go to the Fields Tab. Turn the 'visible' box off for all but the three needed EDU fields (EDU1, EDU2, EDUTOTAL) plus the Stcntrbg blockgroup identifier (so we can join this table back to the appropriate block group later on). To turn off a field, we can uncheck the visible box. Better yet, hold the control key down when you click on a box and the checkmarks will be turned off for all boxes. Now click the four fields that we want and, when you click 'OK' the attribute table will show only those four fields. Open the attribute table (if it is not already open) to confirm this behavior.

Now you need to export the table. Open your attribute table. Click the Table Options Button, which is on the top left of the attribute table window. On the pop-up menu, choose Export. Save the table to a local working directory (such as C:\temp). If you have not registered your working directory with ArcCatalog, you may need to use the connect to folder button to register it first. Save the table as a dBase-formatted table. Call the table education.dbf. The *.dbf suffix will be a handy reminder of the data format for the table and the dBase format is quite portable and easily read by Excel and many other packages.

Fig. 4. Table Options

When asked whether you want to add the table to the current map, click Yes. Now this newly saved-to-disk education.dbf table will be included in your map document file. However, you may not be able to see it in the data frame if the bottom tab is set to Display instead of Source. (When the Display tab is active only the mapable layers will be shown.) Change the view of the data frame from Display to Source; and the education.dbf table will be listed. If you clicked the wrong button when asked to add the table to the map file, it won't be listed, but you can now add it in the same way as you add any other map layer or attribute table.

Since you now have write access to your own education.dbf table, you will now be able to add a new column for computing your low-education percentages. Right click the table name in the data frame. On the pop-up menu, select Open. On the new window, click on Table Option, and on the pop-up menu, select Add field. Call the new field p_lowed and change the data type to float. Click OK, and you will see a new field appear in the window. Next, right click on the new column name and, from the pop-up menu, select Field Calculator. Click "yes" when a warning message comes up (about not being able to undo the editing). Then set:

p_lowed = (this top line should appear by default above the box with the cursor and you should not retype it)
100 * ([EDU1] + [EDU2]) / [EDUTOTAL]

**Fig. 5. Calculate Values**

Sort the table (in descending order) by p_lowed. (Right click on the column name and select Sort Descending.) The highest p_lowed value is 55.5%. What is the fifth highest p_lowed percentage? How many block groups have more than 40% of their adults lacking a high school diploma? Write the values on your lab assignment.

D. Merging Data Tables (via Joins and Relates)

Next, we'd like to generate a thematic map of the newly computed p_lowed. In order to do this, we need to 'join' the new table with our percentages to the Attributes of Cambbgrp table that is linked to the map. The STCNTRBG field is a unique identifier that appears in both tables and can help us cross-reference the two tables. However, first open Cambbgrp's properties window, go to the Fields Tab. Turn the 'visible' box on for all fields. Then, to link our two tables together, first open the layer properties window of the Cambbgrp layer (double click the name of the layer). Go to the Joins & Relates tab. You will be joining attributes from a table. In the Join frame, click the Add button. The "Join Data" window appears. It allows you to specify the criterion. First, select Strcntrbg as the aattribute field of the GIS layer that the join will be based on. Second, select your newly created Education table as the table (education.dbf) that will be joined to the GIS layer. Since we have already added the table into the project, we can pick it up from the drop-down list. Otherwise, we need to add it from the disk. Thirdly, select Strcntrbg as the fieldname on which to base the join. This tells ArcMap to join the education table to the GIS layer Cambbgrp.

**Fig. 6. Join**

Apply the join and go back to the ArcMap window. If you are asked to index the layer, select yes. Open the attribute table of Cambbgrp. We now find the columns of the education tables appended to the right side of the table following the attributes from the Cambbgrp shapefile. Now, we are ready to shade a thematic map of the newly computed p_lowed field. Open the layer definition window. Go to the Symbology tab. Classify your p_lowed field using 5 quantiles and eyeball the results. To save time, we won't ask you to print out this map. Just pick a readable graduated color scheme and take a little time to examine the thematic map: Do the patterns you see match your impression of the socio-economic patterns in Cambridge?

Please note that the table 'joins' are one-directional. If you join the tables in the reverse order, the Attributes of Cambbgrp table will disappear and you won't be able to map the selections from the expanded education table. Finally, note that these Joins are temporary and do not affect the tables that are stored on disk.

III. Spatial Joins and Buffering

Thus far all our query examples have focused on calculations and selection criteria that are done directly on attribute tables. We would also like to be able to perform queries that depend upon the spatial relationships of the spatial features that are related to the rows in the table.

A. Simple Graphical Selection

We are already familiar with the simplest graphical selection tools . Suppose we wanted to use the map to examine a few Cambridge block groups near MIT. This is easily done using the graphical selection tools and Statistics command from the Selection menu, which we have learned in previous labs. In the data display area, select the two block groups along the southern-most edge of Cambridge that contain most of the MIT campus. Now use the Selection Statistics menu item to determine the mean population for those two block groups. It should be 2535.5.

B. Spatial Joins Using a Quick Form of Point-in-Polygon Analysis

Next we'll exercise the "spatial join" capabilities of ArcMap to see whether lower-priced housing tends to be in neighborhoods with relatively low income and low levels of education. We've already mapped the low education percentages, p_lowed, for Cambridge block groups, and the sales89 table (in Q:\data) contains the location of all (1-4 family) residential homes in Cambridge that sold during 1989. These sales data come from a Banker and Tradesman Real Estate Transfer Database, for 1987-1989 (data that Anne Kinsella Thompson acquired for use in her MCP thesis). We can address our question about housing value, income, and education, if we can observe and summarize the extent to which the low-priced housing falls within those block groups with high p_lowed values (or low med_hh_inc values, if we map the median income instead).

Add the sales89 coverage as a layer in your ArcMap window (if you have not already added this shapefile). Open the layer properties window and go to the definition query tab. Create a definition query to include in your layer only those sales with a realprice less than $150,000. Use the Query Builder to help you define your selection. You should find that 29 of the 222 sales meet this criteria. Now select areas of the city (block groups) where these sales occurred. Go to menu Seletion > Select by location. Specify the criterion as in Figure 7.

Fig. 7. Select by Location

Click the Apply button. Every block group that contains one or more of the low-priced sales will be highlighted. You have just done a basic "point-in-polygon" query to find the set of polygon features (block groups) which contain a set of point features (the low-priced sales). [Note: The 29 low-priced sales were located in 21 of the 90+ Cambridge block groups.]

Analyzing the results:

While viewing the cambbgrp attribute table, right-click the label at the top of the med_hh_inc column and select Statistics
Check the Mean value for med_hh_inc across these 21 block groups. It should be $30,070.
Go to Selection-Switch Selection.
Check statistics again. The statistics change.
Check the mean value for med_hh_inc across the other 70 block groups (the ones with no sales under $150,000). It should be $38,454.

The lower priced sales do appear to occur in block groups with somewhat lower income. Next, do the same queries for the p_lowed values that you computed earlier. What is the (unweighted) mean and standard deviation of p_lowed for the 21 block groups that contained all the low-priced homes (realprice < $150,000)? What about the other 70 block groups? Write the values on your lab assignment.

These point-in-polygon queries are useful for quick exploration of the data but the summary statistics are only that -- a summary of the patterns that result. As we might suspect, the general trend suggests that low-priced housing tends to be in lower-income, lower-education neighborhoods. But there is quite a bit of variability and the tools we've used so far don't let us move the Attributes of Cambbgr data from the census table over to the appropriate sales89 rows. Doing that would permit us to examine the patterns more closely. In later labs, we'll look at other tools (in the ArcGIS Toolbox) that let us tag the sales89 table with the census data for the block group that contains the sale.

C. Simple Buffering

The "spatial join" in the previous section involved asking which set of spatial features (sales) were completely contained within another set (block groups). Another simple "spatial join" operation is to determine which spatial features are close to other spatial features. Buffering tools are one way to do this.

Suppose we want to check whether the lower-priced housing tends to be closer to the major roads. Let's use ArcMap's buffering tools to create a buffer around Mass Ave and see whether the sales in the buffer are relatively high or low priced.

If not already in your ArcMap table of contents, add cambtigr.shp shapefile to your ArcMap project. This shapefile is the Cambridge dataset that we used in an earlier lab. Use the 'Select by Attribute' capability utilized earlier in this lab to select only those road segments with FNAME = 'Massachusetts'. (There is one street segment on the Mass Ave bridge that has FNAME = 'Masssachusetts Ave'! Don't bother including that link since neighboring Mass Ave links will still generate the buffer we need.) Beware of two issues when making this selection: (1) the field name shows up as 'fname' in the 'definition query' window, but is listed as 'fename' if you use the 'identify' cursor to click on the street links display in the map window. That's because column names are allowed to have aliases - take a look at the 'fields' tab in the layer properties window and you'll see both names! (2) Also, note that some Mass Ave street segments are not selected. That's because the FNAME for those segments are listed as 'State Hwy 2A' rather than 'Massachusetts'. This multiple-name issue could be a problem. In this case, the 55 selected Mass Ave street segments are sufficient to create a buffer that will enclose the others so we won't need to do extra work to identify those Mass Ave segments that list the route number instead of the street name. ArcGIS has some more elaborate database table schemas to handle such naming and route numbering issues but we won't get into that level of complexity in this exercise.

Let's extract these Mass Ave road segments into a new shapefile before we create our buffer. By doing this, we can save the smaller Mass Ave file on a local drive for better performance. Make sure that the Mass Ave road segments are still highlighted and then right-click the camtigr layer and choose Data/Export-Data. Keep the default options to export only the selected features and to use the coordinate system of the layer's source data (i.e., the way it is saved on disk). Set the location for the saved shapefile to be in c:\temp\lab2 (or some other local directory that is writeable) and name the shapefile to be massave.shp. Now you can click 'OK' to save a new shapefile on a local drive with only the Massachusetts avenue road segments. Also click 'OK' to add this new shapefile to your ArcMap data frame. Finally, open the layer property window of the layer. Under the general tab, change the layer name to Mass Ave.

We are now almost ready to use the 'Buffer' tool in the ArcToolbox to create a half-mile buffer around our 55 Mass Ave road segments. As a final step before creating the buffer, make sure the Data Frame Properties have the map units set to meters and display units are in miles (right click on data frame > properties > general). (Note, the Data Frame coordinate system was set to a Mass State Plane projection based on the same coordinate system information in the first shapefile that we added to our Data Frame. Hence, the map units should already be set and should not be editable within the Data Frame Properties window.)

Open the ArcToolbox window from the geoprocessing menu. Then choose the Buffer option from the 'Proximity' listing under 'Analysis Tools'. Choose Mass Ave for the input feature and enter a path and shapefile name for the output feature that will store it in your writeable sub-directory on the local C: drive (e.g., C:\temp\[Your_athena_name]\Buffer_of_MassAve.shp. In the 'Distance' portion of the buffering window, set the linear unit to be 750 and be sure the units are set to meters. Accept the defaults for the 'side type' and 'end type' (regarding how the shape of the buffer is computed) and change the 'dissolve' setting to 'all' so that overlapping buffers around each street segment are merged. When you've adjusted all these settings, click "Finish" and wait for ArcMap to do the computations. You should end up with a curved sausage-shaped area something like the one in Fig. 9 (except that the 'linear unit' will be 750 and the units will be 'meters'.)


Fig. 8. Create Buffer	Fig. 9. Massachusetts Avenue Buffer

Since this newly created buffer is a shapefile, we can use it just like any other theme. In particular, we can do an exercise similar to the "point-in-polygon" example we did earlier in this exercise in order to determine which 1989 housing sales are located within the new buffer. Highlight the sales89 theme (with all the sales not just the low-priced ones) and choose Selection > Select by Location. Select all the sales89 cases that intersect the Buffer of Mass Ave layer. (We used "completely contain" before and "intersect" this time. Do you understand why?) How many of the 222 sales are located within the 750 meter buffer? What are the mean and standard deviation of sale prices for the sales within the buffer? Write the values on your lab assignment.

The sales prices in and out of the buffer aren't all that different (compared with their standard deviation). Also, some parts of the buffer falls outside Cambridge and we don't know about home sales prices in those areas. Before reaching any conclusions, we would want to do further analysis. But, this quick tour of spatial selection is enough for today. In subsequent labs, we'll examine lots more of the spatial analysis capabilities of ArcMap.

IV. Coordinate systems, Data Frames, and symbolizing lines

In this part, you will get familiar with additional ArcMap tools by adding a map of Middlesex County in a second 'data frame', applying a suitable coordinate system to the Middlesex county layer so the system can use it properly, and replacing the MassGIS street layer with your own symbolization of street segments in the 'magmhda1' major roads layer. At the end of this section, you will prepare a map in PDF format that you will turn in with the lab exercise.

Suppose you would like to create a map layout something like what is shown below in Figure 10. As you can see, there is a small map in the left bottom corner. It shows cities in Middlesex county, Massachusetts and it highlights in red the area that is in Cambridge.

**Fig. 10. Map layout**

To create this map, you have to create a new data frame (via Insert/Data_Frame). Note that you can then determine which data frame is visualized in the 'data view' of the map window by right-clicking a data frame in the table-of-contents window and choosing 'Activate'. In 'layout view' maps from both data frames can be shown.

Rename the new dataframe to legend map then copy and paste (or drag-and-drop) the Cambbgrp layer from the orginal Data Frame into the new one. You willl also need to add into the new Data Frame the middle_county shapefile from your copy in C:\temp\cambridge_shapefiles. (In the first part of this exercise, you copied the middle_county shapefile from the data locker (in .\data\lab2_county) into C:\temp\cambridge_shapefiles. This new map layer shows all the towns within Middlesex County in Massachusetts. When you add the shapefile, you will get a warning message something like this:

**Fig. 11. Warning essage**

This message notifies you that the spatial reference information is missing from the county layer and may cause problems for further use. In some cases, you may not get an error message but that does *not* mean that all is O.K. Let's check it out. Open the middle_county layer's layer properties window and select the source tab. It shows that the coordinate system of this layer is "Undefined". Let's fix this. Close the layer properties window and remove middle_county layer from the legend map dataframe (so we can get 'write' permission for ArcCatalog to set the spatial reference information - you cannot change the coordinate system of a layer while it is open in ArcMap. Now launch the ArcCatalog program (Start > All Programs > ArcGIS > ArcCatalog). Beginning with ArcGIS version 10, the ArcCatalog functions are also accessible from within ArcMap by opening the Catalog tab along the right border of the ArcMap window. [NOTE: When you copy and paste the layer, please use ArcCatalog so that all the files that are associated with the layer are properly copied. A GIS layer consists of several files, such as .shp, .shx, .dbf, .sbn, .sbx, and you may miss necessary files when you use windows explorer to copy the layer.]

1) In ArcCatalog, click your right mouse button while putting the cursor on the name of the layer, middle_county. The drop-down menu will show up. Click properties from the list. Then the Shapefile Properties window will show up. (Note: the image comes from an earlier ArcMap 10.0 version with different folders and files available. For example, we have asked you to copy shapefiles to C:\temp\cambridge_shapefiles rather than c:\usertemp\lab3.)

**Fig. 12. Open Shapefile Properties Window**

2) Choose the "XY Coordinate System" tab. It will bring up the spatial reference properties window

**Fig. 13. Spatial Reference Properties Window**

3) Select the appropriate spatial reference system from the list. In this case NAD1983 (meters) Stateplane Massachusetts Mainland FIPS 2001.prj is the right one. Select the spatial reference system and click ADD.

**Fig. 14. Select spatial reference system**

4) Now the spatial reference for the mcounty layer has been set to be Mass State Plane coordinates based on NAD 1983. Click OK. Now go to ArcMap and add the mcounty layer to the legend map dataframe. This time no warning message will show up. [NOTE: A Data Frame will set the spatial reference system - i.e., coordinate system - for viewing purposes to match whatever dataset is first added to the Data Frame. In our case, we first added the CAMBBGRP shapefile which uses Mass State Plane NAD83 coordinates. Hence the middle_county shapefile will display properly even if we have not specified the properties of its on-disk XY coordinate system. Do you understand why this is the case?]

**Fig. 15. Set Spatial Reference System**

Knowing how to examine and reset the projection associated with data frames and layers is often handy when integrating data layers that come from different agencies and sources.

Symbolizing roads in the major roads layer:

By turning on the 'EOT Major Roads' layer within the 'Massachusetts Data from MassGIS (GeoServer)' geoservices, we can overlay symbolized major roads on top of our map. But the MassGIS web services are not always available, and we have limited control over their use and symbology. The majmhda1.shp shapefile included with today's datasets has the major roads in Massachusetts. If the shapefile were not already included, we could click on the Add layer button, and add majmhda1.shp from M:\data . This is a MassGIS (Massachusetts Geographic Information Systems) Major Roads Datalayer created in December 2000. However, it will be faster to use the local (C:\temp) copy already on our local drive.

Turn off the MassGIS geoservices, and let's symbolize our major roads shapefile, majmhda1.shp, to display and symbolize an equivalent representation of major roads. First, change the name of the majmhda1.shp layer to Major Roads to make the entry in the table of contents more descriptive. Next, open the Layer Properties window, click the symbology tab, and set the properties as follows:

Categories: Unique Value
Values Field: Class
Color Schemes: (don't change)
Then click Add All Values button in the bottom of the symbology tab

This is the first time we have used the "Unique Value" legend type. This type of legend is often useful when a field takes on only a limited number of discrete values. In this case, the field Class assumes only four values: 1, 2, 3, and 4. Each value represents a different type of road. These numbers are not very descriptive, so we will want to include text labels for each value. But what do the numbers mean? For that, we can consult the MassGIS metadata for State roads. The current MassGIS web pages describe a more recent version of the State Road layers. However, the definition of the Class attribute is still applicable. Use the descriptions for the values of Class to label your legend.

By default, ArcMap does not choose a very attractive symbolization scheme for the roads. You will need to adjust the symbols manually. Set the symbols as described in Table 1.

**Table 1. Road Symbolization**
Value	Color	Size	Style
1	Red	3	Solid
2	Blue	2	Solid
3	Dark Gray	0.1	Solid
4	Dark Gray	0.1	Dotted

When you're done, your layer properties-symbology window should resemble Fig. 16. Note that you can edit the text in the 'labels' column if you need to change or expand the description of a category. If you make any edits, be sure to 'apply' your changes! Finally, turn on this symbolized map of major roads will unduly clutter up your map. Exclude all but the class 1 and 2 (really major roads) and only display these two classes on your map.

**Fig. 16. UniqueValue Legend Window**

Include in your lab exercise a PDF of your version of the Fig. 10 map above (but with your overlay of the most major roads or your classification of the education variable).

V. Lab Assignment

Today's lab assignment has 6 questions and the one map from Part IV. Please upload your answers to the appropriate Stellar site (as you did for Lab #2). The lab is due Monday, February 25, 2019 before class. NOTE: Before logging out from the workstation, be sure that you have copied to your private network locker or a thumb drive any new or changed ArcMap documents, shapefiles, coverages, etc. that you have created on the local disk (e.g., C:\temp\... which is not only visible to you, and may be deleted by other users). You may want to use ArcCatalog rather than Windows Explorer to copy the files to be sure that all the required files that make up a shapefile or coverage have been copied appropriately to your network locker.

Created by Thomas H. Grayson and Joe Ferreira.
Modified Sept. 16 2003-17 by Jeeseong Chung, Jinhua Zhao, Xiongjiu Liao, Mi Diao, Yang Chen, Yi Zhu, Eric Schultheis, and Hongmou Zhang.
Last modified 13 Feb. 2019 by [rqadri].

Back to the 11.188 Home Page.
Back to the CRON Home Page.

Massachusetts Institute of Technology Department of Urban Studies and Planning