Massachusetts Institute of Technology - Department of Urban Studies and Planning

11.188: Urban Planning and Social Science Laboratory

11.520: A Workshop on Geographic Information Systems

Georeferencing, Data Creation & Advanced Raster Operations

 

April 3, 2018, Joseph Ferreira & Eric Huntley
(based in part on '06 Lecture by Michael Flaxman)

 


Administrative

  • Lab #7 (raster analysis): due Monday, April 9
  • Homework #3: Raster Analysis and ModelBuilder- now online
    • Part 1 (raster) due Wed., April 26
    • Part 2 (discussion and model builder) due Wed.. May 3
  • Lab #8 (web services): mapping field-collected data, due Mond. May 1
  • In-class text: solutions are posted and graded tests have been uploaded to Stellar

Today

  • Creating GIS Data from non-GIS sources
    • Previously: create point shapefile from X,Y points in a table (e.g., centroids of blockgroups)
    • Discuss other ways today:
      • From imagery, CAD files, digitizing
      • From mailing addresses (geocoding, georeferencing, or address matching)
      • Link to additional notes on spatial feature creation - Lulu Xue's powerpoint slides (for ArcGIS v. 9.3)
  • Introduce advanced raster analysis methods
  • Get ready for field work in next Monday's lab: tagging neighborhood locations with GPS location, timesamp, and user classification

 

Creating GIS Data from Non-GIS Sources

Common sources

    Raw Imagery

    CAD Files

    Digitizing

    Addresses

Different sources, different methods

Raw Imagery

  • Georeference (if necessary)
  • Classify (by Color/Spectral Characteristics) or
  • Digitize (aka Trace)

  • GeoReferencing Imagery

    Georeferencing Imagery

    Note: JPG and TIF images can be directly read into ArcGIS. But by default, they won't have an appropriate coordinate system and won't overlay anything else. (JPEG 2000 and GeoTiff are standard formats that are not always supported but can save coordinate system metadata information along with the image.)

    So we need some data with a coordinate system we trust. (Warning: Google Earth, etc. can be *very* imprecise internationally - see error in this image below).

    Georef Step 2


    So, now we have a valid coordinate system, but our image is clearly pretty far from being correctly registered.

    The solution? The "Georeferencing Toolbar" (View->Toolbars->Georeferencing). This tool allows you to identify relationships (by clicking on the screen):
    • If you identify one corresponding location in each data layer, the software computes a simple shift.
    • If you identify more than one location, the software does a more complex transform (linear, or fancier).

    CAD Files

    Can simply "open" most common CAD files directly in GIS (DXF, DWG, DGN)

    For example, from a real world project, here are CAD data for a regional plan as created by Fonatur, the Mexican national tourism/development agency.

    CAD in GIS

    Important limits:

    "attributes" don't come along, only layer names *therefore you are well-advised to know the layer naming/numbering convention* (and if none - a big messy problem).

    objects must be "exploded" in CAD before export

    solids must be converted into boundary representations

    Common problems / solutions

    Drawn "to scale", but often without explicit projection information and not North aligned

    Solution 1: layer properties dialog allows specification of transformations

    CAD Manual Transform

    Solution 2: "world files" (*.wld) are simple text files documenting transforms

    CAD World Files

    Why bother with "world files"?!!! Scalability: one world file can be replicated and applied to many CAD documents drawn against the same base.


Digitizing - Creating new (georeferenced) Geometry

Vector Data Model - Requires boundaries with X,Y coordinates

  • Old method - large digitizing table with 'puck' on top of paper map
  • New method - 'heads up' digitizing on-screen on top of image
  • ArcEdit provides a rich assortment of geographic feature creation tools
    • And, is complex and often counter-intuitive
    • Includes limited 'ArcSketch' capabilities to create geographic features using templates

 

  • We will demonstrate use of ArcEdit to create a new shapefile of polygons via 'heads up' digitizing on top of one of the orthophotos that we accessed last week from the MassGIS WMS server. (You could just as easily digitize new polygons on top of any other shapefiles that we have used.)
  •  

In ArcCatalog

  • You must first create empty shapefile in ArcCatalog
    • Navigate to a writeable directory and choose File/New/Shapefile
    • specify polygon features
    • specify Mass State Plane (NAD83) mainland coordinates
    • Save this empty shapefile with a name such as, newpolys.shp,
  •  

In ArcMap

  • In an empty Data Frame in ArcMap
    • Add whichever basemap you want to use for digitizing,
    • Then add the empty newpolys.shp shapefile.
      • Just be sure that the map layers in the Data Frame are in Mass State Plane, NAD83 meters, coordinates.
      • For example, use the saved ArcMap document, '11.188_web_service_examples.mxd' that contains MassGIS and ESRI web services
      • Or, add this MassGIS WMS server "Massachusetts Data from MassGIS (GeoServer)" Server at this web address:
      http://giswebservices.massgis.state.ma.us/geoserver/wms?
      • and select one of these layers:
        • Black and White Orthos (1990s)
        • 2008 Color Orthos 30cm
        • 2005 Color Orthos
    • Zoom in to the MIT campus so you can digitize the building footprint of a few buildings:

     

  • Use ArcEdit to create polygons in your new shapefile
    • Turn on the 'Editor' toolbar from Customize/Toolbars/Editor
    • Choose 'Start Editing' from the 'Editor' pull-down menu and then click the 'create features' icon.  Select the newpolys layer as the file to be edited
    • Choose 'polygons' from the 'construction menu and create a few polygons
      • Click a sequence of points along the boundary of a polygon
      • Double-click to jump to the first point and close the polygon
    • Examine attribute table, edit 'id', and think about out how to add additional attributes
    • Try out various editing tools to move, snap, and adjust features
    • Recognize complexities: overlapped vs. shared boundaries; moving shared points, choosing an appropriate level of detail, handling curves, labeling points, lines, polygons,...


Addresses - Requires 'Geocoding'

What is Geocoding

    • Geocoding is a process of creating map features from addresses, place names, or similar textual information based on attributes associated with a referenced geographic database, typically a street network that has address ranges associated with each street segment or 'link' running from one intersection to the next.
    • Geocoding typically uses Interpolation as a method to find the location information about an address. 
      • (If the address along one side of a block range from 1 to 199, then Street Number = 66 is about one-third of the way along that side of the block.)
    • Data required:
      • Reasonably clean, consistent list of legal addresses (i.e. not too many typos, addresses really exist, etc.)
      • Address range attributes on a linear street network
        • Most commonly from Census
        • More current/cleaner data available from private vendors
    • Geoprocessing as a "Service"
      • Basic setup even in desktop ArcGIS
        • First, in ArcCatalog, create "Address Locator", feeding it street network with addresses + parameters
        • Second, in ArcMap,

Geocoding Process

  • Converting textual addresses and names to X,Y locations
  • Address matching - develop point map from mailing list
  • Lookup place names in a 'gazeteer' to find lat/lon, zip, place boundary, voting district, etc.
  • General 'service' to translate among geographic identifiers
  • Examine ArcGIS geocoding services
    • Focus on address matching using TIGER-style street centerline data (with address ranges)

 

Example: using US Census Bureau, TIGER Line Files (as source info for geocoding)

  • Geocoding Strategy using TIGER
    • Encode road network as street centerlines
    • Attach address information to each street segment
    • Use 'in reverse' to match street address to street segment to get approximate X,Y location
  • TIGER: Topologically Integrated Geographic Encoding and Referencing system
    • http://www.census.gov/geo/www/tiger/
    • US Census Bureau TIGER line file 2000, technical documentation
      • at Census: http://www.census.gov/geo/www/tiger/rd_2ktiger/tgrrd2k.pdf
      • in class locker: http://mit.edu/www/data/census2k/tiger_tgrrd2k.pdf
  • Illustrative Example
    Street centerline road segments
    Attaching address ranges to road segments
    TIGER diagram-1 TIGER diagram-1


Data needed for geocoding

1. A list of addresses stored as a database table or a text file.

2. Georeferenced features linked to the address database (such as a street centerline shapefile with street names and address ranges stored as attributes of the shapefile)

Geocoding

A geocoding service, which is a configuration file that specifies the georeferenced feature layer and its relevant attributes, and various rules and tolerance for use in the matching.

GeoCoding Setup

Address Locator

The output of the geocoding is a point file stored as either a shapefile or a geodatabase in ArcGIS.


GeoCoding Results

 

A machine at the GIS Lab in Rotch Library has a seamless street map of the US that does a good job of geocoding any US address.

 


Geocoding Example: Chapter 17, exercises a, b, and c in Ormsby
  • Convert mailing list of neighborhood address to a point shapefile
    • Exercise 17a: create a 'geocoding service' using
      • Street centerline shapefile with address ranges (TIGER-format)
      • Attribute information consistent with address information in mailing list data table
    • Exercise 17b: use the geocoding service to 'address match' all the easy cases
    • Exercise 17c: use the interactive tools of the geocoding service to handle the tough cases
  • 'Crib Sheet' of steps for Exercises 17a: create geocoding service
    • Add street shapefile and mailing list table to your ArcMap session
      • Examine layers to see what they have
      • Don't need to open this shapefile and table in order to create geocoding service
    • Open ArcCatalog and create a new gecoding service
      • Select US Streets with Zone style
      • Select street shapefile that contains address ranges
      • Match attribute columns of street file and mailing list in order to cross-reference
      • Note that the name of the geocoding service includes your user name
  • 'Crib Sheet' of steps for Exercises 17b: use the automatic geocoding service
    • Add street shapefile and mailing list table to your ArcMap session
    • Tools/Geocoding/Geocoding-Services-Manager/Add-gecoding-service
    • Add the service you created in 17a
    • Use Find button on toolbar to locate individual addresses
      • Enter address by hand
      • Create graphic annotation showing location (but don't save as shapefile)
    • Use Select Elements button on toolbar to
      • Automate address matching of easy-to-match addresses (Note use of quality-of-match index)
      • Save results as a shapefile of point features
  • 'Crib Sheet' of steps for Exercises 17c: tweak the automatic results
    • Add street shapefile and 17b output to your ArcMap session
    • Tools/Geocoding/Review&Rematch-Addresses
    • Use Geocoding-Options and Interactive-Review to
      • tweak settings/rules for automated matching
      • edit mailing list addresses and match tough cases as best you can


Advanced Raster Analysis

Summarize grid cells values by fixed geometries using zonal statistics

  • Vector case example: summarize suitability across each census blockgroup
  • Use 'zonal statistics' to average grid cell values within each blockgroup polygon

Finding patches using RegionClass command

  • What if we wanted to summarize by real city blocks and only had road centerlines
  • Create block polygons from road centerlines using 'line coverage to region' RegionClass tool (requires ArcInfo license)
  • Use block IDs to distinguish each 'patch' so zonal sum will compute block average

Moving window analyses
    Sometimes we don't have a fixed geography
    Want to summarize clusters of occurrences
    Example: land use mix (within classic pedestrian 1/4 mile)

CostDistance
    Simplest form: mask out excluded areas, assume cost per cell is equal elsewhere
    Example: cost distance from bookstores
    In general, urban grid makes accessibility relatively even
    But note case of Charles River
    Online Example: RedFin opportunity score to find places within 30-min walk plus public transit:

https://labs.redfin.com/opportunity-score?south=42.22947350329647&west=-71.20178445835722&north=42.48874784471439&east=-70.98449334164275&zoom=12



Introduction to Monday's Lab on GPS Data Collection Field Work



Created by Joseph Ferreira and Michael Flaxman, 2005-2006
Last modified 3 April 2018, Joe Ferreira
Back to the 11.188 Home Page.

Back to the CRON Home Page.