Massachusetts Institute of Technology
Department of Urban Studies and Planning


11.520: A Workshop on Geographic Information Systems

11.188: Urban Planning and Social Science Laboratory

Lecture 3: GIS Data Manipulation and Querying

September 17, 2008, Joe Ferreira

(Based mainly on notes by Mike Flaxman)

Administrative notes:

  • Lab Exercise #2 due next Wednesday, Sept. 24 (at the start of the class)
    • hand in two printouts and email two URLs for online PDFs
      • black/white printout okay if PDFs are in color
      • myname_labxx.pdf becomes URL = /http://mit.edu/myemail/www/myname_labxx.pd
  • Lab Exercise #3 due Monday Sept. 29 (at the start of the lab)
  • Homework set #1 due in 2 weeks (Weds. Oct 1 th via Stellar)
    • Examine relationships among eastern Mass Shopping Centers, major roads, and residential locations
    • Read it over, check out the datasets, try the methods, spread out the work
    • Waiting until the end will be frustrating and stressful!
  • Lecture room - we are checking out 14E-310 as an alternative to 1-379

Today's Outline:

  • projections, datums, and coordinate systems
    • Continue discussion and illustrate
  • database management basics
    • ArcGIS queries and joins

Projections, Datums, and Coordinate Systems

  • Illustrate basic ideas using ArcGIS with Cambridge, Mass, and US-states shapefiles
  • Finish discussion using notes from last Wednesday's lecture: Lecture #2:

Basic Data Manipulation and Query Tools

  • Today's focus
    • manipulating and querying textual tables in ArcGIS
    • associating textual data tables with mapable features
  • Next week:
    • broader focus on relational database management
    • Use of MS-Access

     

n                   Methods of Selecting Features (within basic "vector" GIS model)

o      Exploit link between map and tabular views

o      Simple Attribute Queries

o      Spatial Selection Queries

n                   Manipulating and Extending Tabular Data

o      Adding Attribute Columns

o      Calculating new attribute values

o      Basic statistics on selected features

o      Summary statistics

n                   Using Selected Sets

o    Exporting Data Subsets (replicating data subset)

o    Using “Definition Queries” as filters

Specfic ArcGIS Notes:

Simple Attribute Queries in ArcGIS

            Basic format

                        <Attribute> <Operator> <Value>

                        Attributes delimited with square braces, i.e.: [Landuse]

 

                        String (text) values typically single quote delimited

                                    'Commercial'

 

                        Exact syntax depends on back-end database (aargh!)

                                    Some require double quoted strings

 

            Example

                        [Landuse] = 'Commercial'

                        ArcGIS Interface Dialog perculiarity:

                                    double click to load attribute names or values

                                    single click for operators

 

Compound Attribute Queries (the confusing syntax of ArcGIS)

                        Remember: must repeat the attribute name

                                    [Landuse] = 'Commercial' or [Landuse] = 'Industrial'

*not* [Landuse] = 'Commercial' or 'Industrial' (missing required repetition of attribute name)
*not* [Landuse] = Commercial (missing required single quotes around text)

 

                        Can build up based on current selection set

                                    Two pass query:  [Landuse] = 'Commercial'

                                    then, add to selection: [Landuse] = 'Industrial'

 

Fun with Selections

            By default, processing operations occur based on selected features only

                        For example: to buffer commercial land uses, first select commercial, then buffer

            Subsetting based on selection

                        Simple, important, poorly documented workflow

                        Create a selection, then "export data" to new file

                        (Sorry no cut and paste!)

                       

            In the attribute table, calculations done only on selected features

                        Useful for calculating new attributes, often for reclassification/aggregation

                        Example: ranking store location suitability based on zoning layer

                                    Logic:  best = commercial or mixed use

                                                moderate = industrial

                                                worst = residential

                                    Strategy:

                                                Create new rating attribute [rating] in zoning table

                                                Select best features, calculate rating attribute = 'best'

                                                Select moderate features, cal rating 'moderate'

                                                etc.

                        Advantages/Disadvantages:

                                    Permanent change to database, ranking result obvious

                                    Method *not* obvious after the fact (requires external documentation)

                                    Single, transferable data set

                                    Requires "write" permission on the database

 

            Spatial Selections

                        (This bit different than other textual databases)

                        Can select features based on their spatial relationship with other selected features

                        You will need to do this for your homework

 

Intro to Geoprocessing

n                   Relationships between Data Models & Spatial Questions

o      Data Models Vary in Degree of

§       Geometric refinement

·       What’s the MMU (minimum mapping unit)?

How are contiguous features segmented?

§       Attribute Refinement

·       How many classes of land use are recorded? (urban/suburban or 27 types?)

·       Are the aspects you need directly coded at all? (traffic congestion, historic building quality?)

§       Temporal Refinement

·       How up to date are your data?

·       Are all layers in temporal synch?

·       Is your question about current conditions, or really about future conditions?

o      Common cases when your data model doesn’t match your question

    • Disaggregate using attributes

Example: Classify shopping centers into five classes based on square feet

    • Aggregate using attributes

    Example: Reclassify 27 land use types into built/unbuilt

    • Disaggregate spatially
    • Example: shopping centers near major highways or not

    • Aggregate spatially
    • Example: treat roads as unified linear object based on road type (regardless of digitization segments or name)

o      Spatial aggregation and disaggregation require more than simple selection – require creating new geometries based on combinations of existing geometries


n                   Some Useful and Common Geoprocessing Operations

o      Spatial data subsetting using “Clip”

§       Selects those features within a polygonal geometry, breaking partially included features as needed

o       Buffering

§         Creates new geometry representing an area within a given distance from selected features

§         By default creates one buffered object per feature.  Often useful to “join” output geometries

 

Example: Site Selection for Low Cost Grocery Store Chain

n                   Conceptual Model

o      Brainstorm Criteria for "good" locations

n                   Case Study Example

o      Factors used in actual Commercial Shopping Center Site Selection

o      Powerpoint slides used by commercial firm to market site selection tools

by Edens & Avant and RPM consulting

n                  Think about these marketing slides?

o      Is the methodology or analytic scope overstated?

o      What considerations are omitted, shortchanged, badly measured?

 


Created by Zhong-Ren Peng, Mike Flaxman, and Joe Ferreira 2003-2008

Last modified 17 September 2008 by Joe Ferreira

Back to the 11.520 Home Page.
Back to the CRON Home Page.