Massachusetts Institute of Technology
Department of Urban Studies and Planning


11.520: A Workshop on Geographic Information Systems
11.188: Urban Planning and Social Science Laboratory

Lecture 2: GIS Data Models, Cartography, Data Management, and Queries

September 15, 2010, Joseph Ferreira, Jr.

(including notes by Prof. Mike Flaxman and former Visting Prof. Zhong-Rhen Peng)


Administrative notes regarding lab exercises and schedule

Outline of Today


'Vector' data models for geospatial location

Raster vs. vector data models

Orthophoto of MIT area with major roads overlaid   
Boston/Cambridge Streets superimposed on orthophoto.  Zoomed-in view shows raster nature of the ortho.


Map elements and thematic mapping:
symbology, classification, and normalization

 

Elements of the Map

Scale

Ratio Scales

1:10,000, or 1:100,000 or 1/100,000

Verbal Scales:

     One inch represents 2,000 feet (1:24,000).

     One centimeter represents 20 kilometers (1:2,000,000)

Printout vs. onscreen:

10 foot pixel + 72 pixels per inch onscreen

==> One inch represents 720 feet (1:8,600) -

But some screens have higher/lower pixel densities; not all screens have square pixels; also, screendump to printer will change scale since printer will have different dot-density than the screen

Beware! Good GIS software will try to match screen and printer properties to software settings so screen and printouts will show appropriate scale and show correct scalebars. For the scale to be meaningful, display units and hardware choices must be properly identified.

Large Scale or Small Scale

In general,

         Large scale: >= 1:24,000 (good for *small* area representation - city block)

         Small scale: <= 1:500,000 (good for *large* area representation - metro area)

But scale is relative, depends on the applications

          Large-scale maps are more detailed than small-scale maps

Feature representations at a given scale are product of practical
limits (what can be seen/drawn) and explicit choices about detail

          Important concept: “Minimum Mapping Unit" (MMU)

 

Typical Scales Used

In Metric System:

1:10,000 (example: German national basemaps – individual houses shown) 1:25,000 ... 1:100,000

try changing scale in ArcMap (if coordinate system is known)

In American System:

1:9,600 (one inch represents 800 feet)

1:24,000 (one inch represents 2000 feet) – typical USGS “quad sheet”

1:62,500 (one inch represents slightly less than one mile)

 

Key Cartographic Principals

Maps are a medium of communication

              So…know your audience

              Or if you don’t/can’t, use the “10 foot rule”

                     Can an average citizen read & comprehend E sized plot at 10’

                     Or a page print at 8.5x11” – font size at least 12 point

A good map should never be a “GIS Data Dump”

Selective emphasis is key

The hard part is removing data, when in doubt - delete

Divide mapped elements into “figure / ground”

Figure = Foreground elements – the essential story of the map

Ground = Background elements – provide context, but don’t overwhelm

==> Use visual symbols, colors, and shades to emphasize foreground, & minimize background clutter


Colors and Categories

Human cognitive limit / rule of thumb:

Try to limit the number of thematic colors

  • If not possible, try creating logical visual subgroups
    • example: housing in 3 shades of orange, commercial 3 shades of red, etc.
  • Hard to distinguish more than 3-4 tones within same hue
    • Even harder when your map is reproduced in grayscale…

ArcGIS color defaults are *random* saturated colors

  • Tip #1: desaturate colors, especially background colors or those covering large areas of the map
  • Tip #2: reserve bright, saturated colors either for foreground elements, or small polygons
  • Tip #3: turn *off* the outline of polygons, particularly for background polygons or small polygons

Six Principal Visual Variables

Use contrasting symbols to portray geographic differences

For qualitative differences

Use shape, texture and hue (e.g., land use types).

For quantitative differences

Use size to show variation in amount or count
(e.g., population, number of crimes),

Use graytone or hue to show differences in ratio or intensity
(e.g., proportion of household in poverty, population density).

 

Southern New England Counties

Southern New England Counties


Basic ArcGIS Data Manipulation and Query Tools

n                   Methods of Selecting Features (within basic "vector" GIS model)

o      Exploit link between map and tabular views

o      Simple Attribute Queries

o      Spatial Selection Queries

n                   Manipulating and Extending Tabular Data

o      Basic statistics on selected features

o      Summary statistics

o      Adding Attribute Columns

o      Calculating new attribute values



n                   Using Selected Sets

o    Most operations only change selected sets

o    Exporting Data Subsets (replicating data subset)

o    Using “Definition Queries” as filters

Specfic ArcGIS Notes:

Demo using HW1 data (1990 census tract data for Eastern Mass)

 

Simple Attribute Queries in ArcGIS (using examples from Cambridge landuse shapefile)

            Basic format

                        <Attribute> <Operator> <Value>

                        Attributes delimited with square braces, i.e.: [Landuse]

 

                        String (text) values typically single quote delimited

                                    'Commercial' not "Commercial"

 

                        Exact syntax depends on back-end database (aargh!)

                                    Some require double quoted strings

 

            Example

                        [Landuse] = 'Commercial'

                        ArcGIS Interface Dialog perculiarity:

                                    double click to load attribute names or values

                                    single click for operators

 

Compound Attribute Queries (the confusing syntax of ArcGIS)

                        Remember: must repeat the attribute name

                                    [Landuse] = 'Commercial (UC)' or [Landuse] = 'Industrial (UI)'

*not* [Landuse] = 'Commercial (UC)' or 'Industrial (UI)' (missing required repetition of attribute name)
*not* [Landuse] = Commercial (UC) (missing required single quotes around text)

 

                        Can build up based on current selection set

                                    Two pass query:  [Landuse] = 'Commercial (UC)'

                                    then, add to selection: [Landuse] = 'Industrial (UI)'

 

Fun with Selections

            By default, processing operations occur based on selected features only

                        For example: to buffer commercial land uses, first select commercial, then buffer

            Subsetting based on selection

                        Simple, important, poorly documented workflow

                        Create a selection, then "export data" to new file

                        (Sorry no cut and paste!)

                       

            In the attribute table, calculations done only on selected features

                        Useful for calculating new attributes, often for reclassification/aggregation

                        Example: ranking store location suitability based on zoning layer

                                    Logic:  best = commercial or mixed use

                                                moderate = industrial

                                                worst = residential

                                    Strategy:

                                                Create new rating attribute [rating] in zoning table

                                                Select best features, calculate rating attribute = 'best'

                                                Select moderate features, cal rating 'moderate'

                                                etc.

                        Advantages/Disadvantages:

                                    Permanent change to database, ranking result obvious

                                    Method *not* obvious after the fact (requires external documentation)

                                    Single, transferable data set

                                    Requires "write" permission on the database

 

            Spatial Selections

                        (This capability is different from other textual databases)

                        Can select features based on their spatial relationship with other selected features (e.g., 'inside of' or 'contains part of')

                        You will need to do this for your homework

 


 

Intro to Geoprocessing

n                   Relationships between Data Models & Spatial Questions

o      Data Models Vary in Degree of

§       Geometric refinement

·       What’s the MMU (minimum mapping unit)?

How are contiguous features segmented?

§       Attribute Refinement

·       How many classes of land use are recorded? (urban/suburban or 27 types?)

·       Are the aspects you need directly coded at all? (traffic congestion, historic building quality?)

§       Temporal Refinement

·       How up to date are your data?

·       Are all layers in temporal synch?

·       Is your question about current conditions, or really about future conditions?

o      Common cases when your data model doesn’t match your question

Example: Classify shopping centers into five classes based on square feet

o      Spatial aggregation and disaggregation require more than simple selection – require creating new geometries based on combinations of existing geometries


n                   Some Useful and Common Geoprocessing Operations

o      Spatial data subsetting using “Clip”

§       Selects those features within a polygonal geometry, breaking partially included features as needed

o       Buffering

§         Creates new geometry representing an area within a given distance from selected features

§         By default creates one buffered object per feature.  Often useful to “join” output geometries

 


GIS Example: Site Selection for Low Cost Grocery Store Chain

n                   Conceptual Model

o      Brainstorm Criteria for "good" locations

n                   Case Study Example

o      Factors used in actual Commercial Shopping Center Site Selection

o      Powerpoint slides used by commercial firm to market site selection tools
by Edens & Avant and RPM consulting

n                  Think about these marketing slides?

o      Is the methodology or analytic scope overstated?

o      What considerations are omitted, shortchanged, badly measured?

o      From whose point of view is the siting service helpful or hurtful?


For next week: MS-Access basics - view online help or sign up for Element-K online training if unfamiliar with MS-Access


Last modified 15 September 2010 [jf]

Back to the 11.520 Home Page.
Back to the CRON Home Page.