11.521/11.523: Lab #5: Raster Modeling with SQL and ArcGIS Spatial Analyst
Screen Sharing: http://mit.webex.com/join/jf
(0) Administrative
-- Lab #4 due today, March 6
-- Problem Set A: due Thursday, March 8
-- Problem Set B: due next Thursday, March 15
---- In Problem Set B, categorize all EAST BOSTON PARCELS by owner/landuse type
---- note you are *not* categorizing the zoning variances
-- Today's Lab #5, Part I due next Tuesday (March 13) ; Part II (model builder) due Tuesday, April 3 (after Sping Break)
-- In-lab test in two weeks (March 20), Practice tests from last two years are online
-- Extra lab session Friday, March 10, from 12:15 - 2:15 pm in 9-554
--- Useful to discuss lab exercises and SQL questions
--- Solutions to Lab #2 have been uploaded to Stellar
(1) Purpose of Lab #5
-- Use raster methods to examine spatial pattern of housing value for
East Boston Triple-deckers
-- Create housing value 'surface' for visualization and interpretation
-- Use 'spatial analyst' to interpolate East Boston housing value surface
based on 1985-1995 residential housing sales
-- Contrast a 'raster' approach with the parcel and block level estimates
we have done using assessed value data
-- Gain further experience with GIS+DBMS data manipulations and SQL queries
-- Introduce use of Model Builder to package repeatable geospatial data
processing steps
-- SKIM THROUGH THE ENTIRE EXERCISE BEFORE STARTING
--- Part I-1: thematic map of 1990 census block group median housing value
--- Part I-2: cut East Boston into 50 meter raster grid cells
--- Part I-3: interpolate grid cell housing values using real estate transaction data in Postgres
--- Part II: build 'Model Builder' model of part of the grid cell interpolation steps
-- If you have limited experience with raster-based GIS analysis, or you have never used Model Builder, review this lab #7 exercise from the 2015 Spring 11.188 GIS class:
http://web.mit.edu/11.188/www16/labs/lab7/lab7.html
(2) Details: in-lab examples
-- Creating 50 meter grid with East Boston 'mask'
--- Pay attention to environment settings:
---- coordinate system, cell size, cookie cutter location and extents, and mask
-- Watch my demo of creating and using eboston_pd from bos_planning_districts
---- Managing coordinate system issues for the data layers
---- Set environment via Geoprocessing/Environments (for general defaults) or 'environments' button within each ArcToolbox tool
---- Easiest to convert bos_planning_districts to Mass State Plane NAD83-meters at outset
-- Discuss BOSTON96.SALES
-- Generating parcel centroids to facilitate nearest neighborhood averaging
-- Parcel IDs for 1985-1995 sales are DIFFERENT from 2005 parcel IDs (i.e., you need the older 1996 parcel layer)
-- SQL queries of 65K housing sales to get right East Boston subset for estimation
-- Identifying 'arms length' sales
-- Complex SQL queries using 'IN' and sub-queries
-- Illustrate 'view' to get 'holdsales3'
-- Note value of saved SQL queries and Model Builder models for multi-step procedures that need exploration and tweaking
New Postgres tables for 1996 Boston data that are used in this lab:
boston96.SALES - transaction records of boston residential property sales (1986-1995)
boston96.PARCEL96 - Assessing data table for Boston 1996 properties (one row per assessing record)
boston96.PARCEL96SHP - attribute table for parcel96 shapefile (one row per ground parcel)
boston96.STCLASS - state landuse class code (same codes as in public.mass_landuse but fewer columns)
boston96.SALECODE - codes for sales transactions (arms length, via foreclosure,...)
These SQL queries can help us understand which columns are plausible primary keys:
select count(*), count(distinct parcel_id), count(distinct base_id)
from boston96.parcel96;
-- 138001;138001;92161
---- parcel_id is assessing record, base_id is ground parcel ID
select count(*), count(distinct parcel96_), count(distinct parcel96_i), count(distinct parcel_id),
count(distinct parcel_i_1), count(distinct base_id)
from boston96.parcel96shp;
-- 101213;101213;6572;6551;99859;91982
---- parcel_i_1 should be a unique parcel/unit id and base_id a unique
ground parcel for joining to the assessing data. However, there are some duplicates and we have used
parcel96_ as a primary key for the table.
(3) Performance issues
-- Be aware of the speed difference between on-disk vs. on-net shapefile, grid, or worksace use in ArcGIS
-- Make sure big files and heavy processing are done locally
-- Your Desktop and C:\user are *not* local drives. Use C:\TEMP or a local USB drive
-- using a local geodatabase for workspace and scratch space will help, especially if they are all
within a C:\temp folder that you can copy to a flash drive or the cloud before you leave.
(4) Test in two weeks - Tuesday, March 20, in W31-301 at 4 pm
-- Practice tests (from last two years) are available online
-- We will review test coverage during Thursday's class
-- test is designed for 1.5 hours but we will give you longer (all 3 hours)