Massachusetts Institute of Technology
Department of Urban Studies and Planning


Normalizing Census Data


What does it mean to "Normalize Data" ?

Most census data are counts - the number of people (or households, families, housing units, etc.) who reside within a particular census geography (e.g., a census block group) and meet some criterion (e.g., have at least a high school education). To judge whether that count is 'high' or 'low' we need an appropriate basis for comparison (e.g., the count of all people residing in that geography who are adults with a known education level).. Normalizing census data allows you to interpret data variables relative the universe in which they exist (by dividing the counts by the total count for the appropriate universe).

Consider this Example :

Consider a different example. Suppose you wanted to visualize the fraction of rents less than $500 in each block group. To compute this, we would need to add together several columns from the "Gross Rent" table (H43). In addition, we would need to normalize the data by dividing it by the appropriate universe, here "Specified renter occupied housing units." The total renter-occupied housing units are located in the table "Tenure" (H8) in item H0080002. The sum of the rental units in various rent categories would need to be divided by this value. By taking another look at "Using the File", we can see that tables
H8 and H43 are stored in different DBF files. We can use the field "Logrecnu," the logical record number, to link the extracts from the two files together.

Why normalize the data? Comparing the raw numbers of housing units per block group may be deceiving, as the total number of renter-occupied housing units will vary from one block group to the next. By dividing the number housing units with rent less than $500 by the total number of housing units, we obtain a fraction of the total occupied housing units with rents under $500. This fraction may be compared fairly among block groups.

ArcView and Normalization : normalize data in two ways.

Example of Normalization:

As a percent of total, consider the following

Attribute value for feature x
-------------------------------------------- = Proportion (%) of total contained in feature x
Sum of attribute values in all features

15 persons of Hispanic origin (in x)
---------------------------------------------- = 0.333 = 3.33% of total
450 persons of Hispanic origin (in total)

As one attribute normalized by another, consider this

Attribute value for feature x
----------------------------------- = Proportion (%) of universe that is the attribute
Universe value for feature x

15 persons of Hispanic origin (in x)
------------------------------------------ = 0.05 = 5.0% of the population in x is Hispanic
300 total persons (in x)

Also consider normalizing using attributes that associate with other attributes, for instance the
demographic concept of sex ratio (number of males per 100 females)

135 males (in x)
----------------------- = .818 or about 82 men for every 100 women
165 females (in x)

References :

Normalizing Census Data in ArcView: Concepts and Roadmap, ESRI Schools & Libraries Program, 2000 ESRI, Inc.



Last Modified by Sarah Williams (MIT Libraries) 01 October 2002.

Back to the 11.520 Home Page. Back to the CRL Home Page.