1.017/1.010 

Fall 2003 Syllabus

Instructor: 

Prof. Dennis McLaughlin

email: dennism

NE20-290K 

253-7176 

TA:

Ms. Holly Michael

email: hollym

NE20-254

253-5483


This subject is a computer-oriented introduction to probability and data analysis.  It is designed to give students the knowledge and practical experience they need to interpret lab and field data. 

Basic probability concepts are introduced at the outset because they provide a systematic way to describe uncertainty.  They form the basis for the analysis of quantitative data in science and engineering.  The MATLAB programming language is used to perform virtual experiments and to analyze real-world data sets, many downloaded from the web.   Programming applications include display and assessment of data sets, investigation of hypotheses, and identification of possible casual relationships between variables. 

Class periods will generally be divided into 40 minutes of lecture and 40 minutes of related hands-on computer work using laptops available in the classroom.  In the beginning of the semester recitations will provide computer programming background for students without previous programming experience.   Later, these recitations will be used for more in-depth virtual experiments and data analysis exercises and for discussion of homework problems.

The class includes several homework sets and three quizzes held during the two hour recitation periods.  The grade will be based 40% on homework and in-class exercises and 20% on each of the three quizzes.  The lowest homework grade will not be counted.  There will be no final exam. 

The primary text is "Probability and Statistics for Scientists and Engineers, Jay L. Devore, Duxbury Press, 2000" (noted by D below).  Students not familiar with MATLAB should also consider purchasing one of the many introductory texts on this programming package.  A reasonable choice that is easy to read and moderately priced is "Introduction to MATLAB for Engineers and Scientists, D. Etter, Prentice-Hall, 1996" (noted by E below).  Both of these texts have been ordered by the MIT COOP. 

Click on highlighted links for each class to access lecture outline, problem set, or other relevant information. 

No.

Date     

Topics (PDF)

Problem  Sets (PDF)

Readings 

1      

Sept. 4

Course Introduction
Real and virtual experiments, uncertainty, probability, risk, statistics

 

D1.1

 

Sept. 8

Recitation  1 -- Programming in MATLAB 
Downloading data, accessing MATLAB, 
MATLAB environment, variables, arrays, scripts.  Plotting data

 

E2.1-2.6

2

Sept. 9

Descriptive Statistics
Histograms, percentiles, mean, median, variance, etc.
Characterizing streamflow data

   

3

Sept. 11

Probability
Experiments, outcomes, sample spaces, events, probability, axioms of probability. Methods for assigning probabilities.

PS1 out

D1.2-1.4

 

Sept. 15

Recitation 2  -- MATLAB Operations
Internal MATLAB functions.  Common MATLAB operations, element-wise computations, loops
Translating equations to programs

 

E3.1-3.4

4

Sept. 16

Joint Probability, Independence, Repeated Trials
Joint probability, independent events, repeated trials, Bernoulli trials

 

D2.1-2.2

5

Sept. 18

Combinatorial Methods
Counting rules, combinatorial techniques for evaluating probabilities. Examples.

PS1 in
PS2 out

D2.3

 

Sept. 22

Recitation 3 -- MATLAB Tests and Loops
Relational and logical operations, user-defined functions, if tests.

 

E3.5-3.6

6

Sept. 23

Conditional Probability and Bayes Theorem
Joint probability, conditional probability, prior & posterior probabilities, Bayes theorem.

 

D2.4-2.5 

7

Sept. 25

Random Variables and Probability Distributions
Definition of a random variable.  Cumulative distribution functions, mass and density functions.  Using distributions to assign probabilities.

PS2 in
PS3 out

 
 

Sept. 29

Recitation 4-- Virtual Experiments

   

No.

Date

Topic

Problem  Set 

Readings 

8

Sept. 30

Expectation, Functions of a Random Variable
Expectation, mean and variance of a random variable.  Functions of a single random variable. Solving derived distribution problems with stochastic simulation.

 

D3.1-3.2, E4.1-4.2

9

Oct. 2

Risk
Defining and evaluating risk. Engineering applications.

PS3 in 

D3.3

 

Oct. 6

Quiz Review

   
 

Oct. 7

Quiz 1

   

10

Oct. 9

Some Common Probability Distributions
Binomial, Poisson, uniform, exponential, normal, and lognormal distributions.
Fitting distributions to data

PS4 out

D3.4-3.6,
D4.3-4.6

 

Oct. 13

No Class

   

11

Oct. 14

Multivariate Probability
Multiple random variables, joint and conditional distributions, independence, covariance and correlation.
Computing conditional probabilities in MATLAB

 

D5.1-5.2

12 Oct. 16

Functions of many random variables
Derived distributions for multivariate problems, moments of linear functions of several random variables.
Central Limit Theorem.

   
  Oct. 20 Recitation 5-- Time Series    

13

Oct. 21

Populations and Samples
Populations, random samples. Sample statistics, moments of the sample mean and variance.

PS4 in
PS5 out

D5.3-5.5

14

Oct. 23

Estimation
Estimating distributional properties, method of moments, assessing estimation error.
Comparing alternative estimators.

 

D6.1-6.2

No.

Date

Topic

Problem  Set 

Readings 

15

Oct. 28

Confidence Intervals
Basic concepts, Large sample confidence intervals for the population mean. 
Computing large sample confidence intervals

PS5 in
PS6 out

D7.1-7.3

 

Oct. 27

Review

   

16

Oct. 28

Testing Hypotheses about a Single Population
Formulating hypothesis testing problems, definitions.  Large sample tests of hypotheses about a single population
Applications using MATLAB

 

D8.1-8.5

17

Oct. 30

Testing Hypotheses about Two Populations
Large sample tests of hypotheses about two populations.  Controlled experiments.
Applications using MATLAB.

PS6 in

D9.1-9.3

 

Nov. 3

Quiz Review

   

 

Nov. 4

Quiz 2

   

18

Nov. 6

Small Samples
t, chi-squared, and F statistics, Small sample confidence intervals and hypothesis tests.
Applications using MATLAB.

PS7 out

 
 

Nov. 10

No Class

   
 

Nov. 11

No Class

   

19

Nov. 13

Analysis of Variance
Formulating linear models, definitions, posing hypotheses about model parameters

PS7 in
PS8 out

 
 

Nov. 17

Review of Quiz 2, ANOVA examples

   

20

Nov. 18

Analysis of Variance (continued)
Testing the significance of a single factor, the F test

   

21

Nov. 20

Multifactor Analysis of Variance
Extension of the single-factor model, significance testing
Applications on MATLAB

PS8 in
PS9 out

D11.1-11.4

 

Nov. 24

Examples

   

22

Nov. 25

Linear Regression
Objectives and assumptions of linear regression, estimating regression coefficients, normal equations
Some typical environmental applications

 

D12.1-12.2

 

Nov. 27

No Class

   
 

Dec. 1

Quiz Review

   

23

Dec. 2

Analyzing Regression Results
Accuracy of regression estimates and predictions, prediction confidence intervals, testing significance.
Continuation of environmental examples

PS9 in
 

D12.3-12.4 

 

Dec. 4

Quiz 3

   
 

Dec. 8

No Class

   

24

Dec. 9

Some Practical Applications

   

 Copyright 2003 Massachusetts Institute of Technology
 Last modified Sept. 3, 2003