6.863J/9.611J Natural Language Processing
 
 
Course home
[  Main  ] [  About ] [ Assignments ]
 

Staff
Prof. Robert C. Berwick
berwick@csail.mit.edu
32-D728, x3-8918
Office hours: W 4:30-5:30

Course Support
Lisa Gaumond
lisag@mit.edu
32-D724, 617-324-1543
TA: Rob Speer, 32-226
rspeer@mit.edu; office hrs Tues, 2-5

Course Time & Place
Lectures: M, W 3-4:30 PM
Room: 32-144,  map

Level & Prerequisites
Undergrad/Graduate; 6.034 or permission of instructor

Policies
Textbooks & readings
Grading marks guide
Style guide

Course Description

A laboratory-oriented course in the theory and practice of building computer systems for human language processing, with an emphasis on how human knowledge of language can be integrated into natural language processing.

This subject qualifies as an Artificial Intelligence and Applications concentration subject.

Announcements

Class days in blue, holidays in green, reg add/drop/final project dates in orange.

February 2008
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1
2
3
4
5 6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 28 29
March 2008
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26 27 28 29
30 31          
April 2008
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30    
             
May 2007
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28 29 30 31
Course schedule at a glance
Date
Topic
Slides & Reference Readings
Laboratory/Assignments

2/6
Weds

Introduction: walking the walk, talking the talk
Lecture 1 pdf slides; pdf bw, 4-up
Jurafsky & Martin (JM) ch. 4 pp. 1-8; review ch 2 on finite-state automata/regular expressions if necessary.
NLTK docs, ch. 1-3 or if you already know python, just ch. 3 on words.
• Background Reading (for RR 1): Jurafsky & Martin on ngrams.
• Background Reading (for RR 1): Abney on statistics and language.
Background Reading (for RR 1): Chomsky, Extract on grammaticality, 1955.
Background chapters on NLP from Russell & Norvig, ch. 22.
Reading & response 1 out
(Ngrams; NLTK Python warmup)
NLTK installation here
2/11
Mon
Ngrams; smoothing; Word parsing & transducers
RR1 discussion

Lecture 2 pdf slides; pdf bw 4-up
JM ch. 3; ch 10, pp. 1–7;
Notes on finite-state automata and learning: Notes 1
• Angluin, Induction of k reversible automata
• Berwick & Pilato, Learning syntax by automata induction
Background Reading: Kartunnen, History of two-level morphology, 1996.
Reading & response 1 due MON
2/13
Weds
Word parsing II; complexity issues
Lecture 3 pdf slides; pdf 4-up
• Background Reading (RR 2): Harris, From phoneme to morpheme, 1955.
2/19
Tues
Word parsing complexity; What do childrend do? Part of speech tagging

• Lecture 4 pdf slides; pdf 4-up
Background Reading (RR 2): Saffran, Statistical learning in 8-month-old infants, 1996.
Background reading: Yang, Universal grammar, statistics, or both?, 2004.

2/20
Weds
Part of speech tagging; Finding words by MDL
• Lecture 5 pdf slides; pdf 4-up
2/25
Mon
Parsing & syntax I • Airline delay: Lecture 6 pdf slides; pdf 4-up
• JM, ch. 6
• NLTK docs, words & tagging, ch. 4
Reading & response 2 due MON
2/27
Weds
Airline parsing

• Lecture 6 pdf slides; pdf 4-up
• Russell & Norvig, ch 23.
• JM, ch. 11 & ch. 12

Lab 1 parts 1 and 2 due WEDS
3/3
Mon

RR2 discussion; Parsing II: basic dynamic programming

• Lecture 7 pdf slides; pdf 4-up
• JM ch. 12 new (cf parsing) pdf.

Lab 1 part 3 due MON
Lab
2 out
3/5
Weds
Earley parsing; Probabilistic parsing & Treebanks
• Lecture 8 pdf slides; pdf 4-up
• Lecture 8a ('animation' of probabilistic CKY) here.
• Billot & Lang on 'packed parsed forests' here. (Warning: advanced automata theory required to understand this paper.)



3/10
Mon

Learning syntax I: basic results

• Lecture 9 pdf slides; pdf 4-up
NLTK docs, ch. 8
• Background Reading: Levelt, Grammatical inference
• (brief); Pinker, Formal models of language learning.
• Background Reading: Gold, Language identification in the limit, 1967.

3/12
Weds
Learning syntax II; More on basic results
• Lecture 10 pdf slides; pdf 4-up
NLTK docs, ch.9