Next: Measure Performance Previous: Danger: Overfit

Measure Performance of a Learning Algorithm

Procedure

Collect large set of examples (as large and diverse as possible)

Divide into 2 disjoint sets (training set and test set)

Learn concept based on training set, generating hypothesis H

Classify test set examples using H, measure percentage correctly classified

Should demonstrate improved performance as training set size increases (learning curve)

How quickly does it learn?

$\psfig{figure=figures/LC.PS,width=4in,height=2.8in}$