Building 5-122
77 Massachusetts Avenue
Cambridge MA 02139-4307

tel. 617.253.2850
fax. 617.258.8792

Assessment and Evaluation


Educational assessment:  The collecting, synthesizing, and interpreting of data whose findings aid pedagogical decision making.  Areas of assessment include student performance, instructional strategies, educational technologies, and learning environment.1

Experiment:  A study undertaken in which the researcher has control over some of the conditions in which the study takes place and control over (some aspects of) the independent variables being studied.  Random assignment of subjects to control and experimental groups is usually thought of as a necessary criterion of a true experiment.3

External validity:  The extent to which the findings of a study are relevant to subjects and settings beyond those in the study.  Another term for generalizability. 3

Formative assessment:  A type of assessment conducted during the course of program implementation whose primary purpose is to provide information to improve the program under study.4

Internal validity:  The extent to which the results of a study (usually an experiment) can be attributed to the treatment rather than to flaws in the research design; in other words, the degree to which one can draw valid conclusions about the causal effects of one variable on another.3

Quasi-experiment:  A type of research design for conducting studies in field or real-life situations where the researcher may be able to manipulate some independent variables but cannot randomly assign subjects to control and experimental groups.3

Program evaluation:  The systematic investigation of the process and outcomes of an educational program or policy.4

Qualitative research:  Research that examines phenomena primarily through words and tends to focus on dynamics, meaning, and context.  Qualitative research usually uses observation, interviewing, and document reviews to collect data.4

Quantitative research:  Research that examines phenomena that can be expressed numerically and analyzed statistically.4

Reliability:  The extent to which scores obtained on a measure are reproducible in repeated administrations.2

Summative evaluation:  A study conducted at the end of a program (or a phase of a program) to determine the extent to which anticipated outcomes were produced.  Summative evaluation is intended to provide information about the worth of the program.4

Threats to validity:  Conditions other than the program [treatment] that could be responsible for observed net outcomes; conditions that typically occur in quasi-experiments and, unless controlled, limit confidence that findings are due solely to the program.  Threats to validity include selection, attrition, outside events or history, instrumentation, maturation, statistical regression, and testing.4

Triangulation:  Using multiple methods and/or data sources to study the same phenomenon; qualitative researchers frequently use triangulation to verify their data.4

Validity:  In measurement, validity refers to the extent to which a measure captures the dimension of interest.4


1 Airasian P. Classroom Assessment. 3rd ed. New York: McGraw-Hill, 1997. 

2 Rossi P.H., Freeman H.E., Lipsey M.W.  Evaluation:  A Systematic Approach.  6th ed.  Thousand Oaks, CA: Sage, 1999.

3 Vogt W.P. Dictionary of Statistics and Methodology.  Newbury Park, CA: Sage Publications, 1993.

4 Weiss C.H. Evaluation. 2nd ed. Saddle River, NJ: Prentice Hall, 1998.

Back to Top
About TLL Programs and Services Assessment and Evaluation Teaching Materials Research Education Innovation TLL Library