Welcome, prefrosh! MAIA events April 16 to 18.

See events

AI Safety Fundamentals

Spring 2026 Curriculum

An 8-week introductory reading group covering the current trajectory of AI, evidence for misalignment, threat models, technical safety approaches, and the AI policy landscape. Participants meet weekly in small sections facilitated by experienced TAs. No work is assigned outside of weekly meetings.

0

Introduction to Machine Learning

Self-paced ML fundamentals: neural networks, transformers, and backpropagation.

1

Transformative AI and Current Trajectory

Scaling drivers, capability trends, and time-horizon forecasting toward AGI.

2

Outer Alignment

Reward misspecification, specification gaming, RLHF, and the gap between intended and operationalized goals.

3

Inner Alignment

Deception, reward tampering, alignment faking, and goal misgeneralization.

4

Threat Models

Instrumental convergence, power-seeking, bioterrorism, cyberwarfare, and gradual disempowerment.

5 Coming Soon

Control and Scalable Oversight

AI control techniques, weak-to-strong generalization, debate, and iterated amplification.

6 Coming Soon

Mechanistic Interpretability and Evals

Circuits, sparse autoencoders, feature visualization, and evaluation methodologies.

7 Coming Soon

International Policy and Liability

Export controls, compute governance, model weight security, and AI liability frameworks.

8 Coming Soon

Policy, Careers in Alignment

AI regulation toolbox, career paths in alignment research, and next steps.