The Structure and Interpretation of Deep Networks

A collaborative class handbook on current research methods in mechanistic interpretability.

Table of Contents

Introduction and History

Understanding Representation

Understanding Computation

Understanding Learning

Understanding the World

Class Projects