SLT & Alignment Summit 2023
Announcement
Read the announcement on LessWrong.For videos, see here.
For slides, see here.
Schedule
The summit comprises two parts: a virtual primer (available to everyone) and an in-person conference (with limited accommodation available).
All times are displayed in PDT.
Open to everyone. The aim of the Primer is to give a general introduction to Singular Learning Theory (SLT) and related areas of mathematics and physics, with the aim of providing a foundation for theoretical and experimental work on AI alignment. More concretely, we aim to explain the Free Energy Formula derived by Watanabe, what its terms mean, how to apply it to understand the phase structure of a learning machine, and how to derive intuition for the resulting picture from physics.
This will build on existing lectures at metauni as well as introduce novel content.
Time | Monday | Tuesday | Wednesday | Thursday | Friday |
9:00-10:30 | Welcome | SLT High 1 | Physics 3 | ||
11:00-12:30 | SLT Low 1 | SLT Low 2 | SLT Low 3 | SLT Low 4 | |
1:30-3:00 | Physics 1 | Physics 2 | SLT High 2 | Physics 4 | |
3:30-5:00 | Alignment 1 | Alignment 2 | Mech interp 1 | Mech interp 2 | SLT High 4 |
18:00-18:30 | Watanabe | The Plan |
SLT High Road
The SLT “high road” explains the conceptual toolkit and how to use it to reason about learning machines, leaving the proofs and details for later (“just tell me why it’s useful to know this”).
- SLT High 1 (Dan Murfet): Logic of Phase Transitions
- SLT High 2 (Liam Carroll): Phase transitions in toy ReLU networks
- SLT High 3 (Dan Murfet): Meaning of the RLCT.
- SLT High 4 (Dan Murfet): Thermodynamics of Superposition
SLT Low Road
The SLT “low road” looks at detailed examples and calculations and sketches of how the mathematical theory fits together (“show me how it works in an example”).
- SLT Low 1 (Edmund Lau): Introduction to Bayesian probability, Bayesian posterior and model selection
- SLT Low 2 (Edmund Lau): Bridging regular and singular
- SLT Low 3 (Zhongtian Chen): Introduction to Algebraic Geometry, blowups, resolution of singularities and computing RLCTs
- SLT Low 4 (Edmund Lau): Sketch of derivation of Free Energy Formula.
Physics
- Physics 1 (Jesse Hoogland): The Physics of Intelligence: from Classical to Singular Learning Theory
- Physics 2 (Jesse Hoogland): Statistical mechanics, Boltzmann distribution, free energy, phases and phase transitions
- Physics 3 (Jesse Hoogland): Singularities and nonlinear dynamics (following e.g. Strogatz)
- Physics 4 (Dan Murfet): Catastrophe theory
Alignment and Mechanistic Interpretability
- Alignment 1 (Jesse Hoogland): The case for AI X-risk
- Alignment 2 (Jesse Hoogland): The state of AI safety
- Mech interp 1 (Ben Gerraty): Introduction and Superposition
- Mech interp 2 ( Rohan Hitchcock): Induction Heads and Phase Transitions
- The Plan ( Jesse Hoogland, Dan Murfet): An outline of the plan for applying SLT to alignment, via developmental interpretability.
The second week of the workshop is for discussing open problems, collaboration, and more mathematical details beyond the introductions in the first week. Add the schedule to your Google calendar.