Loss Landscape Degeneracy and Stagewise Development of Transformers
Jesse Hoogland =
Timaeus
George Wang =
Timaeus
Matthew Farrugia-Roberts
Timaeus
Liam Carroll
Timaeus
Susan Wei
University of Melbourne
Daniel Murfet
University of Melbourne
February 4, 2024 · TMLR · Best Paper at 2024 ICML HiLD Workshop
Abstract
We show that in-context learning emerges in transformers in discrete developmental stages, when they are trained on either language modeling or linear regression tasks. We introduce two methods for detecting the milestones that separate these stages, by probing the geometry of the population loss in both parameter space and function space. We study the stages revealed by these new methods using a range of behavioral and structural metrics to establish their validity.
Main contributions:
- The LLC automatically detects hidden developmental stages. The LLC reveals a stage-wise development in the formation of in-context learning, much of which is hidden from the loss.
- Essential dynamics discovers emergent behaviors: This paper introduces essential dynamics and shows that this can be used to discover emergent behaviors. We expect that follow-ups of this technique will lead to new kinds of evals for automatically discovering emergent capabilities (rather than manually evaluated hand-picked capabilities).
- Developmental interpretability works. Upon further inspection, we find that these hidden stages can be interpreted both behaviorally & structurally. The developmental approach supports mechanistic and behavioral analyses.
See the accompanying tweet thread (distillation coming soon).
Cite as
@article{hoogland2024developmental,
journal = {Transactions on Machine Learning Research},
url = {https://openreview.net/forum?id=2JabyZjM5H},
year = {2025},
title = {Loss Landscape Degeneracy and Stagewise Development of Transformers},
author = {Jesse Hoogland and George Wang and Matthew Farrugia-Roberts and Liam Carroll and Susan Wei and Daniel Murfet},
abstract = {We show that in-context learning emerges in transformers in discrete developmental stages, when they are trained on either language modeling or linear regression tasks. We introduce two methods for detecting the milestones that separate these stages, by probing the geometry of the population loss in both parameter space and function space. We study the stages revealed by these new methods using a range of behavioral and structural metrics to establish their validity.}
}Click to copy