The Loss Kernel: A Geometric Probe for Deep Learning Interpretability

Maxwell Adam

Timaeus & University of Melbourne

Zach Furman

University of Melbourne

Jesse Hoogland

Timaeus

September 30, 2025

Abstract

We introduce the loss kernel, an interpretability method for measuring similarity between data points according to a trained neural network. The kernel is the covariance matrix of per-sample losses computed under a distribution of low-loss-preserving parameter perturbations. We first validate our method on a synthetic multitask problem, showing it separates inputs by task as predicted by theory. We then apply this kernel to Inception-v1 to visualize the structure of ImageNet, and we show that the kernel's structure aligns with the WordNet semantic hierarchy. This establishes the loss kernel as a practical tool for interpretability and data attribution.