Decomposing prediction mechanisms for in-context recall: A toy problem using dynamical systems

Speaker

Gireeja Ranade

Affiliation

Associate Teaching Professor
Department of Electrical Engineering and Computer Science
University of California, Berkeley

Abstract

We introduce a new family of dynamical system-based toy problems to study in-context learning (ICL) and associative recall in transformer models. These problems combine features of linear-regression-style continuous in-context learning (ICL) with discrete associative recall. Specifically, we consider symbolically-labeled interleaved observation segments from randomly drawn linear deterministic dynamical systems. We pretrain the transformer models on sample traces from this toy, and study if it can recall the state of a sequence previously seen in its context when prompted to do so with its in-context label. Performing the task requires the transformer model to do multiple tasks — e.g. identifying a dynamical system that a particular segment belongs to, or continuing a prediction once the prediction has been initiated etc. We find that the ability to predict the first token in a segment (which is similar to identifying the underlying system) shows emergence well into training. Surprisingly, predicting the second token in the segment (which is similar to continuing a prediction) can actually be done earlier, before the ability to predict the first token emerges.

Via out-of-distribution experiments, and a mechanistic analysis on model weights via edge pruning, we find that next-token prediction for this toy problem involves at least two separate mechanisms. One mechanism uses the discrete labels to do the associative recall required to predict the start of a resumption of a previously seen sequence. The second mechanism, which is largely agnostic to the discrete labels, performs a Bayesian-style prediction based on the previous token and the context. These two mechanisms have different learning dynamics. To confirm that this multi-mechanism (manifesting as separate phase transitions) phenomenon is not just an artifact of our toy setting, we used OLMo training checkpoints on an ICL translation task to see a similar phenomenon: a decisive gap in the emergence of first-task-token performance vs second-task-token performance.

Bio

Gireeja Ranade is an Associate Teaching Professor in Electrical Engineering and Computer Sciences (EECS) at the University of California, Berkeley. Before this she was at Microsoft Research in Redmond, WA. She received her PhD in Electrical Engineering and Computer Science from the University of California, Berkeley, and her undergraduate degree from MIT in Cambridge, MA. She is the recipient of some awards for her research and teaching such as the NSF CAREER Award, the UC Berkeley Electrical Engineering Award for Outstanding Teaching, the UC Berkeley Award for Extraordinary Teaching in Extraordinary Times and others.