Decomposing prediction mechanisms for in-context recall: A toy problem using dynamical systemsSpeakerGireeja Ranade AffiliationAssociate Teaching Professor AbstractWe introduce a new family of dynamical system-based toy problems to study in-context learning (ICL) and associative recall in transformer models. These problems combine features of linear-regression-style continuous in-context learning (ICL) with discrete associative recall. Specifically, we consider symbolically-labeled interleaved observation segments from randomly drawn linear deterministic dynamical systems. We pretrain the transformer models on sample traces from this toy, and study if it can recall the state of a sequence previously seen in its context when prompted to do so with its in-context label. Performing the task requires the transformer model to do multiple tasks — e.g. identifying a dynamical system that a particular segment belongs to, or continuing a prediction once the prediction has been initiated etc. We find that the ability to predict the first token in a segment (which is similar to identifying the underlying system) shows emergence well into training. Surprisingly, predicting the second token in the segment (which is similar to continuing a prediction) can actually be done earlier, before the ability to predict the first token emerges. Via out-of-distribution experiments, and a mechanistic analysis on model weights via edge pruning, we find that next-token prediction for this toy problem involves at least two separate mechanisms. One mechanism uses the discrete labels to do the associative recall required to predict the start of a resumption of a previously seen sequence. The second mechanism, which is largely agnostic to the discrete labels, performs a Bayesian-style prediction based on the previous token and the context. These two mechanisms have different learning dynamics. To confirm that this multi-mechanism (manifesting as separate phase transitions) phenomenon is not just an artifact of our toy setting, we used OLMo training checkpoints on an ICL translation task to see a similar phenomenon: a decisive gap in the emergence of first-task-token performance vs second-task-token performance. Bio
|