Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok

Date:

ICML