arXiv:2603.29262v1 Announce Type: new Abstract: Grokking in modular arithmetic has established itself as the quintessential fruit fly experiment, serving as a critical domain for investigating the mechanistic origins of model generalization. Despite its significance, existing research remains narrowly focused on specific local circuits or optimization tuning, largely overlooking the global structural evolution that fundamentally drives this phenomenon. We propose that grokking originates from a
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Artificial Intelligence
[Submitted on 31 Mar 2026]
Grokking From Abstraction to Intelligence
Junjie Zhang, Zhen Shen, Gang Xiong, Xisong Dong
Grokking in modular arithmetic has established itself as the quintessential fruit fly experiment, serving as a critical domain for investigating the mechanistic origins of model generalization. Despite its significance, existing research remains narrowly focused on specific local circuits or optimization tuning, largely overlooking the global structural evolution that fundamentally drives this phenomenon. We propose that grokking originates from a spontaneous simplification of internal model structures governed by the principle of parsimony. We integrate causal, spectral, and algorithmic complexity measures alongside Singular Learning Theory to reveal that the transition from memorization to generalization corresponds to the physical collapse of redundant manifolds and deep information compression, offering a novel perspective for understanding the mechanisms of model overfitting and generalization.
Comments: 22page and 5 figures,In this paper, we analyze the grokking phenomenon from the perspective of Singular Learning Theory (SLT). This work is currently under review for ICML 2026
Subjects: Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.29262 [cs.AI]
(or arXiv:2603.29262v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2603.29262
Focus to learn more
Submission history
From: JunJie Zhang [view email]
[v1] Tue, 31 Mar 2026 04:57:12 UTC (4,511 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.AI
< prev | next >
new | recent | 2026-03
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)