CyberIntel ⬡ News
★ Saved ◆ Cyber Reads
← Back ◬ AI & Machine Learning Apr 01, 2026

Grokking From Abstraction to Intelligence

arXiv AI Archived Apr 01, 2026 ✓ Full text saved

arXiv:2603.29262v1 Announce Type: new Abstract: Grokking in modular arithmetic has established itself as the quintessential fruit fly experiment, serving as a critical domain for investigating the mechanistic origins of model generalization. Despite its significance, existing research remains narrowly focused on specific local circuits or optimization tuning, largely overlooking the global structural evolution that fundamentally drives this phenomenon. We propose that grokking originates from a

Full text archived locally
✦ AI Summary · Claude Sonnet


    Computer Science > Artificial Intelligence [Submitted on 31 Mar 2026] Grokking From Abstraction to Intelligence Junjie Zhang, Zhen Shen, Gang Xiong, Xisong Dong Grokking in modular arithmetic has established itself as the quintessential fruit fly experiment, serving as a critical domain for investigating the mechanistic origins of model generalization. Despite its significance, existing research remains narrowly focused on specific local circuits or optimization tuning, largely overlooking the global structural evolution that fundamentally drives this phenomenon. We propose that grokking originates from a spontaneous simplification of internal model structures governed by the principle of parsimony. We integrate causal, spectral, and algorithmic complexity measures alongside Singular Learning Theory to reveal that the transition from memorization to generalization corresponds to the physical collapse of redundant manifolds and deep information compression, offering a novel perspective for understanding the mechanisms of model overfitting and generalization. Comments: 22page and 5 figures,In this paper, we analyze the grokking phenomenon from the perspective of Singular Learning Theory (SLT). This work is currently under review for ICML 2026 Subjects: Artificial Intelligence (cs.AI) Cite as: arXiv:2603.29262 [cs.AI]   (or arXiv:2603.29262v1 [cs.AI] for this version)   https://doi.org/10.48550/arXiv.2603.29262 Focus to learn more Submission history From: JunJie Zhang [view email] [v1] Tue, 31 Mar 2026 04:57:12 UTC (4,511 KB) Access Paper: HTML (experimental) view license Current browse context: cs.AI < prev   |   next > new | recent | 2026-03 Change to browse by: cs References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
    💬 Team Notes
    Article Info
    Source
    arXiv AI
    Category
    ◬ AI & Machine Learning
    Published
    Apr 01, 2026
    Archived
    Apr 01, 2026
    Full Text
    ✓ Saved locally
    Open Original ↗