← Back ◬ AI & Machine Learning —

Form Follows Function: Recursive Stem Model

arXiv AI Archived Mar 18, 2026 ✓ Full text saved

arXiv:2603.15641v1 Announce Type: new Abstract: Recursive reasoning models such as Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM) show that small, weight-shared networks can solve compute-heavy and NP puzzles by iteratively refining latent states, but their training typically relies on deep supervision and/or long unrolls that increase wall-clock cost and can bias the model toward greedy intermediate behavior. We introduce Recursive Stem Model (RSM), a recursive reasoning appr

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Artificial Intelligence [Submitted on 3 Mar 2026] Form Follows Function: Recursive Stem Model Navid Hakimi Recursive reasoning models such as Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM) show that small, weight-shared networks can solve compute-heavy and NP puzzles by iteratively refining latent states, but their training typically relies on deep supervision and/or long unrolls that increase wall-clock cost and can bias the model toward greedy intermediate behavior. We introduce Recursive Stem Model (RSM), a recursive reasoning approach that keeps the TRM-style backbone while changing the training contract so the network learns a stable, depth-agnostic transition operator. RSM fully detaches the hidden-state history during training, treats early iterations as detached "warm-up" steps, and applies loss only at the final step. We further grow the outer recursion depth H and inner compute depth L independently and use a stochastic outer-transition scheme (stochastic depth over H) to mitigate instability when increasing depth. This yields two key capabilities: (i) >20\times faster training than TRM while improving accuracy (\approx 5\times reduction in error rate), and (ii) test-time scaling where inference can run for arbitrarily many refinement steps (\sim 20,000 H_{\text{test}} \gg 20 H_{\text{train}}), enabling additional "thinking" without retraining. On Sudoku-Extreme, RSM reaches 97.5% exact accuracy with test-time compute (within ~1 hour of training on a single A100), and on Maze-Hard (30 \times 30) it reaches ~80% exact accuracy in ~40 minutes using attention-based instantiation. Finally, because RSM implements an iterative settling process, convergence behavior provides a simple, architecture-native reliability signal: non-settling trajectories warn that the model has not reached a viable solution and can be a guard against hallucination, while stable fixed points can be paired with domain verifiers for practical correctness checks. Comments: 11 pages, 9 figures Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE) Cite as: arXiv:2603.15641 [cs.AI] (or arXiv:2603.15641v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2603.15641 Focus to learn more Submission history From: Navid Hakimi [view email] [v1] Tue, 3 Mar 2026 00:55:00 UTC (1,266 KB) Access Paper: view license Current browse context: cs.AI < prev | next > new | recent | 2026-03 Change to browse by: cs cs.LG cs.NE References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes