Improving Multimodal Reasoning via Worst Dimension Optimization
arXiv AIArchived Jun 09, 2026✓ Full text saved
arXiv:2606.07801v1 Announce Type: new Abstract: Multimodal reasoning requires a path that retains integrity over a wide range of constraints, from visual grounding to logic consistency. However, the current Process Reward Models focus on heuristically defined rewards that equally weigh these factors, which may lead to the concealment of individual dimension failures by the dominating factors, without guaranteeing the validity of the reasoning process in general.
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Artificial Intelligence
[Submitted on 5 Jun 2026]
Improving Multimodal Reasoning via Worst Dimension Optimization
Haocheng Lv, Huaping Zhang, Qiuchi Li, Lei Li, Chunxiao Gao
Multimodal reasoning requires a path that retains integrity over a wide range of constraints, from visual grounding to logic consistency. However, the current Process Reward Models focus on heuristically defined rewards that equally weigh these factors, which may lead to the concealment of individual dimension failures by the dominating factors, without guaranteeing the validity of the reasoning process in general.
Subjects: Artificial Intelligence (cs.AI)
Cite as: arXiv:2606.07801 [cs.AI]
(or arXiv:2606.07801v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2606.07801
Focus to learn more
Submission history
From: Haocheng Lv [view email]
[v1] Fri, 5 Jun 2026 19:32:23 UTC (1,088 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.AI
< prev | next >
new | recent | 2026-06
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)