AGWM: Affordance-Grounded World Models for Environments with Compositional Prerequisites
arXiv AIArchived May 11, 2026✓ Full text saved
arXiv:2605.06841v1 Announce Type: new Abstract: In model-based learning, the agent learns behaviors by simulating trajectories based on world model predictions. Standard world models typically learn a stationary transition function that maps states and actions to next states, when an action and an outcome frequently co-occur in training data, the model tends to internalize this correlation as a general causal rule while ignoring action preconditions. In interactive environments, however, agent a
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Artificial Intelligence
[Submitted on 7 May 2026]
AGWM: Affordance-Grounded World Models for Environments with Compositional Prerequisites
Qinshi Zhang (1), Weipeng Deng (2), Zhihan Jiang (3), Jiaming Qu (4), Qianren Li (5), Weitao Xu (5), Ray LC (5) ((1) University of California, San Diego, (2) University of Hong Kong, (3) Columbia University, (4) Amazon, (5) City University of Hong Kong)
In model-based learning, the agent learns behaviors by simulating trajectories based on world model predictions. Standard world models typically learn a stationary transition function that maps states and actions to next states, when an action and an outcome frequently co-occur in training data, the model tends to internalize this correlation as a general causal rule while ignoring action preconditions. In interactive environments, however, agent actions can reshape the future affordance space. At each timestep, an action may becomes executable only after its prerequisites are met, or non-executable when they are destroyed. We term such events structure-changing events (SC events). As a result, a conventional world model often fails to determine whether a given action is executable in the current state, especially in multi-step predictions. Each imagined step is conditioned on an incorrect affordance state, and therefore the prediction error compounds over the rollout horizon. In this paper, we propose AGWM (Affordance-Grounded World Model), which learns an abstract affordance structure represented as a DAG of prerequisite dependencies to explicitly track the dynamic executability of actions. Experiments on game-based simulated environments demonstrate the effectiveness of our method by achieving lower multi-step prediction error, better generalization to novel configurations, and improved interpretability.
Comments: 16 pages, 3 figures, 4 tables. Appendix on pages 11-16 (main text is self-contained)
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
ACM classes: I.2.6; I.2.8
Cite as: arXiv:2605.06841 [cs.AI]
(or arXiv:2605.06841v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2605.06841
Focus to learn more
Submission history
From: Qinshi Zhang [view email]
[v1] Thu, 7 May 2026 18:46:44 UTC (5,119 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.AI
< prev | next >
new | recent | 2026-05
Change to browse by:
cs
cs.LG
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)