DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation
arXiv AIArchived Mar 24, 2026✓ Full text saved
arXiv:2603.20470v1 Announce Type: new Abstract: The rapid growth of the text-to-image (T2I) community has fostered a thriving online ecosystem of expert models, which are variants of pretrained diffusion models specialized for diverse generative abilities. Yet, existing model merging methods remain limited in fully leveraging abundant online expert resources and still struggle to meet diverse in-the-wild user needs. We present DiffGraph, a novel agent-driven graph-based model merging framework,
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Artificial Intelligence
[Submitted on 20 Mar 2026]
DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation
Zhuoling Li, Hossein Rahmani, Jiarui Zhang, Yu Xue, Majid Mirmehdi, Jason Kuen, Jiuxiang Gu, Jun Liu
The rapid growth of the text-to-image (T2I) community has fostered a thriving online ecosystem of expert models, which are variants of pretrained diffusion models specialized for diverse generative abilities. Yet, existing model merging methods remain limited in fully leveraging abundant online expert resources and still struggle to meet diverse in-the-wild user needs. We present DiffGraph, a novel agent-driven graph-based model merging framework, which automatically harnesses online experts and flexibly merges them for diverse user needs. Our DiffGraph constructs a scalable graph and organizes ever-expanding online experts within it through node registration and calibration. Then, DiffGraph dynamically activates specific subgraphs based on user needs, enabling flexible combinations of different experts to achieve user-desired generation. Extensive experiments show the efficacy of our method.
Comments: CVPR
Subjects: Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.20470 [cs.AI]
(or arXiv:2603.20470v1 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2603.20470
Focus to learn more
Submission history
From: Zhuoling Li [view email]
[v1] Fri, 20 Mar 2026 19:54:41 UTC (12,100 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.AI
< prev | next >
new | recent | 2026-03
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)