← Back ◬ AI & Machine Learning Jun 24, 2026

Maestro Order: A Model-Agnostic Orchestration Harness

arXiv Security Archived Jun 24, 2026 ✓ Full text saved

arXiv:2606.23983v1 Announce Type: new Abstract: A single forward pass of a capable model is a fast, fluent, and unreliable problem-solver: it is right often enough to be useful and wrong often enough to be dangerous; in language models, such confident errors are known as hallucinations. We present Maestro Order, a model-agnostic orchestration harness that turns unreliable solvers into reliable problem-solving systems by composing them according to four structural primitives (decompose, ensemble,

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 22 Jun 2026] Maestro Order: A Model-Agnostic Orchestration Harness Hidayet Aksu A single forward pass of a capable model is a fast, fluent, and unreliable problem-solver: it is right often enough to be useful and wrong often enough to be dangerous; in language models, such confident errors are known as hallucinations. We present Maestro Order, a model-agnostic orchestration harness that turns unreliable solvers into reliable problem-solving systems by composing them according to four structural primitives (decompose, ensemble, verify, and recurse) and a budget-aware controller that decides where to spend compute. The harness treats any model as a black-box base solver behind a uniform interface, layers a verifier ensemble whose discrimination is measured online, and allocates verification and voting to the stages with the highest marginal reliability per unit cost. We give the architecture, the message and state schema, the controller algorithm, and the engineering that makes it deterministic, observable, and fault-tolerant. We then specify an evaluation methodology (reliability at fixed cost, coverage, calibration, and ablations) and report results from a faithful Monte Carlo simulation of the harness over a parameterized solver/verifier model. The simulation reproduces the predicted laws quantitatively: verification amplifies reliability geometrically (e.g. 0.55\to0.98 with two gates, \to0.999 with four), voting helps only above chance and is limited by shared errors, and a budget-aware controller reaches a target reliability at a small fraction of the cost of voting alone by selecting the cheapest mechanism for each regime. We close with failure modes (verifier gaming, correlated errors, and decomposition error compounding) and concrete guidance: build robust checkers, diversify solvers, and let the controller put compute where the information is. Comments: 10 pages, 4 figures Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO); Multiagent Systems (cs.MA) ACM classes: I.2.11; I.2.8 Cite as: arXiv:2606.23983 [cs.CR] (or arXiv:2606.23983v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2606.23983 Focus to learn more Submission history From: Hidayet Aksu [view email] [v1] Mon, 22 Jun 2026 22:21:59 UTC (168 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-06 Change to browse by: cs cs.AI cs.LO cs.MA References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes