Cross-Vendor Sola ISPM Benchmark: Evaluating Agentic AI for Federated Identity Security Reasoning
arXiv SecurityArchived Jun 03, 2026✓ Full text saved
arXiv:2606.02674v1 Announce Type: new Abstract: The rapid proliferation of multi-cloud and SaaS platforms has transformed Identity Security Posture Management (ISPM) into a fundamentally cross-vendor challenge: critical misconfigurations and privilege escalation paths increasingly span multiple identity providers, infrastructure layers, and authentication systems never designed to interoperate. Existing evaluations focus on isolated single-platform environments and provide no means to assess whe
Full text archived locally
✦ AI Summary· Claude Sonnet
Computer Science > Cryptography and Security
[Submitted on 1 Jun 2026]
Cross-Vendor Sola ISPM Benchmark: Evaluating Agentic AI for Federated Identity Security Reasoning
Eden Yavin, Gal Engelberg, Konstantin Koutsyi, Leon Goldberg, Gal Baron
The rapid proliferation of multi-cloud and SaaS platforms has transformed Identity Security Posture Management (ISPM) into a fundamentally cross-vendor challenge: critical misconfigurations and privilege escalation paths increasingly span multiple identity providers, infrastructure layers, and authentication systems never designed to interoperate. Existing evaluations focus on isolated single-platform environments and provide no means to assess whether an AI agent can reason across these fragmented boundaries. To address this gap, we introduce the Cross-Vendor Sola ISPM Benchmark, a production-grade benchmark of 50 data-grounded tasks requiring multi-hop entity resolution and cross-system correlation across eight integrated enterprise platforms including AWS, Okta, Azure AD, and Google Workspace. We also contribute an evaluation framework measuring not only final answer correctness but also evidentiary grounding, structural join fidelity, retrieval quality, and SQL equivalence. We evaluate the Sola AI Agent across five context configurations - from no injected metadata to full schema, graph, and retrieval context - using three frontier LLMs. Results show that structured relational context improves answer correctness by approximately 34% relatively and reduces exploration queries by approximately 70% across all tested models, with the largest gains driven by cross-vendor graph topology. Our findings indicate that frontier LLMs possess substantial latent security reasoning capability, but reliable cross-vendor identity analysis is fundamentally constrained by the availability of explicit relational context for entity resolution and evidentiary grounding. Under full context, the best configuration achieves 78% answer correctness while reducing complete failure to 4%.
Comments: 22 pages, 4 figures, 8 tables, 2 appendices
Subjects: Cryptography and Security (cs.CR)
Cite as: arXiv:2606.02674 [cs.CR]
(or arXiv:2606.02674v1 [cs.CR] for this version)
https://doi.org/10.48550/arXiv.2606.02674
Focus to learn more
Submission history
From: Gal Engelberg [view email]
[v1] Mon, 1 Jun 2026 12:41:00 UTC (3,769 KB)
Access Paper:
HTML (experimental)
view license
Current browse context:
cs.CR
< prev | next >
new | recent | 2026-06
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
Export BibTeX Citation
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media
Demos
Related Papers
About arXivLabs
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)