← Back ◬ AI & Machine Learning May 22, 2026

Polars inside Intel SGX2 Enclaves: An Empirical Study of Confidential Analytical Query Processing

arXiv Security Archived May 22, 2026 ✓ Full text saved

arXiv:2605.21797v1 Announce Type: new Abstract: Trusted Execution Environments (TEEs) have renewed interest in confidential analytics, but most prior evaluations focus on SQL database engines or earlier SGX generations. This paper studies an Arrow-native DataFrame engine, Polars, running inside Intel SGX2 enclaves via Gramine on TPC-H SF30 with Azure Blob Storage. We report both the standard TPC-H power score and a query-only variant that removes table-loading time in order to separate compute o

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 20 May 2026] Polars inside Intel SGX2 Enclaves: An Empirical Study of Confidential Analytical Query Processing Wei Wang, Burns Smith, Kenny Leftin Trusted Execution Environments (TEEs) have renewed interest in confidential analytics, but most prior evaluations focus on SQL database engines or earlier SGX generations. This paper studies an Arrow-native DataFrame engine, Polars, running inside Intel SGX2 enclaves via Gramine on TPC-H SF30 with Azure Blob Storage. We report both the standard TPC-H power score and a query-only variant that removes table-loading time in order to separate compute overhead from data-ingestion overhead. Across four dataset-width configurations (approximately 22-73 GB), end-to-end overhead remains nearly constant at 1.49-1.56\times, but this composite metric obscures two distinct behaviors: query-only overhead declines from 1.51-1.52\times to 1.43-1.44\times, whereas table-loading overhead rises from 2.27\times to 4.07\times. We further show that overhead is not uniform across queries: for the len130 configuration, the median per-query SGX slowdown is 1.45\times with a maximum of 2.57\times, and a small set of queries exhibits pronounced run-to-run spikes consistent with stateful EPC pressure. Finally, we compare Polars' lazy and eager APIs under the same TEE setting. Lazy execution is 2.25-2.27\times faster overall, while eager execution fails with out-of-memory errors at 41 GB and above. Relative to the recent DuckDB-SGX2 study, our results suggest that SGX2 can support Arrow-native analytical processing with a similar order of security overhead, but that load-path amplification and API-level optimization are first-order determinants of end-to-end performance. Subjects: Cryptography and Security (cs.CR); Databases (cs.DB) Cite as: arXiv:2605.21797 [cs.CR] (or arXiv:2605.21797v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2605.21797 Focus to learn more Submission history From: Wei Wang [view email] [v1] Wed, 20 May 2026 22:47:28 UTC (12 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-05 Change to browse by: cs cs.DB References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes