← Back ◬ AI & Machine Learning Mar 31, 2026

On the Optimal Number of Grids for Differentially Private Non-Interactive $K$-Means Clustering

arXiv Security Archived Mar 31, 2026 ✓ Full text saved

arXiv:2603.26963v1 Announce Type: new Abstract: Differentially private $K$-means clustering enables releasing cluster centers derived from a dataset while protecting the privacy of the individuals. Non-interactive clustering techniques based on privatized histograms are attractive because the released data synopsis can be reused for other downstream tasks without additional privacy loss. The choice of the number of grids for discretizing the data points is crucial, as it directly controls the qu

Full text archived locally

✦ AI Summary · Claude Sonnet

Computer Science > Cryptography and Security [Submitted on 27 Mar 2026] On the Optimal Number of Grids for Differentially Private Non-Interactive K-Means Clustering Gokularam Muthukrishnan, Anshoo Tandon Differentially private K-means clustering enables releasing cluster centers derived from a dataset while protecting the privacy of the individuals. Non-interactive clustering techniques based on privatized histograms are attractive because the released data synopsis can be reused for other downstream tasks without additional privacy loss. The choice of the number of grids for discretizing the data points is crucial, as it directly controls the quantization bias and the amount of noise injected to preserve privacy. The widely adopted strategy selects a grid size that is independent of the number of clusters and also relies on empirical tuning. In this work, we revisit this choice and propose a refined grid-size selection rule derived by minimizing an upper bound on the expected deviation in the K-means objective function, leading to a more principled discretization strategy for non-interactive private clustering. Compared to prior work, our grid resolution differs both in its dependence on the number of clusters and in the scaling with dataset size and privacy budget. Extensive numerical results elucidate that the proposed strategy results in accurate clustering compared to the state-of-the-art techniques, even under tight privacy budgets. Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML) Cite as: arXiv:2603.26963 [cs.CR] (or arXiv:2603.26963v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2603.26963 Focus to learn more Submission history From: Gokularam Muthukrishnan [view email] [v1] Fri, 27 Mar 2026 20:10:59 UTC (86 KB) Access Paper: HTML (experimental) view license Current browse context: cs.CR < prev | next > new | recent | 2026-03 Change to browse by: cs cs.LG eess eess.SP stat stat.ML References & Citations NASA ADS Google Scholar Semantic Scholar Export BibTeX Citation Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Demos Related Papers About arXivLabs Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

💬 Team Notes