Differentially Private Topological Data Analysis

Kang, Taegyu; Kim, Sehwan; Sohn, Jinwon; Awan, Jordan

统计学 > 机器学习

arXiv:2305.03609 (stat)

[提交于 2023年5月5日 (v1) ，最后修订 2023年11月3日 (此版本， v2)]

标题：差分隐私拓扑数据分析

标题： Differentially Private Topological Data Analysis

Authors:Taegyu Kang, Sehwan Kim, Jinwon Sohn, Jordan Awan

摘要：本文首次尝试了差分隐私（DP）拓扑数据分析（TDA），生成了接近最优的私有持久性图。我们以瓶颈距离来分析持久性图的敏感性，并表明常用的Čech复形的敏感性不会随着样本大小$n$的增加而减小。这使得Čech复形的持久性图难以被私有化。作为替代方案，我们展示了通过$L^1$-距离到度量（DTM）获得的持久性图具有$O(1/n)$的敏感性。基于敏感性分析，我们建议使用以$L^1$-DTM持久性图的瓶颈距离定义的效用函数的指数机制。我们还推导了我们隐私机制准确性的上下界；所得界限表明我们的机制的隐私误差接近最优。我们通过模拟以及在跟踪人类运动的真实数据集上展示了我们私有化持久性图的性能。

摘要： This paper is the first to attempt differentially private (DP) topological data analysis (TDA), producing near-optimal private persistence diagrams. We analyze the sensitivity of persistence diagrams in terms of the bottleneck distance, and we show that the commonly used \v{C}ech complex has sensitivity that does not decrease as the sample size $n$ increases. This makes it challenging for the persistence diagrams of \v{C}ech complexes to be privatized. As an alternative, we show that the persistence diagram obtained by the $L^1$-distance to measure (DTM) has sensitivity $O(1/n)$. Based on the sensitivity analysis, we propose using the exponential mechanism whose utility function is defined in terms of the bottleneck distance of the $L^1$-DTM persistence diagrams. We also derive upper and lower bounds of the accuracy of our privacy mechanism; the obtained bounds indicate that the privacy error of our mechanism is near-optimal. We demonstrate the performance of our privatized persistence diagrams through simulations as well as on a real dataset tracking human movement.

评论：	23页之前参考文献和附录，总共42页，8幅图
主题：	机器学习 (stat.ML) ; 计算几何 (cs.CG); 密码学与安全 (cs.CR); 机器学习 (cs.LG); 代数拓扑 (math.AT)
引用方式：	arXiv:2305.03609 [stat.ML]
	(或者 arXiv:2305.03609v2 [stat.ML] 对于此版本)
	https://doi.org/10.48550/arXiv.2305.03609

提交历史

来自： Jordan Awan [查看电子邮件]
[v1] 星期五， 2023 年 5 月 5 日 15:15:04 UTC (5,320 KB)
[v2] 星期五， 2023 年 11 月 3 日 16:55:55 UTC (5,734 KB)

统计学 > 机器学习

标题：差分隐私拓扑数据分析

标题： Differentially Private Topological Data Analysis

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

统计学 > 机器学习

标题： 差分隐私拓扑数据分析 显示英文标题

标题： Differentially Private Topological Data Analysis

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：差分隐私拓扑数据分析