Forget-MI: Machine Unlearning for Forgetting Multimodal Information in Healthcare Settings

Hardan, Shahad; Taratynova, Darya; Essofi, Abdelmajid; Nandakumar, Karthik; Yaqub, Mohammad

计算机科学 > 机器学习

arXiv:2506.23145v1 (cs)

[提交于 2025年6月29日 ]

标题：遗忘-MI：医疗环境中遗忘多模态信息的机器遗忘

标题： Forget-MI: Machine Unlearning for Forgetting Multimodal Information in Healthcare Settings

Authors:Shahad Hardan, Darya Taratynova, Abdelmajid Essofi, Karthik Nandakumar, Mohammad Yaqub

摘要：隐私保护在人工智能中至关重要，尤其是在医疗保健领域，其中模型依赖于敏感的患者数据。在机器遗忘这一新兴领域，现有方法难以从训练好的多模态架构中移除患者数据，这些架构在医疗保健中被广泛使用。我们提出了Forget-MI，一种用于多模态医疗数据的新颖机器遗忘方法，通过建立损失函数和扰动技术。我们的方法在保留剩余数据的知识并保持与原始模型相当的性能的同时，忘记了被要求遗忘的数据的单模态和联合表示。我们使用遗忘数据集上的性能、测试数据集上的性能以及成员推理攻击（MIA）来评估我们的结果，MIA用于衡量攻击者区分遗忘数据集和训练数据集的能力。我们的模型在减少MIA和遗忘数据集上的性能方面优于现有的方法，同时在测试集上保持相当的性能。具体而言，我们的方法将MIA降低了0.202，并分别将遗忘集上的AUC和F1分数降低了0.221和0.305。此外，我们的测试集性能与重新训练的模型相匹配，同时允许遗忘。代码可在 https://github.com/BioMedIA-MBZUAI/Forget-MI.git 获得

摘要： Privacy preservation in AI is crucial, especially in healthcare, where models rely on sensitive patient data. In the emerging field of machine unlearning, existing methodologies struggle to remove patient data from trained multimodal architectures, which are widely used in healthcare. We propose Forget-MI, a novel machine unlearning method for multimodal medical data, by establishing loss functions and perturbation techniques. Our approach unlearns unimodal and joint representations of the data requested to be forgotten while preserving knowledge from the remaining data and maintaining comparable performance to the original model. We evaluate our results using performance on the forget dataset, performance on the test dataset, and Membership Inference Attack (MIA), which measures the attacker's ability to distinguish the forget dataset from the training dataset. Our model outperforms the existing approaches that aim to reduce MIA and the performance on the forget dataset while keeping an equivalent performance on the test set. Specifically, our approach reduces MIA by 0.202 and decreases AUC and F1 scores on the forget set by 0.221 and 0.305, respectively. Additionally, our performance on the test set matches that of the retrained model, while allowing forgetting. Code is available at https://github.com/BioMedIA-MBZUAI/Forget-MI.git

主题：	机器学习 (cs.LG) ; 密码学与安全 (cs.CR); 计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2506.23145 [cs.LG]
	(或者 arXiv:2506.23145v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.23145

提交历史

来自： Shahad Hardan [查看电子邮件]
[v1] 星期日， 2025 年 6 月 29 日 08:53:23 UTC (901 KB)

计算机科学 > 机器学习

标题：遗忘-MI：医疗环境中遗忘多模态信息的机器遗忘

标题： Forget-MI: Machine Unlearning for Forgetting Multimodal Information in Healthcare Settings

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 遗忘-MI：医疗环境中遗忘多模态信息的机器遗忘 显示英文标题

标题： Forget-MI: Machine Unlearning for Forgetting Multimodal Information in Healthcare Settings

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：遗忘-MI：医疗环境中遗忘多模态信息的机器遗忘