Safe-LLaVA: A Privacy-Preserving Vision-Language Dataset and Benchmark for Biometric Safety

Kim, Younggun; Swetha, Sirnam; Kagdi, Fazil; Shah, Mubarak

计算机科学 > 计算机视觉与模式识别

arXiv:2509.00192 (cs)

[提交于 2025年8月29日 ]

标题： Safe-LLaVA：一种用于生物特征安全的隐私保护视觉语言数据集和基准测试

标题： Safe-LLaVA: A Privacy-Preserving Vision-Language Dataset and Benchmark for Biometric Safety

Authors:Younggun Kim, Sirnam Swetha, Fazil Kagdi, Mubarak Shah

摘要：多模态大语言模型（MLLMs）在视觉-语言任务中表现出色。然而，这些模型常常推断并泄露敏感的生物特征属性——如种族、性别、年龄、体重和眼睛颜色——即使这些信息并未被明确请求。这引发了关键的担忧，特别是在现实世界的应用和社会敏感领域。尽管意识有所提高，但目前尚无公开可用的数据集或基准来全面评估或减轻MLLM中的生物特征泄露。为解决这一差距，我们引入了PRISM（敏感模态响应的隐私意识评估），这是一个新的基准，旨在从两个方面评估MLLM：（1）拒绝与生物特征相关的查询，（2）在保持语义忠实性的前提下，隐式生物特征泄露的一般响应。此外，我们对广泛使用的LLaVA数据集进行了详细审计，并发现了预训练和指令数据中的大量生物特征泄露。为了解决这个问题，我们提出了Safe-LLaVA数据集，这是第一个通过系统地从LLaVA数据集中删除显性和隐性生物特征信息构建的隐私保护MLLM训练数据集。我们在PRISM上的评估揭示了不同属性在MLLM中的生物特征泄露，突显了详细的隐私侵犯。我们还在Safe-LLaVA数据集上微调了一个模型，并表明它显著减少了生物特征泄露。总之，Safe-LLaVA和PRISM为MLLM的隐私对齐开发和评估设定了新标准。 Safe-LLaVA数据集和PRISM基准可在https://huggingface.co/datasets/kyh9191/Safe-LLaVA公开获取，源代码可在https://github.com/Kimyounggun99/Safe-LLaVA.git获取。

摘要： Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in vision-language tasks. However, these models often infer and reveal sensitive biometric attributes - such as race, gender, age, body weight, and eye color - even when such information is not explicitly requested. This raises critical concerns, particularly in real-world applications and socially-sensitive domains. Despite increasing awareness, no publicly available dataset or benchmark exists to comprehensively evaluate or mitigate biometric leakage in MLLMs. To address this gap, we introduce PRISM (Privacy-aware Evaluation of Responses in Sensitive Modalities), a new benchmark designed to assess MLLMs on two fronts: (1) refuse biometric-related queries and (2) implicit biometric leakage in general responses while maintaining semantic faithfulness. Further, we conduct a detailed audit of the widely used LLaVA datasets and uncover extensive biometric leakage across pretraining and instruction data. To address this, we present Safe-LLaVA dataset, the first privacy-preserving MLLM training dataset constructed by systematically removing explicit and implicit biometric information from LLaVA dataset. Our evaluations on PRISM reveal biometric leakages across MLLMs for different attributes, highlighting the detailed privacy-violations. We also fine-tune a model on Safe-LLaVA dataset and show that it substantially reduces the biometric leakages. Together, Safe-LLaVA & PRISM set a new standard for privacy-aligned development and evaluation of MLLMs. The Safe-LLaVA dataset & PRISM benchmark are publicly available at https://huggingface.co/datasets/kyh9191/Safe-LLaVA, and the source code is available at https://github.com/Kimyounggun99/Safe-LLaVA.git.

主题：	计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2509.00192 [cs.CV]
	(或者 arXiv:2509.00192v1 [cs.CV] 对于此版本)
	https://doi.org/10.48550/arXiv.2509.00192

提交历史

来自： Sirnam Swetha [查看电子邮件]
[v1] 星期五， 2025 年 8 月 29 日 18:54:57 UTC (35,580 KB)

计算机科学 > 计算机视觉与模式识别

标题： Safe-LLaVA：一种用于生物特征安全的隐私保护视觉语言数据集和基准测试

标题： Safe-LLaVA: A Privacy-Preserving Vision-Language Dataset and Benchmark for Biometric Safety

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机视觉与模式识别

标题： Safe-LLaVA：一种用于生物特征安全的隐私保护视觉语言数据集和基准测试 显示英文标题

标题： Safe-LLaVA: A Privacy-Preserving Vision-Language Dataset and Benchmark for Biometric Safety

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： Safe-LLaVA：一种用于生物特征安全的隐私保护视觉语言数据集和基准测试