Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2510.20721

Help | Advanced Search

Computer Science > Computation and Language

arXiv:2510.20721 (cs)
[Submitted on 23 Oct 2025 ]

Title: User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios

Title: 用户对LLM在隐私敏感场景中的响应的隐私和帮助性的感知

Authors:Xiaoyuan Wu, Roshni Kaushik, Wenkai Li, Lujo Bauer, Koichi Onoue
Abstract: Large language models (LLMs) have seen rapid adoption for tasks such as drafting emails, summarizing meetings, and answering health questions. In such uses, users may need to share private information (e.g., health records, contact details). To evaluate LLMs' ability to identify and redact such private information, prior work developed benchmarks (e.g., ConfAIde, PrivacyLens) with real-life scenarios. Using these benchmarks, researchers have found that LLMs sometimes fail to keep secrets private when responding to complex tasks (e.g., leaking employee salaries in meeting summaries). However, these evaluations rely on LLMs (proxy LLMs) to gauge compliance with privacy norms, overlooking real users' perceptions. Moreover, prior work primarily focused on the privacy-preservation quality of responses, without investigating nuanced differences in helpfulness. To understand how users perceive the privacy-preservation quality and helpfulness of LLM responses to privacy-sensitive scenarios, we conducted a user study with 94 participants using 90 scenarios from PrivacyLens. We found that, when evaluating identical responses to the same scenario, users showed low agreement with each other on the privacy-preservation quality and helpfulness of the LLM response. Further, we found high agreement among five proxy LLMs, while each individual LLM had low correlation with users' evaluations. These results indicate that the privacy and helpfulness of LLM responses are often specific to individuals, and proxy LLMs are poor estimates of how real users would perceive these responses in privacy-sensitive scenarios. Our results suggest the need to conduct user-centered studies on measuring LLMs' ability to help users while preserving privacy. Additionally, future research could investigate ways to improve the alignment between proxy LLMs and users for better estimation of users' perceived privacy and utility.
Abstract: 大型语言模型(LLMs)已被广泛用于起草电子邮件、总结会议和回答健康问题等任务。 在这些使用场景中,用户可能需要分享私人信息(例如,健康记录、联系方式)。 为了评估LLMs识别和删除此类私人信息的能力,先前的研究开发了基准测试(例如,ConfAIde、PrivacyLens),包含真实生活场景。 利用这些基准测试,研究人员发现,当处理复杂任务时,LLMs有时无法保持机密信息的私密性(例如,在会议总结中泄露员工工资)。 然而,这些评估依赖于LLMs(代理LLMs)来衡量对隐私规范的遵守情况,忽视了真实用户的感知。 此外,先前的研究主要关注响应的隐私保护质量,而没有探讨在帮助性方面的细微差异。 为了了解用户如何感知LLMs在隐私敏感场景中的隐私保护质量和帮助性,我们进行了一项用户研究,共有94名参与者,使用了来自PrivacyLens的90个场景。 我们发现,当评估对同一场景的相同响应时,用户在LLM响应的隐私保护质量和帮助性方面彼此之间的同意度较低。 此外,我们发现五个代理LLMs之间有高度一致的同意度,而每个单独的LLM与用户的评估相关性较低。 这些结果表明,LLM响应的隐私性和帮助性通常因人而异,代理LLMs是衡量真实用户在隐私敏感场景中对这些响应的感知的不良估计。 我们的结果表明,需要开展以用户为中心的研究,以测量LLMs在帮助用户的同时保护隐私的能力。 此外,未来的研究可以探索改进代理LLMs与用户之间对齐的方法,以更好地估计用户感知到的隐私和效用。
Subjects: Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
Cite as: arXiv:2510.20721 [cs.CL]
  (or arXiv:2510.20721v1 [cs.CL] for this version)
  https://doi.org/10.48550/arXiv.2510.20721
arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Xiaoyuan Wu [view email]
[v1] Thu, 23 Oct 2025 16:38:26 UTC (1,077 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
license icon view license
Current browse context:
cs.CL
< prev   |   next >
new | recent | 2025-10
Change to browse by:
cs
cs.AI
cs.HC

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号