PSYCHE: A Multi-faceted Patient Simulation Framework for Evaluation of Psychiatric Assessment Conversational Agents

Lee, Jingoo; Lim, Kyungho; Jung, Young-Chul; Kim, Byung-Hoon

计算机科学 > 计算与语言

arXiv:2501.01594 (cs)

[提交于 2025年1月3日 ]

标题： PSYCHE：用于评估精神科评估对话代理的多方面患者模拟框架

标题： PSYCHE: A Multi-faceted Patient Simulation Framework for Evaluation of Psychiatric Assessment Conversational Agents

Authors:Jingoo Lee, Kyungho Lim, Young-Chul Jung, Byung-Hoon Kim

摘要：最近大型语言模型（LLMs）的进展加速了能够生成类似人类响应的对话代理的发展。由于精神科评估通常涉及精神科医生和患者之间的复杂对话互动，因此对开发旨在模拟精神科医生在临床评估中角色的基于LLM的精神科评估对话代理（PACAs）的兴趣正在增长。然而，用于基准测试PACAs与患者互动的临床适当性的标准化方法仍然研究不足。在此，我们提出PSYCHE，一个新颖的框架，旨在实现对PACAs的1）临床相关性、2）伦理安全性、3）成本效率和4）定量评估。这是通过基于多方面精神科结构模拟精神科患者来实现的，该结构定义了模拟患者的资料、病史和行为，PACAs需要对其进行评估。我们通过一项由10名认证精神科医生参与的研究验证了PSYCHE的有效性，并通过对模拟患者话语的深入分析来支持该研究。

摘要： Recent advances in large language models (LLMs) have accelerated the development of conversational agents capable of generating human-like responses. Since psychiatric assessments typically involve complex conversational interactions between psychiatrists and patients, there is growing interest in developing LLM-based psychiatric assessment conversational agents (PACAs) that aim to simulate the role of psychiatrists in clinical evaluations. However, standardized methods for benchmarking the clinical appropriateness of PACAs' interaction with patients still remain underexplored. Here, we propose PSYCHE, a novel framework designed to enable the 1) clinically relevant, 2) ethically safe, 3) cost-efficient, and 4) quantitative evaluation of PACAs. This is achieved by simulating psychiatric patients based on a multi-faceted psychiatric construct that defines the simulated patients' profiles, histories, and behaviors, which PACAs are expected to assess. We validate the effectiveness of PSYCHE through a study with 10 board-certified psychiatrists, supported by an in-depth analysis of the simulated patient utterances.

评论：	前两位作者贡献相同
主题：	计算与语言 (cs.CL) ; 人工智能 (cs.AI); 机器学习 (cs.LG)
引用方式：	arXiv:2501.01594 [cs.CL]
	(或者 arXiv:2501.01594v1 [cs.CL] 对于此版本)
	https://doi.org/10.48550/arXiv.2501.01594

提交历史

来自： Byung-Hoon Kim M.D. Ph.D. [查看电子邮件]
[v1] 星期五， 2025 年 1 月 3 日 01:38:46 UTC (9,103 KB)

计算机科学 > 计算与语言

标题： PSYCHE：用于评估精神科评估对话代理的多方面患者模拟框架

标题： PSYCHE: A Multi-faceted Patient Simulation Framework for Evaluation of Psychiatric Assessment Conversational Agents

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算与语言

标题： PSYCHE：用于评估精神科评估对话代理的多方面患者模拟框架 显示英文标题

标题： PSYCHE: A Multi-faceted Patient Simulation Framework for Evaluation of Psychiatric Assessment Conversational Agents

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： PSYCHE：用于评估精神科评估对话代理的多方面患者模拟框架