Development and Comparative Evaluation of Three Artificial Intelligence Models (NLP, LLM, JEPA) for Predicting Triage in Emergency Departments: A 7-Month Retrospective Proof-of-Concept

Lansiaux, Edouard; Azzouz, Ramy; Chazard, Emmanuel; Vromant, Amélie; Wiel, Eric

计算机科学 > 机器学习

arXiv:2507.01080 (cs)

[提交于 2025年7月1日 ]

标题：三种人工智能模型（NLP、LLM、JEPA）在急诊科分诊预测中的开发与比较评估：一项7个月的回顾性概念验证

标题： Development and Comparative Evaluation of Three Artificial Intelligence Models (NLP, LLM, JEPA) for Predicting Triage in Emergency Departments: A 7-Month Retrospective Proof-of-Concept

Authors:Edouard Lansiaux, Ramy Azzouz, Emmanuel Chazard, Amélie Vromant, Eric Wiel

摘要：分诊错误，包括分诊不足和过度分诊，是急诊科（EDs）持续存在的挑战。随着患者数量的增加和人员短缺，将人工智能（AI）整合到分诊协议中引起了关注。本研究比较了三种AI模型[自然语言处理（NLP）、大语言模型（LLM）和联合嵌入预测架构（JEPA）]在预测分诊结果方面与FRENCH量表和临床实践的性能。我们对一个前瞻性招募的队列进行了回顾性分析，该队列收集了罗杰·萨伦格罗医院急诊科（法国里尔）7个月内的成年患者分诊数据。三个AI模型进行了训练和验证：（1）TRIAGEMASTER（NLP），（2）URGENTIAPARSE（LLM），以及（3）EMERGINET（JEPA）。数据包括人口统计信息、原始主诉、生命体征以及基于FRENCH量表和GEMSA编码的分诊结果。主要结果是AI预测的分诊级别与FRENCH黄金标准的一致性。通过各种指标进行评估：F1分数、加权Kappa、斯皮尔曼、MAE、RMSE。 LLM模型（URGENTIAPARSE）的准确性更高（综合得分：2.514），相比JEPA（EMERGINET，0.438）和NLP（TRIAGEMASTER，-3.511），优于护士分诊（-4.343）。次要分析突显了URGENTIAPARSE在预测住院需求（GEMSA）方面的有效性及其在结构化数据与原始转录文本之间的稳健性（无论是用于GEMSA预测还是FRENCH预测）。通过患者表示的抽象，LLM架构在测试模型中提供了最准确的分诊预测。将AI整合到急诊科工作流程中可以提高患者安全性和运营效率，尽管将AI整合到临床工作流程中需要解决模型限制并确保伦理透明度。

摘要： Triage errors, including undertriage and overtriage, are persistent challenges in emergency departments (EDs). With increasing patient influx and staff shortages, the integration of artificial intelligence (AI) into triage protocols has gained attention. This study compares the performance of three AI models [Natural Language Processing (NLP), Large Language Models (LLM), and Joint Embedding Predictive Architecture (JEPA)] in predicting triage outcomes against the FRENCH scale and clinical practice.We conducted a retrospective analysis of a prospectively recruited cohort gathering adult patient triage data over a 7-month period at the Roger Salengro Hospital ED (Lille, France). Three AI models were trained and validated : (1) TRIAGEMASTER (NLP), (2) URGENTIAPARSE (LLM), and (3) EMERGINET (JEPA). Data included demographic details, verbatim chief complaints, vital signs, and triage outcomes based on the FRENCH scale and GEMSA coding. The primary outcome was the concordance of AI-predicted triage level with the FRENCH gold-standard. It was assessed thanks to various indicators : F1-Score, Weighted Kappa, Spearman, MAE, RMSE. The LLM model (URGENTIAPARSE) showed higher accuracy (composite score: 2.514) compared to JEPA (EMERGINET, 0.438) and NLP (TRIAGEMASTER, -3.511), outperforming nurse triage (-4.343). Secondary analyses highlighted the effectiveness of URGENTIAPARSE in predicting hospitalization needs (GEMSA) and its robustness with structured data versus raw transcripts (either for GEMSA prediction or for FRENCH prediction). LLM architecture, through abstraction of patient representations, offers the most accurate triage predictions among tested models. Integrating AI into ED workflows could enhance patient safety and operational efficiency, though integration into clinical workflows requires addressing model limitations and ensuring ethical transparency.

评论：	15页，6图
主题：	机器学习 (cs.LG) ; 性能 (cs.PF)
引用方式：	arXiv:2507.01080 [cs.LG]
	(或者 arXiv:2507.01080v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2507.01080

提交历史

来自： Edouard Lansiaux [查看电子邮件]
[v1] 星期二， 2025 年 7 月 1 日 16:37:55 UTC (1,021 KB)

计算机科学 > 机器学习

标题：三种人工智能模型（NLP、LLM、JEPA）在急诊科分诊预测中的开发与比较评估：一项7个月的回顾性概念验证

标题： Development and Comparative Evaluation of Three Artificial Intelligence Models (NLP, LLM, JEPA) for Predicting Triage in Emergency Departments: A 7-Month Retrospective Proof-of-Concept

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 三种人工智能模型（NLP、LLM、JEPA）在急诊科分诊预测中的开发与比较评估：一项7个月的回顾性概念验证 显示英文标题

标题： Development and Comparative Evaluation of Three Artificial Intelligence Models (NLP, LLM, JEPA) for Predicting Triage in Emergency Departments: A 7-Month Retrospective Proof-of-Concept

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：三种人工智能模型（NLP、LLM、JEPA）在急诊科分诊预测中的开发与比较评估：一项7个月的回顾性概念验证