Machine Learning and statistical classification of CRISPR-Cas12a diagnostic assays

Khosla, Nathan; Lesinski, Jake M.; Haywood-Alexander, Marcus; deMello, Andrew J.; Richards, Daniel A.

定量生物学 > 定量方法

arXiv:2501.04413 (q-bio)

[提交于 2025年1月8日 ]

标题：机器学习和统计分类的CRISPR-Cas12a诊断检测

标题： Machine Learning and statistical classification of CRISPR-Cas12a diagnostic assays

Authors:Nathan Khosla, Jake M. Lesinski, Marcus Haywood-Alexander, Andrew J. deMello, Daniel A. Richards

摘要：基于CRISPR的诊断方法因其能够克服当前分子诊断测试的局限性，作为生物传感工具正受到越来越多的关注。为了最大化基于CRISPR的检测方法的性能，大量努力集中在优化生物传感反应的化学和生物学方面。然而，对改进用于分析基于CRISPR的诊断数据的技术关注较少。到目前为止，诊断决策通常涉及各种形式的基于斜率的分类方法。这些方法优于基于评估绝对信号的传统方法，但仍存在局限性。在此，我们使用常见的基于斜率的方法建立了性能基准（总准确率、灵敏度和特异性）。我们将这些基准方法的性能与三种不同的二次经验分布函数统计检验方法进行了比较，发现应用于临床数据集时，诊断速度和准确性有显著提高。三种统计技术中的两种，Kolmogorov-Smirnov检验和Anderson-Darling检验，在结果时间最短和总测试准确率最高方面表现最佳。此外，我们开发了一个长短期记忆循环神经网络来分类基于CRISPR的生物传感数据，在我们的模型数据集上实现了100%的特异性。最后，我们提供了关于选择最适合诊断检测需求的分类方法和分类方法参数的指南。

摘要： CRISPR-based diagnostics have gained increasing attention as biosensing tools able to address limitations in contemporary molecular diagnostic tests. To maximise the performance of CRISPR-based assays, much effort has focused on optimizing the chemistry and biology of the biosensing reaction. However, less attention has been paid to improving the techniques used to analyse CRISPR-based diagnostic data. To date, diagnostic decisions typically involve various forms of slope-based classification. Such methods are superior to traditional methods based on assessing absolute signals, but still have limitations. Herein, we establish performance benchmarks (total accuracy, sensitivity, and specificity) using common slope-based methods. We compare the performance of these benchmark methods with three different quadratic empirical distribution function statistical tests, finding significant improvements in diagnostic speed and accuracy when applied to a clinical data set. Two of the three statistical techniques, the Kolmogorov-Smirnov and Anderson-Darling tests, report the lowest time-to-result and highest total test accuracy. Furthermore, we developed a long short-term memory recurrent neural network to classify CRISPR-biosensing data, achieving 100% specificity on our model data set. Finally, we provide guidelines on choosing the classification method and classification method parameters that best suit a diagnostic assays needs.

评论：	25页，5图，研究论文。Nathan Khosla和Jake M. Lesinski贡献相同。附有电子支持信息作为附录
主题：	定量方法 (q-bio.QM) ; 机器学习 (cs.LG)
引用方式：	arXiv:2501.04413 [q-bio.QM]
	(或者 arXiv:2501.04413v1 [q-bio.QM] 对于此版本)
	https://doi.org/10.48550/arXiv.2501.04413

提交历史

来自： Daniel Richards Dr [查看电子邮件]
[v1] 星期三， 2025 年 1 月 8 日 10:59:36 UTC (19,075 KB)

定量生物学 > 定量方法

标题：机器学习和统计分类的CRISPR-Cas12a诊断检测

标题： Machine Learning and statistical classification of CRISPR-Cas12a diagnostic assays

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

定量生物学 > 定量方法

标题： 机器学习和统计分类的CRISPR-Cas12a诊断检测 显示英文标题

标题： Machine Learning and statistical classification of CRISPR-Cas12a diagnostic assays

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：机器学习和统计分类的CRISPR-Cas12a诊断检测