Deep Learning for GWP Prediction: A Framework Using PCA, Quantile Transformation, and Ensemble Modeling

Rajapriya, Navin; Kawajiri, Kotaro

计算机科学 > 机器学习

arXiv:2411.19124 (cs)

[提交于 2024年11月28日 ]

标题：深度学习用于GWP预测：一种使用PCA、分位数变换和集成建模的框架

标题： Deep Learning for GWP Prediction: A Framework Using PCA, Quantile Transformation, and Ensemble Modeling

Authors:Navin Rajapriya, Kotaro Kawajiri

摘要：开发环境可持续的制冷剂对于减轻人为温室气体对全球变暖的影响至关重要。本研究提出了一种预测模型框架，使用在Multi-Sigma平台上实现的全连接神经网络来估算单一组分制冷剂的100年全球变暖潜能值（GWP 100）。从RDKit、Mordred和alvaDesc中提取的分子描述符被用来捕捉各种化学特征。基于RDKit的模型表现最佳，其均方根误差（RMSE）为481.9，R2得分为0.918，显示出优越的预测准确性和泛化能力。通过主成分分析（PCA）和分位数变换进行降维，以解决数据集的高维和偏态特性，从而提高模型的稳定性和性能。因子分析确定了重要的分子特征，包括分子量、脂溶性以及如腈类和烯丙基氧化物等功能基团，这些是GWP值的重要贡献因素。这些见解为设计环境可持续的制冷剂提供了可操作的指导。将RDKit描述符与Multi-Sigma的框架（包括PCA、分位数变换和神经网络）相结合，为快速虚拟筛选低GWP制冷剂提供了一个可扩展的解决方案。这种方法有望加速环保替代品的识别，直接通过促进符合全球可持续发展目标的下一代制冷剂的设计，对气候缓解做出贡献。

摘要： Developing environmentally sustainable refrigerants is critical for mitigating the impact of anthropogenic greenhouse gases on global warming. This study presents a predictive modeling framework to estimate the 100-year global warming potential (GWP 100) of single-component refrigerants using a fully connected neural network implemented on the Multi-Sigma platform. Molecular descriptors from RDKit, Mordred, and alvaDesc were utilized to capture various chemical features. The RDKit-based model achieved the best performance, with a Root Mean Square Error (RMSE) of 481.9 and an R2 score of 0.918, demonstrating superior predictive accuracy and generalizability. Dimensionality reduction through Principal Component Analysis (PCA) and quantile transformation were applied to address the high-dimensional and skewed nature of the dataset,enhancing model stability and performance. Factor analysis identified vital molecular features, including molecular weight, lipophilicity, and functional groups, such as nitriles and allylic oxides, as significant contributors to GWP values. These insights provide actionable guidance for designing environmentally sustainable refrigerants. Integrating RDKit descriptors with Multi-Sigma's framework, which includes PCA, quantile transformation, and neural networks, provides a scalable solution for the rapid virtual screening of low-GWP refrigerants. This approach can potentially accelerate the identification of eco-friendly alternatives, directly contributing to climate mitigation by enabling the design of next-generation refrigerants aligned with global sustainability objectives.

评论：	10页，5图，2表
主题：	机器学习 (cs.LG) ; 材料科学 (cond-mat.mtrl-sci); 化学物理 (physics.chem-ph)
引用方式：	arXiv:2411.19124 [cs.LG]
	(或者 arXiv:2411.19124v1 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2411.19124

提交历史

来自： Navin Rajapriya [查看电子邮件]
[v1] 星期四， 2024 年 11 月 28 日 13:16:12 UTC (2,005 KB)

计算机科学 > 机器学习

标题：深度学习用于GWP预测：一种使用PCA、分位数变换和集成建模的框架

标题： Deep Learning for GWP Prediction: A Framework Using PCA, Quantile Transformation, and Ensemble Modeling

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 深度学习用于GWP预测：一种使用PCA、分位数变换和集成建模的框架 显示英文标题

标题： Deep Learning for GWP Prediction: A Framework Using PCA, Quantile Transformation, and Ensemble Modeling

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：深度学习用于GWP预测：一种使用PCA、分位数变换和集成建模的框架