CarbonChat: Large Language Model-Based Corporate Carbon Emission Analysis and Climate Knowledge Q&A System

Cao, Zhixuan; Han, Ming; Wang, Jingtao; Jia, Meng

计算机科学 > 计算与语言

arXiv:2501.02031 (cs)

[提交于 2025年1月3日 ]

标题：碳聊天：基于大型语言模型的企业碳排放分析和气候知识问答系统

标题： CarbonChat: Large Language Model-Based Corporate Carbon Emission Analysis and Climate Knowledge Q&A System

Authors:Zhixuan Cao, Ming Han, Jingtao Wang, Meng Jia

摘要：随着全球气候变化影响的加剧，企业碳排放已成为全球关注的焦点。针对大型语言模型中气候变化知识更新滞后、传统增强生成架构在复杂问题上的专业化和准确性不足以及可持续发展报告分析成本高、耗时的问题，本文提出了CarbonChat：基于大型语言模型的企业碳排放分析和气候知识问答系统，旨在实现精确的碳排放分析和政策理解。首先，提出了一种多样化的指标模块构建方法，以处理基于规则和长文本文档的分割以及结构化数据的提取，从而优化关键信息的解析。其次，设计了一种增强的自提示检索-增强生成架构，整合了意图识别、结构化推理链、混合检索和Text2SQL，提高了语义理解和查询转换的效率。接下来，基于温室气体核算框架，建立了14个维度进行碳排放分析，实现了报告摘要、相关性评估和定制化响应。最后，通过多层分块机制、时间戳和幻觉检测功能，确保了分析结果的准确性和可验证性，降低了幻觉率并提高了响应的精度。

摘要： As the impact of global climate change intensifies, corporate carbon emissions have become a focal point of global attention. In response to issues such as the lag in climate change knowledge updates within large language models, the lack of specialization and accuracy in traditional augmented generation architectures for complex problems, and the high cost and time consumption of sustainability report analysis, this paper proposes CarbonChat: Large Language Model-based corporate carbon emission analysis and climate knowledge Q&A system, aimed at achieving precise carbon emission analysis and policy understanding.First, a diversified index module construction method is proposed to handle the segmentation of rule-based and long-text documents, as well as the extraction of structured data, thereby optimizing the parsing of key information.Second, an enhanced self-prompt retrieval-augmented generation architecture is designed, integrating intent recognition, structured reasoning chains, hybrid retrieval, and Text2SQL, improving the efficiency of semantic understanding and query conversion.Next, based on the greenhouse gas accounting framework, 14 dimensions are established for carbon emission analysis, enabling report summarization, relevance evaluation, and customized responses.Finally, through a multi-layer chunking mechanism, timestamps, and hallucination detection features, the accuracy and verifiability of the analysis results are ensured, reducing hallucination rates and enhancing the precision of the responses.

评论：	26页
主题：	计算与语言 (cs.CL) ; 人工智能 (cs.AI)
MSC 类：	68T07, 91B06
ACM 类：	I.2.1
引用方式：	arXiv:2501.02031 [cs.CL]
	(或者 arXiv:2501.02031v1 [cs.CL] 对于此版本)
	https://doi.org/10.48550/arXiv.2501.02031

提交历史

来自： Zhixuan Cao [查看电子邮件]
[v1] 星期五， 2025 年 1 月 3 日 08:45:38 UTC (5,040 KB)

计算机科学 > 计算与语言

标题：碳聊天：基于大型语言模型的企业碳排放分析和气候知识问答系统

标题： CarbonChat: Large Language Model-Based Corporate Carbon Emission Analysis and Climate Knowledge Q&A System

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算与语言

标题： 碳聊天：基于大型语言模型的企业碳排放分析和气候知识问答系统 显示英文标题

标题： CarbonChat: Large Language Model-Based Corporate Carbon Emission Analysis and Climate Knowledge Q&A System

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：碳聊天：基于大型语言模型的企业碳排放分析和气候知识问答系统