Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

Yang, Robert

计算机科学 > 机器学习

arXiv:2508.17681 (cs)

[提交于 2025年8月25日 (v1) ，最后修订 2025年8月26日 (此版本， v2)]

标题：遗忘作为消融：面向生成科学发现的可证伪基准

标题： Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

Authors:Robert Yang

摘要：关于人工智能在科学中的作用的强硬主张——从“通用人工智能将治愈所有疾病”到对发现速度大幅加快的承诺——引发了一个核心的认识论问题：大型语言模型（LLMs）是否真正生成新知识，还是仅仅重新组合记忆片段？我们提出以“遗忘即消融”作为可证伪的构造性科学发现探测方法。其理念是系统地移除一个目标结果及其遗忘闭包（支持性引理、改写和多跳蕴含），然后评估模型是否能仅从允许的公理和工具中重新推导出该结果。成功表明生成能力超越回忆；失败则揭示当前的局限性。与现有的遗忘动机——隐私、版权或安全——不同，我们的框架将其重新定位为人工智能用于科学的认识论探测。我们概述了一个数学和算法领域的最小试点研究，以说明可行性，并勾勒出该方法如何后来扩展到物理学或化学等领域。这是一篇立场论文：我们的贡献是概念性和方法性的，而非实证性的。我们的目标是激发讨论，探讨有原则的消融测试如何帮助区分能够重构知识的模型和仅能检索知识的模型，以及此类探测如何指导下一代人工智能用于科学的基准测试。

摘要： Bold claims about AI's role in science-from "AGI will cure all diseases" to promises of radically accelerated discovery-raise a central epistemic question: do large language models (LLMs) truly generate new knowledge, or do they merely remix memorized fragments? We propose unlearning-as-ablation as a falsifiable probe of constructive scientific discovery. The idea is to systematically remove a target result together with its forget-closure (supporting lemmas, paraphrases, and multi-hop entailments) and then evaluate whether the model can re-derive the result from only permitted axioms and tools. Success would indicate generative capability beyond recall; failure would expose current limits. Unlike prevailing motivations for unlearning-privacy, copyright, or safety-our framing repositions it as an epistemic probe for AI-for-Science. We outline a minimal pilot in mathematics and algorithms to illustrate feasibility, and sketch how the same approach could later be extended to domains such as physics or chemistry. This is a position paper: our contribution is conceptual and methodological, not empirical. We aim to stimulate discussion on how principled ablation tests could help distinguish models that reconstruct knowledge from those that merely retrieve it, and how such probes might guide the next generation of AI-for-Science benchmarks.

评论：	6页。NeurIPS 2025 AI4Science研讨会投稿
主题：	机器学习 (cs.LG) ; 人工智能 (cs.AI)
引用方式：	arXiv:2508.17681 [cs.LG]
	(或者 arXiv:2508.17681v2 [cs.LG] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.17681

提交历史

来自： Robert Yang [查看电子邮件]
[v1] 星期一， 2025 年 8 月 25 日 05:24:15 UTC (20 KB)
[v2] 星期二， 2025 年 8 月 26 日 05:04:10 UTC (27 KB)

计算机科学 > 机器学习

标题：遗忘作为消融：面向生成科学发现的可证伪基准

标题： Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 机器学习

标题： 遗忘作为消融：面向生成科学发现的可证伪基准 显示英文标题

标题： Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：遗忘作为消融：面向生成科学发现的可证伪基准