Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

Zhu, Wang; Singh, Ishika; Jia, Robin; Thomason, Jesse

计算机科学 > 人工智能

arXiv:2406.02791v1 (cs)

[提交于 2024年6月4日 (此版本) ， 最新版本 2024年11月8日 (v2) ]

标题：语言模型可以从环境反馈中推断经典规划器的动作语义

标题： Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

Authors:Wang Zhu, Ishika Singh, Robin Jia, Jesse Thomason

摘要：经典规划方法在可能的情况下保证找到一组可以实现给定目标状态的动作，但需要专家来指定控制环境动态的逻辑动作语义。研究人员已经表明，大型语言模型（LLMs）可以仅基于常识知识和最少的领域信息直接推断出规划步骤，但这样的计划在执行时经常失败。我们结合经典规划和LLM常识推理的优势来进行领域归纳，根据与环境本身的闭环交互来学习和验证动作的前提和后置条件。我们提出了PSALM，该方法利用LLM推理来启发式地完成经典规划器在给定部分领域知识的情况下发出的部分计划，以及在执行后根据环境反馈推断出领域的语义规则，以逻辑语言表示。我们的分析显示，在7个环境中，仅使用一个专家精心设计的示例计划，使用LLM作为启发式规划器和规则预测器，在降低环境执行步骤和环境重置次数的同时，能够同时恢复领域的底层真实动作语义。

摘要： Classical planning approaches guarantee finding a set of actions that can achieve a given goal state when possible, but require an expert to specify logical action semantics that govern the dynamics of the environment. Researchers have shown that Large Language Models (LLMs) can be used to directly infer planning steps based on commonsense knowledge and minimal domain information alone, but such plans often fail on execution. We bring together the strengths of classical planning and LLM commonsense inference to perform domain induction, learning and validating action pre- and post-conditions based on closed-loop interactions with the environment itself. We propose PSALM, which leverages LLM inference to heuristically complete partial plans emitted by a classical planner given partial domain knowledge, as well as to infer the semantic rules of the domain in a logical language based on environment feedback after execution. Our analysis on 7 environments shows that with just one expert-curated example plans, using LLMs as heuristic planners and rule predictors achieves lower environment execution steps and environment resets than random exploration while simultaneously recovering the underlying ground truth action semantics of the domain.

主题：	人工智能 (cs.AI) ; 计算与语言 (cs.CL); 机器人技术 (cs.RO)
引用方式：	arXiv:2406.02791 [cs.AI]
	(或者 arXiv:2406.02791v1 [cs.AI] 对于此版本)
	https://doi.org/10.48550/arXiv.2406.02791

提交历史

来自： Wang Zhu [查看电子邮件]
[v1] 星期二， 2024 年 6 月 4 日 21:29:56 UTC (321 KB)
[v2] 星期五， 2024 年 11 月 8 日 16:50:24 UTC (2,141 KB)

计算机科学 > 人工智能

标题：语言模型可以从环境反馈中推断经典规划器的动作语义

标题： Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 人工智能

标题： 语言模型可以从环境反馈中推断经典规划器的动作语义 显示英文标题

标题： Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：语言模型可以从环境反馈中推断经典规划器的动作语义