Lazy Heuristic Search for Solving POMDPs with Expensive-to-Compute Belief Transitions

Saleem, Muhammad Suhail; Veerapaneni, Rishi; Likhachev, Maxim

Computer Science > Robotics

arXiv:2506.00285v1 (cs)

[Submitted on 30 May 2025 ]

Title: Lazy Heuristic Search for Solving POMDPs with Expensive-to-Compute Belief Transitions

Title: 求解具有昂贵的 belief 转移函数的部分可观察马尔可夫决策过程的懒惰启发式搜索

Authors:Muhammad Suhail Saleem, Rishi Veerapaneni, Maxim Likhachev

Abstract: Heuristic search solvers like RTDP-Bel and LAO* have proven effective for computing optimal and bounded sub-optimal solutions for Partially Observable Markov Decision Processes (POMDPs), which are typically formulated as belief MDPs. A belief represents a probability distribution over possible system states. Given a parent belief and an action, computing belief state transitions involves Bayesian updates that combine the transition and observation models of the POMDP to determine successor beliefs and their transition probabilities. However, there is a class of problems, specifically in robotics, where computing these transitions can be prohibitively expensive due to costly physics simulations, raycasting, or expensive collision checks required by the underlying transition and observation models, leading to long planning times. To address this challenge, we propose Lazy RTDP-Bel and Lazy LAO*, which defer computing expensive belief state transitions by leveraging Q-value estimation, significantly reducing planning time. We demonstrate the superior performance of the proposed lazy planners in domains such as contact-rich manipulation for pose estimation, outdoor navigation in rough terrain, and indoor navigation with a 1-D LiDAR sensor. Additionally, we discuss practical Q-value estimation techniques for commonly encountered problem classes that our lazy planners can leverage. Our results show that lazy heuristic search methods dramatically improve planning speed by postponing expensive belief transition evaluations while maintaining solution quality.

Abstract: 像 RTDP-Bel 和 LAO* 这样的启发式搜索求解器已被证明可以有效地计算部分可观察马尔可夫决策过程（POMDPs）的最优和有界次优解，这些问题通常被表述为信念 MDPs。信念表示系统状态可能分布的概率。给定父信念和一个动作，计算信念状态转换涉及贝叶斯更新，这些更新结合了 POMDP 的转移和观察模型，以确定后继信念及其转换概率。然而，在机器人学领域中存在一类问题，由于物理模拟、射线投射或由底层转移和观察模型所需的昂贵碰撞检测的成本过高，计算这些转换可能代价高昂，导致长时间规划。为了解决这一挑战，我们提出了 Lazy RTDP-Bel 和 Lazy LAO*，它们通过利用 Q 值估计来推迟计算昂贵的信念状态转换，从而显著减少规划时间。我们在接触丰富的操作用于姿态估计、崎岖地形中的户外导航以及使用一维 LiDAR 传感器的室内导航等领域展示了所提出的惰性规划器的优越性能。此外，我们讨论了我们的惰性规划器可以利用的常见问题类别的实际 Q 值估计技术。我们的结果显示，惰性启发式搜索方法通过推迟昂贵的信念转换评估，同时保持解决方案质量，极大地提高了规划速度。

Comments:	Accepted for publication at The 18th International Symposium on Combinatorial Search (SOCS 2025)
Subjects:	Robotics (cs.RO)
Cite as:	arXiv:2506.00285 [cs.RO]
	(or arXiv:2506.00285v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2506.00285

Submission history

From: Muhammad Suhail Saleem [view email]
[v1] Fri, 30 May 2025 22:26:26 UTC (269 KB)

Computer Science > Robotics

Title: Lazy Heuristic Search for Solving POMDPs with Expensive-to-Compute Belief Transitions

Title: 求解具有昂贵的 belief 转移函数的部分可观察马尔可夫决策过程的懒惰启发式搜索

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title: Lazy Heuristic Search for Solving POMDPs with Expensive-to-Compute Belief Transitions Show Chinese title

Title: 求解具有昂贵的 belief 转移函数的部分可观察马尔可夫决策过程的懒惰启发式搜索

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Lazy Heuristic Search for Solving POMDPs with Expensive-to-Compute Belief Transitions