Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2506.00285v1

Help | Advanced Search

Computer Science > Robotics

arXiv:2506.00285v1 (cs)
[Submitted on 30 May 2025 ]

Title: Lazy Heuristic Search for Solving POMDPs with Expensive-to-Compute Belief Transitions

Title: 求解具有昂贵的 belief 转移函数的部分可观察马尔可夫决策过程的懒惰启发式搜索

Authors:Muhammad Suhail Saleem, Rishi Veerapaneni, Maxim Likhachev
Abstract: Heuristic search solvers like RTDP-Bel and LAO* have proven effective for computing optimal and bounded sub-optimal solutions for Partially Observable Markov Decision Processes (POMDPs), which are typically formulated as belief MDPs. A belief represents a probability distribution over possible system states. Given a parent belief and an action, computing belief state transitions involves Bayesian updates that combine the transition and observation models of the POMDP to determine successor beliefs and their transition probabilities. However, there is a class of problems, specifically in robotics, where computing these transitions can be prohibitively expensive due to costly physics simulations, raycasting, or expensive collision checks required by the underlying transition and observation models, leading to long planning times. To address this challenge, we propose Lazy RTDP-Bel and Lazy LAO*, which defer computing expensive belief state transitions by leveraging Q-value estimation, significantly reducing planning time. We demonstrate the superior performance of the proposed lazy planners in domains such as contact-rich manipulation for pose estimation, outdoor navigation in rough terrain, and indoor navigation with a 1-D LiDAR sensor. Additionally, we discuss practical Q-value estimation techniques for commonly encountered problem classes that our lazy planners can leverage. Our results show that lazy heuristic search methods dramatically improve planning speed by postponing expensive belief transition evaluations while maintaining solution quality.
Abstract: 像 RTDP-Bel 和 LAO* 这样的启发式搜索求解器已被证明可以有效地计算部分可观察马尔可夫决策过程(POMDPs)的最优和有界次优解,这些问题通常被表述为信念 MDPs。 信念表示系统状态可能分布的概率。 给定父信念和一个动作,计算信念状态转换涉及贝叶斯更新,这些更新结合了 POMDP 的转移和观察模型,以确定后继信念及其转换概率。 然而,在机器人学领域中存在一类问题,由于物理模拟、射线投射或由底层转移和观察模型所需的昂贵碰撞检测的成本过高,计算这些转换可能代价高昂,导致长时间规划。 为了解决这一挑战,我们提出了 Lazy RTDP-Bel 和 Lazy LAO*,它们通过利用 Q 值估计来推迟计算昂贵的信念状态转换,从而显著减少规划时间。 我们在接触丰富的操作用于姿态估计、崎岖地形中的户外导航以及使用一维 LiDAR 传感器的室内导航等领域展示了所提出的惰性规划器的优越性能。 此外,我们讨论了我们的惰性规划器可以利用的常见问题类别的实际 Q 值估计技术。 我们的结果显示,惰性启发式搜索方法通过推迟昂贵的信念转换评估,同时保持解决方案质量,极大地提高了规划速度。
Comments: Accepted for publication at The 18th International Symposium on Combinatorial Search (SOCS 2025)
Subjects: Robotics (cs.RO)
Cite as: arXiv:2506.00285 [cs.RO]
  (or arXiv:2506.00285v1 [cs.RO] for this version)
  https://doi.org/10.48550/arXiv.2506.00285
arXiv-issued DOI via DataCite

Submission history

From: Muhammad Suhail Saleem [view email]
[v1] Fri, 30 May 2025 22:26:26 UTC (269 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
  • Other Formats
license icon view license
Current browse context:
cs.RO
< prev   |   next >
new | recent | 2025-06
Change to browse by:
cs

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号