Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2501.19259

Help | Advanced Search

Computer Science > Robotics

arXiv:2501.19259 (cs)
[Submitted on 31 Jan 2025 (v1) , last revised 26 Apr 2025 (this version, v2)]

Title: Neuro-LIFT: A Neuromorphic, LLM-based Interactive Framework for Autonomous Drone FlighT at the Edge

Title: Neuro-LIFT:一种基于神经形态和大语言模型的边缘自主无人机飞行交互框架

Authors:Amogh Joshi, Sourav Sanyal, Kaushik Roy
Abstract: The integration of human-intuitive interactions into autonomous systems has been limited. Traditional Natural Language Processing (NLP) systems struggle with context and intent understanding, severely restricting human-robot interaction. Recent advancements in Large Language Models (LLMs) have transformed this dynamic, allowing for intuitive and high-level communication through speech and text, and bridging the gap between human commands and robotic actions. Additionally, autonomous navigation has emerged as a central focus in robotics research, with artificial intelligence (AI) increasingly being leveraged to enhance these systems. However, existing AI-based navigation algorithms face significant challenges in latency-critical tasks where rapid decision-making is critical. Traditional frame-based vision systems, while effective for high-level decision-making, suffer from high energy consumption and latency, limiting their applicability in real-time scenarios. Neuromorphic vision systems, combining event-based cameras and spiking neural networks (SNNs), offer a promising alternative by enabling energy-efficient, low-latency navigation. Despite their potential, real-world implementations of these systems, particularly on physical platforms such as drones, remain scarce. In this work, we present Neuro-LIFT, a real-time neuromorphic navigation framework implemented on a Parrot Bebop2 quadrotor. Leveraging an LLM for natural language processing, Neuro-LIFT translates human speech into high-level planning commands which are then autonomously executed using event-based neuromorphic vision and physics-driven planning. Our framework demonstrates its capabilities in navigating in a dynamic environment, avoiding obstacles, and adapting to human instructions in real-time.
Abstract: 将人类直觉交互整合到自主系统中的尝试一直受到限制。传统的自然语言处理(NLP)系统在上下文理解和意图识别方面存在困难,严重制约了人机交互的发展。近期大型语言模型(LLM)的进步改变了这一局面,通过语音和文本实现了直观且高层次的交流,弥合了人类命令与机器人操作之间的鸿沟。此外,自主导航已成为机器人研究的核心领域,人工智能(AI)被越来越多地用于提升这些系统的性能。 然而,现有的基于AI的导航算法在需要快速决策的任务中面临显著挑战。虽然传统帧率视觉系统在高级决策方面有效,但它们能耗高、延迟长,限制了其在实时场景中的应用。而类脑视觉系统结合事件驱动相机和尖峰神经网络(SNN),通过实现高效节能、低延迟的导航提供了一种有前景的替代方案。尽管具有潜力,但这些系统的实际部署,特别是在无人机等物理平台上的部署仍然很少见。 在这项工作中,我们提出了Neuro-LIFT,一个基于Parrot Bebop2四轴飞行器的实时类脑导航框架。利用LLM进行自然语言处理,Neuro-LIFT将人类语音转化为高层规划指令,并通过事件驱动的类脑视觉以及基于物理的规划来自动执行这些指令。我们的框架展示了其在动态环境中导航、避障以及实时响应人类指令的能力。
Comments: Accepted for publication at the International Joint Conference on Neural Networks (IJCNN) 2025
Subjects: Robotics (cs.RO) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Systems and Control (eess.SY)
Cite as: arXiv:2501.19259 [cs.RO]
  (or arXiv:2501.19259v2 [cs.RO] for this version)
  https://doi.org/10.48550/arXiv.2501.19259
arXiv-issued DOI via DataCite

Submission history

From: Amogh Joshi [view email]
[v1] Fri, 31 Jan 2025 16:17:03 UTC (35,900 KB)
[v2] Sat, 26 Apr 2025 18:37:29 UTC (35,900 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
license icon view license
Current browse context:
cs.NE
< prev   |   next >
new | recent | 2025-01
Change to browse by:
cs
cs.CV
cs.LG
cs.RO
cs.SY
eess
eess.SY

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号