Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > cs > arXiv:2509.11481

Help | Advanced Search

Computer Science > Robotics

arXiv:2509.11481 (cs)
[Submitted on 15 Sep 2025 ]

Title: RAPTOR: A Foundation Policy for Quadrotor Control

Title: RAPTOR:四旋翼飞行器控制的基础策略

Authors:Jonas Eschmann, Dario Albani, Giuseppe Loianno
Abstract: Humans are remarkably data-efficient when adapting to new unseen conditions, like driving a new car. In contrast, modern robotic control systems, like neural network policies trained using Reinforcement Learning (RL), are highly specialized for single environments. Because of this overfitting, they are known to break down even under small differences like the Simulation-to-Reality (Sim2Real) gap and require system identification and retraining for even minimal changes to the system. In this work, we present RAPTOR, a method for training a highly adaptive foundation policy for quadrotor control. Our method enables training a single, end-to-end neural-network policy to control a wide variety of quadrotors. We test 10 different real quadrotors from 32 g to 2.4 kg that also differ in motor type (brushed vs. brushless), frame type (soft vs. rigid), propeller type (2/3/4-blade), and flight controller (PX4/Betaflight/Crazyflie/M5StampFly). We find that a tiny, three-layer policy with only 2084 parameters is sufficient for zero-shot adaptation to a wide variety of platforms. The adaptation through In-Context Learning is made possible by using a recurrence in the hidden layer. The policy is trained through a novel Meta-Imitation Learning algorithm, where we sample 1000 quadrotors and train a teacher policy for each of them using Reinforcement Learning. Subsequently, the 1000 teachers are distilled into a single, adaptive student policy. We find that within milliseconds, the resulting foundation policy adapts zero-shot to unseen quadrotors. We extensively test the capabilities of the foundation policy under numerous conditions (trajectory tracking, indoor/outdoor, wind disturbance, poking, different propellers).
Abstract: 人类在适应新的未见过的条件时表现出惊人的数据效率,比如驾驶一辆新车。 相比之下,现代的机器人控制系统,如使用强化学习(RL)训练的神经网络策略,对于单一环境高度专业化。 由于这种过拟合,它们在即使很小的差异下也会失效,比如仿真到现实(Sim2Real)的差距,并且即使系统有最小的变化也需要系统识别和重新训练。 在这项工作中,我们提出了RAPTOR,一种用于训练四旋翼飞行器控制的高度适应性基础策略的方法。 我们的方法使得训练一个单一的端到端神经网络策略来控制各种四旋翼飞行器成为可能。 我们测试了10种不同的真实四旋翼飞行器,重量从32克到2.4公斤不等,它们在电机类型(有刷与无刷)、框架类型(软式与硬式)、螺旋桨类型(2/3/4桨叶)和飞控(PX4/Betaflight/Crazyflie/M5StampFly)方面也有所不同。 我们发现,一个仅有2084个参数的微型三层策略足以实现对各种平台的零样本适应。 通过在上下文中学习实现的适应性是通过在隐藏层中使用循环实现的。 该策略通过一种新颖的元模仿学习算法进行训练,在该算法中,我们采样1000个四旋翼飞行器,并为每个飞行器使用强化学习训练一个教师策略。 随后,这1000个教师策略被提炼成一个单一的适应性学生策略。 我们发现,结果得到的基础策略可以在毫秒内零样本适应未见过的四旋翼飞行器。 我们在多种条件下广泛测试了基础策略的能力(轨迹跟踪、室内/室外、风扰动、触碰、不同螺旋桨)。
Subjects: Robotics (cs.RO) ; Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2509.11481 [cs.RO]
  (or arXiv:2509.11481v1 [cs.RO] for this version)
  https://doi.org/10.48550/arXiv.2509.11481
arXiv-issued DOI via DataCite

Submission history

From: Jonas Eschmann [view email]
[v1] Mon, 15 Sep 2025 00:05:40 UTC (16,286 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
view license
Current browse context:
cs
< prev   |   next >
new | recent | 2025-09
Change to browse by:
cs.AI
cs.LG
cs.RO

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号