MS-RAFT-3D: A Multi-Scale Architecture for Recurrent Image-Based Scene Flow

Schmid, Jakob; Jahedi, Azin; Senn, Noah Berenguel; Bruhn, Andrés

计算机科学 > 计算机视觉与模式识别

arXiv:2506.01443 (cs)

[提交于 2025年6月2日 ]

标题： MS-RAFT-3D：基于递归图像的场景流的多尺度架构

标题： MS-RAFT-3D: A Multi-Scale Architecture for Recurrent Image-Based Scene Flow

Authors:Jakob Schmid, Azin Jahedi, Noah Berenguel Senn, Andrés Bruhn

摘要：尽管多尺度概念最近在光流和立体视觉领域的循环网络架构中被证明是有用的，但它们迄今为止尚未被考虑用于基于图像的场景流。因此，基于单一尺度的循环场景流主干网络，我们开发了一种多尺度方法，将光流领域成功的分层思想推广到基于图像的场景流。通过考虑特征编码器和上下文编码器的适当概念、整体从粗到细的框架以及训练损失，我们成功设计出了一种场景流方法，在KITTI和Spring数据集上的表现分别比当前最先进的方法高出8.7%（3.89 vs. 4.26）和65.8%（9.13 vs. 26.71）。我们的代码可以在https://github.com/cv-stuttgart/MS-RAFT-3D获取。

摘要： Although multi-scale concepts have recently proven useful for recurrent network architectures in the field of optical flow and stereo, they have not been considered for image-based scene flow so far. Hence, based on a single-scale recurrent scene flow backbone, we develop a multi-scale approach that generalizes successful hierarchical ideas from optical flow to image-based scene flow. By considering suitable concepts for the feature and the context encoder, the overall coarse-to-fine framework and the training loss, we succeed to design a scene flow approach that outperforms the current state of the art on KITTI and Spring by 8.7%(3.89 vs. 4.26) and 65.8% (9.13 vs. 26.71), respectively. Our code is available at https://github.com/cv-stuttgart/MS-RAFT-3D.

评论：	ICIP 2025
主题：	计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2506.01443 [cs.CV]
	(或者 arXiv:2506.01443v1 [cs.CV] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.01443

提交历史

来自： Azin Jahedi [查看电子邮件]
[v1] 星期一， 2025 年 6 月 2 日 08:59:05 UTC (47,824 KB)

计算机科学 > 计算机视觉与模式识别

标题： MS-RAFT-3D：基于递归图像的场景流的多尺度架构

标题： MS-RAFT-3D: A Multi-Scale Architecture for Recurrent Image-Based Scene Flow

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机视觉与模式识别

标题： MS-RAFT-3D：基于递归图像的场景流的多尺度架构 显示英文标题

标题： MS-RAFT-3D: A Multi-Scale Architecture for Recurrent Image-Based Scene Flow

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： MS-RAFT-3D：基于递归图像的场景流的多尺度架构