Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Liang, Yingping; Hu, Yutao; Shao, Wenqi; Fu, Ying

Computer Science > Computer Vision and Pattern Recognition

arXiv:2507.00392v2 (cs)

[Submitted on 1 Jul 2025 (v1) , last revised 5 Jul 2025 (this version, v2)]

Title: Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Title: 通过将单个2D图像提升到3D空间进行密集特征匹配学习

Authors:Yingping Liang, Yutao Hu, Wenqi Shao, Ying Fu

Abstract: Feature matching plays a fundamental role in many computer vision tasks, yet existing methods heavily rely on scarce and clean multi-view image collections, which constrains their generalization to diverse and challenging scenarios. Moreover, conventional feature encoders are typically trained on single-view 2D images, limiting their capacity to capture 3D-aware correspondences. In this paper, we propose a novel two-stage framework that lifts 2D images to 3D space, named as \textbf{Lift to Match (L2M)}, taking full advantage of large-scale and diverse single-view images. To be specific, in the first stage, we learn a 3D-aware feature encoder using a combination of multi-view image synthesis and 3D feature Gaussian representation, which injects 3D geometry knowledge into the encoder. In the second stage, a novel-view rendering strategy, combined with large-scale synthetic data generation from single-view images, is employed to learn a feature decoder for robust feature matching, thus achieving generalization across diverse domains. Extensive experiments demonstrate that our method achieves superior generalization across zero-shot evaluation benchmarks, highlighting the effectiveness of the proposed framework for robust feature matching.

Abstract: 特征匹配在许多计算机视觉任务中起着基础作用，但现有方法严重依赖于稀缺且干净的多视角图像集合，这限制了它们在多样和具有挑战性的场景中的泛化能力。此外，传统特征编码器通常在单视角2D图像上进行训练，限制了它们捕捉3D感知对应关系的能力。在本文中，我们提出了一种新颖的两阶段框架，将2D图像提升到3D空间，命名为\textbf{提升匹配（L2M）}，充分利用大规模和多样的单视角图像。具体来说，在第一阶段，我们使用多视角图像合成和3D特征高斯表示的组合来学习一个3D感知特征编码器，这将3D几何知识注入编码器中。在第二阶段，采用一种新视角渲染策略，并结合从单视角图像生成的大规模合成数据，以学习一个特征解码器，从而实现鲁棒的特征匹配，达到跨不同领域的泛化。大量实验表明，我们的方法在零样本评估基准上实现了优越的泛化能力，突显了所提出框架在鲁棒特征匹配中的有效性。

Comments:	Official Code: https://github.com/Sharpiless/L2M
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2507.00392 [cs.CV]
	(or arXiv:2507.00392v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2507.00392

Submission history

From: Yingping Liang [view email]
[v1] Tue, 1 Jul 2025 03:07:21 UTC (7,912 KB)
[v2] Sat, 5 Jul 2025 23:13:08 UTC (7,912 KB)

Computer Science > Computer Vision and Pattern Recognition

Title: Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Title: 通过将单个2D图像提升到3D空间进行密集特征匹配学习

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title: Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space Show Chinese title

Title: 通过将单个2D图像提升到3D空间进行密集特征匹配学习

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space