Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video

He, Jixuan; Lin, Chieh Hubert; Qi, Lu; Yang, Ming-Hsuan

计算机科学 > 计算机视觉与模式识别

arXiv:2508.06715 (cs)

[提交于 2025年8月8日 ]

标题： Restage4D：从单个视频重新激活可变形3D重建

标题： Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video

Authors:Jixuan He, Chieh Hubert Lin, Lu Qi, Ming-Hsuan Yang

摘要：创建可变形的3D内容随着文本到图像和图像到视频生成模型的兴起而受到越来越多的关注。尽管这些模型为外观提供了丰富的语义先验，但它们难以捕捉真实4D场景合成所需的物理真实感和运动动力学。相反，现实世界的视频可以提供难以幻觉的物理基础几何和关节线索。提出一个问题： \textit{我们能否通过利用真实世界视频的运动先验来生成物理上一致的4D内容？}? 在本工作中，我们探索从单个视频重新激活可变形的3D场景的任务，使用原始序列作为监督信号来纠正合成运动产生的伪影。我们引入了\textbf{Restage4D}，一种用于视频条件4D重新布置的几何保持流程。我们的方法使用视频回溯训练策略，通过共享的运动表示在真实基础视频和合成驱动视频之间建立时间桥梁。我们进一步结合了一个遮挡感知的刚度损失和一个去遮挡回溯机制，以在具有挑战性的运动下提高结构和几何一致性。我们在 DAVIS 和 PointOdyssey 上验证了 Restage4D ，展示了改进的几何一致性、运动质量和3D跟踪性能。我们的方法不仅在新运动下保留了可变形结构，还能自动纠正生成模型引入的错误，揭示了视频先验在4D重新布置任务中的潜力。源代码和训练好的模型将被发布。

摘要： Creating deformable 3D content has gained increasing attention with the rise of text-to-image and image-to-video generative models. While these models provide rich semantic priors for appearance, they struggle to capture the physical realism and motion dynamics needed for authentic 4D scene synthesis. In contrast, real-world videos can provide physically grounded geometry and articulation cues that are difficult to hallucinate. One question is raised: \textit{Can we generate physically consistent 4D content by leveraging the motion priors of the real-world video}? In this work, we explore the task of reanimating deformable 3D scenes from a single video, using the original sequence as a supervisory signal to correct artifacts from synthetic motion. We introduce \textbf{Restage4D}, a geometry-preserving pipeline for video-conditioned 4D restaging. Our approach uses a video-rewinding training strategy to temporally bridge a real base video and a synthetic driving video via a shared motion representation. We further incorporate an occlusion-aware rigidity loss and a disocclusion backtracing mechanism to improve structural and geometry consistency under challenging motion. We validate Restage4D on DAVIS and PointOdyssey, demonstrating improved geometry consistency, motion quality, and 3D tracking performance. Our method not only preserves deformable structure under novel motion, but also automatically corrects errors introduced by generative models, revealing the potential of video prior in 4D restaging task. Source code and trained models will be released.

主题：	计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2508.06715 [cs.CV]
	(或者 arXiv:2508.06715v1 [cs.CV] 对于此版本)
	https://doi.org/10.48550/arXiv.2508.06715

提交历史

来自： Jixuan He [查看电子邮件]
[v1] 星期五， 2025 年 8 月 8 日 21:31:51 UTC (21,846 KB)

计算机科学 > 计算机视觉与模式识别

标题： Restage4D：从单个视频重新激活可变形3D重建

标题： Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 计算机视觉与模式识别

标题： Restage4D：从单个视频重新激活可变形3D重建 显示英文标题

标题： Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题： Restage4D：从单个视频重新激活可变形3D重建