Generative Blocks World: Moving Things Around in Pictures

Vavilala, Vaibhav; Jain, Seemandhar; Vasanth, Rahul; Forsyth, D. A.; Bhattad, Anand

计算机科学 > 图形学

arXiv:2506.20703 (cs)

[提交于 2025年6月25日 ]

标题：生成块世界：在图片中移动物体

标题： Generative Blocks World: Moving Things Around in Pictures

Authors:Vaibhav Vavilala, Seemandhar Jain, Rahul Vasanth, D.A. Forsyth, Anand Bhattad

摘要：我们描述了生成块世界，以通过操作简单的几何抽象与生成图像的场景进行交互。我们的方法将场景表示为凸形3D基本体的组合，同一场景可以用不同数量的基本体表示，从而使编辑器可以移动整个结构或小细节。一旦场景几何形状被编辑，图像便通过一种基于流的方法生成，该方法依赖于深度和纹理提示。我们的纹理提示考虑了修改后的3D基本体，超越了现有键值缓存技术提供的纹理一致性。这些纹理提示（a）允许准确的对象和相机移动，并（b）在很大程度上保留了所描绘对象的身份。定量和定性实验表明，我们的方法在视觉保真度、可编辑性和组合泛化方面优于先前的工作。

摘要： We describe Generative Blocks World to interact with the scene of a generated image by manipulating simple geometric abstractions. Our method represents scenes as assemblies of convex 3D primitives, and the same scene can be represented by different numbers of primitives, allowing an editor to move either whole structures or small details. Once the scene geometry has been edited, the image is generated by a flow-based method which is conditioned on depth and a texture hint. Our texture hint takes into account the modified 3D primitives, exceeding texture-consistency provided by existing key-value caching techniques. These texture hints (a) allow accurate object and camera moves and (b) largely preserve the identity of objects depicted. Quantitative and qualitative experiments demonstrate that our approach outperforms prior works in visual fidelity, editability, and compositional generalization.

评论：	23页，16图，2表
主题：	图形学 (cs.GR) ; 计算机视觉与模式识别 (cs.CV)
引用方式：	arXiv:2506.20703 [cs.GR]
	(或者 arXiv:2506.20703v1 [cs.GR] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.20703

提交历史

来自： Vaibhav Vavilala [查看电子邮件]
[v1] 星期三， 2025 年 6 月 25 日 17:59:55 UTC (7,124 KB)

计算机科学 > 图形学

标题：生成块世界：在图片中移动物体

标题： Generative Blocks World: Moving Things Around in Pictures

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 图形学

标题： 生成块世界：在图片中移动物体 显示英文标题

标题： Generative Blocks World: Moving Things Around in Pictures

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：生成块世界：在图片中移动物体