Beyond Blur: A Fluid Perspective on Generative Diffusion Models

Gruszczynski, Grzegorz; Wlodarczyk, Michal Jan; Meixner, Jakub J; Musialski, Przemyslaw

计算机科学 > 图形学

arXiv:2506.16827v1 (cs)

[提交于 2025年6月20日 ]

标题：超越模糊：生成扩散模型的流体视角

标题： Beyond Blur: A Fluid Perspective on Generative Diffusion Models

Authors:Grzegorz Gruszczynski, Michal Jan Wlodarczyk, Jakub J Meixner, Przemyslaw Musialski

摘要：我们提出了一种新颖的基于偏微分方程（PDE）驱动的腐蚀过程，用于生成式图像合成，该方法扩展了现有的基于PDE的方法。我们的前向传递通过一个物理启发的PDE来表征图像腐蚀，该PDE将方向性平流与各向同性扩散和高斯噪声耦合，并由无量纲数（Peclet数、Fourier数）控制。我们通过一种GPU加速的自定义格子玻尔兹曼求解器在数值上实现这个PDE，以实现快速评估。为了诱导真实的湍流，我们生成随机速度场，引入相干运动并捕捉多尺度混合。在生成过程中，神经网络学习逆转平流-扩散算子，从而构成一种新的生成模型。我们讨论了先前的方法如何作为我们算子的具体情况出现，证明了我们的框架概括了先前基于PDE的腐蚀技术。我们展示了平流如何在保持整体色域不受影响的同时提高生成图像的多样性和质量。这项工作将流体力学、无量纲PDE理论和深度生成建模相结合，为基于扩散的合成提供了关于物理启发的图像腐蚀过程的新视角。

摘要： We propose a novel PDE-driven corruption process for generative image synthesis based on advection-diffusion processes which generalizes existing PDE-based approaches. Our forward pass formulates image corruption via a physically motivated PDE that couples directional advection with isotropic diffusion and Gaussian noise, controlled by dimensionless numbers (Peclet, Fourier). We implement this PDE numerically through a GPU-accelerated custom Lattice Boltzmann solver for fast evaluation. To induce realistic turbulence, we generate stochastic velocity fields that introduce coherent motion and capture multi-scale mixing. In the generative process, a neural network learns to reverse the advection-diffusion operator thus constituting a novel generative model. We discuss how previous methods emerge as specific cases of our operator, demonstrating that our framework generalizes prior PDE-based corruption techniques. We illustrate how advection improves the diversity and quality of the generated images while keeping the overall color palette unaffected. This work bridges fluid dynamics, dimensionless PDE theory, and deep generative modeling, offering a fresh perspective on physically informed image corruption processes for diffusion-based synthesis.

评论：	11页，8幅图，预印本，附录中有补充的伪代码
主题：	图形学 (cs.GR) ; 计算机视觉与模式识别 (cs.CV); 机器学习 (cs.LG)
ACM 类：	I.2.6; I.4.10; I.4.8
引用方式：	arXiv:2506.16827 [cs.GR]
	(或者 arXiv:2506.16827v1 [cs.GR] 对于此版本)
	https://doi.org/10.48550/arXiv.2506.16827

提交历史

来自： Przemyslaw Musialski [查看电子邮件]
[v1] 星期五， 2025 年 6 月 20 日 08:31:30 UTC (47,804 KB)

计算机科学 > 图形学

标题：超越模糊：生成扩散模型的流体视角

标题： Beyond Blur: A Fluid Perspective on Generative Diffusion Models

提交历史

获取论文：

参考文献与引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

计算机科学 > 图形学

标题： 超越模糊：生成扩散模型的流体视角 显示英文标题

标题： Beyond Blur: A Fluid Perspective on Generative Diffusion Models

提交历史

获取论文：

参考文献与引用

BibTeX 格式的引用

收藏

文献和引用工具

与本文相关的代码，数据和媒体

演示

推荐器和搜索工具

arXivLabs：与社区合作伙伴的实验项目

标题：超越模糊：生成扩散模型的流体视角