MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis

Xie, Jianhao; Zhang, Ziang; Weng, Zhenyu; Zhu, Yuesheng; Luo, Guibo

Computer Science > Computer Vision and Pattern Recognition

arXiv:2507.00377 (cs)

[Submitted on 1 Jul 2025 ]

Title: MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis

Title: MedDiff-FT：具有结构引导的可控医学图像合成数据高效扩散模型微调

Authors:Jianhao Xie, Ziang Zhang, Zhenyu Weng, Yuesheng Zhu, Guibo Luo

Abstract: Recent advancements in deep learning for medical image segmentation are often limited by the scarcity of high-quality training data.While diffusion models provide a potential solution by generating synthetic images, their effectiveness in medical imaging remains constrained due to their reliance on large-scale medical datasets and the need for higher image quality. To address these challenges, we present MedDiff-FT, a controllable medical image generation method that fine-tunes a diffusion foundation model to produce medical images with structural dependency and domain specificity in a data-efficient manner. During inference, a dynamic adaptive guiding mask enforces spatial constraints to ensure anatomically coherent synthesis, while a lightweight stochastic mask generator enhances diversity through hierarchical randomness injection. Additionally, an automated quality assessment protocol filters suboptimal outputs using feature-space metrics, followed by mask corrosion to refine fidelity. Evaluated on five medical segmentation datasets,MedDiff-FT's synthetic image-mask pairs improve SOTA method's segmentation performance by an average of 1% in Dice score. The framework effectively balances generation quality, diversity, and computational efficiency, offering a practical solution for medical data augmentation. The code is available at https://github.com/JianhaoXie1/MedDiff-FT.

Abstract: 最近在医学图像分割中的深度学习进展通常受到高质量训练数据稀缺的限制。虽然扩散模型通过生成合成图像提供了潜在的解决方案，但由于其依赖大规模医学数据集并需要更高的图像质量，其在医学成像中的有效性仍然受到限制。为了解决这些挑战，我们提出了MedDiff-FT，这是一种可控的医学图像生成方法，通过微调扩散基础模型，在数据高效的方式下生成具有结构依赖性和领域特异性的医学图像。在推理过程中，动态自适应引导掩码施加空间约束以确保解剖学一致的合成，而轻量级随机掩码生成器通过分层随机性注入来增强多样性。此外，一个自动质量评估协议使用特征空间度量过滤次优输出，随后通过掩码腐蚀来提高保真度。在五个医学分割数据集上进行评估，MedDiff-FT的合成图像-掩码对平均提高了SOTA方法的Dice分数1%。该框架有效地平衡了生成质量、多样性和计算效率，为医学数据增强提供了一个实用的解决方案。代码可在https://github.com/JianhaoXie1/MedDiff-FT获取。

Comments:	11 pages,3 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2507.00377 [cs.CV]
	(or arXiv:2507.00377v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2507.00377

Submission history

From: Guibo Luo [view email]
[v1] Tue, 1 Jul 2025 02:22:32 UTC (2,873 KB)

Computer Science > Computer Vision and Pattern Recognition

Title: MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis

Title: MedDiff-FT：具有结构引导的可控医学图像合成数据高效扩散模型微调

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title: MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis Show Chinese title

Title: MedDiff-FT：具有结构引导的可控医学图像合成数据高效扩散模型微调

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis