Fairness-Aware Data Augmentation for Cardiac MRI using Text-Conditioned Diffusion Models

Skorupko, Grzegorz; Osuala, Richard; Szafranowska, Zuzanna; Kushibar, Kaisar; Dang, Vien Ngoc; Aung, Nay; Petersen, Steffen E; Lekadir, Karim; Gkontra, Polyxeni

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2403.19508v2 (eess)

[Submitted on 28 Mar 2024 (v1) , last revised 8 Sep 2025 (this version, v2)]

Title: Fairness-Aware Data Augmentation for Cardiac MRI using Text-Conditioned Diffusion Models

Title: 基于文本条件扩散模型的心脏MRI公平感知数据增强

Authors:Grzegorz Skorupko, Richard Osuala, Zuzanna Szafranowska, Kaisar Kushibar, Vien Ngoc Dang, Nay Aung, Steffen E Petersen, Karim Lekadir, Polyxeni Gkontra

Abstract: While deep learning holds great promise for disease diagnosis and prognosis in cardiac magnetic resonance imaging, its progress is often constrained by highly imbalanced and biased training datasets. To address this issue, we propose a method to alleviate imbalances inherent in datasets through the generation of synthetic data based on sensitive attributes such as sex, age, body mass index (BMI), and health condition. We adopt ControlNet based on a denoising diffusion probabilistic model to condition on text assembled from patient metadata and cardiac geometry derived from segmentation masks. We assess our method using a large-cohort study from the UK Biobank by evaluating the realism of the generated images using established quantitative metrics. Furthermore, we conduct a downstream classification task aimed at debiasing a classifier by rectifying imbalances within underrepresented groups through synthetically generated samples. Our experiments demonstrate the effectiveness of the proposed approach in mitigating dataset imbalances, such as the scarcity of diagnosed female patients or individuals with normal BMI level suffering from heart failure. This work represents a major step towards the adoption of synthetic data for the development of fair and generalizable models for medical classification tasks. Notably, we conduct all our experiments using a single, consumer-level GPU to highlight the feasibility of our approach within resource-constrained environments. Our code is available at https://github.com/faildeny/debiasing-cardiac-mri.

Abstract: 虽然深度学习在心脏磁共振成像中的疾病诊断和预后方面具有巨大的潜力，但其进展通常受到高度不平衡和有偏的训练数据集的限制。为了解决这个问题，我们提出了一种方法，通过基于敏感属性（如性别、年龄、体重指数（BMI）和健康状况）生成合成数据来缓解数据集中的不平衡。我们采用基于去噪扩散概率模型的ControlNet，对从患者元数据和分割掩码中提取的心脏几何结构组成的文本进行条件设置。我们通过评估生成图像的真实性，使用已建立的定量指标，在英国生物银行的大队列研究中评估我们的方法。此外，我们进行了一项下游分类任务，旨在通过合成生成的样本校正代表性不足群体内的不平衡来消除分类器的偏差。我们的实验表明，所提出的方法在减轻数据集不平衡方面是有效的，例如诊断出的女性患者稀缺或患有正常BMI水平却患有心力衰竭的个体。这项工作代表了在医疗分类任务中采用合成数据以开发公平且可推广模型的重要一步。值得注意的是，我们使用单一的消费级GPU进行了所有实验，以突出我们在资源受限环境中的方法可行性。我们的代码可在 https://github.com/faildeny/debiasing-cardiac-mri 获取。

Subjects:	Image and Video Processing (eess.IV) ; Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2403.19508 [eess.IV]
	(or arXiv:2403.19508v2 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2403.19508

Submission history

From: Grzegorz Skorupko [view email]
[v1] Thu, 28 Mar 2024 15:41:43 UTC (2,948 KB)
[v2] Mon, 8 Sep 2025 09:37:31 UTC (2,592 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title: Fairness-Aware Data Augmentation for Cardiac MRI using Text-Conditioned Diffusion Models

Title: 基于文本条件扩散模型的心脏MRI公平感知数据增强

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title: Fairness-Aware Data Augmentation for Cardiac MRI using Text-Conditioned Diffusion Models Show Chinese title

Title: 基于文本条件扩散模型的心脏MRI公平感知数据增强

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Fairness-Aware Data Augmentation for Cardiac MRI using Text-Conditioned Diffusion Models