DinoAtten3D: Slice-Level Attention Aggregation of DinoV2 for 3D Brain MRI Anomaly Classification

Rafsani, Fazle; Shah, Jay; Chong, Catherine D.; Schwedt, Todd J.; Wu, Teresa

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2509.12512v1 (eess)

[Submitted on 15 Sep 2025 ]

Title: DinoAtten3D: Slice-Level Attention Aggregation of DinoV2 for 3D Brain MRI Anomaly Classification

Title: DinoAtten3D：DinoV2在3D脑部MRI异常分类中的切片级注意力聚合

Authors:Fazle Rafsani, Jay Shah, Catherine D. Chong, Todd J. Schwedt, Teresa Wu

Abstract: Anomaly detection and classification in medical imaging are critical for early diagnosis but remain challenging due to limited annotated data, class imbalance, and the high cost of expert labeling. Emerging vision foundation models such as DINOv2, pretrained on extensive, unlabeled datasets, offer generalized representations that can potentially alleviate these limitations. In this study, we propose an attention-based global aggregation framework tailored specifically for 3D medical image anomaly classification. Leveraging the self-supervised DINOv2 model as a pretrained feature extractor, our method processes individual 2D axial slices of brain MRIs, assigning adaptive slice-level importance weights through a soft attention mechanism. To further address data scarcity, we employ a composite loss function combining supervised contrastive learning with class-variance regularization, enhancing inter-class separability and intra-class consistency. We validate our framework on the ADNI dataset and an institutional multi-class headache cohort, demonstrating strong anomaly classification performance despite limited data availability and significant class imbalance. Our results highlight the efficacy of utilizing pretrained 2D foundation models combined with attention-based slice aggregation for robust volumetric anomaly detection in medical imaging. Our implementation is publicly available at https://github.com/Rafsani/DinoAtten3D.git.

Abstract: 异常检测和分类在医学成像中对于早期诊断至关重要，但由于标注数据有限、类别不平衡以及专家标注的高成本，仍然具有挑战性。新兴的视觉基础模型，如DINOv2，在大量未标注数据上进行预训练，提供了通用的表示，有可能缓解这些限制。在本研究中，我们提出了一种基于注意力的全局聚合框架，专门针对3D医学图像异常分类。利用自监督的DINOv2模型作为预训练特征提取器，我们的方法处理脑部MRI的单个2D轴向切片，通过软注意力机制分配自适应的切片级重要性权重。为了进一步解决数据稀缺问题，我们采用了一个复合损失函数，结合监督对比学习与类别方差正则化，增强了类间可分性和类内一致性。我们在ADNI数据集和机构多类头痛队列上验证了我们的框架，尽管数据可用性有限且类别不平衡显著，仍表现出强大的异常分类性能。我们的结果突显了利用预训练的2D基础模型结合基于注意力的切片聚合在医学成像中进行鲁棒体积异常检测的有效性。我们的实现可在https://github.com/Rafsani/DinoAtten3D.git公开获取。

Comments:	ACCEPTED at the ICCV 2025 Workshop on Anomaly Detection with Foundation Models
Subjects:	Image and Video Processing (eess.IV) ; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2509.12512 [eess.IV]
	(or arXiv:2509.12512v1 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2509.12512

Submission history

From: Fazle Rafsani [view email]
[v1] Mon, 15 Sep 2025 23:31:40 UTC (1,741 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title: DinoAtten3D: Slice-Level Attention Aggregation of DinoV2 for 3D Brain MRI Anomaly Classification

Title: DinoAtten3D：DinoV2在3D脑部MRI异常分类中的切片级注意力聚合

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title: DinoAtten3D: Slice-Level Attention Aggregation of DinoV2 for 3D Brain MRI Anomaly Classification Show Chinese title

Title: DinoAtten3D：DinoV2在3D脑部MRI异常分类中的切片级注意力聚合

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: DinoAtten3D: Slice-Level Attention Aggregation of DinoV2 for 3D Brain MRI Anomaly Classification