Skip to main content
CenXiv.org
This website is in trial operation, support us!
We gratefully acknowledge support from all contributors.
Contribute
Donate
cenxiv logo > eess > arXiv:2509.12512v1

Help | Advanced Search

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2509.12512v1 (eess)
[Submitted on 15 Sep 2025 ]

Title: DinoAtten3D: Slice-Level Attention Aggregation of DinoV2 for 3D Brain MRI Anomaly Classification

Title: DinoAtten3D:DinoV2在3D脑部MRI异常分类中的切片级注意力聚合

Authors:Fazle Rafsani, Jay Shah, Catherine D. Chong, Todd J. Schwedt, Teresa Wu
Abstract: Anomaly detection and classification in medical imaging are critical for early diagnosis but remain challenging due to limited annotated data, class imbalance, and the high cost of expert labeling. Emerging vision foundation models such as DINOv2, pretrained on extensive, unlabeled datasets, offer generalized representations that can potentially alleviate these limitations. In this study, we propose an attention-based global aggregation framework tailored specifically for 3D medical image anomaly classification. Leveraging the self-supervised DINOv2 model as a pretrained feature extractor, our method processes individual 2D axial slices of brain MRIs, assigning adaptive slice-level importance weights through a soft attention mechanism. To further address data scarcity, we employ a composite loss function combining supervised contrastive learning with class-variance regularization, enhancing inter-class separability and intra-class consistency. We validate our framework on the ADNI dataset and an institutional multi-class headache cohort, demonstrating strong anomaly classification performance despite limited data availability and significant class imbalance. Our results highlight the efficacy of utilizing pretrained 2D foundation models combined with attention-based slice aggregation for robust volumetric anomaly detection in medical imaging. Our implementation is publicly available at https://github.com/Rafsani/DinoAtten3D.git.
Abstract: 异常检测和分类在医学成像中对于早期诊断至关重要,但由于标注数据有限、类别不平衡以及专家标注的高成本,仍然具有挑战性。 新兴的视觉基础模型,如DINOv2,在大量未标注数据上进行预训练,提供了通用的表示,有可能缓解这些限制。 在本研究中,我们提出了一种基于注意力的全局聚合框架,专门针对3D医学图像异常分类。 利用自监督的DINOv2模型作为预训练特征提取器,我们的方法处理脑部MRI的单个2D轴向切片,通过软注意力机制分配自适应的切片级重要性权重。 为了进一步解决数据稀缺问题,我们采用了一个复合损失函数,结合监督对比学习与类别方差正则化,增强了类间可分性和类内一致性。 我们在ADNI数据集和机构多类头痛队列上验证了我们的框架,尽管数据可用性有限且类别不平衡显著,仍表现出强大的异常分类性能。 我们的结果突显了利用预训练的2D基础模型结合基于注意力的切片聚合在医学成像中进行鲁棒体积异常检测的有效性。 我们的实现可在https://github.com/Rafsani/DinoAtten3D.git公开获取。
Comments: ACCEPTED at the ICCV 2025 Workshop on Anomaly Detection with Foundation Models
Subjects: Image and Video Processing (eess.IV) ; Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2509.12512 [eess.IV]
  (or arXiv:2509.12512v1 [eess.IV] for this version)
  https://doi.org/10.48550/arXiv.2509.12512
arXiv-issued DOI via DataCite

Submission history

From: Fazle Rafsani [view email]
[v1] Mon, 15 Sep 2025 23:31:40 UTC (1,741 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled
  • View Chinese PDF
  • View PDF
  • HTML (experimental)
  • TeX Source
license icon view license
Current browse context:
eess.IV
< prev   |   next >
new | recent | 2025-09
Change to browse by:
cs
cs.AI
cs.CV
eess

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender (What is IArxiv?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack

京ICP备2025123034号