Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection

Qian, Long; Zhu, Bingke; Chen, Yingying; Tang, Ming; Wang, Jinqiao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2508.03539 (cs)

[Submitted on 5 Aug 2025 ]

Title: Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection

Title: 质量感知的语言条件局部自回归异常合成与检测

Authors:Long Qian, Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

Abstract: Despite substantial progress in anomaly synthesis methods, existing diffusion-based and coarse inpainting pipelines commonly suffer from structural deficiencies such as micro-structural discontinuities, limited semantic controllability, and inefficient generation. To overcome these limitations, we introduce ARAS, a language-conditioned, auto-regressive anomaly synthesis approach that precisely injects local, text-specified defects into normal images via token-anchored latent editing. Leveraging a hard-gated auto-regressive operator and a training-free, context-preserving masked sampling kernel, ARAS significantly enhances defect realism, preserves fine-grained material textures, and provides continuous semantic control over synthesized anomalies. Integrated within our Quality-Aware Re-weighted Anomaly Detection (QARAD) framework, we further propose a dynamic weighting strategy that emphasizes high-quality synthetic samples by computing an image-text similarity score with a dual-encoder model. Extensive experiments across three benchmark datasets-MVTec AD, VisA, and BTAD, demonstrate that our QARAD outperforms SOTA methods in both image- and pixel-level anomaly detection tasks, achieving improved accuracy, robustness, and a 5 times synthesis speedup compared to diffusion-based alternatives. Our complete code and synthesized dataset will be publicly available.

Abstract: 尽管在异常合成方法上取得了显著进展，现有的基于扩散和粗粒度修复的流程通常存在结构上的不足，例如微观结构不连续、有限的语义可控性和生成效率低下。为了克服这些限制，我们引入了ARAS，这是一种语言条件的自回归异常合成方法，通过标记锚定的潜在编辑精确地将局部的、文本指定的缺陷注入正常图像中。利用硬门控自回归算子和无训练、保持上下文的掩码采样内核，ARAS显著提高了缺陷的真实性，保留了细粒度的材料纹理，并提供了对合成异常的连续语义控制。集成在我们的质量感知加权异常检测（QARAD）框架中，我们进一步提出了一种动态加权策略，通过使用双编码器模型计算图像-文本相似性得分来强调高质量的合成样本。在三个基准数据集-MVTec AD、VisA和BTAD上的广泛实验表明，我们的QARAD在图像级和像素级异常检测任务中均优于最先进方法，在准确性和鲁棒性方面有所提升，并且相比基于扩散的方法合成速度提高了5倍。我们的完整代码和合成数据集将公开可用。

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2508.03539 [cs.CV]
	(or arXiv:2508.03539v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2508.03539

Submission history

From: Qian Long [view email]
[v1] Tue, 5 Aug 2025 15:07:32 UTC (22,224 KB)

Computer Science > Computer Vision and Pattern Recognition

Title: Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection

Title: 质量感知的语言条件局部自回归异常合成与检测

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title: Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection Show Chinese title

Title: 质量感知的语言条件局部自回归异常合成与检测

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Title: Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection