Abstract
Anomaly detection is represented as an unsupervised learning to identify deviated images from normal images. In general, there are two main challenges of anomaly detection tasks, i.e., the class imbalance and the unexpectedness of anomalies. In this paper, we propose a multiresolution feature guidance method based on Transformer named GTrans for unsupervised anomaly detection and localization. In GTrans, an Anomaly Guided Network (AGN) pre-trained on ImageNet is developed to provide surrogate labels for features and tokens. Under the tacit knowledge guidance of the AGN, the anomaly detection network named Trans utilizes Transformer to effectively establish a relationship between features with multiresolution, enhancing the ability of the Trans in fitting the normal data manifold. Due to the strong generalization ability of AGN, GTrans locates anomalies by comparing the differences in spatial distance and direction of multi-scale features extracted from the AGN and the Trans. Our experiments demonstrate that the proposed GTrans achieves state-of-the-art performance in both detection and localization on the MVTec AD dataset. GTrans achieves image-level and pixel-level anomaly detection AUROC scores of 99.0% and 97.9% on the MVTec AD dataset, respectively.
Similar content being viewed by others
Notes
Since training set in GTrans only contains normal images without any labels, such data setup can be generally considered as unsupervised [24].
However, AGN is knowledgeable on anomaly images because of the strong generalization, while Trans is unfamiliar with such images. This due to that AGN is pre-trained on ImageNet, which can generalize well across datasets [49], while Trans is trained from scratch.
The datasets analysed during the current study are available at https://www.mvtec.com/company/research/datasets/mvtec-ad.
References
Liu J, Guo F, Zhang Y, Hou B, Zhou H (2021) Defect classification on limited labeled samples with multiscale feature fusion and semi-supervised learning. Appl Intell pp 1–16
Wu J, Le J, Xiao Z, Zhang F, Geng L, Liu Y, Wang W (2021) Automatic fabric defect detection using a wide-and-light network. Appl Intell 51(7):4945–4961
He L, Shi N, Malik K, Li F (2021) Unsupervised defect inspection algorithm based on cascaded gan with edge repair feature fusion. Appl Intell pp 1–19
Zhou K, Xiao Y, Yang J, Cheng J, Liu W, Luo W, Gu Z, Liu J, Gao S (2020) Encoding structure-texture relation with p-net for anomaly detection in retinal images. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, Proceedings, Part XX 16, Springer, pp 360–377. Accessed 23–28 Aug 2020
Zhao X, Su J, Cai J, Yang H, Xi T (2021) Vehicle anomalous trajectory detection algorithm based on road network partition. Appl Intell pp 1–19
Zhou W, Liang Y (2021) An immune optimization based deterministic dendritic cell algorithm. Appl Intell pp 1–16
Zheng X, Chen J, Wang H, Zheng S, Kong Y (2021) A deep learning-based approach for the automated surface inspection of copper clad laminate images. Appl Intell 51(3):1262–1279
Qian Q, Chen L, Li H, Jin R (2020) Dr loss: Improving object detection by distributional ranking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12164–12172
Perera P, Nallapati R, Xiang B (2019) Ocgan: One-class novelty detection using gans with constrained latent representations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2898–2906
Gong D, Liu L, Le V, Saha B, Mansour MR, Venkatesh S, Hengel Avd (2019)Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1705–1714
An J, Cho S (2015) Variational autoencoder based anomaly detection using reconstruction probability. Special Lecture on IE 2(1):1–18
Ruff L, Vandermeulen R, Goernitz N, Deecke L, Siddiqui SA, Binder A, Müller E, Kloft M (2018) Deep one-class classification. In: International conference on machine learning, PMLR, pp 4393–4402
Bergmann P, Fauser M, Sattlegger D, Steger C (2019) Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9592–9600
Pidhorskyi S, Almohsen R, Adjeroh DA, Doretto G (2018) Generative probabilistic novelty detection with adversarial autoencoders. arXiv:1807.02588
Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G (2017) Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: International conference on information processing in medical imaging, Springer, pp 146–157
Akcay S, Atapour-Abarghouei A, Breckon TP (2018) Ganomaly: Semi-supervised anomaly detection via adversarial training. In: Asian conference on computer vision, Springer, pp 622–637
Schlegl T, Seeböck P, Waldstein SM, Langs G, Schmidt-Erfurth U (2019) f-anogan: Fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal 54:30–44
Fei Y, Huang C, Jinkun C, Li M, Zhang Y, Lu C (2020) Attribute restoration framework for anomaly detection. IEEE Trans Multimed
Abati D, Porrello A, Calderara S, Cucchiara R (2019) Latent space autoregression for novelty detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 481–490
Cohen N, Hoshen Y (2020) Sub-image anomaly detection with deep pyramid correspondences. arXiv:2005.02357
Bergmann P, Fauser M, Sattlegger D, Steger C (2020) Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4183–4192
Salehi M, Sadjadi N, Baselizadeh S, Rohban MH, Rabiee HR (2021) Multiresolution knowledge distillation for anomaly detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14902–14912
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Gudovskiy D, Ishizaka S, Kozuka K (2022) Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 98–107
Bergmann P, Löwe S, Fauser M, Sattlegger D, Steger C (2018) Improving unsupervised defect segmentation by applying structural similarity to autoencoders. arXiv:1807.02011
Sato K, Hama K, Matsubara T, Uehara K (2019) Predictable uncertainty-aware unsupervised deep anomaly segmentation. In: 2019 International joint conference on neural networks (IJCNN), IEEE, pp 1–7
Liu W, Li R, Zheng M, Karanam S, Wu Z, Bhanu B, Radke RJ, Camps O (2020) Towards visually explaining variational autoencoders. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8642–8651
Lu Y, Xu P (2018) Anomaly detection for skin disease images using variational autoencoder. arXiv:1807.01349
Sabokrou M, Khalooei M, Fathy M, Adeli E (2018) Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3379–3388
Eskin E, Arnold A, Prerau M, Portnoy L, Stolfo S (2002) A geometric framework for unsupervised anomaly detection. In: Applications of data mining in computer security, pp 77–101. Springer, ???
Hartigan JA, Wong MA (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc. series c (applied statistics) 28(1):100–108
Teh HY, Kevin I, Wang K, Kempa-Liehr AW (2021) Expect the unexpected: Unsupervised feature selection for automated sensor anomaly detection. IEEE Sensors J
Yi J, Yoon S (2020) Patch svdd: Patch-level svdd for anomaly detection and segmentation. In: Proceedings of the asian conference on computer vision
Li C-L, Sohn K, Yoon J, Pfister T (2021) Cutpaste: Self-supervised learning for anomaly detection and localization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9664–9674
Defard T, Setkov A, Loesch A, Audigier R (2021) Padim: A patch distribution modeling framework for anomaly detection and localization. In: International conference on pattern recognition, Springer, pp 475–489
Wang G, Han S, Ding E, Huang D (2021) Student-teacher feature pyramid matching for unsupervised anomaly detection. arXiv:2103.04257
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision, Springer, pp 213–229
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: Deformable transformers for end-to-end object detection. arXiv:2010.04159
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv:2102.04306
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241
Pirnay J, Chai K (2021) Inpainting transformer for anomaly detection. arXiv:2104.13897
Zhang S, He X, Yan S (2019) Latentgnn: Learning efficient non-local relations for visual recognition. In: International conference on machine learning, PMLR, pp 7374–7383
Liang X, Hu Z, Zhang H, Lin L, Xing EP (2018) Symbolic graph reasoning meets convolutions. Adv Neural Inf Process Syst 31:1853–1863
Chen Y, Rohrbach M, Yan Z, Shuicheng Y, Feng J, Kalantidis Y (2019) Graph-based global reasoning networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 433–442
Islam MA, Jia S, Bruce ND (2020) How much position information do convolutional neural networks encode? arXiv:2001.08248
Yim J, Joo D, Bae J, Kim J (2017) A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4133–4141
Kornblith S, Shlens J, Le QV (2019) Do better imagenet models transfer better? In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2661–2671
Wu B, Xu C, Dai X, Wan A, Zhang P, Yan Z, Tomizuka M, Gonzalez J, Keutzer K, Vajda P (2020) Visual transformers: Token-based image representation and processing for computer vision. arXiv:2006.03677
Acknowledgements
This work was supported by the NSF of China under Grants 62171135, Fujian distinguished talent project under Grant 2022J06010, Fujian Key research Project under Grant 2023XQ004, Fujian Industry Software Project of Industry Department 2023, and Education key project under Grant 500190.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, S., Chen, P., Chen, H. et al. Multiresolution feature guidance based transformer for anomaly detection. Appl Intell 54, 1831–1846 (2024). https://doi.org/10.1007/s10489-024-05283-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05283-7