Abstract
Autoencoders are widely used in machine learning applications, in particular for anomaly detection. Hence, they have been introduced in high energy physics as a promising tool for model-independent new physics searches. We scrutinize the usage of autoencoders for unsupervised anomaly detection based on reconstruction loss to show their capabilities, but also their limitations. As a particle physics benchmark scenario, we study the tagging of top jet images in a background of QCD jet images. Although we reproduce the positive results from the literature, we show that the standard autoencoder setup cannot be considered as a model-independent anomaly tagger by inverting the task: due to the sparsity and the specific structure of the jet images, the autoencoder fails to tag QCD jets if it is trained on top jets even in a semi-supervised setup. Since the same autoencoder architecture can be a good tagger for a specific example of an anomaly and a bad tagger for a different example, we suggest improved performance measures for the task of model-independent anomaly detection. We also improve the capability of the autoencoder to learn non-trivial features of the jet images, such that it is able to achieve both top jet tagging and the inverse task of QCD jet tagging with the same setup. However, we want to stress that a truly model-independent and powerful autoencoder-based unsupervised jet tagger still needs to be developed.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
ATLAS collaboration, Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC, Phys. Lett. B 716 (2012) 1 [arXiv:1207.7214] [INSPIRE].
CMS collaboration, Observation of a New Boson at a Mass of 125 GeV with the CMS Experiment at the LHC, Phys. Lett. B 716 (2012) 30 [arXiv:1207.7235] [INSPIRE].
M. Feickert and B. Nachman, A Living Review of Machine Learning for Particle Physics, arXiv:2102.02770 [INSPIRE].
M. D. Schwartz, Modern Machine Learning and Particle Physics, arXiv:2103.12226 [INSPIRE].
D. Bourilkov, Machine and Deep Learning Applications in Particle Physics, Int. J. Mod. Phys. A 34 (2020) 1930019 [arXiv:1912.08245] [INSPIRE].
D. Guest, K. Cranmer and D. Whiteson, Deep Learning and its Application to LHC Physics, Ann. Rev. Nucl. Part. Sci. 68 (2018) 161 [arXiv:1806.11484] [INSPIRE].
K. Albertsson et al., Machine Learning in High Energy Physics Community White Paper, J. Phys. Conf. Ser. 1085 (2018) 022008 [arXiv:1807.02876] [INSPIRE].
A. J. Larkoski, I. Moult and B. Nachman, Jet Substructure at the Large Hadron Collider: A Review of Recent Advances in Theory and Machine Learning, Phys. Rept. 841 (2020) 1 [arXiv:1709.04464] [INSPIRE].
L. M. Dery, B. Nachman, F. Rubbo and A. Schwartzman, Weakly Supervised Classification in High Energy Physics, JHEP 05 (2017) 145 [arXiv:1702.00414] [INSPIRE].
T. Cohen, M. Freytsis and B. Ostdiek, (Machine) Learning to Do More with Less, JHEP 02 (2018) 034 [arXiv:1706.09451] [INSPIRE].
E. M. Metodiev, B. Nachman and J. Thaler, Classification without labels: Learning from mixed samples in high energy physics, JHEP 10 (2017) 174 [arXiv:1708.02949] [INSPIRE].
P. T. Komiske, E. M. Metodiev, B. Nachman and M. D. Schwartz, Learning to classify from impure samples with high-dimensional data, Phys. Rev. D 98 (2018) 011502 [arXiv:1801.10158] [INSPIRE].
M. Borisyak and N. Kazeev, Machine Learning on data with sPlot background subtraction, 2019 JINST 14 P08020 [arXiv:1905.11719] [INSPIRE].
O. Amram and C. M. Suarez, Tag N’ Train: a technique to train improved classifiers on unlabeled data, JHEP 01 (2021) 153 [arXiv:2002.12376] [INSPIRE].
J. S. H. Lee, S. M. Lee, Y. Lee, I. Park, I. J. Watson and S. Yang, Quark Gluon Jet Discrimination with Weakly Supervised Learning, J. Korean Phys. Soc. 75 (2019) 652 [arXiv:2012.02540] [INSPIRE].
L. Ruff et al., A Unifying Review of Deep and Shallow Anomaly Detection, Proc. IEEE 109 (2021) 756 [arXiv:2009.11732].
R. Chalapathy and S. Chawla, Deep Learning for Anomaly Detection: A Survey, arXiv:1901.03407.
B. Nachman, Anomaly Detection for Physics Analysis and Less than Supervised Learning, arXiv:2010.14554 [INSPIRE].
G. Kasieczka et al., The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics, arXiv:2101.08320 [INSPIRE].
P. Baldi and K. Hornik, Neural networks and principal component analysis: Learning from examples without local minima, Neural Networks 2 (1989) 53.
Y. Bengio, A. Courville and P. Vincent, Representation learning: A review and new perspectives, IEEE Trans. Pattern Anal. Machine Intell. 35 (2013) 1798 [arXiv:1206.5538].
G. Pang, C. Shen, L. Cao and A. V. D. Hengel, Deep Learning for Anomaly Detection, ACM Computing Surveys 54 (2021) 1 [arXiv:2007.02500].
J. Hajer, Y.-Y. Li, T. Liu and H. Wang, Novelty Detection Meets Collider Physics, Phys. Rev. D 101 (2020) 076015 [arXiv:1807.10261] [INSPIRE].
M. Crispim Romão, N. F. Castro and R. Pedro, Finding New Physics without learning about it: Anomaly Detection as a tool for Searches at Colliders, Eur. Phys. J. C 81 (2021) 27 [arXiv:2006.05432] [INSPIRE].
S. Alexander et al., Decoding Dark Matter Substructure without Supervision, arXiv:2008.12731 [INSPIRE].
A. Blance, M. Spannowsky and P. Waite, Adversarially-trained autoencoders for robust unsupervised new physics searches, JHEP 10 (2019) 047 [arXiv:1905.10384] [INSPIRE].
O. Cerri, T. Q. Nguyen, M. Pierini, M. Spiropulu and J.-R. Vlimant, Variational Autoencoders for New Physics Mining at the Large Hadron Collider, JHEP 05 (2019) 036 [arXiv:1811.10276] [INSPIRE].
T. Cheng, J.-F. Arguin, J. Leissner-Martin, J. Pilette and T. Golling, Variational Autoencoders for Anomalous Jet Tagging, arXiv:2007.01850 [INSPIRE].
B. Bortolato, B. M. Dillon, J. F. Kamenik and A. Smolkovič, Bump Hunting in Latent Space, arXiv:2103.06595 [INSPIRE].
T. Heimel, G. Kasieczka, T. Plehn and J. M. Thompson, QCD or What?, SciPost Phys. 6 (2019) 030 [arXiv:1808.08979] [INSPIRE].
M. Farina, Y. Nakai and D. Shih, Searching for New Physics with Deep Autoencoders, Phys. Rev. D 101 (2020) 075021 [arXiv:1808.08992] [INSPIRE].
T. S. Roy and A. H. Vijay, A robust anomaly finder based on autoencoders, arXiv:1903.02032 [INSPIRE].
E. Nalisnick, A. Matsukawa, Y. W. Teh, D. Gorur and B. Lakshminarayanan, Do Deep Generative Models Know What They Don’t Know?, arXiv:1810.09136.
R. T. Schirrmeister, Y. Zhou, T. Ball and D. Zhang, Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features, arXiv:2006.10848.
P. Kirichenko, P. Izmailov and A. G. Wilson, Why Normalizing Flows Fail to Detect Out-of-Distribution Data, arXiv:2006.08545.
J. Ren et al., Likelihood Ratios for Out-of-Distribution Detection, arXiv:1906.02845.
J. Serrà et al., Input complexity and out-of-distribution detection with likelihood-based generative models, arXiv:1909.11480.
A. Tong, G. Wolf and S. Krishnaswamy, Fixing Bias in Reconstruction-based Anomaly Detection with Lipschitz Discriminators, arXiv:1905.10710.
L. G. Almeida, M. Backović, M. Cliche, S. J. Lee and M. Perelstein, Playing Tag with ANN: Boosted Top Identification with Pattern Recognition, JHEP 07 (2015) 086 [arXiv:1501.05968] [INSPIRE].
G. Kasieczka, T. Plehn, M. Russell and T. Schell, Deep-learning Top Taggers or The End of QCD?, JHEP 05 (2017) 006 [arXiv:1701.08784] [INSPIRE].
J. Pearkes, W. Fedorko, A. Lister and C. Gay, Jet Constituents for Deep Neural Network Based Top Quark Tagging, arXiv:1704.02124 [INSPIRE].
S. Macaluso and D. Shih, Pulling Out All the Tops with Computer Vision and Deep Learning, JHEP 10 (2018) 121 [arXiv:1803.00107] [INSPIRE].
A. Butter et al., The Machine Learning landscape of top taggers, SciPost Phys. 7 (2019) 014 [arXiv:1902.09914] [INSPIRE].
J. Y. Araz and M. Spannowsky, Combine and Conquer: Event Reconstruction with Bayesian Ensemble Neural Networks, JHEP 04 (2021) 296 [arXiv:2102.01078] [INSPIRE].
A. Butter, G. Kasieczka, T. Plehn and M. Russell, Deep-learned Top Tagging with a Lorentz Layer, SciPost Phys. 5 (2018) 028 [arXiv:1707.08966] [INSPIRE].
G. Kasieczka, T. Plehn, J. Thompson and M. Russel, Top Quark Tagging Reference Dataset, https://doi.org/10.5281/zenodo.2603256 (2019).
M. Abadi et al., TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, https://www.tensorflow.org/ (2015).
F. Chollet et al., Keras, https://github.com/fchollet/keras (2015).
B. Zong et al., Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection, in International Conference on Learning Representations, Vancouver Convention Center, Vancouver, BC, Canada, April 30 – May 3, 2018 [https://openreview.net/forum?id=BJJLHbb0-].
D. Gong et al., Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection, [arXiv:1904.02639].
J. Batson, C. G. Haaf, Y. Kahn and D. A. Roberts, Topological Obstructions to Autoencoding, JHEP 04 (2021) 280 [arXiv:2102.08380] [INSPIRE].
J. H. Collins, P. Martín-Ramiro, B. Nachman and D. Shih, Comparing Weak- and Unsupervised Methods for Resonant Anomaly Detection, arXiv:2104.02092 [INSPIRE].
Y. Rubner, C. Tomasi and L. J. Guibas, The Earth Mover’s Distance as a Metric for Image Retrieval, Int. J. Comput. Vision 40 (2000) 99.
N. Bonneel, J. Rabin, G. Peyré and H. Pfister, Sliced and Radon Wasserstein Barycenters of Measures, J. Math. Imag. Vis. 51 (2015) 22.
T. Finke, Deep Learning for New Physics Searches at the LHC, Master Thesis, RWTH Aachen University (2020).
I. Oleksiyuk, Unsupervised learning for tagging anomalous jets at the LHC, Bachelor Thesis, RWTH Aachen University (2021).
B. M. Dillon, T. Plehn, C. Sauer and P. Sorrenson, Better Latent Spaces for Better Autoencoders, arXiv:2104.08291 [INSPIRE].
B. M. Dillon, Learning the latent structure of collider events, in Anomaly Detection Mini-Workshop — LHC Summer Olympics, (2020) [https://indico.desy.de/event/25341/contributions/56828/].
Y. Gershtein, D. Jaroslawski, K. Nasha, D. Shih and M. Tran, Anomaly detection with convolutional autoencoders and latent space analysis, in Anomaly Detection Mini-Workshop — LHC Summer Olympics, (2020) and publication in preparation [https://indico.desy.de/event/25341/contributions/56829/].
T. Sjöstrand et al., An introduction to PYTHIA 8.2, Comput. Phys. Commun. 191 (2015) 159 [arXiv:1410.3012] [INSPIRE].
DELPHES 3 collaboration, DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
M. Cacciari, G. P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72 (2012) 1896 [arXiv:1111.6097] [INSPIRE].
M. Cacciari, G. P. Salam and G. Soyez, The anti-kt jet clustering algorithm, JHEP 04 (2008) 063 [arXiv:0802.1189] [INSPIRE].
D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [INSPIRE].
E. Bernreuther, T. Finke, F. Kahlhoefer, M. Krämer and A. Mück, Casting a graph net to catch dark showers, SciPost Phys. 10 (2021) 046 [arXiv:2006.08639] [INSPIRE].
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
ArXiv ePrint: 2104.09051
Rights and permissions
Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Finke, T., Krämer, M., Morandini, A. et al. Autoencoders for unsupervised anomaly detection in high energy physics. J. High Energ. Phys. 2021, 161 (2021). https://doi.org/10.1007/JHEP06(2021)161
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/JHEP06(2021)161