Skip to main content

Topological obstructions to autoencoding

A preprint version of the article is available at arXiv.

Abstract

Autoencoders have been proposed as a powerful tool for model-independent anomaly detection in high-energy physics. The operating principle is that events which do not belong to the space of training data will be reconstructed poorly, thus flagging them as anomalies. We point out that in a variety of examples of interest, the connection between large reconstruction error and anomalies is not so clear. In particular, for data sets with nontrivial topology, there will always be points that erroneously seem anomalous due to global issues. Conversely, neural networks typically have an inductive bias or prior to locally interpolate such that undersampled or rare events may be reconstructed with small error, despite actually being the desired anomalies. Taken together, these facts are in tension with the simple picture of the autoencoder as an anomaly detector. Using a series of illustrative low-dimensional examples, we show explicitly how the intrinsic and extrinsic topology of the dataset affects the behavior of an autoencoder and how this topology is manifested in the latent space representation during training. We ground this analysis in the discussion of a mock “bump hunt” in which the autoencoder fails to identify an anomalous “signal” for reasons tied to the intrinsic topology of n-particle phase space.

References

  1. J. Cogan, M. Kagan, E. Strauss and A. Schwarztman, Jet-images: computer vision inspired techniques for jet tagging, JHEP 02 (2015) 118 [arXiv:1407.5675] [INSPIRE].

    Article  Google Scholar 

  2. L. de Oliveira, M. Kagan, L. Mackey, B. Nachman and A. Schwartzman, Jet-images — deep learning edition, JHEP 07 (2016) 069 [arXiv:1511.05190] [INSPIRE].

    Article  Google Scholar 

  3. D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning internal representations by error propagation, in Parallel distributed processing: explorations in the microstructure of cognition. Volume 1: foundations, D. E. Rumelhart, J. L. McClelland and the PDP research group eds., MIT Press, Cambridge, MA, U.S.A. (1986).

  4. M. A. Pimentel, D. A. Clifton, L. Clifton and L. Tarassenko, A review of novelty detection, Signal Proc. 99 (2014) 215.

    Article  Google Scholar 

  5. B. Nachman, Anomaly detection for physics analysis and less than supervised learning, arXiv:2010.14554 [INSPIRE].

  6. M. Feickert and B. Nachman, A living review of machine learning for particle physics, arXiv:2102.02770 [INSPIRE].

  7. G. Kasieczka et al., The LHC olympics 2020: a community challenge for anomaly detection in high energy physics, arXiv:2101.08320 [INSPIRE].

  8. P. Baldi, K. Bauer, C. Eng, P. Sadowski and D. Whiteson, Jet substructure classification in high-energy physics with deep neural networks, Phys. Rev. D 93 (2016) 094034 [arXiv:1603.09349] [INSPIRE].

    Article  Google Scholar 

  9. J. Barnard, E. N. Dawe, M. J. Dolan and N. Rajcic, Parton shower uncertainties in jet substructure analyses with deep neural networks, Phys. Rev. D 95 (2017) 014018 [arXiv:1609.00607] [INSPIRE].

    Article  Google Scholar 

  10. P. T. Komiske, E. M. Metodiev and M. D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].

    MATH  Article  Google Scholar 

  11. ATLAS collaboration, Quark versus gluon jet tagging using jet images with the ATLAS detector, Tech. Rep. ATL-PHYS-PUB-2017-017, CERN, Geneva, Switzerland (2017).

  12. G. Kasieczka, T. Plehn, M. Russell and T. Schell, Deep-learning top taggers or the end of QCD?, JHEP 05 (2017) 006 [arXiv:1701.08784] [INSPIRE].

    Article  Google Scholar 

  13. W. Bhimji, S. A. Farrell, T. Kurth, M. Paganini, Prabhat and E. Racah, Deep neural networks for physics analysis on low-level whole-detector data at the LHC, J. Phys. Conf. Ser. 1085 (2018) 042034 [arXiv:1711.03573] [INSPIRE].

  14. S. Macaluso and D. Shih, Pulling out all the tops with computer vision and deep learning, JHEP 10 (2018) 121 [arXiv:1803.00107] [INSPIRE].

    Article  Google Scholar 

  15. J. Guo, J. Li, T. Li, F. Xu and W. Zhang, Deep learning for R-parity violating supersymmetry searches at the LHC, Phys. Rev. D 98 (2018) 076017 [arXiv:1805.10730] [INSPIRE].

    Article  Google Scholar 

  16. D. Guest, J. Collado, P. Baldi, S.-C. Hsu, G. Urban and D. Whiteson, Jet flavor classification in high-energy physics with deep neural networks, Phys. Rev. D 94 (2016) 112002 [arXiv:1607.08633] [INSPIRE].

    Article  Google Scholar 

  17. G. Louppe, K. Cho, C. Becot and K. Cranmer, QCD-aware recursive neural networks for jet physics, JHEP 01 (2019) 057 [arXiv:1702.00748] [INSPIRE].

    Article  Google Scholar 

  18. T. Cheng, Recursive neural networks in quark/gluon tagging, Comput. Softw. Big Sci. 2 (2018) 3 [arXiv:1711.02633] [INSPIRE].

    Article  Google Scholar 

  19. S. Egan, W. Fedorko, A. Lister, J. Pearkes and C. Gay, Long Short-Term Memory (LSTM) networks with jet constituents for boosted top tagging at the LHC, arXiv:1711.09059 [INSPIRE].

  20. K. Fraser and M. D. Schwartz, Jet charge and machine learning, JHEP 10 (2018) 093 [arXiv:1803.08066] [INSPIRE].

    Article  Google Scholar 

  21. L. G. Almeida, M. Backović, M. Cliche, S. J. Lee and M. Perelstein, Playing tag with ANN: boosted top identification with pattern recognition, JHEP 07 (2015) 086 [arXiv:1501.05968] [INSPIRE].

    Article  Google Scholar 

  22. J. Pearkes, W. Fedorko, A. Lister and C. Gay, Jet constituents for deep neural network based top quark tagging, arXiv:1704.02124 [INSPIRE].

  23. T. Roxlo and M. Reece, Opening the black box of neural nets: case studies in stop/top discrimination, arXiv:1804.09278 [INSPIRE].

  24. J. A. Aguilar-Saavedra, J. H. Collins and R. K. Mishra, A generic anti-QCD jet tagger, JHEP 11 (2017) 163 [arXiv:1709.01087] [INSPIRE].

    Article  Google Scholar 

  25. H. Lüo, M.-X. Luo, K. Wang, T. Xu and G. Zhu, Quark jet versus gluon jet: fully-connected neural networks with high-level features, Sci. China Phys. Mech. Astron. 62 (2019) 991011 [arXiv:1712.03634] [INSPIRE].

    Article  Google Scholar 

  26. L. Moore, K. Nordström, S. Varma and M. Fairbairn, Reports of my demise are greatly exaggerated: N -subjettiness taggers take on jet images, SciPost Phys. 7 (2019) 036 [arXiv:1807.04769] [INSPIRE].

    Article  Google Scholar 

  27. P. T. Komiske, E. M. Metodiev and J. Thaler, Energy flow polynomials: a complete linear basis for jet substructure, JHEP 04 (2018) 013 [arXiv:1712.07124] [INSPIRE].

    Article  Google Scholar 

  28. P. T. Komiske, E. M. Metodiev and J. Thaler, Energy flow networks: deep sets for particle jets, JHEP 01 (2019) 121 [arXiv:1810.05165] [INSPIRE].

    Article  Google Scholar 

  29. P. T. Komiske, E. M. Metodiev and J. Thaler, Cutting multiparticle correlators down to size, Phys. Rev. D 101 (2020) 036019 [arXiv:1911.04491] [INSPIRE].

    Article  Google Scholar 

  30. G. Kasieczka, S. Marzani, G. Soyez and G. Stagnitto, Towards machine learning analytics for jet substructure, JHEP 09 (2020) 195 [arXiv:2007.04319] [INSPIRE].

    Article  Google Scholar 

  31. K. Datta and A. Larkoski, How much information is in a jet?, JHEP 06 (2017) 073 [arXiv:1704.08249] [INSPIRE].

    Article  Google Scholar 

  32. A. Butter, G. Kasieczka, T. Plehn and M. Russell, Deep-learned top tagging with a Lorentz layer, SciPost Phys. 5 (2018) 028 [arXiv:1707.08966] [INSPIRE].

    Article  Google Scholar 

  33. K. Datta and A. J. Larkoski, Novel jet observables from machine learning, JHEP 03 (2018) 086 [arXiv:1710.01305] [INSPIRE].

    Article  Google Scholar 

  34. F. A. Dreyer, G. P. Salam and G. Soyez, The Lund jet plane, JHEP 12 (2018) 064 [arXiv:1807.04758] [INSPIRE].

    Article  Google Scholar 

  35. P. T. Komiske, E. M. Metodiev and J. Thaler, Metric space of collider events, Phys. Rev. Lett. 123 (2019) 041801 [arXiv:1902.02346] [INSPIRE].

    Article  Google Scholar 

  36. A. J. Larkoski and E. M. Metodiev, A theory of quark vs. gluon discrimination, JHEP 10 (2019) 014 [arXiv:1906.01639] [INSPIRE].

    MathSciNet  Article  Google Scholar 

  37. C. Cesarotti and J. Thaler, A robust measure of event isotropy at colliders, JHEP 08 (2020) 084 [arXiv:2004.06125] [INSPIRE].

    MathSciNet  Article  Google Scholar 

  38. P. T. Komiske, E. M. Metodiev and J. Thaler, The hidden geometry of particle collisions, JHEP 07 (2020) 006 [arXiv:2004.04159] [INSPIRE].

    MathSciNet  Article  Google Scholar 

  39. Y. S. Lai, D. Neill, M. Płoskoń and F. Ringer, Explainable machine learning of the underlying physics of high-energy particle collisions, arXiv:2012.06582 [INSPIRE].

  40. T. Cai, J. Cheng, N. Craig and K. Craig, Linearized optimal transport for collider events, Phys. Rev. D 102 (2020) 116019 [arXiv:2008.08604] [INSPIRE].

    Article  Google Scholar 

  41. J. Thaler and K. Van Tilburg, Identifying boosted objects with N -subjettiness, JHEP 03 (2011) 015 [arXiv:1011.2268] [INSPIRE].

    Article  Google Scholar 

  42. D. P. Kingma and M. Welling, Auto-encoding variational bayes, arXiv:1312.6114 [INSPIRE].

  43. M. Farina, Y. Nakai and D. Shih, Searching for new physics with deep autoencoders, Phys. Rev. D 101 (2020) 075021 [arXiv:1808.08992] [INSPIRE].

    Article  Google Scholar 

  44. T. Heimel, G. Kasieczka, T. Plehn and J. M. Thompson, QCD or what?, SciPost Phys. 6 (2019) 030 [arXiv:1808.08979] [INSPIRE].

    Article  Google Scholar 

  45. O. Cerri, T.Q. Nguyen, M. Pierini, M. Spiropulu and J.-R. Vlimant, Variational autoencoders for new physics mining at the Large Hadron Collider, JHEP 05 (2019) 036 [arXiv:1811.10276] [INSPIRE].

    Article  Google Scholar 

  46. J. Hajer, Y.-Y. Li, T. Liu and H. Wang, Novelty detection meets collider physics, Phys. Rev. D 101 (2020) 076015 [arXiv:1807.10261] [INSPIRE].

    Article  Google Scholar 

  47. T. S. Roy and A. H. Vijay, A robust anomaly finder based on autoencoders, arXiv:1903.02032 [INSPIRE].

  48. A. Blance, M. Spannowsky and P. Waite, Adversarially-trained autoencoders for robust unsupervised new physics searches, JHEP 10 (2019) 047 [arXiv:1905.10384] [INSPIRE].

    Article  Google Scholar 

  49. T. Cheng, J.-F. Arguin, J. Leissner-Martin, J. Pilette and T. Golling, Variational autoencoders for anomalous jet tagging, arXiv:2007.01850 [INSPIRE].

  50. S. E. Park, D. Rankin, S.-M. Udrescu, M. Yunus and P. Harris, Quasi anomalous knowledge: searching for new physics with embedded knowledge, arXiv:2011.03550 [INSPIRE].

  51. M. Crispim Romão, N. F. Castro and R. Pedro, Finding new physics without learning about it: anomaly detection as a tool for searches at colliders, Eur. Phys. J. C 81 (2021) 27 [arXiv:2006.05432] [INSPIRE].

    Article  Google Scholar 

  52. CMS collaboration, Measurement of the properties of a Higgs boson in the four-lepton final state, Phys. Rev. D 89 (2014) 092007 [arXiv:1312.5353] [INSPIRE].

  53. ATLAS collaboration, Measurements of Higgs boson production and couplings in the four-lepton channel in pp collisions at center-of-mass energies of 7 and 8 TeV with the ATLAS detector, Phys. Rev. D 91 (2015) 012006 [arXiv:1408.5191] [INSPIRE].

  54. A. Bogatskiy, B. Anderson, J. T. Offermann, M. Roussi, D. W. Miller and R. Kondor, Lorentz group equivariant neural network for particle physics, arXiv:2006.04780 [INSPIRE].

  55. G. Kanwar et al., Equivariant flow-based sampling for lattice gauge theory, Phys. Rev. Lett. 125 (2020) 121601 [arXiv:2003.06413] [INSPIRE].

    MathSciNet  Article  Google Scholar 

  56. D. Boyda et al., Sampling using SU(N) gauge equivariant flows, Phys. Rev. D 103 (2021) 074504 [arXiv:2008.05456] [INSPIRE].

    MathSciNet  Article  Google Scholar 

  57. C. Olah, Neural networks, manifolds and topology, https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/, (2014).

  58. E. O. Korman, Autoencoding topology, arXiv:1803.00156.

  59. M. Moor, M. Horn, B. Rieck and K. Borgwardt, Topological autoencoders, in International conference on machine learning, PMLR, (2020), pg. 7045 [arXiv:1906.00722].

  60. M. Hajij and K. Istvan, Topology and neural networks, arXiv:2008.13697.

  61. A. J. Larkoski and T. Melia, Covariantizing phase space, Phys. Rev. D 102 (2020) 094014 [arXiv:2008.06508] [INSPIRE].

    MathSciNet  Article  Google Scholar 

  62. Particle Data Group collaboration, Review of particle physics, PTEP 2020 (2020) 083C01 [INSPIRE].

  63. G. Carlsson, Topology and data, Bull. Amer. Math. Soc. 46 (2009) 255.

    MathSciNet  MATH  Article  Google Scholar 

  64. F. Rosenblatt, Principles of neurodynamics: perceptrons and the theory of brain mechanism, Tech. rep., Cornell Aeronautical Lab Inc., U.S.A. (1961).

  65. M. Minsky and S. A. Papert, Perceptrons: an introduction to computational geometry, MIT Press, Cambridge, MA, U.S.A. (1988).

    MATH  Google Scholar 

  66. ATLAS collaboration, Dijet resonance search with weak supervision using \( \sqrt{s} \) = 13 TeV pp collisions in the ATLAS detector, Phys. Rev. Lett. 125 (2020) 131801 [arXiv:2005.02983] [INSPIRE].

  67. T. S. Cohen, M. Geiger, J. Köhler and M. Welling, Spherical CNNs, arXiv:1801.10130.

  68. R. Kondor, Z. Lin and S. Trivedi, Clebsch-Gordan nets: a fully Fourier space spherical convolutional neural network, arXiv:1806.09231.

  69. F. Camastra and A. Staiano, Intrinsic dimension estimation: advances and open problems, Informat. Sci. 328 (2016) 26.

    MATH  Article  Google Scholar 

  70. U. Sharma and J. Kaplan, A neural scaling law from the dimension of the data manifold, arXiv:2004.10802.

  71. S. L. Smith, P.-J. Kindermans and Q. V. Le, Don’t decay the learning rate, increase the batch size, in International conference on learning representations, (2018) [arXiv:1711.00489].

  72. D. P. Kingma and J. Ba, Adam: a method for stochastic optimization, arXiv:1412.6980 [INSPIRE].

  73. D. Hendrycks and K. Gimpel, Gaussian Error Linear Units (GELUs), arXiv:1606.08415.

  74. P. Ramachandran, B. Zoph and Q. V. Le, Searching for activation functions, arXiv:1710.05941.

  75. M. Mahowald, On the embeddability of the real projective spaces, Proc. Amer. Math. Soc. 13 (1962) 763.

    MathSciNet  MATH  Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yonatan Kahn.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

ArXiv ePrint: 2102.08380

Rights and permissions

Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Batson, J., Haaf, C.G., Kahn, Y. et al. Topological obstructions to autoencoding. J. High Energ. Phys. 2021, 280 (2021). https://doi.org/10.1007/JHEP04(2021)280

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/JHEP04(2021)280

Keywords

  • Phenomenological Models