Abstract
Experiments at particle colliders are the primary source of insight into physics at microscopic scales. Searches at these facilities often rely on optimization of analyses targeting specific models of new physics. Increasingly, however, data-driven model-agnostic approaches based on machine learning are also being explored. A major challenge is that such methods can be highly sensitive to the presence of many irrelevant features in the data. This paper presents Boosted Decision Tree (BDT)-based techniques to improve anomaly detection in the presence of many irrelevant features. First, a BDT classifier is shown to be more robust than neural networks for the Classification Without Labels approach to finding resonant excesses assuming independence of resonant and non-resonant observables. Next, a tree-based probability density estimator using copula transformations demonstrates significant stability and improved performance over normalizing flows as irrelevant features are added. The results make a compelling case for further development of tree-based algorithms for more robust resonant anomaly detection in high energy physics.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
G. Kasieczka et al., The LHC Olympics 2020 a community challenge for anomaly detection in high energy physics, Rept. Prog. Phys. 84 (2021) 124201 [arXiv:2101.08320] [INSPIRE].
T. Aarrestad et al., The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider, SciPost Phys. 12 (2022) 043 [arXiv:2105.14027] [INSPIRE].
J.H. Collins, K. Howe and B. Nachman, Anomaly Detection for Resonant New Physics with Machine Learning, Phys. Rev. Lett. 121 (2018) 241803 [arXiv:1805.02664] [INSPIRE].
T. Heimel, G. Kasieczka, T. Plehn and J.M. Thompson, QCD or What?, SciPost Phys. 6 (2019) 030 [arXiv:1808.08979] [INSPIRE].
M. Farina, Y. Nakai and D. Shih, Searching for New Physics with Deep Autoencoders, Phys. Rev. D 101 (2020) 075021 [arXiv:1808.08992] [INSPIRE].
J.H. Collins, K. Howe and B. Nachman, Extending the search for new resonances with machine learning, Phys. Rev. D 99 (2019) 014038 [arXiv:1902.02634] [INSPIRE].
B. Nachman and D. Shih, Anomaly Detection with Density Estimation, Phys. Rev. D 101 (2020) 075042 [arXiv:2001.04990] [INSPIRE].
A. Andreassen, B. Nachman and D. Shih, Simulation Assisted Likelihood-free Anomaly Detection, Phys. Rev. D 101 (2020) 095004 [arXiv:2001.05001] [INSPIRE].
K. Benkendorfer, L.L. Pottier and B. Nachman, Simulation-assisted decorrelation for resonant anomaly detection, Phys. Rev. D 104 (2021) 035003 [arXiv:2009.02205] [INSPIRE].
A. Hallin et al., Classifying anomalies through outer density estimation, Phys. Rev. D 106 (2022) 055006 [arXiv:2109.00546] [INSPIRE].
J.A. Raine, S. Klein, D. Sengupta and T. Golling, CURTAINs for your sliding window: Constructing unobserved regions by transforming adjacent intervals, Front. Big Data 6 (2023) 899345 [arXiv:2203.09470] [INSPIRE].
A. Hallin et al., Resonant anomaly detection without background sculpting, Phys. Rev. D 107 (2023) 114012 [arXiv:2210.14924] [INSPIRE].
T. Golling, S. Klein, R. Mastandrea and B. Nachman, Flow-enhanced transportation for anomaly detection, Phys. Rev. D 107 (2023) 096025 [arXiv:2212.11285] [INSPIRE].
E.M. Metodiev, B. Nachman and J. Thaler, Classification without labels: Learning from mixed samples in high energy physics, JHEP 10 (2017) 174 [arXiv:1708.02949] [INSPIRE].
T. Finke et al., Back To The Roots: Tree-Based Algorithms for Weakly Supervised Anomaly Detection, arXiv:2309.13111 [INSPIRE].
L. Grinsztajn, E. Oyallon and G. Varoquaux, Why do tree-based models still outperform deep learning on typical tabular data?, in Advances in Neural Information Processing Systems 35: 36th Conference on Neural Information Processing Systems (NeurIPS 2022), S. Koyejo et al. eds., Curran Associates Inc. (2022), pp. 507–520 [https://proceedings.neurips.cc/paper_files/paper/2022/file/0378c7692da36807bdec87ab043cdadc-Paper-Datasets_and_Benchmarks.pdf].
V. Borisov et al., Deep Neural Networks and Tabular Data: A Survey, arXiv:2110.01889 [https://doi.org/10.1109/TNNLS.2022.3229161].
G. Kasieczka, B. Nachman and D. Shih, R&D Dataset for LHC Olympics 2020 Anomaly Detection Challenge, (2019) [https://doi.org/10.5281/zenodo.6466204].
C. Bierlich et al., A comprehensive guide to the physics and usage of PYTHIA 8.3, SciPost Phys. Codeb. 2022 (2022) 8 [arXiv:2203.11601] [INSPIRE].
DELPHES 3 collaboration, DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
M. Cacciari, G.P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72 (2012) 1896 [arXiv:1111.6097] [INSPIRE].
J. Thaler and K. Van Tilburg, Identifying Boosted Objects with N-subjettiness, JHEP 03 (2011) 015 [arXiv:1011.2268] [INSPIRE].
J. Thaler and K. Van Tilburg, Maximizing Boosted Top Identification by Minimizing N-subjettiness, JHEP 02 (2012) 093 [arXiv:1108.2701] [INSPIRE].
G.H. John, R. Kohavi and K. Pfleger, Irrelevant Features and the Subset Selection Problem, in Machine Learning Proceedings 1994, W.W. Cohen and H. Hirsh Elsevier (1994), p. 121–129 [https://doi.org/10.1016/b978-1-55860-335-6.50023-4].
J. Neyman and E.S. Pearson, On the Problem of the Most Efficient Tests of Statistical Hypotheses, Phil. Trans. Roy. Soc. Lond. A 231 (1933) 289 [INSPIRE].
T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, Springer (2009) [https://doi.org/10.1007/978-0-387-84858-7] [INSPIRE].
T. Chen and C. Guestrin, XGBoost: A Scalable Tree Boosting System, arXiv:1603.02754 [https://doi.org/10.1145/2939672.2939785] [INSPIRE].
D.P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [INSPIRE].
N. Awaya and L. Ma, Unsupervised tree boosting for learning probability distributions, arXiv:2101.11083.
G. Papamakarios et al., Normalizing Flows for Probabilistic Modeling and Inference, arXiv:1912.02762 [INSPIRE].
A. Sklar, Fonctions de répartition à n dimensions et leurs marges, Publ. Inst. Stat. Univ. Paris 8 (1959) 229.
D. Sengupta, S. Klein, J.A. Raine and T. Golling, CURTAINs Flows For Flows: Constructing Unobserved Regions with Maximum Likelihood Estimation, arXiv:2305.04646 [INSPIRE].
D.C. Liu and J. Nocedal, On the limited memory BFGS method for large scale optimization, Math. Programming 45 (1989) 503 [INSPIRE].
P. Virtanen et al., SciPy 1.0–Fundamental Algorithms for Scientific Computing in Python, Nature Meth. 17 (2020) 261 [arXiv:1907.10121] [INSPIRE].
ATLAS collaboration, Dijet resonance search with weak supervision using \( \sqrt{s} \) = 13 TeV pp collisions in the ATLAS detector, Phys. Rev. Lett. 125 (2020) 131801 [arXiv:2005.02983] [INSPIRE].
Acknowledgments
We would like to thank Ben Nachman and David Shih for useful discussions. This research is supported by the NSF grant PHY-2014071. YCS is partially supported by the Boochever Fellowship at Cornell University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
ArXiv ePrint: 2310.13057
Rights and permissions
Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Freytsis, M., Perelstein, M. & San, Y.C. Anomaly detection in the presence of irrelevant features. J. High Energ. Phys. 2024, 220 (2024). https://doi.org/10.1007/JHEP02(2024)220
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/JHEP02(2024)220