Abstract
Transformers have become the primary architecture for natural language processing. In this study, we explore their use for auto-regressive density estimation in high-energy jet physics, which involves working with a high-dimensional space. We draw an analogy between sentences and words in natural language and jets and their constituents in high-energy physics. Specifically, we investigate density estimation for light QCD jets and hadronically decaying boosted top jets. Since transformers allow easy sampling from learned densities, we exploit their generative capability to assess the quality of the density estimate. Our results indicate that the generated data samples closely resemble the original data, as evidenced by the excellent agreement of distributions such as particle multiplicity or jet mass. Furthermore, the generated samples are difficult to distinguish from the original data, even by a powerful supervised classifier. Given their exceptional data processing capabilities, transformers could potentially be trained directly on the massive LHC data sets to learn the probability densities in high-energy jet physics.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
M. Feickert and B. Nachman, A Living Review of Machine Learning for Particle Physics, arXiv:2102.02770 [INSPIRE].
S. Badger et al., Machine learning and LHC event generation, SciPost Phys. 14 (2023) 079 [arXiv:2203.07460] [INSPIRE].
I.J. Goodfellow et al., Generative Adversarial Networks, arXiv:1406.2661 [INSPIRE].
D.J. Rezende and S. Mohamed, Variational Inference with Normalizing Flows, arXiv:1505.05770.
G. Loaiza-Ganem, B.L. Ross, J.C. Cresswell and A.L. Caterini, Diagnosing and Fixing Manifold Overfitting in Deep Generative Models, arXiv:2204.07172.
J.C. Cresswell et al., CaloMan: Fast generation of calorimeter showers with density estimation on learned manifolds, in the proceedings of the 36th Conference on Neural Information Processing Systems, New Orleans, U.S.A., 28 November – 9 December 2022 [arXiv:2211.15380] [INSPIRE].
C. Krause and D. Shih, CaloFlow: Fast and Accurate Generation of Calorimeter Showers with Normalizing Flows, arXiv:2106.05285 [INSPIRE].
C. Krause and D. Shih, CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing Flows, arXiv:2110.11377 [INSPIRE].
C. Krause, I. Pang and D. Shih, CaloFlow for CaloChallenge Dataset 1, arXiv:2210.14245 [INSPIRE].
S. Diefenbacher et al., L2LFlows: Generating High-Fidelity 3D Calorimeter Images, arXiv:2302.11594 [INSPIRE].
A. Andreassen, I. Feige, C. Frye and M.D. Schwartz, JUNIPR: a Framework for Unsupervised Machine Learning in Particle Physics, Eur. Phys. J. C 79 (2019) 102 [arXiv:1804.09720] [INSPIRE].
A. Andreassen, I. Feige, C. Frye and M.D. Schwartz, Binary JUNIPR: an interpretable probabilistic model for discrimination, Phys. Rev. Lett. 123 (2019) 182001 [arXiv:1906.10137] [INSPIRE].
A. Vaswani et al., Attention Is All You Need, Adv. Neural Inf. Process. Syst. 30 (2017) [arXiv:1706.03762].
R. Fakoor, P. Chaudhari, J. Mueller and A.J. Smola, TraDE: Transformers for Density Estimation, arXiv:2004.02441.
T. Wolf et al., HuggingFace’s Transformers: State-of-the-art Natural Language Processing, arXiv:1910.03771.
V. Mikuni and F. Canelli, Point cloud transformers applied to collider physics, Mach. Learn. Sci. Tech. 2 (2021) 035027 [arXiv:2102.05073] [INSPIRE].
H. Qu, C. Li and S. Qian, Particle Transformer for Jet Tagging, arXiv:2202.03772 [INSPIRE].
S. Qiu et al., Holistic approach to predicting top quark kinematic properties with the covariant particle transformer, Phys. Rev. D 107 (2023) 114029 [arXiv:2203.05687] [INSPIRE].
F.A. Di Bello et al., Reconstructing particles in jets using set transformer and hypergraph prediction networks, arXiv:2212.01328 [INSPIRE].
B.M. Dillon et al., Symmetries, safety, and self-supervision, SciPost Phys. 12 (2022) 188 [arXiv:2108.04253] [INSPIRE].
R. Kansal et al., Evaluating generative models in high energy physics, Phys. Rev. D 107 (2023) 076017 [arXiv:2211.10295] [INSPIRE].
B. Käch, D. Krücker and I. Melzer-Pellmann, Point Cloud Generation using Transformer Encoders and Normalising Flows, arXiv:2211.13623 [INSPIRE].
M. Leigh et al., PC-JeDi: Diffusion for Particle Cloud Generation in High Energy Physics, arXiv:2303.05376 [INSPIRE].
E. Buhmann, G. Kasieczka and J. Thaler, EPiC-GAN: Equivariant Point Cloud Generation for Particle Jets, arXiv:2301.08128 [INSPIRE].
A. Paszke et al., PyTorch: An Imperative Style, High-Performance Deep Learning Library, in Advances in Neural Information Processing Systems 32, Curran Associates, Inc. (2019), p. 8024–8035.
D.P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [INSPIRE].
A. Holtzman et al., The Curious Case of Neural Text Degeneration, arXiv:1904.09751.
G. Kasieczka, T. Plehn, J. Thompson and M. Russel, Top Quark Tagging Reference Dataset, https://doi.org/10.5281/ZENODO.2603256.
G. Louppe, K. Cho, C. Becot and K. Cranmer, QCD-Aware Recursive Neural Networks for Jet Physics, JHEP 01 (2019) 057 [arXiv:1702.00748] [INSPIRE].
H. Qu and L. Gouskos, ParticleNet: Jet Tagging via Particle Clouds, Phys. Rev. D 101 (2020) 056019 [arXiv:1902.08570] [INSPIRE].
A. Butter et al., The Machine Learning landscape of top taggers, SciPost Phys. 7 (2019) 014 [arXiv:1902.09914] [INSPIRE].
B. Nachman and D. Shih, Anomaly Detection with Density Estimation, Phys. Rev. D 101 (2020) 075042 [arXiv:2001.04990] [INSPIRE].
Acknowledgments
We would like to thank Martin Grohe for discussions and Erik Buhmann, Gregor Kasieczka and David Shih for valuable comments and suggestions on the draft paper. TF is supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under grant 400140256 - GRK 2497: the physics of the heaviest particles at the Large Hadron Collider. The research of MK and AM is supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under grant 396021762 - TRR 257 “Particle Physics Phenomenology after the Higgs Discovery”. JT is supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under grants GR 1492/16-1 and KI 2348/1-1 “Quantitative Reasoning About Database Queries”. The authors gratefully acknowledge the computing time granted by the NHR4CES Resource Allocation Board and provided on the supercomputer CLAIX at RWTH Aachen University as part of the NHR4CES infrastructure. The calculations for this research were conducted with computing resources under the project rwth0934.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
ArXiv ePrint: 2303.07364
Rights and permissions
Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Finke, T., Krämer, M., Mück, A. et al. Learning the language of QCD jets with transformers. J. High Energ. Phys. 2023, 184 (2023). https://doi.org/10.1007/JHEP06(2023)184
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/JHEP06(2023)184