QCD-aware recursive neural networks for jet physics

  • Gilles Louppe
  • Kyunghyun Cho
  • Cyril Becot
  • Kyle CranmerEmail author
Open Access
Regular Article - Theoretical Physics


Recent progress in applying machine learning for jet physics has been built upon an analogy between calorimeters and images. In this work, we present a novel class of recursive neural networks built instead upon an analogy between QCD and natural languages. In the analogy, four-momenta are like words and the clustering history of sequential recombination jet algorithms is like the parsing of a sentence. Our approach works directly with the four-momenta of a variable-length set of particles, and the jet-based tree structure varies on an event-by-event basis. Our experiments highlight the flexibility of our method for building task-specific jet embeddings and show that recursive architectures are significantly more accurate and data efficient than previous image-based networks. We extend the analogy from individual jets (sentences) to full events (paragraphs), and show for the first time an event-level classifier operating on all the stable particles produced in an LHC event.


Jets QCD Phenomenology 


Open Access

This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.


  1. [1]
    J. Cogan, M. Kagan, E. Strauss and A. Schwarztman, Jet-Images: Computer Vision Inspired Techniques for Jet Tagging, JHEP 02 (2015) 118 [arXiv:1407.5675] [INSPIRE].ADSCrossRefGoogle Scholar
  2. [2]
    L. de Oliveira, M. Kagan, L. Mackey, B. Nachman and A. Schwartzman, Jet-images — deep learning edition, JHEP 07 (2016) 069 [arXiv:1511.05190] [INSPIRE].CrossRefGoogle Scholar
  3. [3]
    L.G. Almeida, M. Backović, M. Cliche, S.J. Lee and M. Perelstein, Playing Tag with ANN: Boosted Top Identification with Pattern Recognition, JHEP 07 (2015) 086 [arXiv:1501.05968] [INSPIRE].ADSCrossRefGoogle Scholar
  4. [4]
    P. Baldi, K. Bauer, C. Eng, P. Sadowski and D. Whiteson, Jet Substructure Classification in High-Energy Physics with Deep Neural Networks, Phys. Rev. D 93 (2016) 094034 [arXiv:1603.09349] [INSPIRE].ADSGoogle Scholar
  5. [5]
    J. Barnard, E.N. Dawe, M.J. Dolan and N. Rajcic, Parton Shower Uncertainties in Jet Substructure Analyses with Deep Neural Networks, Phys. Rev. D 95 (2017) 014018 [arXiv:1609.00607] [INSPIRE].ADSGoogle Scholar
  6. [6]
    P.T. Komiske, E.M. Metodiev and M.D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].ADSCrossRefzbMATHGoogle Scholar
  7. [7]
    G. Kasieczka, T. Plehn, M. Russell and T. Schell, Deep-learning Top Taggers or The End of QCD?, JHEP 05 (2017) 006 [arXiv:1701.08784] [INSPIRE].ADSCrossRefGoogle Scholar
  8. [8]
    D. Guest, J. Collado, P. Baldi, S.-C. Hsu, G. Urban and D. Whiteson, Jet Flavor Classification in High-Energy Physics with Deep Neural Networks, Phys. Rev. D 94 (2016) 112002 [arXiv:1607.08633] [INSPIRE].ADSGoogle Scholar
  9. [9]
    C. Goller and A. Kuchler, Learning task-dependent distributed representations by backpropagation through structure, IEEE Int. Conf. Neural Networks 1 (1996) 347.Google Scholar
  10. [10]
    R. Socher, C.C. Lin, C. Manning and A.Y. Ng, Parsing natural scenes and natural language with recursive neural networks, in Proceedings of the 28th international conference on machine learning (ICML-11), Bellevue U.S.A. (2011), pg. 129.Google Scholar
  11. [11]
    R. Socher, J. Pennington, E.H. Huang, A.Y. Ng and C.D. Manning, Semi-supervised recursive autoencoders for predicting sentiment distributions, in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh U.K. (2011), pg. 151.Google Scholar
  12. [12]
    K. Cho, B. van Merriënboer, D. Bahdanau and Y. Bengio, On the properties of neural machine translation: Encoder-decoder approaches, arXiv:1409.1259.
  13. [13]
    K. Cho et al., Learning phrase representations using rnn encoder-decoder for statistical machine translation, arXiv:1406.1078.
  14. [14]
    X. Chen, X. Qiu, C. Zhu, S. Wu and X. Huang, Sentence modeling with gated recursive neural network, in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon Portugal (2015), pg. 793.Google Scholar
  15. [15]
    I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, MIT Press, Cambridge U.S.A. (2016).zbMATHGoogle Scholar
  16. [16]
    S. Catani, Y.L. Dokshitzer, M.H. Seymour and B.R. Webber, Longitudinally invariant K t clustering algorithms for hadron hadron collisions, Nucl. Phys. B 406 (1993) 187 [INSPIRE].ADSCrossRefGoogle Scholar
  17. [17]
    Y.L. Dokshitzer, G.D. Leder, S. Moretti and B.R. Webber, Better jet clustering algorithms, JHEP 08 (1997) 001 [hep-ph/9707323] [INSPIRE].
  18. [18]
    M. Cacciari, G.P. Salam and G. Soyez, The anti-k t jet clustering algorithm, JHEP 04 (2008) 063 [arXiv:0802.1189] [INSPIRE].ADSCrossRefzbMATHGoogle Scholar
  19. [19]
    G.P. Salam, Towards Jetography, Eur. Phys. J. C 67 (2010) 637 [arXiv:0906.1833] [INSPIRE].ADSCrossRefGoogle Scholar
  20. [20]
    V. Nair and G.E. Hinton, Rectified linear units improve restricted boltzmann machines, in Proceedings of the 27th international conference on machine learning (ICML-10), Haifa Israel (2010), pg. 807.Google Scholar
  21. [21]
    N. Fischer, S. Prestel, M. Ritzmann and P. Skands, Vincia for Hadron Colliders, Eur. Phys. J. C 76 (2016) 589 [arXiv:1605.06142] [INSPIRE].ADSCrossRefGoogle Scholar
  22. [22]
    M. Ritzmann, D.A. Kosower and P. Skands, Antenna Showers with Hadronic Initial States, Phys. Lett. B 718 (2013) 1345 [arXiv:1210.6345] [INSPIRE].ADSCrossRefzbMATHGoogle Scholar
  23. [23]
    J. Chung, C. Gulcehre, K. Cho and Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv:1412.3555.
  24. [24]
    T. Sjöstrand et al., An Introduction to PYTHIA 8.2, Comput. Phys. Commun. 191 (2015) 159 [arXiv:1410.3012] [INSPIRE].
  25. [25]
    DELPHES 3 collaboration, J. de Favereau et al., DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
  26. [26]
    D.P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [INSPIRE].
  27. [27]
    J. Thaler and K. Van Tilburg, Identifying Boosted Objects with N-subjettiness, JHEP 03 (2011) 015 [arXiv:1011.2268] [INSPIRE].
  28. [28]
    S. R. Bowman, C. D. Manning and C. Potts, Tree-structured composition in neural networks without tree-structured architectures, arXiv:1506.04834.
  29. [29]
    S.R. Bowman, Modeling natural language semantics in learned representations, Ph.D. Thesis, Stanford University, Stanford U.S.A. (2016).Google Scholar
  30. [30]
    X. Shi, I. Padhi and K. Knight, Does string-based neural mt learn source syntax?, in Proceedings of EMNLP 2016, Austin U.S.A. (2016).Google Scholar
  31. [31]
    G. Louppe, M. Kagan and K. Cranmer, Learning to Pivot with Adversarial Networks, arXiv:1611.01046 [INSPIRE].
  32. [32]
    L. Lönnblad, C. Peterson and T. Rognvaldsson, Finding Gluon Jets With a Neural Trigger, Phys. Rev. Lett. 65 (1990) 1321 [INSPIRE].ADSCrossRefGoogle Scholar
  33. [33]
    L. Lönnblad, C. Peterson and T. Rognvaldsson, Using neural networks to identify jets, Nucl. Phys. B 349 (1991) 675 [INSPIRE].ADSCrossRefGoogle Scholar
  34. [34]
    R. Sinkus and T. Voss, Particle identification with neural networks using a rotational invariant moment representation, Nucl. Instrum. Meth. A 391 (1997) 360 [INSPIRE].ADSCrossRefGoogle Scholar
  35. [35]
    P. Chiappetta, P. Colangelo, P. De Felice, G. Nardulli and G. Pasquariello, Higgs search by neural networks at LHC, Phys. Lett. B 322 (1994) 219 [hep-ph/9401343] [INSPIRE].
  36. [36]
    B.H. Denby, Neural Networks and Cellular Automata in Experimental High-energy Physics, Comput. Phys. Commun. 49 (1988) 429 [INSPIRE].ADSCrossRefGoogle Scholar
  37. [37]
    A.J. Larkoski, I. Moult and B. Nachman, Jet Substructure at the Large Hadron Collider: A Review of Recent Advances in Theory and Machine Learning, arXiv:1709.04464 [INSPIRE].
  38. [38]
    D. Guest, K. Cranmer and D. Whiteson, Deep Learning and its Application to LHC Physics, Ann. Rev. Nucl. Part. Sci. 68 (2018) 161 [arXiv:1806.11484] [INSPIRE].ADSCrossRefGoogle Scholar
  39. [39]
    M. Russell, Top quark physics in the Large Hadron Collider era, Ph.D. Thesis, Glasgow University, Glasgow U.K. (2017) [arXiv:1709.10508] [INSPIRE].
  40. [40]
    T. Cheng, Recursive Neural Networks in Quark/Gluon Tagging, Comput. Softw. Big Sci. 2 (2018) 3 [arXiv:1711.02633] [INSPIRE].
  41. [41]
    K. Fraser and M.D. Schwartz, Jet Charge and Machine Learning, JHEP 10 (2018) 093 [arXiv:1803.08066] [INSPIRE].ADSCrossRefGoogle Scholar
  42. [42]
    CMS collaboration, New Developments for Jet Substructure Reconstruction in CMS, CMS-DP-2017-027 (2017).
  43. [43]
    S. Egan, W. Fedorko, A. Lister, J. Pearkes and C. Gay, Long Short-Term Memory (LSTM) networks with jet constituents for boosted top tagging at the LHC, arXiv:1711.09059 [INSPIRE].
  44. [44]
    I. Henrion et al., Neural Message Passing for Jet Physics, in Proceedings of the Deep Learning for Physical Sciences Workshop at NIPS (2017), Long Beach U.S.A. (2017),
  45. [45]
    A. Butter, G. Kasieczka, T. Plehn and M. Russell, Deep-learned Top Tagging with a Lorentz Layer, SciPost Phys. 5 (2018) 028 [arXiv:1707.08966] [INSPIRE].ADSCrossRefGoogle Scholar
  46. [46]
    K. Datta and A.J. Larkoski, Novel Jet Observables from Machine Learning, JHEP 03 (2018) 086 [arXiv:1710.01305] [INSPIRE].ADSCrossRefGoogle Scholar
  47. [47]
    P.T. Komiske, E.M. Metodiev and J. Thaler, Energy flow polynomials: A complete linear basis for jet substructure, JHEP 04 (2018) 013 [arXiv:1712.07124] [INSPIRE].CrossRefGoogle Scholar
  48. [48]
    S.H. Lim and M.M. Nojiri, Spectral Analysis of Jet Substructure with Neural Networks: Boosted Higgs Case, JHEP 10 (2018) 181 [arXiv:1807.03312] [INSPIRE].ADSCrossRefGoogle Scholar
  49. [49]
    S. Choi, S.J. Lee and M. Perelstein, Infrared Safety of a Neural-Net Top Tagging Algorithm, arXiv:1806.01263 [INSPIRE].
  50. [50]
    E.M. Metodiev, B. Nachman and J. Thaler, Classification without labels: Learning from mixed samples in high energy physics, JHEP 10 (2017) 174 [arXiv:1708.02949] [INSPIRE].ADSCrossRefGoogle Scholar
  51. [51]
    P.T. Komiske, E.M. Metodiev, B. Nachman and M.D. Schwartz, Learning to classify from impure samples with high-dimensional data, Phys. Rev. D 98 (2018) 011502 [arXiv:1801.10158] [INSPIRE].ADSGoogle Scholar
  52. [52]
    J.H. Collins, K. Howe and B. Nachman, Anomaly Detection for Resonant New Physics with Machine Learning, Phys. Rev. Lett. 121 (2018) 241803 [arXiv:1805.02664] [INSPIRE].ADSCrossRefGoogle Scholar
  53. [53]
    R.T. D’Agnolo and A. Wulzer, Learning New Physics from a Machine, arXiv:1806.02350 [INSPIRE].
  54. [54]
    A. Andreassen, I. Feige, C. Frye and M.D. Schwartz, JUNIPR: a Framework for Unsupervised Machine Learning in Particle Physics, arXiv:1804.09720 [INSPIRE].
  55. [55]
    S.R. Bowman, J. Gauthier, A. Rastogi, R. Gupta, C.D. Manning and C. Potts, A fast unified model for parsing and sentence understanding, arXiv:1603.06021.
  56. [56]
    D. Yogatama, P. Blunsom, C. Dyer, E. Grefenstette and W. Ling, Learning to compose words into sentences with reinforcement learning, arXiv:1611.09100.
  57. [57]
    J. Bruna, W. Zaremba, A. Szlam and Y. LeCun, Spectral networks and locally connected networks on graphs, arXiv:1312.6203.
  58. [58]
    M. Henaff, J. Bruna and Y. LeCun, Deep convolutional networks on graph-structured data, arXiv:1506.05163.
  59. [59]
    Y. Li, D. Tarlow, M. Brockschmidt and R.S. Zemel, Gated graph sequence neural networks, arXiv:1511.05493.
  60. [60]
    M. Niepert, M. Ahmed and K. Kutzkov, Learning convolutional neural networks for graphs, arXiv:1605.05273.
  61. [61]
    M. Defferrard, X. Bresson and P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral filtering, arXiv:1606.09375.
  62. [62]
    T.N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, arXiv:1609.02907.
  63. [63]
    T.N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, arXiv:1609.02907.
  64. [64]
    D. Maclaurin, D. Duvenaud, M. Johnson and R.P. Adams, Autograd: Reverse-mode differentiation of native Python, (2015).

Copyright information

© The Author(s) 2019

Authors and Affiliations

  • Gilles Louppe
    • 1
    • 2
    • 3
  • Kyunghyun Cho
    • 2
  • Cyril Becot
    • 1
    • 4
  • Kyle Cranmer
    • 1
    • 2
    Email author
  1. 1.Center for Cosmology & Particle PhysicsNew York UniversityNew YorkU.S.A.
  2. 2.Center for Data ScienceNew York UniversityNew YorkU.S.A.
  3. 3.University of LiègeLiègeBelgium
  4. 4.DESYHamburgGermany

Personalised recommendations