Skip to main content

Advertisement

SpringerLink
Weakly supervised classification in high energy physics
Download PDF
Download PDF
  • Regular Article - Theoretical Physics
  • Open Access
  • Published: 29 May 2017

Weakly supervised classification in high energy physics

  • Lucio Mwinmaarong Dery1,
  • Benjamin Nachman2,
  • Francesco Rubbo  ORCID: orcid.org/0000-0001-5170-36523 &
  • …
  • Ariel Schwartzman3 

Journal of High Energy Physics volume 2017, Article number: 145 (2017) Cite this article

  • 755 Accesses

  • 65 Citations

  • 4 Altmetric

  • Metrics details

A preprint version of the article is available at arXiv.

Abstract

As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics — quark versus gluon tagging — we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.

Download to read the full article text

Working on a manuscript?

Avoid the most common mistakes and prepare your manuscript for journal editors.

Learn more

References

  1. ATLAS collaboration, Performance of b-jet identification in the ATLAS experiment, 2016 JINST 11 P04008 [arXiv:1512.01094] [INSPIRE].

  2. CMS collaboration, Identification of b-quark jets with the CMS experiment, 2013 JINST 8 P04013 [arXiv:1211.4462] [INSPIRE].

  3. ATLAS collaboration, Light-quark and gluon jet discrimination in pp collisions at \( \sqrt{s}=7 \) TeV with the ATLAS detector, Eur. Phys. J. C 74 (2014) 3023 [arXiv:1405.6583] [INSPIRE].

  4. CMS collaboration, Performance of quark/gluon discrimination in 8 TeV pp data, CMS-PAS-JME-13-002 (2013)

  5. ATLAS collaboration, Identification of boosted, hadronically decaying W bosons and comparisons with ATLAS data taken at \( \sqrt{s}=8 \) TeV, Eur. Phys. J. C 76 (2016) 154 [arXiv:1510.05821] [INSPIRE].

  6. CMS collaboration, Identification techniques for highly boosted W bosons that decay into hadrons, JHEP 12 (2014) 017 [arXiv:1410.4227] [INSPIRE].

  7. ATLAS collaboration, Identification of high transverse momentum top quarks in pp collisions at \( \sqrt{s}=8 \) TeV with the ATLAS detector, JHEP 06 (2016) 093 [arXiv:1603.03127] [INSPIRE].

  8. CMS collaboration, Boosted top jet tagging at CMS, CMS-PAS-JME-13-007 (2013).

  9. T.G. Dietterich, R.H. Lathrop and T. Lozano-Pérez, Solving the multiple instance problem with axis-parallel rectangles, Artif. Intell. 89 (1997) 31.

    Article  MATH  Google Scholar 

  10. J. Amores, Multiple instance classification: Review, taxonomy and comparative study, Artif. Intell. 201 (2013) 81.

    Article  MathSciNet  MATH  Google Scholar 

  11. D. Kotzias, M. Denil, N. de Freitas and P. Smyth, From group to individual labels using deep features, in the proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (KDD15), August 10–13, Sydney, Australia (2015).

  12. G. Patrini, R. Nock, P. Rivera and T. Caetano, (Almost) No label no cry, in Advances in Neural Information Processing Systems 27, Z. Ghahramani et al. eds., Curran Associates Inc., U.S.A. (2014).

  13. J. Gallicchio and M.D. Schwartz, Quark and gluon tagging at the LHC, Phys. Rev. Lett. 107 (2011) 172001 [arXiv:1106.3076] [INSPIRE].

    Article  ADS  Google Scholar 

  14. J.R. Andersen et al., Les Houches 2015: Physics at TeV Colliders Standard Model Working Group Report, in the proceedings of the 9th Les Houches Workshop on Physics at TeV Colliders (PhysTeV 2015), June 1–19, Les Houches, France (2016), arXiv:1605.04692 [INSPIRE].

  15. D.P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv:1412.6980.

  16. F. Chollet, Keras, https://github.com/fchollet/keras (2015).

  17. CMS collaboration, V tagging observables and correlations, CMS-PAS-JME-14-002 (2014).

  18. ATLAS collaboration, Search for high-mass diboson resonances with boson-tagged jets in proton-proton collisions at \( \sqrt{s}=8 \) TeV with the ATLAS detector, JHEP 12 (2015) 055 [arXiv:1506.00962] [INSPIRE].

  19. CMS collaboration, Search for the standard model Higgs boson produced through vector boson fusion and decaying to \( b\overline{b} \), Phys. Rev. D 92 (2015) 032008 [arXiv:1506.01010] [INSPIRE].

  20. CMS collaboration, Measurement of electroweak production of two jets in association with a Z boson in proton-proton collisions at \( \sqrt{s}=8 \) TeV, Eur. Phys. J. C 75 (2015) 66 [arXiv:1410.3153] [INSPIRE].

  21. ATLAS collaboration, Search for the Standard Model Higgs boson produced by vector-boson fusion and decaying to bottom quarks in \( \sqrt{s}=8 \) TeV pp collisions with the ATLAS detector, JHEP 11 (2016) 112 [arXiv:1606.02181] [INSPIRE].

  22. B. Bhattacherjee, S. Mukhopadhyay, M.M. Nojiri, Y. Sakaki and B.R. Webber, Quark-gluon discrimination in the search for gluino pair production at the LHC, JHEP 01 (2017) 044 [arXiv:1609.08781] [INSPIRE].

    Article  ADS  Google Scholar 

  23. J. Rojo et al., The PDF4LHC report on PDFs and LHC data: Results from Run I and preparation for Run II, J. Phys. G 42 (2015) 103103 [arXiv:1507.00556] [INSPIRE].

    Article  ADS  MathSciNet  Google Scholar 

  24. J. Alwall et al., The automated computation of tree-level and next-to-leading order differential cross sections and their matching to parton shower simulations, JHEP 07 (2014) 079 [arXiv:1405.0301] [INSPIRE].

    Article  ADS  Google Scholar 

  25. S. Alioli, P. Nason, C. Oleari and E. Re, A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX, JHEP 06 (2010) 043 [arXiv:1002.2581] [INSPIRE].

    Article  ADS  MATH  Google Scholar 

  26. T. Sjöstrand, S. Mrenna and P.Z. Skands, PYTHIA 6.4 physics and manual, JHEP 05 (2006) 026 [hep-ph/0603175] [INSPIRE].

  27. M. Bahr et al., HERWIG++ physics and manual, Eur. Phys. J. C 58 (2008) 639 [arXiv:0803.0883] [INSPIRE].

    Article  ADS  Google Scholar 

  28. T. Gleisberg et al., Event generation with SHERPA 1.1, JHEP 02 (2009) 007 [arXiv:0811.4622] [INSPIRE].

  29. B. Andersson, G. Gustafson, G. Ingelman and T. Sjöstrand, Parton fragmentation and string dynamics, Phys. Rept. 97 (1983) 31 [INSPIRE].

    Article  ADS  Google Scholar 

  30. B.R. Webber, A QCD model for jet fragmentation including soft gluon interference, Nucl. Phys. B 238 (1984) 492 [INSPIRE].

    Article  ADS  Google Scholar 

  31. ALEPH collaboration, D. Buskulic et al., Quark and gluon jet properties in symmetric three jet events, Phys. Lett. B 384 (1996) 353 [INSPIRE].

  32. P.T. Komiske, E.M. Metodiev and M.D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].

    Article  ADS  Google Scholar 

  33. G. Dissertori, I.G. Knowles and M. Schmelling, High energy experiments and theory, Clarendon Press, Oxford U.K. (2003).

    MATH  Google Scholar 

  34. T. Sjöstrand, S. Mrenna and P.Z. Skands, A brief introduction to PYTHIA 8.1, Comput. Phys. Commun. 178 (2008) 852 [arXiv:0710.3820] [INSPIRE].

  35. M. Cacciari, G.P. Salam and G. Soyez, The anti-k t jet clustering algorithm, JHEP 04 (2008) 063 [arXiv:0802.1189] [INSPIRE].

    Article  ADS  Google Scholar 

  36. M. Cacciari, G.P. Salam and G. Soyez, FastJet user manual, Eur. Phys. J. C 72 (2012) 1896 [arXiv:1111.6097] [INSPIRE].

    Article  ADS  Google Scholar 

  37. Particle Data Group collaboration, K.A. Olive et al., Review of particle physics, Chin. Phys. C 38 (2014) 090001 [INSPIRE].

  38. A.J. Larkoski, J. Thaler and W.J. Waalewijn, Gaining (mutual) information about quark/gluon discrimination, JHEP 11 (2014) 129 [arXiv:1408.3122] [INSPIRE].

    Article  ADS  Google Scholar 

  39. ATLAS collaboration, Measurement of the charged-particle multiplicity inside jets from \( \sqrt{s}=8 \) TeV pp collisions with the ATLAS detector, Eur. Phys. J. C 76 (2016) 322 [arXiv:1602.00988] [INSPIRE].

Download references

Open Access

This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.

Author information

Authors and Affiliations

  1. Physics Department, Stanford University, Stanford, CA, 94305, U.S.A.

    Lucio Mwinmaarong Dery

  2. Physics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA, 94720, U.S.A.

    Benjamin Nachman

  3. SLAC National Accelerator Laboratory, Stanford University, 2575 Sand Hill Rd, Menlo Park, CA, 94025, U.S.A.

    Francesco Rubbo & Ariel Schwartzman

Authors
  1. Lucio Mwinmaarong Dery
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Benjamin Nachman
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Francesco Rubbo
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Ariel Schwartzman
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Rubbo.

Additional information

ArXiv ePrint: 1702.00414

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dery, L.M., Nachman, B., Rubbo, F. et al. Weakly supervised classification in high energy physics. J. High Energ. Phys. 2017, 145 (2017). https://doi.org/10.1007/JHEP05(2017)145

Download citation

  • Received: 17 February 2017

  • Revised: 08 April 2017

  • Accepted: 22 May 2017

  • Published: 29 May 2017

  • DOI: https://doi.org/10.1007/JHEP05(2017)145

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Jets
Download PDF

Working on a manuscript?

Avoid the most common mistakes and prepare your manuscript for journal editors.

Learn more

Advertisement

Over 10 million scientific documents at your fingertips

Switch Edition
  • Academic Edition
  • Corporate Edition
  • Home
  • Impressum
  • Legal information
  • Privacy statement
  • California Privacy Statement
  • How we use cookies
  • Manage cookies/Do not sell my data
  • Accessibility
  • FAQ
  • Contact us
  • Affiliate program

Not affiliated

Springer Nature

© 2023 Springer Nature Switzerland AG. Part of Springer Nature.