Skip to main content
Log in

Riemannian data preprocessing in machine learning to focus on QCD color structure

  • Original Paper - Particles and Nuclei
  • Published:
Journal of the Korean Physical Society Aims and scope Submit manuscript

Abstract

Identifying the quantum chromodynamics (QCD) color structure of processes provides additional information to enhance the reach for new physics searches at the large Hadron collider (LHC). Analyses of QCD color structure in the decay process of a boosted particle have been spotted as information becomes well localized in the limited phase space. While these kinds of a boosted jet analyses provide an efficient way to identify the color structure, the constrained phase space reduces the number of available data, resulting in a low significance. In this letter, we provide a simple but a novel data preprocessing method using a Riemann sphere to utilize a full phase space by decorrelating QCD structure from kinematics. We can achieve statistical stability by enlarging the size of testable data set with focusing on QCD structure effectively. We demonstrate the power of our method with the finite statistics of the LHC Run 2. Our method is complementary to conventional boosted jet analyses in utilizing QCD information over a wide range of a phase space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. We would like to refer [10] for the review at the early state of the LHC, and [11] as a good summary of ML applications.

  2. There are studies on a statistical analysis using ML [45,46,47].

References

  1. T. Brown, B. Mann, N. Ryder, M. Subbiah, J.D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877 (2020)

    Google Scholar 

  2. M. Feickert, B. Nachman, A living review of machine learning for particle physics, (2021), arXiv:2102.02770 [hep-ph]

  3. A. Radovic, M. Williams, D. Rousseau, M. Kagan, D. Bonacorsi, A. Himmel, A. Aurisano, K. Terao, T. Wongjirad, Machine learning at the energy and intensity frontiers of particle physics. Nature 560, 41 (2018)

    ADS  Google Scholar 

  4. G. Karagiorgi, G. Kasieczka, S. Kravitz, B. Nachman, D. Shih, Machine learning in the search for new fundamental physics. Nat. Rev. Phys. 4, 399 (2022)

    Google Scholar 

  5. P.T. Komiske, E.M. Metodiev, J. Thaler, Energy flow polynomials: a complete linear basis for jet substructure. JHEP 04, 013 (2018). arXiv:1712.07124 [hep-ph]

    ADS  Google Scholar 

  6. L. Bradshaw, S. Chang, B. Ostdiek, Creating simple, interpretable anomaly detectors for new physics in jet substructure. Phys. Rev. D 106, 035014 (2022). arXiv:2203.01343 [hep-ph]

    ADS  Google Scholar 

  7. S. Chang, T. Cohen, B. Ostdiek, What is the machine learning? Phys. Rev. D 97, 056009 (2018). arXiv:1709.10106 [hep-ph]

    ADS  Google Scholar 

  8. S. Jung, D. Lee, K.-P. Xie, Beyond \(M_{t{\bar{t}}}\): learning to search for a broad \(t{{\bar{t}}}\) resonance at the LHC. Eur. Phys. J. C 80, 105 (2020). arXiv:1906.02810 [hep-ph]

    ADS  Google Scholar 

  9. T. Faucett, J. Thaler, D. Whiteson, Mapping machine-learned physics into a human-readable space. Phys. Rev. D 103, 036020 (2021). arXiv:2010.11998 [hep-ph]

    ADS  Google Scholar 

  10. T. Plehn, M. Spannowsky, Top tagging. J. Phys. G 39, 083001 (2012). arXiv:1112.4441 [hep-ph]

    ADS  Google Scholar 

  11. A. Butter et al., The machine learning landscape of top taggers. SciPost Phys. 7, 014 (2019). arXiv:1902.09914 [hep-ph]

    ADS  Google Scholar 

  12. L.M. Jones, Tests for determining the Parton ancestor of a Hadron jet. Phys. Rev. D 39, 2550 (1989)

    ADS  Google Scholar 

  13. L. Lonnblad, C. Peterson, T. Rognvaldsson, Finding gluon jets with a neural trigger. Phys. Rev. Lett. 65, 1321 (1990)

    ADS  Google Scholar 

  14. S.D. Ellis, Z. Kunszt, D.E. Soper, Jets at hadron colliders at order \(\alpha -s^{3:}\) a look inside. Phys. Rev. Lett. 69, 3615 (1992). arXiv:hep-ph/9208249

    ADS  Google Scholar 

  15. J. Gallicchio, M.D. Schwartz, Pure samples of quark and gluon jets at the LHC. JHEP 10, 103 (2011). arXiv:1104.1175 [hep-ph]

    ADS  Google Scholar 

  16. J. Gallicchio, M.D. Schwartz, Quark and gluon tagging at the LHC. Phys. Rev. Lett. 107, 172001 (2011). arXiv:1106.3076 [hep-ph]

    ADS  Google Scholar 

  17. J. Gallicchio, M.D. Schwartz, Quark and gluon jet substructure. JHEP 04, 090 (2013). arXiv:1211.7038 [hep-ph]

    ADS  Google Scholar 

  18. A.J. Larkoski, J. Thaler, W.J. Waalewijn, Gaining (Mutual) information about quark/gluon discrimination. JHEP 11, 129 (2014). arXiv:1408.3122 [hep-ph]

    ADS  Google Scholar 

  19. D. Ferreira de Lima, P. Petrov, D. Soper, M. Spannowsky, Quark-gluon tagging with shower deconstruction: unearthing dark matter and Higgs couplings. Phys. Rev. D 95, 034001 (2017). arXiv:1607.06031 [hep-ph]

    ADS  Google Scholar 

  20. C. Frye, A.J. Larkoski, J. Thaler, K. Zhou, Casimir meets Poisson: improved quark/gluon discrimination with counting observables. JHEP 09, 083 (2017). arXiv:1704.06266 [hep-ph]

    ADS  Google Scholar 

  21. J. Davighi, P. Harris, Fractal based observables to probe jet substructure of quarks and gluons. Eur. Phys. J. C 78, 334 (2018). arXiv:1703.00914 [hep-ph]

    ADS  Google Scholar 

  22. E.M. Metodiev, J. Thaler, Jet topics: disentangling quarks and gluons at colliders. Phys. Rev. Lett. 120, 241602 (2018). arXiv:1802.00008 [hep-ph]

    ADS  Google Scholar 

  23. P.T. Komiske, E.M. Metodiev, J. Thaler, An operational definition of quark and gluon jets. JHEP 11, 059 (2018). arXiv:1809.01140 [hep-ph]

    ADS  Google Scholar 

  24. A.J. Larkoski, E.M. Metodiev, A theory of quark vs. gluon discrimination. JHEP 10, 014 (2019). arXiv:1906.01639 [hep-ph]

    ADS  MathSciNet  Google Scholar 

  25. F.A. Dreyer, G. Soyez, A. Takacs, Quarks and gluons in the Lund plane. JHEP 08, 177 (2022). arXiv:2112.09140 [hep-ph]

    ADS  MathSciNet  MATH  Google Scholar 

  26. J. Gallicchio, M.D. Schwartz, Seeing in color: jet superstructure. Phys. Rev. Lett. 105, 022001 (2010). arXiv:1001.5027 [hep-ph]

    ADS  Google Scholar 

  27. A. Hook, M. Jankowiak, J.G. Wacker, Jet dipolarity: top tagging with color flow. JHEP 04, 007 (2012). arXiv:1102.1012 [hep-ph]

    ADS  Google Scholar 

  28. D.E. Soper, M. Spannowsky, Finding physics signals with shower deconstruction. Phys. Rev. D 84, 074002 (2011). arXiv:1102.3480 [hep-ph]

    ADS  Google Scholar 

  29. D. Curtin, R. Essig, B. Shuve, Boosted multijet resonances and new color-flow variables. Phys. Rev. D 88, 034019 (2013). arXiv:1210.5523 [hep-ph]

    ADS  Google Scholar 

  30. S.H. Lim, M.M. Nojiri, Spectral analysis of jet substructure with neural networks: boosted Higgs case. JHEP 10, 181 (2018). arXiv:1807.03312 [hep-ph]

    ADS  Google Scholar 

  31. J. Lin, M. Freytsis, I. Moult, B. Nachman, Boosting \(H\rightarrow b{{\bar{b}}}\) with machine learning. JHEP 10, 101 (2018). arXiv:1807.10768 [hep-ph]

    ADS  Google Scholar 

  32. A. Chakraborty, S.H. Lim, M.M. Nojiri, Interpretable deep learning for two-prong jet classification with jet spectra. JHEP 07, 135 (2019). arXiv:1904.02092 [hep-ph]

    ADS  Google Scholar 

  33. J.H. Kim, M. Kim, K. Kong, K.T. Matchev, M. Park, Portraying double Higgs at the large hadron collider. JHEP 09, 047 (2019). arXiv:1904.08549 [hep-ph]

    ADS  Google Scholar 

  34. A. Buckley, G. Callea, A.J. Larkoski, S. Marzani, An optimal observable for color singlet identification. SciPost Phys. 9, 026 (2020). arXiv:2006.10480 [hep-ph]

    ADS  Google Scholar 

  35. J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H.S. Shao, T. Stelzer, P. Torrielli, M. Zaro, The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to Parton shower simulations. JHEP 07, 079 (2014). arXiv:1405.0301 [hep-ph]

    ADS  MATH  Google Scholar 

  36. T. Sjöstrand, S. Ask, J.R. Christiansen, R. Corke, N. Desai, P. Ilten, S. Mrenna, S. Prestel, C.O. Rasmussen, P.Z. Skands, An introduction to PYTHIA 8.2. Comput. Phys. Commun. 191, 159 (2015). arXiv:1410.3012 [hep-ph]

    ADS  MATH  Google Scholar 

  37. J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. LemaÎtre, A. Mertens, and M. Selvaggi ( collaboration DELPHES 3), DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02, 057, arXiv:1307.6346 [hep-ex]

  38. M. Cacciari, G.P. Salam, G. Soyez, FastJet user manual. Eur. Phys. J. C 72, 1896 (2012). arXiv:1111.6097 [hep-ph]

    ADS  MATH  Google Scholar 

  39. M.L. Mangano, M. Moretti, F. Piccinini, M. Treccani, Matching matrix elements and shower evolution for top-quark production in hadronic collisions. JHEP 01, 013 (2007). arXiv:hep-ph/0611129

    ADS  Google Scholar 

  40. G. Aad et al., ( collaboration ATLAS), Measurements of \(WH\) and \(ZH\) production in the \(H \rightarrow b{\bar{b}}\) decay channel in \(pp\) collisions at 13 TeV with the ATLAS detector. Eur. Phys. J. C 81, 178 (2021). https://doi.org/10.1140/epjc/s10052-020-08677-2. arXiv:2007.02873 [hep-ex]

  41. A. M. Sirunyan, et al. (collaboration CMS), Particle-flow reconstruction and global event description with the CMS detector, JINST 12 number number (10), P10003, arXiv:1706.04965 [physics.ins-det]

  42. C.K. Khosa, S. Marzani, Higgs boson tagging with the Lund jet plane. Phys. Rev. D 104, 055043 (2021). arXiv:2105.03989 [hep-ph]

    ADS  Google Scholar 

  43. F.A. Dreyer, G.P. Salam, G. Soyez, The Lund jet plane. JHEP 12, 064 (2018). arXiv:1807.04758 [hep-ph]

    ADS  Google Scholar 

  44. A. De Rujula, J. Lykken, M. Pierini, C. Rogan, M. Spiropulu, Higgs look-Alikes at the LHC. Phys. Rev. D 82, 013003 (2010). arXiv:1001.5300 [hep-ph]

    ADS  Google Scholar 

  45. A. Coccaro, M. Pierini, L. Silvestrini, R. Torre, The DNNLikelihood: enhancing likelihood distribution with deep learning. Eur. Phys. J. C 80, 664 (2020). arXiv:1911.03305 [hep-ph]

    ADS  Google Scholar 

  46. C. K. Khosa, V. Sanz, M. Soughton, A simple guide from machine learning outputs to statistical criteria, (2022), arXiv:2203.03669 [hep-ph]

  47. E. Arganda, X. Marcano, V. M. Lozano, A. D. Medina, A. D. Perez, M. Szewc, A. Szynkman, A method for approximating optimal statistical significances with machine-learned likelihoods, (2022), arXiv:2205.05952 [hep-ph]

  48. M. Chen, T. Cheng, J.S. Gainer, A. Korytov, K.T. Matchev, P. Milenovic, G. Mitselmakher, M. Park, A. Rinkevicius, M. Snowball, The role of interference in unraveling the ZZ-couplings of the newly discovered boson at the LHC. Phys. Rev. D 89, 034002 (2014). arXiv:1310.1397 [hep-ph]

    ADS  Google Scholar 

  49. C. Shimmin, P. Sadowski, P. Baldi, E. Weik, D. Whiteson, E. Goul, A. Søgaard, Decorrelated jet substructure tagging using adversarial neural networks. Phys. Rev. D 96, 074034 (2017). arXiv:1703.03507 [hep-ex]

    ADS  Google Scholar 

  50. L. Bradshaw, R.K. Mishra, A. Mitridate, B. Ostdiek, Mass agnostic jet taggers. SciPost Phys. 8, 011 (2020). arXiv:1908.08959 [hep-ph]

    ADS  Google Scholar 

  51. G. Kasieczka, D. Shih, Robust jet classifiers through distance correlation. Phys. Rev. Lett. 125, 122001 (2020). arXiv:2001.05310 [hep-ph]

    ADS  Google Scholar 

  52. J. Shlomi, P. Battaglia, J.-R. Vlimant, Graph Neural networks in particle physics. (2020), https://doi.org/10.1088/2632-2153/abbf9a, arXiv:2007.13681 [hep-ex]

  53. T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks (2017), arXiv:1609.02907 [cs.LG]

  54. N. Perraudin, M. Defferrard, T. Kacprzak, R. Sgier, DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applications. Astron. Comput. 27, 130 (2019). https://doi.org/10.1016/j.ascom.2019.03.004. arXiv:1810.12186 [astro-ph.CO]

    Article  ADS  Google Scholar 

  55. T. Buss, B. M. Dillon, T. Finke, M. Krämer, A. Morandini, A. Mück, I. Oleksiyuk, and T. Plehn, What’s Anomalous in LHC Jets?, ( 2022), arXiv:2202.00686 [hep-ph]

Download references

Acknowledgements

MP appreciates Chul Kim for introducing a Lund analysis. This study was supported by the Research Program funded by the Seoul National University of Science and Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Myeonghun Park.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. The structure of CNN

For each event, we construct an image as a square array in the (\(\eta \text {-}\phi\)) plane with each pixel intensity given by the total hadrons \(p_T\) deposited in the associated region in the calorimeter. The rectangular region between \(-2.5\le \eta \le 2.5\) and \(-\pi \le \phi \le \pi\) is discretized into \(50\times 50\) pixels grid. As discussed earlier, in order to optimize the CNN performance and reduce the error, we applied the following preprocessing steps:

  1. 1.

    Image cleansing: removing all leptons and photons from the image.

  2. 2.

    Center: shift the center of the image from (0, 0) to \((\frac{\eta _{b^-}+\eta _{b^+}}{2},\frac{\phi _{b^-}+\phi _{b^+}}{2})\).

  3. 3.

    Normalization: normalize pixels intensity by diving each pixel in the image by the maximum pixel intensity value.

  4. 4.

    Inverse stereographic projection: project the image pixels in \((\eta \text {-}\phi )\) plane to a Riemann sphere by applying the inverse stereographic transformations. We fix the hot cores position in \(\phi _R\) dimension to be at (\(-\frac{\pi }{2},\frac{\pi }{2}\)). The projected images represent sphere in three dimensions and thus we cannot use the normal CNN to classify them. One way is to use a spherical CNN approach as presented in [54]. Another way is to use the Mollweide projection as shown in Fig. 7 (left). In this work, we use Mollweide projected images as input to the CNN.

  5. 5.

    Momentum smearing: smear the momentum according to Gaussian distribution and correlate the neighboring pixels to decrease the sparsity in images [55]. Figure 7 illustrates the effect of Gaussian filter with standard deviation \(\sigma =3\) on the reconstructed images for singlet, octet scalars and backgrounds.

Fig. 7
figure 7

Mollweide projection for a single event before (left) and after (right) applying Gaussian kernel with standard deviation \(\sigma =3\)

Testing the CNN classification performance on two different data sets, Riemannian preprocessing and No preprocessing, requires to optimize the CNN structure for each data set individually. We found that CNN with six convolution layers with kernel size 3, two dense layers and one output layer with two neurons is the best choice for both cases. Each convolution layers pair is followed by a maxpooling layer of size 2 and a dropout layer with dropout rate of \(20\%\). Dense layers are followed by dropout layer with dropout rate of \(30\%\). The number of kernels in the first convolution layer is fixed to 16 and the activation functions is ReLU except the last dense layer (output layer) we use SoftMax function. To maintain the network stability and to avoid the covariate shift problem, we add batch normalization layers. In Riemannian preprocessing images, edges are of most important, thus we apply padding layer in the first convolution layer to keep image dimension intact. The Loss function, categorical cross entropy, is minimized using Adam optimizer with learning rate of 0.001. The number of kernels in the convolution layers and the number of neurons in the dense layers has been optimized using RandomizedSearchCV function in Scikit-learn. We use a balanced data set of 100K events for both processes, each data set is divided into \(60\%\) training set, \(20\%\) validation set and \(20\%\) test set.

Appendix 2. Background rejection

Fig. 8
figure 8

ROC curve in separating backgrounds from signal events by utilizing color-flow information in a resolved phase space

Background contribution in a \(HZ\rightarrow b \bar{b} \ell \bar{\ell }\) channel comes from vector boson plus three or more jets, di-top production, single top production and di-boson production [40]; while we dropped the \(W+jets\) and Wt processes as they have negligible contribution after passing all selection cuts. In [40], ATLAS analysis does not take an advantage of different color-flow information between a signal and backgrounds. Here, we add a CNN study as in the main text with/without Riemannian preprocessing. In Fig. 8, we present ROC performance using CNN analyses. We can achieve an extra factor 2 enhancement in a discovery significance using Riemannian preprocessing while we get only \(25\%\) enhancement with a normal CNN analysis when we take a false positive as 0.2 for example.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hammad, A., Park, M. Riemannian data preprocessing in machine learning to focus on QCD color structure. J. Korean Phys. Soc. 83, 235–242 (2023). https://doi.org/10.1007/s40042-023-00877-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40042-023-00877-9

Keywords

Navigation