Riemannian data preprocessing in machine learning to focus on QCD color structure

Hammad, Ahmed; Park, Myeonghun

doi:10.1007/s40042-023-00877-9

Riemannian data preprocessing in machine learning to focus on QCD color structure

Original Paper - Particles and Nuclei
Published: 18 July 2023

Volume 83, pages 235–242, (2023)
Cite this article

Journal of the Korean Physical Society Aims and scope Submit manuscript

130 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Identifying the quantum chromodynamics (QCD) color structure of processes provides additional information to enhance the reach for new physics searches at the large Hadron collider (LHC). Analyses of QCD color structure in the decay process of a boosted particle have been spotted as information becomes well localized in the limited phase space. While these kinds of a boosted jet analyses provide an efficient way to identify the color structure, the constrained phase space reduces the number of available data, resulting in a low significance. In this letter, we provide a simple but a novel data preprocessing method using a Riemann sphere to utilize a full phase space by decorrelating QCD structure from kinematics. We can achieve statistical stability by enlarging the size of testable data set with focusing on QCD structure effectively. We demonstrate the power of our method with the finite statistics of the LHC Run 2. Our method is complementary to conventional boosted jet analyses in utilizing QCD information over a wide range of a phase space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Study of $$Z \rightarrow ll\gamma $$ decays at $$\sqrt{s}$$ = 8 TeV with the ATLAS detector

Article Open access 26 February 2024

Improving topological cluster reconstruction using calorimeter cell timing in ATLAS

Article Open access 03 May 2024

FCC Physics Opportunities

Article Open access 05 June 2019

Notes

We would like to refer [10] for the review at the early state of the LHC, and [11] as a good summary of ML applications.
There are studies on a statistical analysis using ML [45,46,47].

References

T. Brown, B. Mann, N. Ryder, M. Subbiah, J.D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877 (2020)
Google Scholar
M. Feickert, B. Nachman, A living review of machine learning for particle physics, (2021), arXiv:2102.02770 [hep-ph]
A. Radovic, M. Williams, D. Rousseau, M. Kagan, D. Bonacorsi, A. Himmel, A. Aurisano, K. Terao, T. Wongjirad, Machine learning at the energy and intensity frontiers of particle physics. Nature 560, 41 (2018)
ADS Google Scholar
G. Karagiorgi, G. Kasieczka, S. Kravitz, B. Nachman, D. Shih, Machine learning in the search for new fundamental physics. Nat. Rev. Phys. 4, 399 (2022)
Google Scholar
P.T. Komiske, E.M. Metodiev, J. Thaler, Energy flow polynomials: a complete linear basis for jet substructure. JHEP 04, 013 (2018). arXiv:1712.07124 [hep-ph]
ADS Google Scholar
L. Bradshaw, S. Chang, B. Ostdiek, Creating simple, interpretable anomaly detectors for new physics in jet substructure. Phys. Rev. D 106, 035014 (2022). arXiv:2203.01343 [hep-ph]
ADS Google Scholar
S. Chang, T. Cohen, B. Ostdiek, What is the machine learning? Phys. Rev. D 97, 056009 (2018). arXiv:1709.10106 [hep-ph]
ADS Google Scholar
S. Jung, D. Lee, K.-P. Xie, Beyond $M_{t{\bar{t}}}$: learning to search for a broad $t{{\bar{t}}}$ resonance at the LHC. Eur. Phys. J. C 80, 105 (2020). arXiv:1906.02810 [hep-ph]
ADS Google Scholar
T. Faucett, J. Thaler, D. Whiteson, Mapping machine-learned physics into a human-readable space. Phys. Rev. D 103, 036020 (2021). arXiv:2010.11998 [hep-ph]
ADS Google Scholar
T. Plehn, M. Spannowsky, Top tagging. J. Phys. G 39, 083001 (2012). arXiv:1112.4441 [hep-ph]
ADS Google Scholar
A. Butter et al., The machine learning landscape of top taggers. SciPost Phys. 7, 014 (2019). arXiv:1902.09914 [hep-ph]
ADS Google Scholar
L.M. Jones, Tests for determining the Parton ancestor of a Hadron jet. Phys. Rev. D 39, 2550 (1989)
ADS Google Scholar
L. Lonnblad, C. Peterson, T. Rognvaldsson, Finding gluon jets with a neural trigger. Phys. Rev. Lett. 65, 1321 (1990)
ADS Google Scholar
S.D. Ellis, Z. Kunszt, D.E. Soper, Jets at hadron colliders at order $\alpha -s^{3:}$ a look inside. Phys. Rev. Lett. 69, 3615 (1992). arXiv:hep-ph/9208249
ADS Google Scholar
J. Gallicchio, M.D. Schwartz, Pure samples of quark and gluon jets at the LHC. JHEP 10, 103 (2011). arXiv:1104.1175 [hep-ph]
ADS Google Scholar
J. Gallicchio, M.D. Schwartz, Quark and gluon tagging at the LHC. Phys. Rev. Lett. 107, 172001 (2011). arXiv:1106.3076 [hep-ph]
ADS Google Scholar
J. Gallicchio, M.D. Schwartz, Quark and gluon jet substructure. JHEP 04, 090 (2013). arXiv:1211.7038 [hep-ph]
ADS Google Scholar
A.J. Larkoski, J. Thaler, W.J. Waalewijn, Gaining (Mutual) information about quark/gluon discrimination. JHEP 11, 129 (2014). arXiv:1408.3122 [hep-ph]
ADS Google Scholar
D. Ferreira de Lima, P. Petrov, D. Soper, M. Spannowsky, Quark-gluon tagging with shower deconstruction: unearthing dark matter and Higgs couplings. Phys. Rev. D 95, 034001 (2017). arXiv:1607.06031 [hep-ph]
ADS Google Scholar
C. Frye, A.J. Larkoski, J. Thaler, K. Zhou, Casimir meets Poisson: improved quark/gluon discrimination with counting observables. JHEP 09, 083 (2017). arXiv:1704.06266 [hep-ph]
ADS Google Scholar
J. Davighi, P. Harris, Fractal based observables to probe jet substructure of quarks and gluons. Eur. Phys. J. C 78, 334 (2018). arXiv:1703.00914 [hep-ph]
ADS Google Scholar
E.M. Metodiev, J. Thaler, Jet topics: disentangling quarks and gluons at colliders. Phys. Rev. Lett. 120, 241602 (2018). arXiv:1802.00008 [hep-ph]
ADS Google Scholar
P.T. Komiske, E.M. Metodiev, J. Thaler, An operational definition of quark and gluon jets. JHEP 11, 059 (2018). arXiv:1809.01140 [hep-ph]
ADS Google Scholar
A.J. Larkoski, E.M. Metodiev, A theory of quark vs. gluon discrimination. JHEP 10, 014 (2019). arXiv:1906.01639 [hep-ph]
ADS MathSciNet Google Scholar
F.A. Dreyer, G. Soyez, A. Takacs, Quarks and gluons in the Lund plane. JHEP 08, 177 (2022). arXiv:2112.09140 [hep-ph]
ADS MathSciNet MATH Google Scholar
J. Gallicchio, M.D. Schwartz, Seeing in color: jet superstructure. Phys. Rev. Lett. 105, 022001 (2010). arXiv:1001.5027 [hep-ph]
ADS Google Scholar
A. Hook, M. Jankowiak, J.G. Wacker, Jet dipolarity: top tagging with color flow. JHEP 04, 007 (2012). arXiv:1102.1012 [hep-ph]
ADS Google Scholar
D.E. Soper, M. Spannowsky, Finding physics signals with shower deconstruction. Phys. Rev. D 84, 074002 (2011). arXiv:1102.3480 [hep-ph]
ADS Google Scholar
D. Curtin, R. Essig, B. Shuve, Boosted multijet resonances and new color-flow variables. Phys. Rev. D 88, 034019 (2013). arXiv:1210.5523 [hep-ph]
ADS Google Scholar
S.H. Lim, M.M. Nojiri, Spectral analysis of jet substructure with neural networks: boosted Higgs case. JHEP 10, 181 (2018). arXiv:1807.03312 [hep-ph]
ADS Google Scholar
J. Lin, M. Freytsis, I. Moult, B. Nachman, Boosting $H\rightarrow b{{\bar{b}}}$ with machine learning. JHEP 10, 101 (2018). arXiv:1807.10768 [hep-ph]
ADS Google Scholar
A. Chakraborty, S.H. Lim, M.M. Nojiri, Interpretable deep learning for two-prong jet classification with jet spectra. JHEP 07, 135 (2019). arXiv:1904.02092 [hep-ph]
ADS Google Scholar
J.H. Kim, M. Kim, K. Kong, K.T. Matchev, M. Park, Portraying double Higgs at the large hadron collider. JHEP 09, 047 (2019). arXiv:1904.08549 [hep-ph]
ADS Google Scholar
A. Buckley, G. Callea, A.J. Larkoski, S. Marzani, An optimal observable for color singlet identification. SciPost Phys. 9, 026 (2020). arXiv:2006.10480 [hep-ph]
ADS Google Scholar
J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H.S. Shao, T. Stelzer, P. Torrielli, M. Zaro, The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to Parton shower simulations. JHEP 07, 079 (2014). arXiv:1405.0301 [hep-ph]
ADS MATH Google Scholar
T. Sjöstrand, S. Ask, J.R. Christiansen, R. Corke, N. Desai, P. Ilten, S. Mrenna, S. Prestel, C.O. Rasmussen, P.Z. Skands, An introduction to PYTHIA 8.2. Comput. Phys. Commun. 191, 159 (2015). arXiv:1410.3012 [hep-ph]
ADS MATH Google Scholar
J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. LemaÎtre, A. Mertens, and M. Selvaggi ( collaboration DELPHES 3), DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02, 057, arXiv:1307.6346 [hep-ex]
M. Cacciari, G.P. Salam, G. Soyez, FastJet user manual. Eur. Phys. J. C 72, 1896 (2012). arXiv:1111.6097 [hep-ph]
ADS MATH Google Scholar
M.L. Mangano, M. Moretti, F. Piccinini, M. Treccani, Matching matrix elements and shower evolution for top-quark production in hadronic collisions. JHEP 01, 013 (2007). arXiv:hep-ph/0611129
ADS Google Scholar
G. Aad et al., ( collaboration ATLAS), Measurements of $WH$ and $ZH$ production in the $H \rightarrow b{\bar{b}}$ decay channel in $pp$ collisions at 13 TeV with the ATLAS detector. Eur. Phys. J. C 81, 178 (2021). https://doi.org/10.1140/epjc/s10052-020-08677-2. arXiv:2007.02873 [hep-ex]
A. M. Sirunyan, et al. (collaboration CMS), Particle-flow reconstruction and global event description with the CMS detector, JINST 12 number number (10), P10003, arXiv:1706.04965 [physics.ins-det]
C.K. Khosa, S. Marzani, Higgs boson tagging with the Lund jet plane. Phys. Rev. D 104, 055043 (2021). arXiv:2105.03989 [hep-ph]
ADS Google Scholar
F.A. Dreyer, G.P. Salam, G. Soyez, The Lund jet plane. JHEP 12, 064 (2018). arXiv:1807.04758 [hep-ph]
ADS Google Scholar
A. De Rujula, J. Lykken, M. Pierini, C. Rogan, M. Spiropulu, Higgs look-Alikes at the LHC. Phys. Rev. D 82, 013003 (2010). arXiv:1001.5300 [hep-ph]
ADS Google Scholar
A. Coccaro, M. Pierini, L. Silvestrini, R. Torre, The DNNLikelihood: enhancing likelihood distribution with deep learning. Eur. Phys. J. C 80, 664 (2020). arXiv:1911.03305 [hep-ph]
ADS Google Scholar
C. K. Khosa, V. Sanz, M. Soughton, A simple guide from machine learning outputs to statistical criteria, (2022), arXiv:2203.03669 [hep-ph]
E. Arganda, X. Marcano, V. M. Lozano, A. D. Medina, A. D. Perez, M. Szewc, A. Szynkman, A method for approximating optimal statistical significances with machine-learned likelihoods, (2022), arXiv:2205.05952 [hep-ph]
M. Chen, T. Cheng, J.S. Gainer, A. Korytov, K.T. Matchev, P. Milenovic, G. Mitselmakher, M. Park, A. Rinkevicius, M. Snowball, The role of interference in unraveling the ZZ-couplings of the newly discovered boson at the LHC. Phys. Rev. D 89, 034002 (2014). arXiv:1310.1397 [hep-ph]
ADS Google Scholar
C. Shimmin, P. Sadowski, P. Baldi, E. Weik, D. Whiteson, E. Goul, A. Søgaard, Decorrelated jet substructure tagging using adversarial neural networks. Phys. Rev. D 96, 074034 (2017). arXiv:1703.03507 [hep-ex]
ADS Google Scholar
L. Bradshaw, R.K. Mishra, A. Mitridate, B. Ostdiek, Mass agnostic jet taggers. SciPost Phys. 8, 011 (2020). arXiv:1908.08959 [hep-ph]
ADS Google Scholar
G. Kasieczka, D. Shih, Robust jet classifiers through distance correlation. Phys. Rev. Lett. 125, 122001 (2020). arXiv:2001.05310 [hep-ph]
ADS Google Scholar
J. Shlomi, P. Battaglia, J.-R. Vlimant, Graph Neural networks in particle physics. (2020), https://doi.org/10.1088/2632-2153/abbf9a, arXiv:2007.13681 [hep-ex]
T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks (2017), arXiv:1609.02907 [cs.LG]
N. Perraudin, M. Defferrard, T. Kacprzak, R. Sgier, DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applications. Astron. Comput. 27, 130 (2019). https://doi.org/10.1016/j.ascom.2019.03.004. arXiv:1810.12186 [astro-ph.CO]
Article ADS Google Scholar
T. Buss, B. M. Dillon, T. Finke, M. Krämer, A. Morandini, A. Mück, I. Oleksiyuk, and T. Plehn, What’s Anomalous in LHC Jets?, ( 2022), arXiv:2202.00686 [hep-ph]

Download references

Acknowledgements

MP appreciates Chul Kim for introducing a Lund analysis. This study was supported by the Research Program funded by the Seoul National University of Science and Technology.

Author information

Authors and Affiliations

Institute of Convergence Fundamental Studies, Seoul National University of Science and Technology, 232 Gongneung-ro, Nowon-gu, Seoul, 01811, Republic of Korea
Ahmed Hammad & Myeonghun Park
School of Natural Sciences, Seoul National University of Science and Technology, 232 Gongneung-ro, Nowon-gu, Seoul, 01811, Republic of Korea
Myeonghun Park

Authors

Ahmed Hammad
View author publications
You can also search for this author in PubMed Google Scholar
Myeonghun Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Myeonghun Park.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. The structure of CNN

For each event, we construct an image as a square array in the ($\eta \text {-}\phi$) plane with each pixel intensity given by the total hadrons $p_T$ deposited in the associated region in the calorimeter. The rectangular region between $-2.5\le \eta \le 2.5$ and $-\pi \le \phi \le \pi$ is discretized into $50\times 50$ pixels grid. As discussed earlier, in order to optimize the CNN performance and reduce the error, we applied the following preprocessing steps:

1.
Image cleansing: removing all leptons and photons from the image.
2.
Center: shift the center of the image from (0, 0) to $(\frac{\eta _{b^-}+\eta _{b^+}}{2},\frac{\phi _{b^-}+\phi _{b^+}}{2})$.
3.
Normalization: normalize pixels intensity by diving each pixel in the image by the maximum pixel intensity value.
4.
Inverse stereographic projection: project the image pixels in $(\eta \text {-}\phi )$ plane to a Riemann sphere by applying the inverse stereographic transformations. We fix the hot cores position in $\phi _R$ dimension to be at ($-\frac{\pi }{2},\frac{\pi }{2}$). The projected images represent sphere in three dimensions and thus we cannot use the normal CNN to classify them. One way is to use a spherical CNN approach as presented in [54]. Another way is to use the Mollweide projection as shown in Fig. 7 (left). In this work, we use Mollweide projected images as input to the CNN.
5.
Momentum smearing: smear the momentum according to Gaussian distribution and correlate the neighboring pixels to decrease the sparsity in images [55]. Figure 7 illustrates the effect of Gaussian filter with standard deviation $\sigma =3$ on the reconstructed images for singlet, octet scalars and backgrounds.

Testing the CNN classification performance on two different data sets, Riemannian preprocessing and No preprocessing, requires to optimize the CNN structure for each data set individually. We found that CNN with six convolution layers with kernel size 3, two dense layers and one output layer with two neurons is the best choice for both cases. Each convolution layers pair is followed by a maxpooling layer of size 2 and a dropout layer with dropout rate of $20\%$. Dense layers are followed by dropout layer with dropout rate of $30\%$. The number of kernels in the first convolution layer is fixed to 16 and the activation functions is ReLU except the last dense layer (output layer) we use SoftMax function. To maintain the network stability and to avoid the covariate shift problem, we add batch normalization layers. In Riemannian preprocessing images, edges are of most important, thus we apply padding layer in the first convolution layer to keep image dimension intact. The Loss function, categorical cross entropy, is minimized using Adam optimizer with learning rate of 0.001. The number of kernels in the convolution layers and the number of neurons in the dense layers has been optimized using RandomizedSearchCV function in Scikit-learn. We use a balanced data set of 100K events for both processes, each data set is divided into $60\%$ training set, $20\%$ validation set and $20\%$ test set.

Appendix 2. Background rejection

Background contribution in a $HZ\rightarrow b \bar{b} \ell \bar{\ell }$ channel comes from vector boson plus three or more jets, di-top production, single top production and di-boson production [40]; while we dropped the $W+jets$ and Wt processes as they have negligible contribution after passing all selection cuts. In [40], ATLAS analysis does not take an advantage of different color-flow information between a signal and backgrounds. Here, we add a CNN study as in the main text with/without Riemannian preprocessing. In Fig. 8, we present ROC performance using CNN analyses. We can achieve an extra factor 2 enhancement in a discovery significance using Riemannian preprocessing while we get only $25\%$ enhancement with a normal CNN analysis when we take a false positive as 0.2 for example.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hammad, A., Park, M. Riemannian data preprocessing in machine learning to focus on QCD color structure. J. Korean Phys. Soc. 83, 235–242 (2023). https://doi.org/10.1007/s40042-023-00877-9

Download citation

Received: 14 June 2023
Accepted: 26 June 2023
Published: 18 July 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s40042-023-00877-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Riemannian data preprocessing in machine learning to focus on QCD color structure

Abstract

Access this article

Similar content being viewed by others

Study of $$Z \rightarrow ll\gamma $$ decays at $$\sqrt{s}$$ = 8 TeV with the ATLAS detector