Deep learning assisted jet tomography for the study of Mach cones in QGP

Mach cones are expected to form in the expanding quark-gluon plasma (QGP) when energetic quarks and gluons traverse the hot medium at a velocity faster than the speed of sound in high-energy heavy-ion collisions. The shape of the Mach cone and the associated diffusion wake are sensitive to the initial jet production location and the propagation direction of the parton shower relative to the radial flow because of the distortion caused by the collective expansion of the QGP and the large density gradient. The shape of jet-induced Mach cones and their distortions in heavy-ion collisions provide a unique and direct probe of the dynamical evolution and the equation of state of QGP. However, it is difficult to identify the Mach cone and the diffusion wake in current experimental measurements of final hadron distributions because they are averaged over all possible initial jet production locations and parton-shower propagation directions. To overcome this difficulty, we develop a deep learning assisted jet tomography which uses the full information of the final hadrons from jets to localize the initial jet production positions. This method can help to constrain the initial regions of jet production in heavy-ion collisions and enable a differential study of Mach-cones with different path lengths and orientations relative to the radial flow of the QGP in heavy-ion collisions.


I. INTRODUCTION
All projectiles moving through medium feel resistance and eventually slow down, depending on the density of the medium and the frictional force. Typical bullets, for example, lose all their energy to the friction and come to a stop after about only a few feet of distance in the water which is about 800 times denser than the air. The lost energy generates sound waves in the medium that travels with the projectiles and leaves behind a diffusion wake. When the speed of a fast-moving projectile exceeds the velocity of sound in the medium, fronts of the sound waves are squeezed to form a Mach cone or the sonic boom. The same phenomenon is also expected in a small droplet of subatomic liquid [1][2][3][4][5][6][7][8][9][10][11][12][13] called quark gluon plasma (QGP) the size of an atomic nucleus in high energy nuclear collisions. In contrast to the normal nuclear matter where quarks and gluons are confined to the domains of each nucleon (proton or neutron), they can roam freely in the whole volume of QGP. In a process called jet quenching [14][15][16][17], energetic quarks and gluons (also called partons) or jets are produced through hard scattering of beam partons from the colliding nuclei before the formation of QGP. These fast jets have to traverse the QGP at the speed of light (c), which is faster than the upper bound of the velocity of sound (c s < c/ √ 3 ) in QGP [18]. As the fast jets lose energy-momentum along their trajectories through jet-medium interaction, their lost energy will induce supersonic sound waves in the form of Mach cones on a femtometer scale inside the QGP.
The jet-induced subatomic Mach cones can be studied theoretically using transport models [19][20][21][22][23], linearized hydrodynamics [4, 6-8, 11, 12, 23-26] as well as full hydrodynamics [13,23,[27][28][29][30][31][32][33][34] as illustrated in Fig. 1. These Mach cones are different from those in a classical and macroscopic medium. First of all, partonic jets are quantum projectiles. They not only lose energy-momentum through elastic scattering with the medium but also through self and medium-induced gluon radiation with intricate quantum interference due to multiple scattering in medium [35][36][37]. This unique energy loss mechanism will lead to a deformed Mach cone [8,9] if all or part of the radiative energy loss is converted into medium excitation. Secondly, the extreme hot and dense QGP droplet expands very rapidly due to large pressure and density gradient during the jet propagation. The induced Mach cone excitation will also be deformed by the large density gradient and the radial flow of the medium [20,21,28,33]. Experimental observation and measurements of the jet-induced Mach cone excitation therefore can help to study the dynamical evolution and intrinsic properties of the QGP medium such as the temperature and flow profiles, the opacity to jet partons, the equation of state and transport coefficient of the QGP [3,10,38]. Such background can be eliminated in neutral boson (Z 0 ) or photon (γ)-jet events since the γ/Z 0 trigger does not interact with the QGP medium and therefore is not correlated with the underlying soft hadrons. However, the most recent experimental data on γ/Z 0 -hadron correlations [43] have not shown any visible and unambiguous signal of jet-induced Mach cones without any additional event selections.
The absence of clear experimental evidence for jet-induced Mach cones in γ/Z 0 -jet events could be caused by averaging event over the initial jet production positions and propagation directions.
For a given initial jet production position and propagation direction, simulations show that jet-induced Mach cones lead to some unique angular distributions of soft hadrons due to distortion by the density gradient and radial flow in an expanding QGP medium [29,44,45]. However, for fixed transverse momentum of the trigger γ/Z 0 in experiments, the initial γ/Z 0 -jet can be produced at any transverse position (x, y) according to the distribution of the number of binary collisions in the overlap region of two colliding nuclei. The final γ/Z 0 -hadron correlations measured in experiments are averaged over events with different γ/Z 0 -jet production positions and jet propagation direction or effectively different jet propagation path length (different amount of jet energy loss), the temperature gradient, as well as the intersection between jet propagation and collective flow of the medium. The final averaged γ/Z 0 -hadron correlation unfortunately loses many of the distinctive features from a typical Mach cone.
Attempts have been made in the past to locate the initial jet production position using jet tomography. Since γ/Z 0 does not lose energy in the QGP medium, the change of γ/Z 0 -jet energy asymmetry χ = E γ/Z 0 −E jet is related to the jet energy loss which depends monotonically on the jet propagation length. It is therefore proposed in a longitudinal jet tomography [46] that the γ/Z 0 -jet energy asymmetry can be used to locate the initial longitudinal position (along the jet propagation direction) or the jet path length. Similarly, the transverse asymmetry A T , which is defined to characterize the jet transverse diffusion due to the density transverse gradient of the medium, has been proposed in a transverse or gradient tomography [47,48] to approximately locate the jet initial transverse position perpendicular to the jet propagation. The 2D jet tomography combining the longitudinal and transverse jet tomography is shown to be able to locate the initial jet production positions, including both the path length and perpendicular coordinates, and enhance detailed structures of the diffusion wake of the Mach cones in γ-hadron correlations [45,49]. Recently, deep neural network was also employed to estimate the energy loss of hadron-triggered jet which can also tell roughly the initial longitudinal positions of jet production along the jet direction [50,51].
The above methods in previous studies can only provide very primitive estimate of the averaged initial jet production positions. To fully understand the jet-induced medium response and look for signals of Mach cones and the diffusion wake, more accurate and event-by-event jet production positions are desired, taking into account more realistic jet energy loss and the fluctuating nature of jet shower development in the QGP medium. We aim to develop a deep learning assisted jet tomography in this study that will address the following questions: 1. Is it possible to determine the initial jet production positions using the momenta of final state particles?
2. Will the signal of Mach cones and the diffusion wake be amplified if initial jet production positions are constrained to specific regions with long jet propagation lengths or fixed propagation direction relative to the radial flow?
3. How reliable is the new deep learning assisted method?
To answer the first question, we employ a point-cloud deep neural network (DNN) with multiple hidden layers, which is powerful in pattern recognition and has been widely used in high energy nuclear physics [52][53][54][55] and jet medium interaction studies [50,51,[56][57][58][59][60][61][62][63]. We will train and validate the DNN using data from the coupled Linear Boltzmann Transport (CoLBT) and hydro model simulations [29] which combines the Linear Boltzmann Transport (LBT) model for jet parton propagation in medium with the concurrent QGP evolution given by the 3+1D CCNU-LBNL viscous (CLVisc) hydrodynamics [64][65][66]. The CoLBT-hydro model also simulates the stochastic energy momentum deposition to the medium by jet-shower partons together with the relativistic hydrodynamic medium response. For the second question, we can average events with the same jet production regions as predicted by deep neural network. This method, dubbed "jet-position engineering", shows that the signal of Mach cones and diffusion wake depends strongly on the initial jet production regions. The third question concerns with the training model dependence of the neural network. We will use pseudo-data generated from another model, the LIDO partonic transport model [67,68], to cross-check performance of the trained network, i.e., use the CoLBTtrained network to infer the jet production locations from the pseudo-data generated from the LIDO model and compare to the ground truth. Although we do not expect either model to provide the exact real jet dynamics in the QGP, this practice can estimate the model uncertainty of the method.
The remainder of the paper is organized as following. In Section II, we describe the major physics model CoLBT-hydro and the cross-check model LIDO for jet evolution in the QGP. We will also introduce the point-cloud neural network method for locating the initial jet production positions from the information of final state hadrons. Results are presented in Section III. Section IV summarizes the paper.

II. METHOD
CoLBT-hydro model for jet transport and medium response -The initial configurations in momentum distributions of γ-jets are generated by PYTHIA8 [69]. The Trento model is used for both the initial spatial distribution [70] of γ-jet production and the initial energy density distribution for the hydrodynamic evolution of the bulk QGP medium. The jet transport in QGP is simulated using the LBT model [44,[71][72][73], where evolution of the phase space distribution of jet shower partons in QGP is described by the linearized Boltzmann equation, where p a/c and f a/c are the four-momenta and phase space distributions of jet-shower partons before and after each scattering, p b/d are the four-momenta of medium partons sampled from thermal Both processes have considered all possible channels of elastic scattering a + b → c + d, whose amplitudes |M ab→cd | are computed up to leading order in perturbative QCD [74]. The factor γ b is the color and spin degeneracy of parton b in the forward process and is a Lorentz-invariant double step function used to regularize the collinear divergence in the scattering amplitudes, whereŝ,t, andû are the Mandelstam variables, and µ 2 D = 3 2 g 2 T 2 is the Debye screening mass. The inelastic part in Eq. (1) is the contribution from gluon radiation that is induced by the elastic scattering processes. The induced gluon emission rate Γ inel a is calculated according to the higher-twist formalism [75][76][77]. Please refer to Refs. [44,[71][72][73] and [29,[78][79][80] for more details about LBT and CoLBT-hydro model.
The four-momenta of thermal partons b and d are sampled from distribution functions f (p) = 1/(e p·u/T ± 1) with + for quarks and − for gluons. The local temperature and fluid velocity u are provided by the CLVisc model that solves numerically relativistic hydrodynamic equations, for the space-time evolution of QGP, where ∇ µ is the covariant derivative operator, T µν = ( + P )u µ u ν − P g µν + π µν is the energy stress tensor, in which and P are the energy density and pressure in the co-moving frame of the fluid, u µ is the relativistic fluid four-velocity satisfying is the metric tensor in the Milne (τ, x, y, η s ) coordinates and π µν is the shear-stress tensor. The source term J ν describes the energy and momentum deposition from jet showers to QGP. It includes the energy and momentum from the recoiled and radiated partons whose energy in the local co-moving frame of the fluid is smaller than a cut-off which is set to p 0 = 2 GeV/c. Partons taken out of the phase space in the medium will leave a particle hole behind, and will deposit negative energy-momentum to the medium through the source term.
This source term J ν is responsible for generating the Mach cones and the diffusion wake in the QGP medium. This group of time-dependent partial differential equations are solved numerically on GPU using OpenCL in the CLVisc model concurrently with the LBT transport. See Refs. [64][65][66]81] for more details. A parton recombination model is used to hadronize both the QGP and hard partons from jet showers [82][83][84]. The kinetic cuts in this work are p γ T ∈ (200, 250) GeV, |η γ | < 0.32, p jet T > 100 GeV, |η jet | < 0.75, respectively, with jet cone size R = 0.4. LIDO transport model for jet evolution. The LIDO model also contains the essential ingredients of collisional and radiative processes as well as the medium response, but differs from CoLBThydro in details. We will apply the predictor trained with the CoLBT-hydro model to infer the initial jet production locations from a set of pseudo-data generated by the LIDO transport model. This provides an estimation of the model-dependent uncertainty in the predictor. Here, we briefly describe the LIDO model with an emphasis on its difference as compared to the CoLBT-hydro model. Full details of the LIDO model can be found in Refs. [67,68].
The evolution of the distribution function of hard partons in the LIDO model can be schematically written as The first key difference as compared to the CoLBT-hydro model is that the jet-medium interaction is  [75]. Nevertheless, both methods result in a radiative energy loss that scales as ln E for energteic partons.
The parameters in CoLBT-hydro and LIDO model have been calibrated to reproduce experimental data on the suppression of single inclusive jet and hadron spectra [68,79,83]. Therefore, both models have similar averaged jet/parton energy loss, while the fluctuation and traverse dynamics could be different. This will be elaborated later in the next section where we compare hadron distributions within jets between the two models.
Finally, the LIDO model does not solve the fully coupled hydrodynamic response, but uses a simplified formula to compute the correction to jet energy from medium response [68]. Furthermore, LIDO only uses Lund string fragmentation while the CoLBT-hydro model uses fragmentation plus parton recombination model for hadronization. Nevertheless, we expect these differences only affect particle distribution at low p T . We do not consider these to have major impacts on the following inference of jet locations, where only hadrons with p T > 2 GeV/c are used in the training.
where f is the function represented by the neural network, θ is the collection of all the trainable parameters which are initialized with random numbers. The training objective is to minimize the loss which is the squared differences between network prediction and the ground truth, where N is the total number of hadrons from jets in a mini-batch of training samples. The loss is reduced step-by-step using the stochastic gradient descent (SGD), where the learning rate α is a small positive number. In this study, we use the Adam optimizer which improves SGD using momentum mechanism and adaptive learning rate. process. The learning rate is a key parameter in stochastic gradient descent. A large learning rate at the beginning of the training helps to walk fast in the parameter space, while a small learning rate at late time will reduce oscillations around the minima and helps the network to converge.
Using pseudo-data from LIDO model simulations to test our trained neural network, we observe a smaller testing error than the validating and testing error on CoLBT-hydro data. The testing error is only 0.049 on data from LIDO model, which corresponds to a RMSE 2.2 fm. In the first glance, it seems mysterious since usually the testing error is larger than training error unless there are dropouts which randomly discard a fraction of neurons during training to increase the generalization ability. In our case, we did not use dropout and suspect that the testing error is smaller on LIDO data because of less fluctuations in the LIDO simulations.
One notable difference is that CoLBT-hydro uses both fragmentation and recombination for hadronization while LIDO uses fragmentation only. To check the effect of this model difference, we switch off the recombination in CoLBT-hydro and found that the testing error is reduced from 0.060 to 0.054.
There are other model differences which affect the jet shape and the predicting accuracy. Fig. 5 shows the distribution of the mean absolute angle of hadrons in jet, with respect to the jet-axis.
Hadrons spread to larger angles in CoLBT-hydro as compared with LIDO. Both models have taken into account the rescattering of recoiled partons. LIDO considers the t-channel scattering between partons, while CoLBT-hydro has also s and u-channels, which result in more wide-angle scatterings. Other model differences between CoLBT-hydro and LIDO might also contribute to this surprisingly better prediction accuracy on pseudo-data from LIDO whose underlying modeling is different from the CoLBT-hydro model used for training data production.

III. RESULTS
The Mach cones from CoLBT-Before we study signals of jet-induced Mach cones and diffusion wake with specified spatial regions of the initial jet production positions using the trained point It is not possible to determine the jet production position exactly in each event due to large fluctuations even with the complete information of jet hadrons. For the same reason it is also difficult to locate the average jet production points precisely using the longitudinal and transverse gradient tomography. The neural network predictions are also approximate as one can see here. Even though the true initial production points spread to a larger region than the network predictions, clipping the network predicted jet production points into different sections will still localize the true production positions to the region of interests as one can see in comparing the upper and middle panels of Fig. 7.  the QGP with a shorter path length where the temperature is also lower. We should note that the enhanced soft hadrons in the peak region along the jet direction shown in the lower panels of Fig. 7 can come from both the Mach cone and medium-induced gluon radiation. It is difficult to distinguish these two sources of soft hadron enhancement in the measured distributions.
As one can see in Fig. 6, the diffusion wake associated with jet-induced Mach cones is just a depletion of the hydrodynamic background behind the propagating jet. This will lead to a depletion of soft hadrons in the γ/Z 0 direction in γ/Z 0 -jet events. One can clearly see this depletion (yields are negative after subtraction of the p+p baseline) in Figs. 7(j) and (l) where the effect of the medium response is strong. As pointed out in Ref. [45], mini-jets from multiple parton interactions (MPI) associated with the hard processes of γ/Z 0 -jet production contribute to a background in the jet-hadron correlation that is constant in azimuthal angle. These mini-jets will also interact with the QGP medium and experience jet quenching in heavy-ion collisions. Quenching of these MPI mini-jets will lead to an enhancement of soft hadrons from the MPI background some of which still remain even after the subtraction of p+p baseline. This is the reason why one does not see the clear depletion due to diffusion wake in Fig. 7(i) where the effect of the medium response is weak.
One can use a mixed-event method to estimate the enhanced MPI background in A+A collisions.
The depletion of soft hadrons due to the diffusion wake will become stronger after the subtraction of the MPI background [45].
As we have illustrated in Fig. 6, the shape of Mach cones and diffusion wake are distorted both by the radial flow and the density gradient of the expanding QGP. They also influence the jet-hadron correlation in the final soft hadrons. When the distribution of the initial jet production positions is symmetric with respect to the jet propagation (or γ) direction, the jet-hadron correlation should also be symmetric with respect to the jet direction (∆φ = 0) as shown in Figs. 7(a,e,i) and (b,f,j).
In these symmetric jet-hadron correlations, the width of the correlation is broadened by both the wind of the radial flow and the slope of the density gradient. The broadening is larger for longer jet path lengths. The depletion due to the diffusion wake peaks in the direction of the trigger γ (∆φ = π) as seen in Fig. 7(j).
If the distribution of the initial jet production position is not symmetric with respect to the jet direction, jet propagation is tangential to the radial flow most of the time. The Mach cones and diffusion wake will be distorted by the temperature gradient and radial flow and become asymmetric with respect to the jet direction as seen in Fig. 6. This asymmetry should also be reflected in the soft hadron azimuthal angle distributions as seen in the right two panels of Fig. 7 [(e,g,k) and (d,h,l)] for γ-jet events in which the initial jet production locations are selected by the neural network such that jet propagation is mostly tangential to the radial flow. In these asymmetric jet-hadron angular correlations, hadron yields in the direction towards the central region of the QGP are enhanced because of the higher temperature of the medium, larger parton energy loss and stronger medium response. The diffusion wake is, on the other hand, both swiped by the radial flow and deflected by the density gradient such that the minimum angle (dip in the soft hadron depletion) is shifted away from the γ direction ∆φ hjet = π. The angular shift is larger when the diffusion wake is exposed more to the radial flow and density gradient as in the case of Figs. 7(e,g,k). The dip is shallower due to smaller jet path length and weaker medium response as compared to the case in Figs. 7(d,h,l).
Similarly, the position of the peak in the jet-hadron angular correlation is also shifted away from the jet direction (∆φ = 0) due to the influence of the radial flow and the density gradient. The shift is larger when the jet propagation path length is longer. The asymmetrical features in the measured jet-hadron correlations in events with neural network engineered jet production locations should be the clear evidence of the jet-induced Mach cones and diffusion wake in heavy-ion collisions.
3D structure of the diffusion wake-Since both jets and the Mach cones are 3-dimensional objects, it is interesting to examine the 3D structure of the signals of Mach cones and the diffusion wake as recently suggested in Ref. [49]. We can do so for γ-jet events with and without restricted regions of the initial jet production positions. Fig. 8  We now restrict the initial jet production positions to (x < −1, y < 0) with the jet traveling in the positive y direction using the deep neural network predictions. This scenario is very similar to that in the right panel of Fig. 7 (d,h,l) where the jet traverses approximately one half region of the QGP in a direction that is mostly tangential to the radial flow. The structure of the Mach cone and diffusion wake in the 2D jet-hadron correlation function is much more enhanced as shown in the right plot of Fig. 8. Restricting the initial jet production positions this way forces the jet to traverse longer path lengths on the average, lose more energy and induce a stronger medium response. The peak in the jet direction is higher as compared to the left plot. The peak position is also shifted away from jet direction (difficult to see here because the tip of the peak is cut-out by the limited range of the vertical axis of the plot) due to the radial flow and density gradient. The diffusion wake valley in the γ direction is much deeper with mostly negative yields [86]. Since the jet propagation direction is mostly tangential to the radial flow and the path is off the center of the QGP in this case, the diffusion wake is swiped along the radial flow and drifted by the density gradient away from the γ direction. This leads to a diffusion wake valley whose minimum point (the dip) is shifted away from the γ direction (∆φ = π) as we see in the right plot. This 3D structure enhanced by neural network selection of the initial jet production positions is much more dramatic than the plain azimuthal angle dependence of jet-hadron correlations. The corresponding jet distribution is slightly modified from the p+p baseline with the peak shifting slightly to smaller p T indicating a small jet energy loss. If we constrain the predicted jet production positions to y ∈ [−1, 1] (dash-dotted-green line) and y < −1 (dotted-red line), the average path lengths are increasingly longer with larger jet energy loss. The corresponding jet p T distributions therefore shift to smaller p T as shown. These trends of the γ-jet asymmetry is a clear evidence for the power of the deep learning assisted jet tomography which can be used in the future for more detailed study of jet quenching and jet-induced medium response.

IV. SUMMARY
In this paper, we have developed a deep learning assisted jet tomography and applied the technique to study jet-induced Mach cones and the associated diffusion wake in high-energy heavyion collisions. The deep learning assisted jet tomography is based on a point cloud neural network and allows one to predict the initial jet production locations using information of final state hadrons with p T > 2 GeV/c inside a jet in each event. The neural network is trained with data from CoLBT-hydro model simulations of jet propagation and medium response in high-energy heavy-ion collisions. The network is also tested against pseudo-data from the LIDO model with high accuracy.
The mean square root deviation between network prediction and the true jet production position is about 2.2 ∼ 2.4 fm. This network therefore allows one to select events with approximate regions of the initial jet production locations inside the overlap transverse area of two colliding nuclei with known jet propagation direction relative to the radial flow.
Using the deep learning assisted jet tomography we have investigated the signals of jet-induced Mach cones and the associated diffusion wake in the final jet-hadron correlation for soft hadrons from CoLBT-hydro simulations of γ-jet events in high-energy heavy-ion collisions. By restricting the initial jet production positions to different regions of the nuclear overlap area, we can enhance the signals of the Mach cone and diffusion wake and investigate the influence of the radial flow and density gradient. When the initial jet production positions are symmetric to the jet direction, the jet-hadron azimuthal correlation is shown to get broadened by the wind of the radial flow and drifted diffusion due to the density gradient. The width of the azimuthal correlation and the soft hadron yield are shown to increase with the averaged jet path length when we vary the restricted region of the initial jet production location. For a long jet path length, a clear negative dip due to the diffusion wake is observed behind the conical shock front. If the initial jet production positions are restricted to only one half region of the nuclear overlap area, the final jet-hadron correlation becomes asymmetric with respect to the jet direction due to the distortion of the Mach cone and diffusion wake by the radial flow and density gradient. The Mach-cone peak in the jet-hadron correlation is shifted away from the jet direction and the dip of the diffusion wake is also shifted away from the γ direction. Both of these shifts are driven by the blowing wind of the radial flow and drifted diffusion due to the density gradient. We also applied the deep learning assisted jet tomography to look at the signals of the Mach cone and diffusion wake in the 2-dimensional jet-hadron correlation in both rapidity and azimuthal angle, which is more dramatic than in the azimuthal angle jet-hadron correlation. Observing this phenomena in experimental data will provide direct evidence for the existence of Mach cones in heavy ion collisions and their distortion by the radial flow and density gradient.
We also illustrated how one can use the jet tomography to extract path length dependence of jet energy loss from the measured γ-jet p T distributions. These are just two examples of how one can use the deep learning assisted jet tomography to study jet quenching and jet-induced medium response in high-energy heavy-ion collisions.