1 Introduction

All projectiles moving through medium feel resistance and eventually slow down, depending on the density of the medium and the frictional force. Typical bullets, for example, lose all their energy to the friction and come to a stop after about only a few feet of distance in the water which is about 800 times denser than the air. The lost energy generates sound waves in the medium that travel with the projectiles and leave behind a diffusion wake. When the speed of a fast-moving projectile exceeds the velocity of sound in the medium, fronts of the sound waves are squeezed together to form a Mach cone or the sonic boom. The same phenomenon is also expected in a small droplet of subatomic liquid [1,2,3,4,5,6,7,8,9,10,11,12,13] called quark gluon plasma (QGP) the size of an atomic nucleus in high energy nuclear collisions. In contrast to the normal nuclear matter where quarks and gluons are confined to the domains of each nucleon (proton or neutron), they can roam freely in the whole volume of QGP. In a process called jet quenching [14,15,16,17], energetic quarks and gluons (also called partons) are produced through hard scattering of beam partons from the colliding nuclei before the formation of QGP. These fast partons have to traverse the QGP at the speed of light (c), which is faster than the upper bound of the velocity of sound (\(c_s < {c/\sqrt{3}}\) ) in QGP [18]. As the fast partons lose energy-momentum along their trajectories through jet-medium interaction, their lost energy will induce supersonic sound waves in the form of Mach cones on a femtometer scale inside the QGP.

The jet-induced subatomic Mach cones can be studied theoretically using transport models [19,20,21,22,23], linearized hydrodynamics [4, 6,7,8, 11, 12, 23,24,25,26] as well as full hydrodynamics [13, 23, 27,28,29,30,31, 31,32,33,34] as illustrated in Fig. 1. These Mach cones are different from those in a classical and macroscopic medium. First of all, partonic jets are quantum projectiles. They lose energy-momentum not only through elastic scattering with the medium but also through self and medium-induced gluon radiation with intricate quantum interference due to multiple scattering in the medium [35,36,37]. This unique energy loss mechanism will lead to a deformed Mach cone [8, 9] if all or part of the radiative energy loss is converted into medium excitation. Secondly, the extremely hot and dense QGP droplet expands very rapidly due to the large pressure and density gradient during the parton-shower propagation. The induced Mach cone excitation will also be deformed by the large density gradient and the radial flow of the medium [20, 21, 28, 33]. Experimental observations and measurements of the jet-induced Mach cone excitation can therefore help to study the dynamical evolution and intrinsic properties of the QGP medium such as the temperature and flow profiles, the opacity to jet partons, the equation of state and transport coefficient of the QGP [3, 10, 38].

Fig. 1
figure 1

A snapshot of the 3-dimensional energy density distribution in the transverse plane (xy) and the spatial rapidity (\(\eta _s\)) of the medium response induced by a \(\gamma \)-triggered jet at time \(\tau =3\) fm/c in a central Pb+Pb collisions at \(\sqrt{s}=5.02\) TeV simulated by the CoLBT-hydro model (see description in Sect. 2). White dots represent the position of jet shower partons and the arrows their momenta

Since both jets and the QGP with jet-induced medium excitation will become hadrons in the final state, one can search and study the jet-induced Mach cone in heavy-ion collision experiments through jet and hadron correlations. Jets in the final state are represented by the energetic leading or clusters of hadrons and hadronization of QGP leads to a large number of soft hadrons as recorded by experimental detectors. Due to the large background of soft hadrons from QGP not associated with jets and the jet-induced Mach cone, it is rather difficult to unambiguously identify the signal of the Mach cone in experimental data. For example, fluctuating initial energy density distributions and collective expansion of QGP can lead to nontrivial two-hadron correlations in the azimuthal angle from anisotropic flows [21, 23, 39,40,41,42] that can be misidentified as a possible signal of Mach cone. Such background can be eliminated in neutral boson (\(Z^0\)) or photon (\(\gamma \))-jet events since the \(\gamma /Z^0\) trigger does not interact with the QGP medium and is therefore not correlated with the underlying soft hadrons. However, the most recent experimental data on \(\gamma /Z^0\)-hadron correlations [43] have not shown any visible and unambiguous signal of jet-induced Mach cones without any additional event selections.

The absence of clear experimental evidence for jet-induced Mach cones in \(\gamma /Z^0\)-jet events could be caused by the averaging of events over the initial jet production positions and propagation directions. For a given initial jet production position and propagation direction, simulations show that jet-induced Mach cones lead to some unique angular distributions of soft hadrons due to distortion by the density gradient and radial flow in an expanding QGP medium [29, 44, 45]. However, for fixed transverse momentum of the trigger \(\gamma /Z^0\) in experiments, the initial \(\gamma /Z^0\)-jet can be produced at any transverse position (xy) according to the distribution of the number of binary collisions in the overlap region of two colliding nuclei. The final \(\gamma /Z^0\)-hadron correlations measured in experiments are averaged over events with different \(\gamma /Z^0\)-jet production positions and parton-shower propagation direction or effectively different parton-shower propagation path length (different amount of jet energy loss), the temperature gradient, as well as the intersection between parton-shower propagation and collective flow of the medium. The final averaged \(\gamma /Z^0\)-hadron correlation unfortunately loses many of the distinctive features from a typical Mach cone.

Attempts have been made in the past to locate the initial jet production position using jet tomography. Since \(\gamma /Z^0\) does not lose energy in the QGP medium, the change of \(\gamma /Z^0\)-jet energy asymmetry \(\chi = E_{\gamma /Z^0} - E_{\textrm{jet}}\) is related to the averaging jet energy loss which depends monotonically on the averaging parton shower propagation length. It is therefore proposed in a longitudinal jet tomography [46] that the \(\gamma /Z^0\)-jet energy asymmetry can be used to locate the initial longitudinal position (along the parton-shower propagation direction) or the jet path length. Similarly, the transverse asymmetry \(A_T\), which is defined to characterize the jet transverse diffusion due to the density transverse gradient of the medium, has been proposed in a transverse or gradient tomography [47, 48] to approximately locate the jet initial transverse position perpendicular to the parton-shower propagation. The 2D jet tomography combining the longitudinal and transverse jet tomography is shown to be able to locate the initial jet production positions, including both the path length and perpendicular coordinates, and enhance detailed structures of the diffusion wake of the Mach cones in \(\gamma \)-hadron correlations [45, 49]. Recently, deep neural network was also employed to estimate the energy loss of hadron-triggered jet which can also tell roughly the initial longitudinal positions of jet production along the jet direction [50, 51].

The above methods in previous studies can only provide very primitive estimate of the averaged initial jet production positions. To fully understand the jet-induced medium response and look for signals of Mach cone and the diffusion wake, more accurate and event-by-event jet production positions are desired, taking into account more realistic jet energy loss and the fluctuating nature of jet shower development in the QGP medium. We aim to develop a deep learning assisted jet tomography in this study that will address the following questions:

  1. 1.

    Is it possible to determine the initial jet production positions using the momenta of final state particles?

  2. 2.

    Will the signal of Mach cones and the diffusion wake be amplified if initial jet production positions are constrained to specific regions with long parton-shower propagation lengths or fixed propagation direction relative to the radial flow?

  3. 3.

    How reliable is the new deep learning assisted method?

To answer the first question, we employ a point-cloud deep neural network (DNN) with multiple hidden layers, which is powerful in pattern recognition and has been widely used in high energy nuclear physics [52,53,54,55] and jet medium interaction studies [50, 51, 56,57,58,59,60,61,62,63]. We will train and validate the DNN using data from the coupled Linear Boltzmann Transport (CoLBT) and hydro model simulations [29] which combines the Linear Boltzmann Transport (LBT) model for jet parton propagation in medium with the concurrent QGP evolution given by the 3+1D CCNU-LBNL viscous (CLVisc) hydrodynamics [64,65,66]. The CoLBT-hydro model also simulates the stochastic energy momentum deposition to the medium by jet-shower partons together with the relativistic hydrodynamic medium response. For the second question, we can average events with the same jet production regions as predicted by deep neural network. This method, dubbed “jet-position engineering”, shows that the signal of Mach cone and diffusion wake depends strongly on the initial jet production regions. The third question concerns with the training model dependence of the neural network. We will use pseudo-data generated from another model, the LIDO partonic transport model [67, 68], to cross-check performance of the trained network, i.e., use the CoLBT-trained network to infer the jet production locations from the pseudo-data generated from the LIDO model and compare to the ground truth. Although we do not expect either model to provide the exact real jet dynamics in the QGP, this practice can estimate the model uncertainty of the method.

The remainder of the paper is organized as following. In Sect. 2, we describe the major physics model CoLBT-hydro and the cross-check model LIDO for jet evolution in the QGP. We will also introduce the point-cloud neural network method for locating the initial jet production positions from the information of final state hadrons. Results are presented in Sect. 3. Section 4 summarizes the paper.

2 Methods

2.1 CoLBT-hydro model for jet transport and medium response

The initial configurations in momentum distributions of \(\gamma \)-jets are generated by PYTHIA8 [69]. High energy partons with large virtuality produced in hard scattering split subsequently towards their mass shell, following the DGLAP evolution equation. The splitting terminates when the virtuality of all daughter partons in the shower become smaller than 1 GeV. Please refer to [23] for more details on the shower generation in the momentum space. The Trento model is used for both the initial spatial distribution [70] of \(\gamma \)-jet production and the initial energy density distribution for the hydrodynamic evolution of the bulk QGP medium.

The jet transport in QGP is simulated using the LBT model [44, 71,72,73], where evolution of the phase space distribution of jet shower partons in QGP is described by the linearized Boltzmann equation,

$$\begin{aligned} p_a\cdot \partial f_a{} & {} = \int \sum _{b c d } \prod _{i=b,c,d}\frac{d^3p_i}{2E_i(2\pi )^3} (f_c f_d-f_a f_b)|\mathcal{M}_{ab\rightarrow cd}|^2\nonumber \\{} & {} \quad \times \frac{\gamma _b}{2} S_2({\hat{s}},{\hat{t}},{\hat{u}})(2\pi )^4\delta ^4(p_a\!+\!p_b\!-\!p_c\!-\!p_d)\nonumber \\{} & {} \quad + \textrm{inelastic}, \end{aligned}$$
(1)

where \(p_{a/c}\) and \(f_{a/c}\) are the four-momenta and phase space distributions of jet-shower partons before and after each scattering, \(p_{b/d}\) are the four-momenta of medium partons sampled from thermal distributions \(f_{i} = 1/(e^{p_{i}\cdot u / T} \pm 1)\) (\(i=b, d\)) in a locally thermalized QGP with temperature T and fluid four-velocity u. The left hand side of the equation describes the change of parton phase space density \(f_a\). The right hand side contains the loss term (\(-f_a f_b\)) and the gain term (\(f_c f_d\)). The loss term describes the forward process where a jet shower parton a in the given phase space scatters with a thermal parton b, causing a decrease in \(f_a\) if parton a leaves this specific phase space. The gain term describes the back reaction, where a jet shower parton c scatters with a thermal parton d, causing an increase of phase space density \(f_a\), if the recoiled parton enters the specific phase space. Both processes have considered all possible channels of elastic scattering \(a + b \rightarrow c + d\), whose amplitudes \(|{{\mathcal {M}}}_{ab \rightarrow cd}|\) are computed up to leading order in perturbative QCD [74]. The factor \(\gamma _b\) is the color and spin degeneracy of parton b in the forward process and

$$\begin{aligned} S_2({\hat{s}}, {\hat{t}}, {\hat{u}}) = \theta ({\hat{s}}\ge 2\mu _{D}^2)\theta (-{\hat{s}}+\mu _{D}^2\le {\hat{t}}\le -\mu _{D}^2) \end{aligned}$$
(2)

is a Lorentz-invariant double step function used to regularize the collinear divergence in the scattering amplitudes, where \({\hat{s}}\), \({\hat{t}}\), and \({\hat{u}}\) are the Mandelstam variables, and \(\mu _{D}^2 = \frac{3}{2}g^2 T^2\) is the Debye screening mass. The inelastic part in Eq. (1) is the contribution from gluon radiation that is induced by the elastic scattering processes. The induced gluon emission rate \(\Gamma ^{\textrm{inel}}_{a}\) is calculated according to the higher-twist formalism [75,76,77]. Please refer to Refs. [44, 71,72,73] and [29, 78,79,80] for more details about LBT and CoLBT-hydro model.

The four-momenta of thermal partons b and d are sampled from distribution functions \(f(p) = 1/(e^{p\cdot u / T} \pm 1)\) with \(+\) for quarks and − for gluons. The local temperature and fluid velocity u are provided by the CLVisc model that solves numerically relativistic hydrodynamic equations,

$$\begin{aligned} \nabla _{\mu } T^{\mu \nu } = J^{\nu }, \end{aligned}$$
(3)

for the space-time evolution of QGP, where \(\nabla _{\mu }\) is the covariant derivative operator, \(T^{\mu \nu } = (\epsilon +P)u^{\mu }u^{\nu } - P g^{\mu \nu } + \pi ^{\mu \nu }\) is the energy stress tensor, in which \(\epsilon \) and P are the energy density and pressure in the co-moving frame of the fluid, \(u^{\mu }\) is the relativistic fluid four-velocity satisfying \(u_{\mu }u^{\mu } = 1\), \(g^{\mu \nu } = \textrm{diag}(1, -1, -1, -\tau ^2)\) is the metric tensor in the Milne \((\tau , x, y, \eta _s)\) coordinates and \(\pi ^{\mu \nu }\) is the shear-stress tensor. The source term \(J^{\nu }\) describes the energy and momentum deposition from jet showers to QGP. It includes the energy and momentum from the recoiled and radiated partons whose energy in the local co-moving frame of the fluid is smaller than a cut-off which is set to \(p_0=2\) GeV/c. Partons taken out of the phase space in the medium will leave a particle hole behind, and will deposit negative energy-momentum to the medium through the source term. This source term \(J^{\nu }\) is responsible for generating the Mach cones and the diffusion wake in the QGP medium. This group of time-dependent partial differential equations are solved numerically on GPU using OpenCL in the CLVisc model concurrently with the LBT transport. See Refs. [64,65,66, 81] for more details. A parton recombination model is used to hadronize both the QGP and hard partons from jet showers [82,83,84]. The kinetic cuts in this work are \(p_T^\gamma \in (200,250)\) GeV, \(\left| \eta _\gamma \right| <0.32\), \(p_T^{\textrm{jet}}>100\) GeV, \(\left| \eta _{\textrm{jet}} \right| <0.75\), respectively, with jet cone size \(R=0.4\).

2.2 LIDO transport model for jet evolution

The LIDO model also contains the essential ingredients of collisional and radiative processes as well as the medium response, but differs from CoLBT-hydro in details. We will apply the predictor trained with the CoLBT-hydro model to infer the initial jet production locations from a set of pseudo-data generated by the LIDO transport model. This provides an estimation of the model-dependent uncertainty in the predictor. Here, we briefly describe the LIDO model with an emphasis on its difference as compared to the CoLBT-hydro model. Full details of the LIDO model can be found in Refs. [67, 68].

The evolution of the distribution function of hard partons in the LIDO model can be schematically written as

$$\begin{aligned} \frac{d f(t, x, p)}{dt}{} & {} = \left[ -\eta \nabla _p+ \frac{1}{4}\hat{q}_s\nabla _p^2\right] f\nonumber \\{} & {} \quad +\mathcal {C}_{12}[f] + \mathcal {C}_{22}[f] + \mathcal {C}_{23}[f]. \end{aligned}$$
(4)

The first key difference as compared to the CoLBT-hydro model is that the jet-medium interaction is separated into two parts: (1) hard interactions with \(q_\perp > 2\mu _D\) are simulated according to the matrix elements of scatterings in the vacuum \(\mathcal {C}_{22}[f]\); (2) soft interactions that transfer small momentum \(q_\perp <2\mu _D\) between jet partons and the medium are modeled by a diffusion process, as indicated by the first term on the right-hand side of the equation. The soft diffusion constant is \(\hat{q}_s = \alpha _s C_R \mu _D^2 T \ln 4\) with the corresponding soft drag constant \(\eta = \hat{q}_s/4ET\). Since a large fraction of the elastic processes are taken care of by a diffusion equation (Langevin dynamics in particle-based simualtions), the fluctuation of elastic energy loss in LIDO is smaller than in CoLBT-hydro with \(-t\approx q_\perp ^2 > \mu _D^2\). The second difference is the handling of radiative processes. The LIDO model splits parton emissions into diffusion-induced radiation \(\mathcal {C}_{12}[f]\) and hard-collision-induced radiation \(\mathcal {C}_{23}[f]\). The LIDO model applies a different implementation of the Landau-Pomeranchuck-Migdal (LPM) interference effects that works best in the deep LPM region, while the CoLBT-hydro model applies the Higher-twist formula [75]. Nevertheless, both methods result in a radiative energy loss that scales as \(\ln E\) for energteic partons.

The parameters in CoLBT-hydro and LIDO model have been calibrated to reproduce experimental data on the suppression of single inclusive jet and hadron spectra [68, 79, 83]. Therefore, both models have similar averaged jet/parton energy loss, while the fluctuation and traverse dynamics could be different. This will be elaborated later in the next section where we compare hadron distributions within jets between the two models.

Finally, the LIDO model does not solve the fully coupled hydrodynamic response, but uses a simplified formula to compute the correction to jet energy from medium response [68]. Furthermore, LIDO only uses Lund string fragmentation while the CoLBT-hydro model uses fragmentation plus parton recombination model for hadronization. Nevertheless, we expect these differences only affect particle distribution at low \(p_T\). We do not consider these to have major impacts on the following inference of jet locations, where only hadrons with \(p_T>2\) GeV/c are used in the training.

Fig. 2
figure 2

The structure of the point cloud network that is used to predict the jet production positions (xy) from hadrons in the jet and the global information of \(\gamma \) and jet

2.3 Point cloud network for particles in momentum space

We employ a point cloud network to locate the initial jet production positions as shown in Fig. 2. The network has two kinds of inputs for each training sample. The first input consists of a list of hadrons in the jet with 6 features (\(p_x\), \(p_y\), \(p_z\), E, mass, \(p_T\)) for each hadron. Since different events may have different numbers of hadrons, we concatenate a list of zeros to the end of the hadron list to use the point cloud network. Hadrons from jets are selected with \(p_T>2\) GeV/c to reduce the influence of soft hadrons from the hydrodynamic background. The second input contains the global information about the \(\gamma \)-jet event, with 6 floating numbers indicating the (\(\eta , p_T, \phi \)) of the \(\gamma \) and the jet reconstructed using the FastJet with anti-kt algorithm [85].

The point cloud network uses one dimensional convolution neural network (1DCNN) to convert the 6 features of each hadron in the cloud to high dimensional latent space. The 1DCNN is a feed forward neural network (also called multi-layer perception or MLP) whose parameters are shared by all hadrons in the cloud. The 1DCNN has a structure of MLP(256, 128, 128) where the numbers stand for the number of neurons in each hidden layer. At each step, an ELU activation function is used to get non-linear transformation capability. The high dimensional latent features of each hadron from 1DCNN are independent of each other. Global Max Pooling will locate the boundaries along each direction in the high dimensional latent space. This step preserves the permutation symmetry and helps to capture the correlations between hadrons in the jet. The output of this step is the jet substructure information encoded in 128 numbers from the Global Max Pooling. Another MLP(256, 128) is used to convert the second input to high dimensional features including \(\gamma \)-jet asymmetry. Latent features from these two inputs, corresponding to local and global information, are fed together to the final MLP(32) to produce the output. The output of the network has two neurons representing the initial jet production position \((x^{\textrm{net}}_i, y^{\textrm{net}}_i)\) in the transverse plane, for given sets of the final hadron four-momenta \(\{\vec {p}\}_i\),

$$\begin{aligned} (x^{\textrm{net}}_i, y^{\textrm{net}}_i) = f(\{\vec {p}\}_i, \theta ), \end{aligned}$$
(5)

where f is the function represented by the neural network, \(\theta \) is the collection of all the trainable parameters which are initialized with random numbers. The training objective is to minimize the loss which is the squared differences between the network prediction and the ground truth,

$$\begin{aligned} \textrm{loss} = {1\over N}\sum _{i=1}^N \left[ (x_i^{\textrm{net}} - x_i^\textrm{true})^2 + (y_i^{\textrm{net}} - y_i^{\textrm{true}})^2 \right] , \end{aligned}$$
(6)

where N is the total number of hadrons from jets in a mini-batch of training samples. The loss is reduced step-by-step using the stochastic gradient descent (SGD),

$$\begin{aligned} \theta ^{\textrm{new}} = \theta ^{\textrm{old}} - \alpha {\partial \textrm{loss} \over \partial \theta }, \end{aligned}$$
(7)

where the learning rate \(\alpha \) is a small positive number. In this study, we use the Adam optimizer which improves SGD using momentum mechanism and adaptive learning rate.

Fig. 3
figure 3

The \(p_x\) and \(p_y\) (circles) of jet hadrons for 6 randomly selected \(\gamma \)-jet events and their corresponding jet production positions (xy) in the transverse plane

Fig. 4
figure 4

The training history as a function of epochs. Left figure shows the training and validating loss of the network. Right figure shows the adaptive learning rate changing with the training process. The learning rate is a key parameter in stochastic gradient descent. A large learning rate at the beginning of the training helps to walk fast in the parameter space, while a small learning rate at late time will reduce oscillations around the minima and helps the network to converge

Shown in Fig. 3 are \(p_x\) and \(p_y\) (circles) of jet hadrons for 6 randomly selected CoLBT-hydro events with a \(\gamma \)-jet and their corresponding jet production positions (xy) in the transverse plane. These six scatter plots illustrate the training data and the corresponding labels used in the supervised regression. No obvious patterns in the \((p_x, p_y)\) distributions of jet hadrons can be found visually that are directly related to the jet production positions due to the stochastic nature of jet showering and scattering in the medium. This reminds us that the neural network is not expected to determine the jet production positions accurately in all events, if the causality between the input and output is destroyed by event-by-event fluctuations.

The network prefers to work with numbers between -1 and 1. During training, the four-momenta and mass of hadrons are scaled-down by a factor of 2000. The transverse momentum \(p_T\) of \(\gamma \) and jet are scaled-down by a factor of 250. The coordinates x and y of initial jet production are scaled-down by a factor of 10. Because of the scaling, a typical loss 0.06 corresponds to an average deviation \(\sqrt{6}=2.4\) fm from the ground truth.

2.4 Locating jet production positions using deep learning

Figure 4 shows the training and validating loss as a function of training epochs in the jet-production-position predictions. The first epoch is trained with batch-size 4 which is different from other epochs whose batch-size equals 32. By doing this, the network can explore more regions first and then continues to optimize from a good starting point. Besides the adaptive learning rate provided by the Adam optimizer, we reduce the learning rate by 20% if the validation loss saturates and stays on a plateau for 50 epochs. The validating and testing errors using data from CoLBT-hydro are about 0.06 which corresponds to a root-mean-square-error (RMSE) \(\sqrt{6}=2.4\) fm.

Using pseudo-data from LIDO model simulations to test our trained neural network, we observe a smaller testing error than the validating and testing error on CoLBT-hydro data. The testing error is only 0.049 on data from LIDO model, which corresponds to a RMSE 2.2 fm. In the first glance, it seems mysterious since usually the testing error is larger than training error unless there are dropouts which randomly discard a fraction of neurons during training to increase the generalization ability. In our case, we did not use dropout and suspect that the testing error is smaller on LIDO data because of less fluctuations in the LIDO simulations.

One notable difference is that CoLBT-hydro uses both fragmentation and recombination for hadronization while LIDO uses fragmentation only. To check the effect of this model difference, we switch off the recombination in CoLBT-hydro and found that the testing error is reduced from 0.060 to 0.054.

There are other model differences which affect the jet shape and the predicting accuracy. Figure 5 shows the distribution of the mean absolute angle of hadrons in jet, with respect to the jet-axis. Hadrons spread to larger angles in CoLBT-hydro as compared with LIDO. Both models have taken into account the rescattering of recoiled partons. LIDO considers the t-channel scattering between partons, while CoLBT-hydro has also s and u-channels, which result in more wide-angle scattering. Other model differences between CoLBT-hydro and LIDO might also contribute to this surprisingly better prediction accuracy on pseudo-data from LIDO whose underlying modeling is different from the CoLBT-hydro model used for training data production.

3 Results

3.1 The Mach cones from CoLBT

Before we study signals of jet-induced Mach cone and diffusion wake with specified spatial regions of the initial jet production positions using the trained point cloud neural network, we first look at features of the Mach cones from CoLBT-hydro simulations and their dependence on the initial jet production positions. Shown in Fig. 6 are the transverse distributions of the energy density at spatial rapidity \(\eta _s=\)0.0 from jet-induced medium response in a single \(\gamma \)-jet event, with 6 different initial jet production positions. Jets propagate upward in all these events. In this plot background from the same hydrodynamics events without jets is subtracted. The negative energy density represents the depletion of the medium due to the diffusion wake. The upper panels show events where jets are initially produced close to the upper edge of the overlap region of two colliding nuclei and travel outward. These jets in general have short path lengths and lose less energy. The lower panels show events where jets are produced near the lower edge of the overlap region but travel inward through the center and across the whole length of the QGP with long path lengths. The average medium density (temperature) on their paths is also higher. The left and right panels show events in which the parton-shower propagation direction is tangential to the radial flow while the middle panels show events where jets travel along (upper-center) and against (lower-center) the radial flow.

Fig. 5
figure 5

The distribution of mean \(\Delta \phi \) for hadrons in jet. The green solid line is from LIDO model and the blue dashed line is from CoLBT-hydro model

Fig. 6
figure 6

Energy density transverse distributions of jet-induced medium response from CoLBT-hydro simulations of 6 \(\gamma \)-jet events with \(p_T^\gamma \in (200,250)\) GeV/c in 0–10% central Pb+Pb collisions at \(\sqrt{s}=5.02\) TeV with different jet production positions

The medium responses and signals of the Mach cone are clearly stronger for longer path lengths and higher average medium density on the parton-shower propagation path. The shape of the Mach cone is also apparently distorted by the temperature gradient and the radial flow of the QGP medium. For jets traveling tangentially to the radial flow (left and right panels), jet shower partons going toward the center of QGP lose more energy and induce stronger medium response as compared to these partons going away from the center. On the other hand, momenta of the thermal partons associated with the flowing medium will blow the medium response along the direction of the radial flow as seen in the shape of the diffusion wake which is tilted in the direction of the radial flow. The final distorted shape of the Mach cone is the result of competition between temperature gradient and radial flow. This distorted pattern is clearly seen in the left and right panels of Fig. 6 where both the Mach cone and the diffusion wake are asymmetrical with respect to the parton-shower propagation (or \(\gamma \)) direction. Observation of this asymmetric or distorted medium-response in the final hadron distributions with different jet production positions will provide clear evidence of jet induced Mach cones. In the following we will demonstrate that one can use a trained deep neural network to locate the jet initial production positions from the full information of final state hadrons inside the jet-cone. Soft-hadron distributions in events classified by the neural network according to the initial jet production positions should show the expected asymmetry.

3.2 Jet-hadron correlation with engineered initial jet production positions

We can now use the trained neural network to classify \(\gamma \)-jet events from CoLBT-hydro simulations according to the predicted positions of the initial jet production from the neural network and look at the corresponding \(\gamma \)-hadron (or jet-hadron) correlations. Shown in Fig. 7 are scatter plots of jet production positions selected by the deep neural network (upper panel), the true production positions in the simulated events (middle panel) and the corresponding jet-hadron correlations in azimuthal angle for charged hadrons (lower panel) in 0–10\(\%\) Pb+Pb collisions at the colliding energy \(\sqrt{s}=5.02\) TeV. The baseline correlation in proton+proton collisions is subtracted. The transverse momenta of the prompt photon and jets are \(p_T^\gamma \in (200,250)\) GeV and \(p_T^{\textrm{jet}}>100\) GeV, respectively with jet cone size \(R=0.4\). Their rapidities lie in \(\left| \eta _\gamma \right| <0.32\) and \(\left| \eta _{\textrm{jet}}\right| <0.75\). The transverse momenta of soft charged hadrons are restricted to \(p_T^h\in (1.0,2.0)\) GeV. The contour plots in the upper panels show the averaged temperature profile of the QGP background aligned to the reaction plane of the collisions. The photon(\(\gamma \)) is along the negative y direction and jet shower partons traverse in the opposite direction.

It is not possible to determine the jet production position exactly in each event due to large fluctuations even with the complete information of jet hadrons. For the same reason it is also difficult to locate the average jet production points precisely using the longitudinal and transverse gradient tomography. The neural network predictions are also approximate as one can see here. Even though the true initial production points spread to a larger region than the network predictions, clipping the network predicted jet production points into different sections will still localize the true production positions to the region of interests as one can see in comparing the upper and middle panels of Fig. 7.

Fig. 7
figure 7

The initial jet positions in 0–10% central Pb+Pb collisions selected by the deep neural network (upper panels) as compared to the true initial jet production points in CoLBT-hydro simulations (middle panels) and the corresponding difference in \(\gamma \)-hadron yields as a function of \(\Delta \phi ^\textrm{h jet }\)=\(\phi ^{\textrm{jet}}-\phi ^{\textrm{h}}\) between Pb+Pb and p+p collisions (lower panels). See text for more details

For the network selected regions of initial jet production points in the right (Fig. 7d, h, l) and center-left panels (Fig. 7b, f, j), the \(\gamma \)-jet has to traverse (in the positive y-direction) the central region of the QGP medium with a longer path length where the temperature is also higher. This causes more jet energy loss and leads to a larger yield of soft hadrons from the medium response and induced radiation as compared to the cases in the left (Fig. 7a, e, i) and center-right panels (Fig. 7c, g, k) where jets are produced in the outer region and travel away from the center of the QGP with a shorter path length where the temperature is also lower. We should note that the enhanced soft hadrons in the peak region along the jet direction shown in the lower panels of Fig. 7 can come from both the Mach cone and medium-induced gluon radiation. It is difficult to distinguish these two sources of soft hadron enhancement in the measured distributions.

As one can see in Fig. 6, the diffusion wake associated with jet-induced Mach cones is just a depletion of the hydrodynamic background behind the propagating jet. This will lead to a depletion of soft hadrons in the \(\gamma /Z^0\) direction in \(\gamma /Z^0\)-jet events. One can clearly see this depletion (yields are negative after subtraction of the p+p baseline) in Fig. 7j, l where the effect of the medium response is strong. As pointed out in Ref. [45], mini-jets from multiple parton interactions (MPI) associated with the hard processes of \(\gamma /Z^0\)-jet production contribute to a background in the jet-hadron correlation that is constant in azimuthal angle. These mini-jets will also interact with the QGP medium and experience jet quenching in heavy-ion collisions. Quenching of these MPI mini-jets will lead to an enhancement of soft hadrons from the MPI background some of which still remain even after the subtraction of p+p baseline. This is the reason why one does not see the clear depletion due to diffusion wake in Fig. 7i where the effect of the medium response is weak. One can use a mixed-event method to estimate the enhanced MPI background in A+A collisions. The depletion of soft hadrons due to the diffusion wake will become stronger after the subtraction of the MPI background [45].

As we have illustrated in Fig. 6, the shape of Mach cone and diffusion wake are distorted both by the radial flow and the density gradient of the expanding QGP. They also influence the correlation of final soft hadrons with the jet. When the distribution of the initial jet production positions is symmetric with respect to the parton-shower propagation (or \(\gamma \)) direction, the jet-hadron correlation should also be symmetric with respect to the jet direction (\(\Delta \phi =0\)) as shown in Fig. 7a, e, i, b, f, j. In these symmetric jet-hadron correlations, the width of the correlation is broadened by both the wind of the radial flow and the slope of the density gradient. The broadening is larger for longer jet path lengths. The depletion due to the diffusion wake peaks in the direction of the trigger \(\gamma \) (\(\Delta \phi =\pi \)) as seen in Fig. 7j.

If the distribution of the initial jet production position is asymmetric with respect to the jet direction, parton-shower propagation is tangential to the radial flow most of the time. The Mach cone and diffusion wake will be distorted by the temperature gradient and radial flow and become asymmetric with respect to the jet direction as seen in Fig. 6. This asymmetry should also be reflected in the soft hadron azimuthal angle distributions as seen in the right two panels of Fig. 7c, g, k, d, h, l for \(\gamma \)-jet events in which the initial jet production locations are selected by the neural network such that parton-shower propagation is mostly tangential to the radial flow. In these asymmetric jet-hadron angular correlations, hadron yields in the direction towards the central region of the QGP are enhanced because of the higher temperature of the medium, larger parton energy loss and stronger medium response. The diffusion wake is, on the other hand, both swiped by the radial flow and deflected by the density gradient such that the minimum angle (dip in the soft hadron depletion) is shifted away from the \(\gamma \) direction \(\Delta \phi ^{\textrm{hjet}}=\pi \). The angular shift is larger when the diffusion wake is exposed more to the radial flow and density gradient as in the case of Fig. 7d, h, l. The dip is shallower due to smaller jet path length and weaker medium response as seen in the case of Fig. 7c, g, k. Similarly, the position of the peak in the jet-hadron angular correlation is also shifted away from the jet direction (\(\Delta \phi =0\)) due to the influence of the radial flow and the density gradient. The shift is larger when the parton-shower propagation path length is longer. The asymmetric features in the measured jet-hadron correlations in events with neural network engineered jet production locations should be the clear evidence of the jet-induced Mach cone and diffusion wake in heavy-ion collisions.

3.3 3D structure of the diffusion wake

Since both jets and the Mach cones are 3-dimensional objects, it is interesting to examine the 3D structure of the signals of Mach cones and the diffusion wake as recently suggested in Ref. [49]. We can do so for \(\gamma \)-jet events with and without restricted regions of the initial jet production positions. Figure 8 shows the 2-dimensional jet-hadron correlations as a function of rapidity \(\Delta \eta =\eta _{\textrm{h}}-\eta _{\textrm{jet}}\) and azimuthal angle \(\Delta \phi =\phi _{\textrm{h}}-\phi _{\textrm{jet}}\) for soft charged hadrons with \(p_T=1-2\) GeV/c from \(\gamma \)-jets with \(p_T^\gamma \in (200,250)\) GeV/c, \(\left| \eta _\gamma \right| <0.32\) and \(p_T^{\textrm{jet}}>100\) GeV/c, \(\left| \eta _{\textrm{jet}} \right| <0.75\) in 0–10% central Pb+Pb collisions at \(\sqrt{s}=5.02\) TeV. The left plot is for \(\gamma \)-jet events averaged over all possible initial jet production positions with the jet going in the positive y direction (perpendicular to the reaction plane). There is a peak in the jet direction corresponding to soft hadrons from the Mach cone and the jet shower partons. This peak sits on top of a ridge-like (along the \(\Delta \phi \) direction at \(\Delta \eta =0\)) background from MPI. In the \(\gamma \) direction we see a shallow valley on top of the MPI ridge due to the depletion of soft hadrons by the diffusion wake. The minimum point of the diffusion wake valley is in the direction of the trigger \(\gamma \) at \(\Delta \phi =\pi \) and \(\Delta \eta =0\).

Fig. 8
figure 8

Distributions in azimuthal angle \(\Delta \phi \) and rapidity \(\Delta \eta \) of charged particles with \(1\le p_T \le 2\) GeV from jet shower and medium response. Events are averaged over all jet production positions (left plot) or restricted to region \((x<-1, y<-1)\) using a deep neural network (right plot)

We now restrict the initial jet production positions to \((x<-1, y<0)\) with the parton shower traveling in the positive y direction using the deep neural network predictions. This scenario is very similar to that in the right panel of Fig. 7d, h, l where the parton shower traverses approximately one half region of the QGP in a direction that is mostly tangential to the radial flow. The structure of the Mach cone and diffusion wake in the 2D jet-hadron correlation function is much more enhanced as shown in the right plot of Fig. 8. Restricting the initial jet production positions this way forces the jet to traverse longer path lengths on the average, lose more energy and induce a stronger medium response. The peak in the jet direction is higher as compared to the left plot. The peak position is also shifted away from jet direction (difficult to see here because the tip of the peak is cut-out by the limited range of the vertical axis of the plot) due to the radial flow and density gradient. The diffusion wake valley in the \(\gamma \) direction is much deeper with mostly negative yields.Footnote 1 Since the parton-shower propagation direction is mostly tangential to the radial flow and the path is off the center of the QGP in this case, the diffusion wake is swiped along the radial flow and drifted by the density gradient away from the \(\gamma \) direction. This leads to a diffusion wake valley whose minimum point (the dip) is shifted away from the \(\gamma \) direction (\(\Delta \phi =\pi \)) as we see in the right plot. This 3D structure enhanced by neural network selection of the initial jet production positions is much more dramatic than the plain azimuthal angle dependence of jet-hadron correlations.

Fig. 9
figure 9

\(p_T\) distribution of \(\gamma \)-jets in p+p (solid black) and 0–10% central Pb+Pb collisions at \(\sqrt{s}=5.02\) TeV with different path length selected by deep neural network. The initial \(\gamma \)-jet configurations are generated by PYTHIA8 with \({\hat{p}}_{Tmin}=200\) GeV/c

3.4 The \(\gamma \)-jet asymmetry

Another application of the deep learning assisted jet tomography is to study the length dependence of jet energy loss. Using the predictions of the initial jet production positions by the neural network from the information of jet hadrons, one can classify events with different regions of the initial jet production positions and therefore different averaged parton-shower propagation length. One can then extract the path length dependence of the jet modifications. Shown in Fig. 9 are jet \(p_T\) distributions for \(\gamma \)-jets that propagate in the positive y direction (perpendicular to the reaction plane) with different path lengths in average as selected by deep neural network in 0–10% Pb+Pb collisions at \(\sqrt{s}=5.02\) TeV. The solid-black line is the baseline in proton+proton collisions without jet-medium interaction. The blue-dashed line is for jets with \(y>1\) that are produced close to the surface of QGP. They have on the average short path lengths and small jet energy loss. The corresponding jet distribution is slightly modified from the p+p baseline with the peak shifting slightly to smaller \(p_T\) indicating a small jet energy loss. If we constrain the predicted jet production positions to \(y \in [-1, 1]\) (dash-dotted-green line) and \(y < -1\) (dotted-red line), the average path lengths are increasingly longer with larger jet energy loss. The corresponding jet \(p_T\) distributions therefore shift to smaller \(p_T\) as shown. These trends of the \(\gamma \)-jet asymmetry are a clear evidence for the power of the deep learning assisted jet tomography which can be used in the future for more detailed study of jet quenching and jet-induced medium response.

4 Summary

In this paper, we have developed a deep learning assisted jet tomography and applied the technique to study jet-induced Mach cones and the associated diffusion wake in high-energy heavy-ion collisions. The deep learning assisted jet tomography is based on a point cloud neural network and allows one to predict the initial jet production locations using information of final state hadrons with \(p_T>\) 2 GeV/c inside a jet in each event. The neural network is trained with data from CoLBT-hydro model simulations of parton-shower propagation and medium response in high-energy heavy-ion collisions. The network is also tested against pseudo-data from the LIDO model with high accuracy. The mean square root deviation between network prediction and the true jet production position is about \(2.2\sim 2.4\) fm. This network therefore allows one to select events with approximate regions of the initial jet production locations inside the overlap transverse area of two colliding nuclei with known parton-shower propagation direction relative to the radial flow.

Using the deep learning assisted jet tomography we have investigated the signals of jet-induced Mach cones and the associated diffusion wake in the final jet-hadron correlation for soft hadrons from CoLBT-hydro simulations of \(\gamma \)-jet events in high-energy heavy-ion collisions. By restricting the initial jet production positions to different regions of the nuclear overlap area, we can enhance the signals of the Mach cone and diffusion wake and investigate the influence of the radial flow and density gradient. When the initial jet production positions are symmetric to the jet direction, the jet-hadron azimuthal correlation is shown to get broadened by the wind of the radial flow and drifted diffusion due to the density gradient. The width of the azimuthal correlation and the soft hadron yield are shown to increase with the averaged jet path length when we vary the restricted region of the initial jet production location. For a long jet path length, a clear negative dip due to the diffusion wake is observed behind the conical shock front. If the initial jet production positions are restricted to only one half region of the nuclear overlap area, the final jet-hadron correlation becomes asymmetric with respect to the jet direction due to the distortion of the Mach cone and diffusion wake by the radial flow and density gradient. The Mach-cone peak in the jet-hadron correlation is shifted away from the jet direction and the dip of the diffusion wake is also shifted away from the \(\gamma \) direction. Both of these shifts are driven by the blowing wind of the radial flow and drifted diffusion due to the density gradient. We also applied the deep learning assisted jet tomography to look at the signals of the Mach cone and diffusion wake in the 2-dimensional jet-hadron correlation in both rapidity and azimuthal angle, which is more dramatic than in the azimuthal angle jet-hadron correlation. Observing this phenomena in experimental data will provide direct evidence for the existence of Mach cones in heavy ion collisions and their distortion by the radial flow and density gradient.

We also illustrated how one can use the jet tomography to extract path length dependence of jet energy loss from the measured \(\gamma \)-jet \(p_T\) distributions. These are just two examples of how one can use the deep learning assisted jet tomography to study jet quenching and jet-induced medium response in high-energy heavy-ion collisions.