Identifying the nature of the QCD transition in relativistic collision of heavy nuclei with deep learning

Using deep convolutional neural network (CNN), the nature of the QCD transition can be identified from the final-state pion spectra from hybrid model simulations of heavy-ion collisions that combines a viscous hydrodynamic model with a hadronic cascade “after-burner”. Two different types of equations of state (EoS) of the medium are used in the hydrodynamic evolution. The resulting spectra in transverse momentum and azimuthal angle are used as the input data to train the neural network to distinguish different EoS. Different scenarios for the input data are studied and compared in a systematic way. A clear hierarchy is observed in the prediction accuracy when using the event-by-event, cascade-coarse-grained and event-fine-averaged spectra as input for the network, which are about 80%, 90% and 99%, respectively. A comparison with the prediction performance by deep neural network (DNN) with only the normalized pion transverse momentum spectra is also made. High-level features of pion spectra captured by a carefully-trained neural network were found to be able to distinguish the nature of the QCD transition even in a simulation scenario which is close to the experiments.


Introduction
The dynamics of the strong interactions between quarks and gluons, governing the properties of hot and dense nuclear matter, can be described by the theory of QCD.It predicts that, if the temperature of strongly-interacting matter becomes large enough, a new state of matter is formed in which quarks and gluons can roam freely and are not confined in the hadrons anymore.This state of matter is called the quark-gluon plasma (QGP).Lattice QCD has established that the transition from a hadron gas to the QGP is a smooth crossover at a high temperature T ∼ 140 − 180 MeV and low net baryon density [1][2][3].A variety of theoretical models, such as the Dyson-Schwinger equations model [4][5][6][7], the (Polyakov loop-) Nambu-Jona-Lasinio model [8][9][10][11][12] and the quark-meson coupling model [13][14][15] also predict the existence of a first-order phase transition that occurs at low temperature and moderate to large net baryon densities.
Relativistic heavy ion experiments have been carried out at the SIS18 [16], at the AGS [17] and at the SPS [18] Email address: yldu@fias.uni-frankfurt.de(Yi-Lun Du) in the fixed target mode and at the Relativistic Heavy Ion Collider (RHIC) [19] as well as at the Large Hadron Collider (LHC) [20] in the collider mode.The forthcoming Facility for Anti-proton and Ion Research (FAIR) [21,22] and the Nuclotron-based Ion Collider fAcility (NICA) [23] will provide unprecedented intensities and luminosities for future studies.The main goal of these large experiments is to search for signals for the QCD phase transition and study the properties of QGP in nucleus-nucleus collisions.A strategy to identify the signals of QGP is to compare sophisticated model simulations with varying parameter sets and different equations of state (with and without a phase transition) with experimental data as particle spectra and correlation functions.
Currently some observables, for example, anisotropic flow [24][25][26][27], directed flow [28,29] and fluctuations of particle multiplicities [30][31][32][33], are conjectured as most sensitive to the appearance of a phase transition.Furthermore, modern statistical methods are used to apply a global fitting of a set of different observables with Bayesian methods [34,35].However, no direct information has been obtained so far on the bulk properties of the QCD medium, such as the EoS and the transport coefficients.
The great advantage of the DL method over conventional ones is its ability to extract hidden features from highly dynamical, rapidly evolving and complex non-linear systems, like in relativistic heavy ion collisions.Conventional observables rely on human experience and are prone to be biased, while ML methods can be used to identify deeply hidden signals and to construct better observables for the classification task at hand.Recently, a deep CNN classifier was developed as an effective "EoS-meter", an excellent tool for revealing the nature of the QCD transition with a high predictive accuracy ∼ 95% in hadron spectra from a pure hydrodynamic study [38].
The present work studies the performance of a CNN to identify the EoS trained and tested with hadron spectra from a more realistic simulation of heavy ion collisions.The generalizability of the method is explored by considering well established dynamics in the state-of-the-art simulation models.First of all, the hadronic rescattering, after the hydrodynamics evolution, is taken into account in the simulation via a hadronic cascade.Consequently, the event-by-event final-state pion spectra are discrete instead of smooth as in hydrodynamic simulations.Secondly, the resonance decays are included, which also contribute to the pion spectra.Due to the finite number of particles, the discrete event-by-event pion spectra will have significant fluctuations that might overwhelm correlations one is looking for.We will develop modified DL-tools with CNN to identify the EoS in this more complex and more realistic dynamic scenario.
This paper is organized as follows: Sec. 2 introduces the hybrid simulation model.Sec. 3 discusses the neural network and the methods of the data pre-processing.Sec. 4 presents the performance of the trained CNN in different scenarios and comparisons with that of a fully-connected deep neural network (DNN).Finally, Sec. 5 summarizes the results and gives the conclusions.Appendix A gives the details of the neural network structure.Appendix B shows the simulated data and predictive performance on testing datasets by the trained neural network.Appendix C visualizes the training datasets in Appendix B with traditional observables.

Micro-Macro hybrid model of relativistic heavyion collisions
The modeling of relativistic heavy-ion collision is mostly done by following a "standard prescription" for the spatiotemporal evolution of the collision dynamics.The initial state of the matter right after the violent collision is described by the "color glass condensate", which consists of frozen primordial gluons and is assumed to isotropize within 1 fm/c [65][66][67][68][69].These gluons may evolve rapidly in accordance with the classic Yang-Mills equation.A few fm/c later, they can achieve approximate local thermal equilibrium [70,71] and may exist briefly as a Yang-Mills gluon plasma, which may quickly expand nearly isentropically due to the high initial temperature.The total entropy and energy are not yet distributed over quark-antiquark degrees of freedom.Subsequently, quarks are produced by gluon-gluon collisions [66][67][68][69], forming a strongly coupled quark-gluon plasma (sQGP).The dynamical evolution of that QGP can be described approximately by macroscopic dissipative hydrodynamics [72][73][74][75][76][77].Viscous corrections are included to describe some of the remaining deviation from local isotropy and thermal equilibrium.The EoS of the hot QGP medium, the constitutive element used to close the hydrodynamic equations, is one crucial input.As the medium expands and cools quasiisentropically, the quark-gluon fluid will go through a smooth crossover, or hypothetically in this work as a control experiment, a first order phase transition.The nature of the QCD transition strongly affects the hydrodynamic evolution [78].Different forms of transitions are associated with different pressure gradients which consequently lead to different expansion rates.As the matter becomes more dilute, it will form an expanding non-equilibrium hadronic matter with important final state effects.For instance, final absorption of the products of the resonance decays in the hadronic matter can substantially change the yields of the hadrons observed by the experimental detectors.This evolution of the hadronic matter can be successfully described by microscopic hadron cascade models [79][80][81].
To generate the data for the training of the CNN, we use the iEBE-VISHNU hybrid model [82], which can perform event-by-event simulations of relativistic heavy-ion collisions at different energies.Major components of this hybrid model include an initial condition generator (Su-perMC), a (2+1)D second-order event-by-event viscous hydrodynamic simulator (VISHNew), a particle sampler (iSS) and a hadron cascade "afterburner" simulator (UrQMD).
This hybrid model uses either the Monte-Carlo Glauber (MC-G) [83][84][85] or the Monte-Carlo Kharzeev-Levin-Nardi (MCKLN) [86,87] model to generate the fluctuating initial conditions in the SuperMC module.The collision centrality can be set up as needed, based on the assumption that, on average, the final charged hadron multiplicity, dN ch /dy, is directly proportional to the initially produced total entropy in the transverse plane dS/dy| y=0 .The effect of viscous heating will cause a spread in the final dN ch /dy,  which is considered small (2-3%) for a given dS/dy| y=0 .
The simulation with the hydrodynamic package VISH-New uses two different EoSs: (1) the crossover type EoS, based on a lattice-QCD parametrization [88], denoted as L-EOS; (2) the first order type EoS with a Maxwell construction [89] between a hadron resonance gas and an ideal gas of quarks and gluons, as Q-EOS.The transition temperature is T c = 165 MeV.These two EoSs are depicted in Fig. 1.
After the hydrodynamic evolution, the fluid fields are projected via the Cooper-Frye formula into particles, which will then be further propagated in a hadronic cascade, the Ultrarelativistic Quantum Molecular Dynamics (UrQMD) model.In UrQMD, a non-equilibrium transport model, resonance decays and hadronic rescatterings are included in the simulation.In contrast to the hydrodynamic evolution, which is governed by the conservation of energy and momentum with the EoS, shear viscosity η, bulk viscosity ξ, particles are assumed to be in asymptotic states and the trajectories are given by straight-lines between the collisions in the hadronic cascade.The hadronic cascade evolution is not deterministic since the processes involve certain randomness, e.g., scattering angle, scattering probability and decay probabilities.Furthermore, the effects of finite number of particles, i.e., thermal fluctuations, are included since the cascade propagates the discrete particles instead of the average densities.
This hybrid model with some adjustable parameters can fit experimental data on final hadron spectra.These parameters include: the equilibration time τ 0 , which defines the point when the local thermal equilibration is reached and the hydrodynamics evolution starts, the ratio of the shear viscosity to the entropy density η/s, and the freeze-out temperature T sw , which defines the switch from the hydrodynamic evolution to the hadronic cascade.
We vary the model parameters in the generation of the training datasets to allow the neural network to capture the intrinsic features encoded in the EoS, instead of those biased by the specific setup of other physical uncertainties.This would require many events simulations for hundreds of different parameter combinations and centrality selections, to make sure that the neural network gains a sufficient generalizabilty.However, in practical this is impossible.Hence we focus on systematic changes of these parameters and study the performance of the network whence it reaches the boundary of these parameter values.

Neural network and data pre-processing
In Ref. [38], the DL-tool engine with a CNN has been shown to classify successfully the EoS in pure hydrodynamical simulations, on an event-by-event basis with a ∼ 95% accuracy.To apply this strategy to real experimental data, it's crucial to perform realistic simulations with hadronic "after-burner" and resonance decays.In the present paper, the DL-tool engine is constructed for more realistic simulations of heavy ion collisions.The CNN architecture used here is similar to that discussed in Ref. [38].We refer to that paper for technical details.An introduction to this new CNN network is presented in detail in In general, training or learning algorithms benefit a lot from pre-processing of the datasets.The input to the neural network used here, pion spectra ρ(p T , Φ), is a 24 × 24 matrix.One refers to each matrix element as one "feature" and each matrix as one "sample".The pre-processing of the input data can be applied in a feature-wise (per feature) or sample-wise (per input sample) manner.
In the feature-wise standardization, the input ρ(p T , Φ) of all the training samples are pre-processed in a sampleinterdependent manner.Each feature is subtracted with the mean over all training samples and is divided by their standard deviation.In this way, all features are centered around zero and have variances of the same order.Thus it is prevented that one feature with larger variance dominates the objective function over other features.The transformation is saved and then will be applied in the testing samples.With this standardization, the testing data should be simulated in one of the same collision systems as the training data, since the multiplicity in different collision systems differ a lot.
In the sample-wise standardization, or min-max normalization, the input ρ(p T , Φ) are pre-processed in a sampleindependent manner.Each 24 × 24 matrix can be rescaled to have a zero mean and a unit variance, or to a specific range, such as [− 1 2 , 1 2 ], respectively.The latter choice is used in Ref. [38] with success.
Our training results show that feature-wise standardization does always perform better than the other two sample-wise methods.Hence we will show in the following only the results of the feature-wise standardization.

Training and testing results
A systematic analysis of the performance of the above described CNN is presented for hybrid modeling for relativistic heavy-ion collisions.Here an important aspect is the generalizability of the trained CNN model in the testing stage.The overfitting of the network to the training data will be checked on the validation data which are generated with the same physical parameter set in modeling the training data.The testing is performed on the testing datasets which are generated with different physical parameter sets in modeling the training data.The generalizability of the CNN model with respect to different physical parameter sets is studied systematically.In the previous study with pure hydrodynamics [38], the training data are generated with a viscous (3+1)D hydrodynamics model, CLVisc [76], with AMPT initial conditions [90], while the testing data are generated with a viscous (2+1)D hydrodynamics model, VISHNew, with Monte-Carlo Glauber initial conditions, which are used in a hybrid model in this work for the training data generation instead.However, here we find that, even in the pure hydrodynamic study, reversing the simulation models for training and testing data generation will obtain a testing accuracy only about 70%, from which we suspect some superiority of (3+1)D hydrodynamics model with AMPT initial conditions over other ones.Thus in this work, we would not be able to discuss the generalizability of the CNN model with respect to different hybrid simulation models.

Hybrid model with late transition to cascade
The CNN in the previous study [38] was directly trained using primordial pion spectra, obtained from a numerical integration of the Cooper-Frye formula over the freeze-out hypersurface in the hydrodynamics.In such a scenario, one neglects the fluctuations due to the finite number of hadrons.In addition, a significant portion of pions originating from resonance decays also need to be taken into account.In this section, we study the influence of the aforementioned effects on the predictive power of the CNN.To see the influence of the finite number of particles and resonance decays, we first assume a late transition from hydrodynamics to the UrQMD cascade by taking the switching temperature the same value as the hydrodynamics freezeout temperature used in Ref. [38], T sw = 137 MeV.In this scenario, the duration and influence of the hadronic cascade are significantly diminished and we are left with the effects of the finite number of particles and resonance decays as compared to the pure hydrodynamics modeling.

Event-by-event input, switch at T sw = 137 MeV
In this scenario, the event-by-event pion spectra ρ(p T , φ) are taken as the input to the CNN. 12 training datasets are generated by the iEBE-VISHNU hybrid model with the fluctuating MC-Glauber initial condition and 6 different fine centrality bins with 1% width in the centrality range 0-60% in two collision systems, respectively.We set the ratio of the shear viscosity to entropy density as η/s = 0.08 and 0.00, the equilibration time as τ 0 = 0.The validation accuracy is found to be about 83.5% after 1000 epochs training.This validation accuracy indicates that high-level correlations are extracted from the two-dimensional pion spectra ρ(p T , φ) to identify the EoS.However, it is significantly lower than that in pure hydrodynamics modeling [38], where a validation accuracy up to 99% was obtained.This implies that the fluctuations due to the finite number of particles and resonance decays overwhelm some correlation information from the early dynamics to the final-state particle spectra and thus result in the "overlap" between these two types of eventby-event spectra with different EoSs, which hinders the discrimination between them.

Cascade-coarse-grained input, switch at T sw = 137
MeV To mitigate the effect of fluctuation due to the finite number of particles and resonance decays, we average the pion spectra over a certain number of events.In the model simulations one can repeat the hadronic cascade for any number of times for the same hydrodynamic evolution.Then the pion spectra averaged over these simulations are taken as the input for training, which will be called "cascade-coarse-grained input".We would like to find out whether such an event averaging will improve the network performance due to the statistics enhancement or worsen it due to the information loss.
In this scenario, 2 training datasets are generated by the iEBE-VISHNU hybrid model with the fluctuating MC-Glauber initial condition in the centrality range 0-50%.The details are shown in Tab.B.3 in Appendix B. In total, 15747 events are generated with two different EoSs.The hadronic cascade is repeated 30 times after each hydrodynamics evolution.The spectra averaged over these 30 events are taken as the input to the network.The validation accuracy with these cascade-coarse-grained spectra ρ c (p T , φ) can achieve about 92%.One can see that such averaging over cascade-stage is beneficial in identifying the EoS information in early dynamics from the final-state particle spectra.This means that the statistics matters a lot for using particle spectra to decode the EoS information.
4.1.3.Event-fine-averaged input, switch at T sw = 137 MeV One drawback of the above average procedure is that the separation of collision dynamics into hydrodynamic and hadronic cascade stage is purely theoretical.Thus from a realistic point of view, an averaging procedure based on experimentally controllable event filtering is preferable.In this scenario, spectra are averaged within the same fine centrality bin (with 1% width) instead, which will be called "event-fine-averaged input" in the following.To be specific, we average the spectra of 30 random events within the same fine centrality bin in Tabs.B.1 and B.2 as the input to the network to accumulate the statistics.5, one can see that the fluctuations are significantly reduced in the 30-events-fine-averaged spectra.This manner of averaging reduces the fluctuations from the initial conditions besides that from hadronic cascade and resonance decays.Consequently, a surprisingly obvious improvement for the CNN performance in classifying the two types of EoS is made.The validation accuracy reaches about 99% with the 30events-fine-averaged spectra ρ a (p T , φ) after 1000 epochs training, a value similar to that in the pure hydrodynamic case [38].In principal, one can include more datasets generated in different fine centrality bins for training.However, we confirm that it's enough to use the datasets simulated only in 6 representative fine centrality bins as in Tabs.B.1 and B.2, respectively, for training, since the predictive performances on the datasets simulated in other unselected fine centrality bins are as high as the training accuracy.This demonstrates that non-trivial high-level correlations which are independent of the centrality bins are learned by the neural network.
After the training with validation, the trained network is confronted with the testing data, which are generated with different physical parameter sets in simulations to explore the network's generalizability.In Tabs.B.4 and B.5 in Appendix B we show the predictive performance of the neural network trained with the 30-eventsfine-averaged spectra.A testing accuracy 95% on average is obtained on the testing data simulated in the centrality range 0-50% with MC-Glauber or MCKLN initial conditions.This evidently demonstrates that the trained neural network is robust against different model setups such  as initial conditions, τ 0 , η/s and T sw in a range between [130,142] MeV.We observe a slight centrality dependence of the predictive accuracy in the collision system Pb+Pb √ s N N = 2.76 TeV, which decreases for more peripheral events.

A hierarchy of the accuracy in the above scenarios
Fig. 2 shows the training and validation accuracy (upper panel) and loss (lower panel), respectively, by the CNN with the same setup for the first 1000 epochs in three aforementioned scenarios.In each scenario, training and validation accuracy (loss) are still close after 1000 epochs training, which implies that over-fitting is avoided.Besides, the network has not been sufficiently trained in the cascade-coarse-grained scenario after 1000 epochs as the accuracy (loss) is still increasing (decreasing).
A clear hierarchy of the prediction accuracy is observed when the averaging is performed over more and more stages of the simulated dynamics.The CNN with event-by-event spectra gives the lowest accuracy, while the one with the 30-events-fine-averaged spectra gives the highest one, which is as high as in the pure hydrodynamic study [38].

Hybrid model with early transition to cascade
The scenario with early transition from hydrodynamics to hadronic cascade in hybrid modeling is in accordance with a widely used choice of the switching temperature T sw > 150 MeV.This scenario is different from the one discussed in the previous subsection in two aspects.Firstly, the higher switching temperature decreases the contribution from the primordial pions which are directly emitted from the hydrodynamic evolution, and increases the contribution from resonance decays.Secondly, the elongated duration of the hadronic cascade stage may further blur out the imprint of the phase transition encoded in the finalstate particle spectra.In the following, we will study how a higher switching temperature affects the performance of the CNN in three aforementioned scenarios, respectively.The validation accuracy is found to be about 78% for the CNN trained with these event-by-event spectra as input.This validation accuracy is lower than that in the scenario with late transition (switching temperature T sw = 137 MeV).This decrease in the validation accuracy can be understood as a result of the increased contribution from resonance decays and the elongated duration of the hadronic rescattering.

Cascade-coarse-grained input, switch at T sw > 150
MeV In this scenario, the cascade-coarse-grained pion spectra ρ c (p T , φ) are taken as the input to the CNN. 2 training datasets are generated in analogy to the previous late transition case, by the iEBE-VISHNU hybrid model with the fluctuating MC-Glauber initial condition in the centrality range 0-50% with the hadronic cascade simulated 30 times individually after each hydrodynamic evolution.The switching temperature T sw is set to be 155 or 160 MeV.The details are shown in Tab.B.8 in Appendix B.About 24000 events with two different EoSs are generated in total.The validation accuracy is found to be 87.5% at most, which is also lower than that in previous scenario with late transition to cascade.
4 testing datasets are generated in this scenario as shown in Tab.B.9 in Appendix B in the centrality range 0-50%.Both MC-Glauber and MCKLN initial conditions are used, and simulation parameters are varied from the training datasets to check the generalizability of the CNN.After training and validating the neural network, the testing accuracy on these datasets is 83% on average, which is slightly lower than the validation accuracy.

Event-fine-averaged input, switch at T sw > 150 MeV
In this scenario, the 30-events-fine-averaged spectra for training is explored with the switching temperature T sw = 160 MeV.This input is generated by the average over the spectra of 30 independent events within the same fine centrality bins (with 1% width) shown in Tabs.B.6 and B.7.The validation accuracy can also reach up to 99% in this scenario as in the previous late transition one.The testing accuracy is up to 95% on average on the testing datasets as shown in Tab.B.10 in the Appendix B. We also observe a slight centrality dependence of the predictive accuracy in the collision system Au+Au √ s N N = 200 GeV, which decreases for more peripheral events.

Comparison with fully-connected deep neural network
As already discussed in subsection 4.1, the event-byevent and 30-events-fine-averaged normalized p T spectra and elliptic flow v 2 with two different EOS from all centrality bins in Tabs.B.1 and B.2, as shown in Figs.C.5 and C.6, respectively, are non-distinguishable within the range of event-by-event fluctuations.However, one can observe that the peaks of the normalized p T spectra with Q-EOS are higher than that with L-EOS on the whole.In One can see that these all-events-fine-averaged normalized p T spectra are not distinguishable again by the human eye.Their derivatives are also helpful to distinguish the EoS in certain p T bins, which might lead us to construct novel observables from normalized p T spectra in the future.Inspired with this observation, we use the normalized p T spectra as the input to a fully-connected DNN to distinguish the EOSs as a first try.In this case, the normalized p T spectra are regarded as a whole instead of isolated points at each p T bin as regarded by the human eye, and high-level correlations including but not limited to high-order derivatives can be extracted supervisely.
We train a fully-connected DNN 1 with the event-byevent normalized p T spectra from all centrality bins in Tabs.B.6 and B.7 as the input.The validation accuracy is about 74%, which is below that by CNN with twodimensional spectra, about 78%.Here the correlations are not very strong in both cases due to the fluctuations from the particlization and "afterburner".When the 30-eventsfine-averaged normalized p T spectra are taken as the input instead, the validation accuracy is about 97%, which is also a little below that by CNN with two-dimensional spectra, about 99%.Here the correlations are very strong in both cases.As for the testing accuracy, CNN with twodimensional spectra outperforms fully-connected DNN with one-dimensional spectra by about 8% with 30-events-fineaveraged spectra.Apparently, in the above cases, fullyconnected DNN with one-dimensional normalized p T spectra can capture the main correlations, while CNN with two-dimensional spectra performs better and improves the generalizability.
When the event-by-event normalized p T spectra from all centrality bins in Tabs.B.1 and B.2 with T sw = 137 MeV and in Tabs.B.6 and B.7 with T sw = 160 MeV are taken as the input to the fully-connected DNN, the validation accuracy is about 62%, which is much lower than that by CNN with two-dimensional spectra, about 69%.This shows that when physical parameters in the simulation model vary a lot in the generation of the training data, the normalized p T spectra are more difficult to distinguish and CNN with two-dimensional spectra will outperform fully-connected DNN with one-dimensional normalized p T spectra.

Summary and Conclusion
We extended a previous exploratory study on identifying EoS in the modeling of heavy ion collisions from hadron spectra using DL technique [38].In this extended study, we consider more realistic hybrid modeling for heavy-ion collisions, where hadronic cascade "afterburner" with finite number of particles and resonance decays are properly taken into account.In the hybrid modeling the final-state particle spectra are histograms containing large fluctuations and thus are different from those in the previous study [38], which are smooth hadron spectra from Cooper-Frye prescription with perfect statistics.Fig. 3 summarizes the predictive performances on the validation datasets in the above exploratory studies of different scenarios.
We have demonstrated that, after the hydrodynamic evolution, stochastic particlization, hadronic cascade and resonance decays, the information about EoS in early dynamics is preserved in the final-state pion spectra, from 1 This fully-connected DNN consists of two hidden dense layers of size 128 and 256, respectively, and each is followed by a dropout [91] (with a rate of 0.5) and PReLu activation layer [92].These two dense layers are initialized with "He normal" initializer [92] and constrained with L2 regularization [93].The green star depicts the pure hydrodynamic result [38].The orange square, the purple triangle and the red filled circle symbols depict the results for the 30-events-fine-averaged, cascade-coarse-grained and event-by-event spectra, respectively, in different switching temperatures.
the perspectives of deep CNN, as shown in Fig. 3.The event-by-event input for the network can reveal the EoStype information with about 80% classification accuracy in binary classification setup.The downward trend for the performance of network in validation with respect to the switching temperature in Fig. 3, implies that more stochasticity from the resonance decays and the elongated hadronic cascade will diminish the correlation between the EoS information in the early dynamics and the final-state particle spectra.This is in accordance with the common physical interpretation.
Finally, the hierarchy of the validation accuracy in different scenarios in Fig. 3 shows that proper enhancement of statistics and reduction of fluctuations from either the final hadronic dynamics or together with the initial conditions in the input data are found to facilitate the revealing of the EoS information by the network from final-state particle spectra.
In conclusion, deep CNN can decode the imprint of the EoS in hydrodynamic evolution (encoded within the phase transition dynamics) on the final-state pion spectra from heavy-ion collisions.The good performance of the network does demonstrate that this "EoS-encoder" works.The fingerprint of the early dynamics of the bulk matter is not washed out by the evolution even when stochasticity is increased due to the hadronization and sequential hadron dynamics.Deep CNN provides an effective decoding method to extract high-level correlations from twodimensional final-state pion spectra, which are immune to different physical factors, such as centrality bins.In relatively simple cases, fully-connected deep neural network can also identify the EoS from normalized pion p T spectra with close validation accuracy as CNN does, which can lead us to discover new observables sensitive to EoS from normalized pion p T spectra.The generalizability of the learned features with respect to other simulation models also depends on the simulation model for the training data generation.In the present study, the training data is generated with well tested iEBE-VISHNU (VISH-New+UrQMD) hybrid model.In the future we will explore how to capture the features which can be generalized to the testing data from other models as well as experimental data.Possible applications of the framework developed here can be extended to classifying fluctuating initial conditions, extracting transport coefficients of QCD matter, analysis of real experimental data filtering and pre-processing, and detector calibration.

Figure 1 :
Figure 1: Two different EoSs are implemented in the hydrodynamic simulation, as functions of the energy density.A crossover, based on a lattice QCD parametrization is compared with a first order phase transition with a transition temperature Tc = 165 MeV, obtained by a Maxwell construction.It is assumed that the baryon-chemical potential is exactly µ B = 0 throughout the whole simulation.
Fig. A.4 in Appendix A. The input ρ(p T , Φ) ≡ dN π /dydp T dΦ to this neural network is a histogram of the number of pions with 24 p Tbins and 24 Φ-bins.p T denotes the transverse momenta of observed pions in the final state, while Φ denotes the azimuthal angles.Only pions with p T ≤ 2GeV, rapidity |y| ≤ 1 and Φ ∈ [0, 2π] are accepted and accounted in the histogram.
5 and 0.4 fm/c in the collision systems Pb+Pb √ s N N = 2.76 TeV and Au+Au √ s N N = 200 GeV, respectively.The details of the datasets are shown in Tabs.B.1 and B.2 in Appendix B.About 44000 events with two different EoSs are generated in total.Fig. C.5 in Appendix C shows the eventby-event normalized p T spectra and the elliptic flow v 2 as a function of p T of these training datasets with two EoSs.These two one-dimensional traditional observables are non-distinguishable by the human eye with respect to the EoSs.Thus it's not trivial to identify the EoS from just final-state pion p T spectra.The negative elliptic flow v 2 in Fig. C.5 shows that there are great fluctuations in the event-by-event spectra.
Fig. C.6 in Appendix C shows the 30-events-fine-averaged normalized p T spectra and the elliptic flow v 2 as a function of p T of these training datasets with two EoSs.These two one-dimensional traditional observables are still not distinguishable by eye.By comparing with the corresponding event-by-event observables as shown in Fig. C.

Figure 2 :
Figure2: Training and validation accuracy (upper panel) and loss (lower panel) in three different scenarios with switching temperature Tsw = 137 MeV.These three scenarios refer to the 30-eventsfine-averaged spectra (purple and brown), the cascade-coarse-grained spectra (red and green) as well as the event-by-event spectra (blue and orange).

4. 2 . 1 .
Event-by-event input, switch at T sw > 150 MeV In this scenario, 9 training datasets are generated by the iEBE-VISHNU hybrid model with the fluctuating MC-Glauber initial condition in the centrality range 0-50%.The switching temperature is T sw = 160 MeV.Two different values for the equilibration time τ 0 and the ratio of shear viscosity to entropy η/s are used in the simulations.The details are shown in Tabs.B.6 and B.7 in Appendix B. In total, about 60000 events are generated with two different EoS types.
Figs. C.7, C.8 and C.9 in Appendix C, we show the eventby-event, 30-events-fine-averaged and all-events-fine-averaged normalized p T spectra (left panel) and elliptic flow v 2 (right panel) solely from centrality bin 14-15% in Pb+Pb collision √ s N N = 2.76 TeV in Tab.B.1, respectively.Within the same centrality bin one can see that the all-events-fineaveraged normalized p T spectra are distinguishable with respect to different EOSs, 30-events-fine-averaged normalized p T spectra are almost distinguishable from certain p T bins, while the event-by-event normalized p T spectra are still not.In Fig. C.10 in Appendix C, we show the all-events-fine-averaged normalized p T spectra (upper left panel) and elliptic flow v 2 (upper right panel) as well as the first (lower left panel), second (lower middle panel) and third (lower right panel) derivatives of the normalized p T spectra from all centrality bins in Tabs.B.1 and B.2.

Figure 3 :
Figure3: Comparison between the validation accuracy in all the different scenarios studied.The green star depicts the pure hydrodynamic result[38].The orange square, the purple triangle and the red filled circle symbols depict the results for the 30-events-fine-averaged, cascade-coarse-grained and event-by-event spectra, respectively, in different switching temperatures.

Figure A. 4 : 1 :
Figure A.4: The architecture of our convolution neural network (CNN) for identifying the QCD transition type by using pion spectra with 24 transverse momentum p T bins and 24 azimuthal angle Φ bins.

10 :V2 8 V2Figure C. 6 :V2 25 V2
Figure C.5: Event-by-event normalized p T spectra dN/N dydp T (left panel) and elliptic flow v 2 as a function of p T (right panel) of the training datasets in Tab.B.1 and Tab.B.2 with two EoSs.Vertical discrepancy is event-by-event fluctuations.The green cross and the red point symbol depict the observables with L-EOS and Q-EOS, respectively.These events are generated in different centrality bins with Tsw = 137 MeV in two collision systems.

14 V2 25 V2
Figure C.9: All-events-fine-averaged normalized p T spectra dN/N dydp T (left panel) and elliptic flow v 2 as a function of p T (right panel) of the training datasets in Tab.B.1 with two EoSs.The green cross and the red point symbol depict the observables with L-EOS and Q-EOS, respectively.These events are generated in centrality bin 14%-15% with Tsw = 137 MeV in two collision systems.

Table B .
T , Φ) computed by the iEBE-VISHNU hybrid model with the MC-Glauber initial conditions in the centrality range 0-60%.The ratio of shear viscosity to entropy density η/s = 0.00.The equilibration time τ 0 = 0.4 fm/c.The switching temperature Tsw = 137 MeV.The collision system is Au+Au at √ s N N = 200 GeV.4: Predictive accuracy on the testing datasets 1: 30-events-fine-averaged spectra ρa(p T , Φ) generated with MC-Glauber initial conditions and different √ s N N , η/s, τ 0 , and Tsw in the centrality range 0-50%.

Table B .
6: Training datasets 4: numbers of event-by-event spectra ρ(p T , Φ) computed by the iEBE-VISHNU hybrid model with the MC-Glauber initial conditions in the centrality range 0-50%.The ratio of shear viscosity to entropy density η/s = 0.08.The equilibration time τ 0 = 0.6 fm/c.The switching temperature Tsw = 160 MeV.The collision system is Pb+Pb at √ s N N = 2.76 TeV.
Table B.7: Training datasets 5: numbers of event-by-event spectra ρ(p T , Φ) computed by the iEBE-VISHNU hybrid model with the MC-Glauber initial conditions in the centrality range 0-50%.The ratio of shear viscosity to entropy density η/s = 0.16.The equilibration time τ 0 = 0.4 fm/c.The switching temperature Tsw = 160 MeV.The collision system is Au+Au at √ s N N = 200 GeV.

Table B .
8:Training datasets 6: numbers of cascade-coarse-grained spectra ρc(p T , Φ) computed by the iEBE-VISHNU hybrid model with the MC-Glauber initial conditions in the centrality range 0-50%.