Classi(cid:12)cation of equation of state in relativistic heavy-ion collisions using deep learning

: Convolutional Neural Nets, which is a powerful method of Deep Learning, is applied to classify equation of state of heavy-ion collision event generated within the UrQMD model. Event-by-event transverse momentum and azimuthal angle distributions of protons are used to train a classi(cid:12)er. An overall accuracy of classi(cid:12)cation of 98% is reached for Au+Au events at p s NN = 11 GeV. Performance of classi(cid:12)ers, trained on events at di(cid:11)erent colliding energies, is investigated. Obtained results indicate extensive possibilities of application of Deep Learning methods to other problems in physics of heavy-ion collisions.


Introduction
One of the main goals of physics of heavy-ion collisions is to explore properties of exotic state of matter, namely, hot, dense and hard-interacting baryonic matter.It can be recreated in laboratory by colliding heavy nuclei at relativistic energies.Lattice Quantum Chromodynamics (QCD) calculations indicate that a transition from quark-gluon plasma (QGP) to hadron gas is a smooth crossover at high energies and low baryon densities [1].It is widely believed that a first-order phase transition that ends up with tricritical point takes place within the energy range between √ s = 3 and 10 GeV, see, e.g., [2] and references therein.Various past and ongoing experiments, such as beam energy scan (BES) and BES II at Relativistic Heavy Ion Collider (RHIC) [3,4], experiments at Super Proton Synchrotron (SPS) at CERN, are exploring collisions with gold and lead ion beams to find any peculiarities within the aforementioned energy range.However, neither the first-order phase transition, nor the tricritical point has been observed so far.Future experiments, such as Nuclotron-based Ion Collider fAcility (NICA) and Facility for Antiproton and Ion Research (FAIR) are aiming to perform collisions at given energies with higher luminosity, which give us a hope to see something new there.Difficulties in observing the phase transition originate from numerous factors.Some of them are an extremely short, approximately 10 −24 fm/c, time of existence of the QGP phase, small number of particles in the system, high anisotropy of matter both in coordinate and in momentum space, etc.All valuable information registered by detectors is roughly several thousands of particles with corresponding energies and momenta.Therefore, it is immensely difficult to make any reasonable assumptions about the media they came from.

JHEP07(2020)133
Recent success of methods of Artificial Intelligence (AI), such as Machine Learning (ML) and Deep Learning (DL), in approximating highly obscure dependencies gives us a justifiable hope that they can benefit in heavy-ion physics objectives.Now these methods are widely employed in various problems that involve pattern recognition, for instance, image classification, natural language processing, and so on.Major advantage of such algorithms is that they do not need to be explicitly programmed.For example, AlphaGo [5], a program developed by Google Deep Mind to play a board game Go, managed to beat human world champion with the score 4:1.Played on 19 × 19 board, Go has roughly 2.08 * 10 170 legal positions which makes it impossible for explicitly-written algorithm due to computational complexity.Moreover, a more advanced versions of the program, AlphaGo Zero and AlphaZero, were trained even without using the human data or guidance, but only by playing between versions of themselves and reached even better results [6].Recently, algorithms of DL have started to be applied to complex tasks in various fields of physics, namely, high-energy particle physics [7][8][9], condensed matter physics [10,11], astroparticle physics [12][13][14], and quantum information [15].The work [16] demonstrates how optimal representation variables, describing different states of physical systems, could be learned by neural nets.Generative Adversarial Nets, a generative Deep Learning models have started to be successfully applied for the event simulations [17,18].Review of possibilities of application of ML and DL methods to high energy physics can be found elsewhere [19][20][21].Prominent results of application of DL to problems of high energy nuclear physics and heavy-ion collisions are described in [22][23][24].
Present work continues efforts of application of deep learning methods to identification of properties of equation of state (EoS).The paper is organized as follows.Section 2 introduces basic features of the microscopic ultrarelativistic quantum molecular dynamics (UrQMD) model and parameters of hard and soft mean-field potentials used in the calculations.It is outlined that the Deep Learning methods are able to differentiate the calculations made with two different equations of state on event-by-event basis compared to standard methods, which need usually significant statistics of generated events.Principles of the Convolutional Neural Network (CNN) algorithm are sketched in section 3. Section 4 presents main results of our study.Here we show also the ability of model, trained on particular collision energy, to make correct distinctions for the collisions at neighbor energies.Effectiveness of different training algorithms is discussed as well.Conclusions are drawn in section 5.

Features of microscopic model
The Ultra-relativistic Quantum Molecular Dynamics model [25,26] is a Monte-Carlo event generator designed for the description of hh, hA, and A + A collisions in a broad energy range from hundred MeV up to several TeV.The model contains 55 baryon and 32 meson states with masses up to 2.25 GeV/c 2 as independent degrees of freedom, together with their antiparticles and explicit isospin-projected states.At energies below few GeV, the interaction dynamics of hh or A + A collisions can be described via interactions between JHEP07(2020)133 the hadrons and their excited states, resonances.At higher energies new processes of multiparticle production come into play.The UrQMD treats the production of new particles via formation and fragmentation of specific colored objects, strings.Strings are uniformly stretched between the quarks, diquarks and their antistates with constant string tension κ ≈ 1 GeV/fm.The excited strings are fragmenting into pieces via the Schwinger mechanism of q q-pair production, and the distribution of newly produced hadrons is uniform in the rapidity space.The model utilizes Hamiltonian dynamics of particle motions and incorporates particle interaction via geometric cross sections, taken from available experimental data or from quark models.

Hard and soft EoS
There is a possibility to switch-on potential interaction in the form of Yukawa, Coulomb, and Skyrme potentials where Here r i is a coordinate vector of i-th particle, Z i is value of electric charge, ρ int j is interaction density, V Yuk 0 , γ Y , t 1 , t γ , γ are constants that define stiffness of EoS, see [25][26][27] for details.In present work we use potential with two different sets of parameters which can be associated with two separate equations of state, hard and soft, of baryonic matter.Parameters of the hard and the soft potentials employed in the calculations are listed in table 1.

Example: directed flow
It was shown earlier that EoS in UrQMD influences kinematic observables, such as directed flow, v 1 , quite significantly [28].Recall, that directed flow [29] is an experimental observable JHEP07(2020)133

Hard EoS Soft EoS
Figure 1.Slope of directed flow of protons coming from UrQMD minimum bias Au+Au collisions generated with hard (stars) and soft (crosses) EoS, respectively.Data are taken from [28].
that characterizes momentum space anisotropy of particle spectrum.It is defined as first coefficient of transverse flow Fourier decomposition Here p T is particle transverse momentum, y is rapidity, φ is the azimuth between the p T and the participant event plane, and Ψ n is the azimuthal angle of the event plane of n-th flow component.Averaging in eq.(2.4) is done over all particle of a given species from the single event and over all set of events.Figure 1 shows the midrapidity slopes of directed flow of protons in minimum bias Au+Au collisions at energies √ s = 4, 7.7, 11.5, 19.6 GeV generated with two different EoS.One can see that the value of the slope is sensitive to the choice of EoS.However, to make the differences in v 1 slopes visible, about one million events for each data point were needed.

Model setup
It is worth mentioning that application of the Deep Learning methods permits one to differentiate between EoS on event-by-event basis.To check the efficiency of the method we generated 5000 Au+Au collisions of each kind at √ s = 11 GeV with centrality 0-5%.Distribution of proton number n p = n p (p T , φ) was calculated for each separate event.Transverse
Layer  momentum and azimuthal angle were selected in ranges (0, 1) GeV/c and (−π, π), respectively.Altogether, the training dataset is consisted of 10000 histograms with dimensions of 10 × 10 bins and labeled "0" for hard potential and "1" for soft potential.All bin values were scaled to fit the range (0, 1), so they represent probability density distribution.

Deep Convolutional Neural Network
Convolutional Neural Network (CNN) [30] has proven to be a powerful deep learning algorithm for tasks dealing with graphical data, such as image recognition, classification, etc.It was inspired by the process of image processing in mammal's visual cortex.Similar to other deep learning algorithms, CNN consists of layers of artificial neurons.However, there are three main types of layers that are used in deep CNN.Convolutional layers are used to extract useful features.Pooling layers are responsible for reduction of parameters and fully-connected ones are used for making the final prediction.They map extracted features to the desired output, which is either "1" or "0" in our case.Configuration of layers is determined by the complexity of the task.Network architecture employed here is described in table 2 and visualized in figure 2, build with the help of online tool for CNN architecture visualization. 1 Binary cross-entropy, used as a loss function, reads Here w n is a sample weight (set to 1), x n and y n are network outputs and true labels, respectively.Index n runs over the batch.The network was trained with a help of stochastic gradient descent (SGD) method with learning rate 0.01 during 800 epochs.Dropout [31] layers (p = 0.3) were used before every fully-connected layer to reduce overfitting.Validation loss was recorded after every 10 epochs and model parameters were saved every time it decreased.Model parameters that yielded lowest loss at validation set were used for model evaluation afterwards.Eighty percent of the dataset was used for training and twenty percent for validation.Data acquisition and early preprocessing steps were done within the ROOT framework [32].Further process was realized in Jupyter-lab [33], which is an interactive Python notebook, with an extensive use of the Python libraries NumPy [34,35], Matplotlib [36], Pandas [37], and Scikit-learn [38].The network was realized in PyTorch [39] framework.

Classification accuracy for central Au+Au collisions at three different energies
Typical particle density profiles coming from events generated with both hard and soft potentials are shown in figure 3. We have to emphasize that density profiles from single event are strongly influenced by fluctuations and substantially differ on event-by-event basis.They are hardly distinguishable by naked eye, yet if we consider larger statistics, e.g., 100 events, the differences become more noticeable, as shown in figure 4.
Overall classification accuracy of 98% percent was achieved for both hard and soft potentials.It means that out of 100 items of every kind 98 were classified correctly.Learning curves are depicted in figure 5 validation losses start to abruptly fall, whereas both accuracies rapidly grow.This circumstance indicates that a direction of minimum of loss function in the model parameter space is found.All curves enter the saturation region after 150-th epoch.The fact that training metrics coincides with validation metrics indicates that our model does not overfit and generalizes well.
Then, analogous setup was used for energies of √ s = 7 and 14 GeV.For the last case overall accuracy of 98% was reached, however, it drops to 94% of accuracy for √ s = 7 GeV collision energy within the proposed training scheme.It could be attributed to low multiplicity of secondary particles, produced in a single event.Next, it is interesting to study the ability of a model, trained on one particular energy, to make predictions for another energy.These results are listed in table 3.There are three trained models together with their predictions for Au+Au collisions at three energies, √ s = 7, 11, and 14 GeV.Table 3 reveals an interesting result that a classifier, trained on events at collision energy of √ s = 14 GeV, shows good performance while classifying the events at lower energies.The same is true for the classifier trained at √ s = 11 GeV.It could indicate on similarities in hadron spatial distributions in both reactions.In contrast, performance of the classifier trained on √ s = 7 GeV is significantly worse.Possible explanation could be low particle yields in a single event at this energy which are not enough to make a distinction properly.As it turned out, the smallest amount of data needed for accuracy better than 96% is only 100 images, i.e., only 1% of the whole dataset, for energies √ s = 11 and 14 GeV.For the energy of 7 GeV the best performance with the least amount of data used is 92% with 400 training instances.However, training time is roughly 5 times longer compared to 4000 epochs for √ s = 11 GeV.

Application of more efficient algorithm -Adam
It is possible to achieve very good results, namely, 92% of accuracy on validation set for √ s = 7 GeV and 96% of accuracy on validation set for √ s = 11 and 14 GeV, by using the Adam routine [40], which is a more advanced algorithm of loss minimization.Figure 6, figure 7, and figure 8 show overall validation accuracy during 5000 epochs of training for three energy data sets with different amount of training data and different optimizers.

Results obtained for central Au+Au collisions at
√ s = 7 GeV are displayed in figure 6.To reach the desired accuracy above 90% by standard SGD method using just 4% of generated statistics as a train factor, one needs about 2500 epochs.This time can be shortened to approximately 600 epochs if 20% of events are used for the training.In contrast, application of the Adam algorithm with 4% train fraction needs only 30(!) epochs to reach the accuracy about 90%.Peculiarity in application of this method is the slow declining of the validation accuracy after 100 epochs, which means that the model starts to overfit.Recall that training with low amounts of data is unstable with both training algorithms.More time is required by the model to find a path to the minimum of the loss.The problem can be cured by increase of the training statistics.It is important because the multiplicity of produced hadrons in heavy-ion collisions drops with decreasing collision JHEP07(2020)133  energy.For Au+Au collisions at √ s = 11 and 14 GeV the multiplicities of secondaries are larger than that at √ s = 7 GeV.Here advantages of the Adam algorithm become more obvious, as shown in figure 7 for the collisions at √ s = 11 GeV.The standard SGD optimization method demonstrates sharp increase of the validation accuracy after 3200 epochs for the train fraction 1%, and after 400 epochs for the train fraction 20%.Training with Adam algorithm converges much faster.We see a rapid increase of accuracy in EoS recognition with the train fraction 1% after 30 epochs already.The same efficiency is obtained for central Au+Au collisions at √ s = 14 GeV, as depicted in figure 8. Therefore, a classificator trained with the Adam optimization algorithm needs very limiting statistics for training within quite short time to discriminate between the hard and the soft equation of state with accuracy about 95%.

Conclusions and discussion
Our main results can be summarized as follows.Convolutional Neural Networks demonstrate excellent performance for equation of state classification in UrQMD generated heavyion collisions.Binary classification accuracy of 98% is achieved for events at √ s = 11 and JHEP07(2020)133  14 GeV and 94% for √ s = 7 GeV, using the data set of primitive experimental observables, such as particle transverse momentum and azimuthal angle.Moreover, it is possible to differentiate between equation of state using the model trained on different neighboring energies.
It turned out that even a small fraction of data is enough to achieve satisfactory results.Training time depends strongly on data fraction used for the training, and on the employed algorithm.Our study shows that to reach the top classification accuracy for central Au+Au collisions at energies about 11 GeV and higher by the standard Stochastic Gradient Descent method, one needs more that 3000 epochs for training data of 1% of the generated statistics, and more than 400 epochs if the training fraction was increased to 20%.Similar accuracy can be achieved by application of the stochastic optimization algorithm Adam trained at statistics of 1% just after 30 epochs.Therefore, training Convolutional Neural Network classifier with the Adam algorithm permits one to reach high accuracy during very short training time with little amount of training data.
It is worth mentioning that the performance of trained CNN routine is very robust against initial state fluctuations and final state interactions, such as decays of resonances.However, similar to other applications of DL technique to high energy physics [22][23][24], the program has to decide between two discrete choices: calculations either with soft or stiff equation of state.In other words, this is a one-dimensional discrete space.In reality, the space of possible EoS is continuous.The next step, therefore, should be calculations JHEP07(2020)133  After that one can make a transition to multi-dimensional continuous space.This will allow us to extract different parameters of the EoS provided we can fix its functional form.If the limits of values of each parameter, entering the EoS, are determined, we have to generate the events for each set of parameters.Then, we train the model so that it learns the correspondence between (p T , φ)-distributions of hadrons and set of parameters.The whole procedure looks cumbersome but feasible.This interesting problem needs further investigations.

Figure 3 .
Figure 3. Proton densities for (a) soft and (b) hard potentials from a single UrQMD generated Au+Au collision at √ s = 11 GeV.

Figure 6 .
Figure 6.Validation loss during 5000 epochs of training for different training data fractions and optimizers in Au+Au central collisions generated within UrQMD at √ s = 7 GeV.Solid and dashed curves indicate the results obtained by SGD method with train fraction 4% and 20%, respectively.Dotted curve denote the results obtained by Adam method with 4% train fraction.

Figure 7 .
Figure 7.The same as figure 6 but for central Au+Au collisions at √ s = 11 GeV.

Figure 8 .
Figure 8.The same as figure 6 but for central Au+Au collisions at √ s = 14 GeV.

Table 1 .
Parameters of the hard and soft mean-field potentials used in the current version of the UrQMD.

Table 2 .
Architecture of Convolutional Neural Network.

Table 3 .
Performance of models trained at different energies.