Measurement of light-by-light scattering and search for axion-like particles with 2.2 nb$^{-1}$ of Pb+Pb data with the ATLAS detector

This paper describes a measurement of light-by-light scattering based on Pb+Pb collision data recorded by the ATLAS experiment during Run 2 of the LHC. The study uses $2.2$ nb$^{-1}$ of integrated luminosity collected in 2015 and 2018 at $\sqrt{s_\mathrm{NN}}=5.02$ TeV. Light-by-light scattering candidates are selected in events with two photons produced exclusively, each with transverse energy $E_{\mathrm{T}}^{\gamma}>2.5$ GeV, pseudorapidity $|\eta_{\gamma}|<2.37$, diphoton invariant mass $m_{\gamma\gamma}>5$ GeV, and with small diphoton transverse momentum and diphoton acoplanarity. The integrated and differential fiducial cross sections are measured and compared with theoretical predictions. The diphoton invariant mass distribution is used to set limits on the production of axion-like particles. This result provides the most stringent limits to date on axion-like particle production for masses in the range 6-100 GeV. Cross sections above 2 to 70 nb are excluded at the 95% CL in that mass interval.


Introduction
Light-by-light (LbyL) scattering, → , is a process in the Standard Model (SM) that proceeds at lowest order in quantum electrodynamics (QED) via virtual one-loop box diagrams involving charged fermions (leptons and quarks) and ± bosons ( Figure 1). LbyL interactions can occur in relativistic heavy-ion collisions at any impact parameters. However, the large impact parameters i.e. larger than twice the radius of the ions, are experimentally preferred as the strong interaction does not play a role in these ultra-peripheral collision (UPC) events. In general, UPC events allow studies of processes involving nuclear photoexcitation, photoproduction of hadrons, and two-photon interactions. Comprehensive reviews of UPC physics can be found in Refs. [1,2]. The electromagnetic (EM) fields produced by the colliding Pb nuclei can be treated as a beam of quasi-real photons with a small virtuality of 2 < 1/ 2 , where is the radius of the nuclear charge distribution and so 2 < 10 −3 GeV 2 [3][4][5]. The cross section for the reaction Pb+Pb ( ) → Pb ( * ) +Pb ( * ) can then be calculated by convolving the respective photon flux with the elementary cross section for the process → , with a possible EM excitation [6], denoted by ( * ). Since the photon flux associated with each nucleus scales as 2 , the LbyL cross section is strongly enhanced relative to proton-proton ( ) collisions.
In this measurement, the final-state signature of interest is the exclusive production of two photons, where the diphoton final state is measured in the detector surrounding the Pb+Pb interaction region, and the incoming Pb ions survive the EM interaction. Hence, one expects that two low-energy photons will be detected with no further activity in the central detector. In particular, no reconstructed charged-particle tracks originating from the Pb+Pb interaction point are expected.
The LbyL process has been proposed as a sensitive channel to study physics beyond the SM. Modifications of the → scattering rates can be induced by new exotic charged particles [7] and by the presence of extra spatial dimensions [8]. The LbyL cross sections are also sensitive to Born-Infeld extensions of QED [9], Lorentz-violating operators in electrodynamics [10], and the presence of space-time non-commutativity in QED [11]. Additionally, new neutral particles, such as axion-like particles (ALP), can also contribute in the form of narrow diphoton resonances [12], as shown in Figure 1. ALPs are relatively light, gauge-singlet (pseudo-)scalar particles that appear in many theories with a spontaneously broken global symmetry. Their masses and couplings to SM particles may range over many orders of magnitude. The previous ATLAS searches involving ALP decays to photons are based on collision data [13,14].
LbyL scattering via an electron loop has been precisely, albeit indirectly, tested in measurements of the anomalous magnetic moment of the electron and muon [15,16]. The → reaction has been measured in photon scattering in the Coulomb field of a nucleus (Delbrück scattering) [17][18][19][20] and in the photon splitting process [21]. A related process, in which initial photons fuse to form a pseudoscalar meson which subsequently decays into a pair of photons, has been studied at electron-positron colliders [22][23][24].
The authors of Ref. [25] proposed to measure LbyL scattering by exploiting the large photon fluxes available in heavy-ion collisions at the LHC. The first direct evidence of the LbyL process in Pb+Pb UPC at the LHC was established by the ATLAS [26] and CMS [27] Collaborations. The evidence was obtained from Pb+Pb data recorded in 2015 at a centre-of-mass energy of √ NN = 5.02 TeV with integrated luminosities of 0.48 nb −1 (ATLAS) and 0.39 nb −1 (CMS). The CMS Collaboration also set upper limits on the cross section for ALP production, → → , over a mass range of 5-90 GeV. Exploiting a data sample of Pb+Pb collisions collected in 2018 at the same centre-of-mass energy with an integrated luminosity of 1.73 nb −1 , the ATLAS Collaboration observed LbyL scattering with a significance of 8.2 [28]. These two ATLAS measurements used tight requirements on the diphoton invariant mass (> 6 GeV) and single-photon transverse energy (> 3 GeV). This paper presents a measurement of the cross sections for Pb+Pb ( ) → Pb ( * ) +Pb ( * ) production at √ NN = 5.02 TeV using a combination of Pb+Pb collision data recorded in 2015 and 2018 by the ATLAS experiment, corresponding to an integrated luminosity of 2.2 nb −1 . This analysis follows the approach proposed in Ref. [25] and the methodology used in the previous measurements [26 , 28]. However, as a result of improvements in the trigger efficiency and purity of the photon identification, a broader kinematic range in diphoton invariant mass (> 5 GeV) and single-photon transverse energy (> 2.5 GeV) is covered. This extension results in an increase of about 50% in expected signal yield in comparison with the previous tighter requirements.
The integrated fiducial cross section and four differential distributions involving kinematic variables of the final-state photons are measured. Two of the distributions characterise the energy of the process: the invariant mass of the diphoton system, , and the average transverse momentum of two photons, ( 1 T + 2 T )/2. The remaining ones probe angular correlations of the system. These are the rapidity 1 of the diphoton system, , and | cos( * )|, defined as: where * is the scattering angle in the centre-of-mass frame, and Δ 1, 2 is the difference between the rapidities of the photons.
The measured diphoton invariant mass distribution is used to set limits on ALP production via the process → → .

ATLAS detector
The ATLAS detector [29] at the LHC covers nearly the entire solid angle around the collision point. It consists of an inner tracking detector surrounded by a thin superconducting solenoid, EM and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroid magnets. The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the pseudorapidity 2 range | | < 2.5.
The high-granularity silicon pixel detector (Pixel) covers the collision region. Typically, it provides four measurements per track, with the first hit being in the insertable B-layer (IBL) [30,31], which was installed at a mean distance of 3.3 cm from the beam pipe before the start of Run 2. It is followed by the silicon microstrip tracker (SCT), which usually provides four two-dimensional measurement points per track. These silicon detectors are complemented by the transition radiation tracker, which enables radially extended track reconstruction up to | | = 2.0.
1 Rapidity is defined as = 1 2 ln + − , where and are particle's energy and the component of momentum along the beam axis, respectively. 2 ATLAS uses a right-handed coordinate system with origin at the nominal interaction point in the centre of the detector and the -axis along the beam pipe. The -axis points from the IP to the centre of the LHC ring, and the -axis points upwards. Cylindrical coordinates ( , ) are used in the transverse plane, being the azimuthal angle around the -axis. The pseudorapidity is defined in terms of the polar angle as = − ln tan( /2). Angular distance is measured in units of Δ ≡ √︁ (Δ ) 2 + (Δ ) 2 . The transverse energy of a photon or electron is T = /cosh( ), where is its energy.

Data and Monte Carlo simulation samples
The data used in this measurement is from Pb+Pb collisions with a centre-of-mass energy of √ NN = 5.02 TeV, recorded in 2015 and 2018 with the ATLAS detector at the LHC. The full data set corresponds to an integrated luminosity of 2.2 nb −1 . Only high-quality data with all detectors operating normally are analysed.
Monte Carlo (MC) simulated events for the LbyL signal process were generated at leading order (LO) using SuperChic v3.0 [34]. They take into account box diagrams with leptons and quarks (such as the diagram in Figure 1), and ± bosons, including interference effects. The ± contribution is only important for diphoton masses > 2 . Next-to-leading-order QCD and QED corrections are not included. They increase the → cross section by a few percent [35,36]. An alternative LbyL signal sample was generated using calculations from Ref. [37]. The difference between the nominal and alternative signal prediction is mainly in the implementation of the non-hadronic overlap condition of the Pb ions. In SuperChic v3.0 the probability for exclusive interactions turns on smoothly for Pb+Pb impact parameters in the range of 15-20 fm and it is unity for larger values, while the alternative prediction fully suppresses these interactions for impact parameters below 14 fm when two nuclei overlap during the collision. This difference leads to a fiducial cross section for LbyL scattering that is by about 3% larger in the alternative calculation than in the prediction from SuperChic v3.0.
The exclusive diphoton final state can also be produced via the strong interaction through a quark loop in the exchange of two gluons in a colour-singlet state. This central exclusive production (CEP) background contribution, → , was modelled using SuperChic v3.0. Background from two-photon production of quark-antiquark pairs was estimated using H ++ 2.7.1 [38] where the Equivalent Photon Approximation (EPA) formalism in collisions is implemented. The sample was then normalised to cover the differences in equivalent photon fluxes between the Pb+Pb and cases.
Exclusive dielectron pairs from the reaction Pb+Pb ( ) → Pb ( * ) +Pb ( * ) + − are used for various aspects of the analysis, in particular to validate the EM calorimeter energy scale and resolution. This → + − process was modelled with the STARlight v2.0 MC generator [39], in which the cross section is computed by combining the Pb+Pb photon flux with the LO formula for → + − . The background contribution from a related process, → + − , was modelled using STARlight v2.0 interfaced with Pythia 8.212 [40] for the simulation of -lepton decays.
Events for the ALP signal were generated using STARlight v2.0 for ALP masses ( ) ranging between 5 and 100 GeV. A mass spacing of 1 GeV was used for 5 < < 30 GeV, while for > 30 GeV a 10 GeV mass spacing was used. The width of the simulated ALP resonance is well below the detector resolution in all simulated samples.
All generated events were passed through a detector simulation [41] based on GEANT4 [42] and are reconstructed with the standard ATLAS reconstruction software.

Event selection
Candidate diphoton events were recorded using a dedicated trigger for events with moderate activity in the calorimeter but little additional activity in the entire detector. The trigger strategies for the 2015 and 2018 data sets were different. In particular, the latter aimed at improving the trigger efficiency at low photon transverse energy, T , values. At Level-1 in 2015, the total T registered in the calorimeter after noise suppression was required to be between 5 and 200 GeV. In 2018, a logical OR of two Level-1 conditions was required: (1) at least one EM cluster with T > 1 GeV in coincidence with total T registered in the calorimeter between 4-200 GeV, or (2) at least two EM clusters with T > 1 GeV with total T registered in the calorimeter below 50 GeV. At the HLT, events in 2015 were rejected if more than one hit was found in the inner ring of the MBTS (MBTS veto). In 2018, a requirement of total T on each side of the FCal detector to be below 3 GeV was imposed. Additionally, in both data sets a veto condition on activity in the Pixel detector, hereafter referred to as Pixel-veto, had to be satisfied. The number of hits was required to be at most 10 in 2015, and at most 15 in 2018.
Photons are reconstructed from EM clusters in the calorimeter and tracking information provided by the ID, which allows the identification of photon conversions [43]. Selection requirements are applied to remove EM clusters with a large amount of energy from poorly functioning calorimeter cells, and a timing requirement is made to reject out-of-time candidates. An energy calibration specifically optimised for photons [44] is applied to the candidates to account for upstream energy loss and both lateral and longitudinal shower leakage. The calibration is derived for nominal collisions with dedicated factors applied to account for a negligible contribution from multiple Pb+Pb collisions at the same bunch crossing. A correction [44] is applied to photons in MC samples to account for potential mismodelling of quantities which describe shower shapes of the associated EM showers.
The photon particle identification (photon PID) in this analysis is based on a selection of the shower-shape variables, optimised for the signal events. Only photons with T > 2.5 GeV and | | < 2.37, excluding the calorimeter transition region 1.37 < | | < 1.52, are considered. The pseudorapidity requirement ensures that the photon candidates pass through regions of the EM calorimeter where the first layer is segmented into narrow strips, providing good separation between genuine prompt photons and photons coming from the decay of neutral hadrons. The identification is based on a neural network trained on background photons extracted from data and photons from the signal MC simulation, as already used in the previous ATLAS measurement [28]. The PID requirements are optimised for low-T photons ( T < 20 GeV) to maintain a constant photon PID efficiency of 95% as a function of and T with respect to reconstructed photon candidates. They also select a purer sample of photons than obtained with the cut-based photon PID utilised in collisions [43].
Preselected events are required to have exactly two photons satisfying the above selection criteria, with a diphoton invariant mass greater than 5 GeV. In order to suppress the → + − background, a veto on charged-particle tracks (with T > 100 MeV, | | < 2.5, at least one hit in the Pixel detector and at least six hits in the Pixel and SCT detectors in total) is imposed. In order to reduce the background from electrons with poorly reconstructed tracks, candidate events are required to have no 'pixel tracks' in the vicinity of the photon candidate. Pixel tracks are reconstructed using only the information from the Pixel detector, and are required to have T > 50 MeV, | | < 2.5, and at least three hits in the Pixel detector. In order to suppress fake pixel tracks due to noise in the Pixel detector, only pixel tracks with Δ < 0.5 from the photons are considered. These requirements reduce the fake-photon background from the dielectron final state by a factor of about 10 4 , according to simulation. They have minor impact on → signal events (93% efficiency for the track veto and 99% for the pixel-track veto), since the probability of photon conversion in the Pixel detector is relatively small and the converted photons have a low probability of being reconstructed at very low T due to the presence of low-momentum electron tracks.
Due to the absence of tracks in the LbyL signal events, no primary vertex is reconstructed. The photon direction is estimated using the barycentre of the cluster with respect to the origin of the ATLAS coordinate system.
To reduce other sources of fake-photon background (involving mainly calorimeter noise and cosmic-ray muons), the transverse momentum of the diphoton system ( T ) is required to be below 1 GeV for < 12 GeV and below 2 GeV for > 12 GeV. To reduce real-photon background from CEP → reactions, an additional requirement on the diphoton acoplanarity, = (1 − |Δ |/ ) < 0.01, is used. The CEP process exhibits a significantly broader acoplanarity distribution than the → process because gluons recoil against the Pb nucleus, which then dissociates.
To select → + − candidates, events are required to pass the same trigger as in the diphoton selection. Each electron is reconstructed from an EM energy cluster in the calorimeter matched to a track in the ID [45]. The electrons are required to have a transverse energy T > 2.5 GeV and pseudorapidity | | < 2.47 with the calorimeter transition region 1.37 < | | < 1.52 excluded. They are also required to meet loose identification criteria based on shower-shape and track-quality variables [45]. The → + − events are selected by requiring exactly two oppositely charged electrons, no further charged-particle tracks coming from the interaction region (with the selection requirements as described above), and dielectron acoplanarity below 0.01.

Trigger efficiency
The trigger sequence used in the analysis consists of three independent requirements: Level-1, MBTS/FCal veto, and the requirement on low activity in the ID.
The Level-1 trigger efficiency was estimated with → + − events passing one of the independent supporting triggers. These triggers are designed to select events with single or double dissociation of Pb nuclei and small activity in the ID. They are based on a coincidence of signals in one or both ZDC sides with a requirement on the total T in the calorimeter to be below 50 GeV. Dielectron event candidates are required to have exactly two reconstructed tracks and two geometrically matched EM clusters, each with a minimum T of 1 GeV and | | < 1.47, excluding the calorimeter transition region 1.37 < | | < 1.52. The electron identification requirements are removed in order to accept more events in this very low T region, where the efficiencies to reconstruct and identify electrons are low. Furthermore, dielectron acoplanarity evaluated using electron charged-particle tracks is required to be below 0.01. The extracted Level-1 trigger efficiency is provided as a function of the sum of T of the two EM clusters reconstructed offline (   The MBTS and FCal veto efficiencies are estimated using → + − events recorded by supporting triggers. The MBTS veto efficiency is estimated to be (98 ± 2)% [26] and the FCal veto efficiency is found to be (99.1 ± 0.6)%. Both efficiencies are independent of kinematics.
Due to low conversion probability of signal photons in the Pixel detector, inefficiency of the Pixel-veto requirement at the trigger level is found to be negligible for diphoton event candidates.
The efficiency for selected → + − events to satisfy the Pixel-veto requirement is evaluated using a dedicated supporting trigger accepting events with at most 15 tracks at the HLT, out of which at least two had T > 1 GeV. At Level-1, the same trigger condition was applied as in the diphoton trigger. The FCal veto requirement was also imposed at the HLT. The Pixel-veto efficiency is parameterised using a second-order polynomial as a function of dielectron rapidity, . The efficiency reaches 80-85% for dielectron rapidity | | < 1 and drops to 45-50% at | | ≈ 2.5. This efficiency correction is applied to the → + − MC simulation.

Photon reconstruction and identification
The photon reconstruction efficiency is extracted from data using → + − events, where one of the electrons emits a hard-bremsstrahlung photon when interacting with the material of the detector. A tag-and-probe method is performed for events collected by the diphoton trigger with exactly one identified electron and exactly two reconstructed charged-particle tracks. The electron is considered a tag if it can be matched to one of the tracks with a Δ < 1.0 requirement. The electron is required to have T > 4 GeV and the track that is unmatched with the electron (trk2) must have T < 1.5 GeV. The electron-trk2 transverse momentum difference is treated as the transverse energy of the probe, since the additional hard-bremsstrahlung photon is expected to have T ≈ ( T − trk2 T ). The trk2 T < 1.5 GeV requirement ensures a sufficient Δ separation between the expected photon and the second electron. A hard-bremsstrahlung photon is expected to be within a distance of Δ = 1.0 around trk2 direction. Any additional background contribution to the exclusive → + − reaction is found to be very small in Pb+Pb UPC [46], and therefore it is considered negligible.
The data sample contains 2905 → + − ( ) bremsstrahlung photons and is used to extract the photon reconstruction efficiency, which is presented in Figure 3. The efficiency in data is approximately 60% for T = 2.5 GeV and reaches 90% at T = 6 GeV. Reasonable agreement between data and simulation is found. The distribution from Figure 3 is used to obtain the data-to-simulation scale factors that are used to correct the MC simulation.
High-T exclusive dilepton production ( → ℓ + ℓ − with ℓ ± = ± , ± ) with final-state radiation (FSR) is used for data-driven measurements of the photon PID efficiency, defined as the probability for a reconstructed photon to satisfy the identification criteria. Events with exactly two oppositely charged tracks with T > 0.5 GeV are selected in UPC events recorded by the diphoton or dimuon 3 triggers. In addition a requirement to reconstruct a photon candidate with T > 2.5 GeV and | | < 2.37, excluding the calorimeter transition region 1.37 < | | < 1.52, is imposed. A photon candidate is required to be separated from each track with the requirement Δ > 0.3. This condition avoids the leakage of the photon cluster energy to an electron cluster from the → + − process. The mass of the dilepton system is required to be above 1.5 GeV. The FSR event candidates are identified using a tt T < 1 GeV requirement, where 3 The dimuon trigger required a muon candidate with T > 4 GeV reconstructed at Level-1 and at least two tracks with T above 1 GeV among up to 15 tracks found at the HLT.   In the data for photons with T < 5 GeV, the photon PID efficiency is in the range of 91-93% in the 2018 set, while it is found to be 97-100% in the 2015 set. This difference is due to slightly different detector conditions between the 2015 and 2018 data-taking periods, causing the photon shower-shape distributions to be narrower in the 2015 data. Based on these studies, MC simulated events are corrected using photon T -dependent data-to-simulation scale factors separately for the 2015 and 2018 data sets.

Photon energy calibration
The EM energy scale and energy resolution are validated in data using → + − events. The two electrons from the → + − reaction exhibit balanced transverse momenta with | + T − − T |, expected to be below 30 MeV, which is much smaller than the EM calorimeter energy resolution. Therefore, the energy resolution, cluster T , can be determined from the measurement of cluster1 where cluster1 T and cluster2 T are the transverse energies of the two clusters. At low electron-T (below 10 GeV) the value of cluster T / cluster T is observed to be 8-10% in data, which agrees well with the resolution obtained from simulation.  The EM energy scale is cross-checked using the ratio of electron cluster T to electron track T . It is observed that the simulation provides a good description of the T / trk T distribution. Figure 5 presents detector-level distributions for events passing the → + − selection (outlined in Section 4) in the 2018 Pb+Pb data. In total, 28 045 → + − event candidates are observed. The shaded bands reflect systematic uncertainties due to electron energy scale and resolution, electron reconstruction and identification, and trigger efficiency. In general, the STARlight prediction describes the normalisation and shapes of distributions well. Small systematic differences between the central values of the exclusive dielectron data and the MC prediction are seen in the tail of the dielectron T distribution, likely due to a missing contribution from the QED final-state radiation which is not simulated by the MC generator.

Control distributions for exclusive → + − production
The low number of → + − events collected by a control trigger in the 2015 Pb+Pb data precludes precision comparisons between data and MC simulation in that sample. In particular, the tighter Pixel-veto requirement imposed at the HLT necessitates a dedicated pseudorapidity-dependent trigger efficiency correction which, due to the limited number of → + − events, could only be extracted with 20% precision. Nevertheless, overall reasonable agreement was found within large uncertainties as demonstrated in the previous ATLAS publication [26]. 6 Background estimation 6

.1 Dielectron final states
The → + − process has a relatively high cross section and can be a source of fake diphoton events. The electron-to-photon misidentification can occur when the electron track is not reconstructed or the electron emits a hard bremsstrahlung photon.
The → + − yield in the signal region defined in Section 4 is estimated using a fully data-driven method. A control region is defined requiring exactly two photon candidates passing the signal selection, and one or two pixel tracks. This control region is denoted by CR =1, 2 PixTrk . The event yield observed in CR =1, 2 PixTrk is extrapolated to the signal region using the probability of missing the electron pixel track if the standard track is not reconstructed ( mistag ). The mistag value is measured in data using events with exactly one standard track and two photon candidates having < 0.01. It is measured to be mistag = (47 ± 9)%, where the uncertainty is estimated by relaxing the requirement. It is also found that mistag does not depend on the probed photon T and . The number of → + − events in the signal region is estimated to be → + − = 15 ± 7, where the uncertainty accounts for the mistag uncertainty and limited event yield in CR =1, 2 PixTrk . This uncertainty also covers the differences if the → + − yield is instead extrapolated from event yields for individual pixel-track multiplicities ( = 1 or = 2).
The distribution shapes of various kinematic variables of → + − background in the signal region are taken from data in CR =1 PixTrk . The shape uncertainty is constructed by comparing kinematic distributions from data in CR =1 PixTrk with the distributions from data in CR =2 PixTrk .

Central exclusive diphoton production
The CEP → background is estimated from MC simulation with the overall rate of this process evaluated in the control region in the data. The normalisation is constrained using the condition: where data denotes the number of observed events, → is the expected CEP → event yield, sig is the expected number of signal events (from MC simulation) and → is the + − background yield. The → is estimated using the same data-driven method as described in Section 6.1. The diphoton acoplanarity distribution for events satisfying the signal region selection, but before applying the < 0.01 requirement is shown in Figure 6. The predictions provide a fair description of the shape of the data distribution.
The uncertainty in the CEP → background process takes into account the limited number of events in the > 0.01 control region (11%), as well as experimental and modelling uncertainties. It is found that all experimental uncertainties have a negligible impact on the CEP → background estimate. The impact of the MC modelling uncertainty on the shape of the acoplanarity distribution is estimated using an alternative SuperChic v2.0 MC sample with extra gluon interactions (no absorptive effects). This leads to a 21% change in the CEP background yield in the signal region, which is taken as a systematic uncertainty. An additional check is performed by varying the parton distribution function (PDF) of the gluon. The differences between leading-order MMHT 2014 [47], CT14 [48] and NNPDF3.1 [49] PDF sets have negligible impact on the shape of the diphoton acoplanarity distribution.
In addition, the energy deposition in the ZDC, which is sensitive to the dissociation of Pb nuclei, is studied for events before the < 0.01 requirement is imposed. Good agreement is observed in the > 0.01 control region between the data-driven CEP estimate and the observed events with a signal corresponding to at least one neutron in the ZDC. In the signal region ( < 0.01), approximately 70% of observed events have a signal corresponding to no neutrons in the ZDC, which is consistent with the signal-plus-background hypothesis.
The background due to CEP in the signal region is estimated to be 12 ± 3 events. In the differential cross-section measurements, the shape uncertainty is evaluated using the alternative SuperChic v2.0 MC sample.

Other background sources with prompt photons
The contribution from the → + − process is evaluated using the M G 5_aMC@NLO v2.4.3 MC generator [50] and the Pb+Pb photon flux from STARlight. This contribution is estimated to be below 1% of the expected signal and is consequently ignored in the analysis.
Syst. uncertainty Figure 6: The diphoton acoplanarity distribution for events satisfying the signal region selection, but before applying the < 0.01 requirement. Data are shown as points with statistical error bars, while the histograms represent the expected signal and background levels. The CEP → background is normalised in the > 0.01 control region. The signal prediction is normalised to the same integrated luminosity as the data. The shaded band represents the uncertainties in signal and background predictions, excluding the uncertainty in the luminosity.
The contribution from UPC events where both nuclei emit a bremsstrahlung photon is estimated using calculations from Ref. [53]. The cross section for single-bremsstrahlung photon production from a Pb ion in the fiducial region of the measurement is calculated to be below 10 −4 pb, so the coincidence of two such occurrences is negligible.

Other fake-photon background
The background contribution from →¯production is estimated using MC simulation based on H ++ and it contributes less than 1 event to the total number of events in the signal region. The expected yield for the background from → + − process is estimated using MC simulation based on STARlight + Pythia 8 and is found to be less than 0.5 events. Both of these background sources are considered negligible.
Exclusive two-meson production can be a potential source of background for LbyL scattering events, mainly due to their similar back-to-back topology. Mesons can fake photons either by their decay into photons ( 0 , , ) or by mis-reconstructed charged-particle tracks (for example + , − states). Estimates for such contributions are reported in Refs. [25, 54-57] and these contributions are considered to be negligible in the signal region.
The background from fake diphoton events induced by cosmic-ray muons is estimated using a control region with at least one track reconstructed in the muon spectrometer and further studied using the reconstructed photon-cluster time distribution. The latter method is also used to estimate the background originating from calorimeter noise. After imposing the T requirements, these background contributions are below 1 event and are considered negligible.

Systematic uncertainties
Systematic uncertainties in the → cross-section measurements arise from the reconstruction of photons, the background determination, and integrated luminosity uncertainty, as well as the procedures used to correct for detector effects.
The precision of the Level-1 trigger efficiency estimation is limited by the number of events recorded by the supporting trigger. As a systematic check, the + − event selection is varied. In total, the impact of the Level-1 trigger efficiency uncertainty on the expected signal yield is 5%. The uncertainty in the MBTS/FCal veto efficiency has negligible impact on the results.
The uncertainty in the photon reconstruction and PID efficiencies is estimated by parameterising the scale factors as a function of the photon pseudorapidity, instead of the photon transverse momentum. This affects the expected signal yield by 4% (photon reconstruction efficiency) and 2% (photon PID efficiency). The variation of the selection criteria used in data-driven efficiency measurements has negligible impact on the results. The statistical uncertainty of the photon reconstruction and PID efficiency corrections is propagated using the pseudo-experiment method in which the correction factors are randomly shifted in an ensemble of pseudo-experiments according to the mean and standard deviation of the correction factor. This has negligible impact on the expected signal.
The uncertainties related to the photon energy scale and resolution affect the expected signal yield by 1% and 2%, respectively. The uncertainty due to imperfect knowledge of the photon angular resolution is estimated using electron clusters from the → + − process. The data-MC difference in the electron cluster resolution is applied as an extra smearing to photons from the signal MC sample. This results in a 2% shift of the signal yield, which is taken as a systematic uncertainty.
The uncertainty due to the choice of signal MC generator is estimated by using an alternative signal MC sample, as detailed in Section 3. This affects the signal yield by 1% which is taken as a systematic uncertainty. The uncertainty due to the limited signal MC sample size is 1%.
The uncertainties in the background estimation are evaluated as described in Section 6.
The uncertainty in the integrated luminosity of the data sample is 3.2%. It is derived from the calibration of the luminosity scale using -beam-separation scans, following a methodology similar to that detailed in Ref. [58], and using the LUCID-2 detector for the baseline luminosity measurements.
Systematic uncertainties associated with the background estimate, the photon PID and reconstruction efficiency, photon energy scale, and photon angular and energy resolution are fully correlated between the 2015 and 2018 data-taking periods. Systematic uncertainties in the trigger efficiency are computed separately for each data-taking period. They are dominated by the statistical uncertainty of each data set and are thus uncorrelated.

Kinematic distributions
Photon kinematic distributions comparing the selected data with the sum of expected event yields from simulated signal and background processes in the signal region are shown in Figure 7. In total, 97

Integrated fiducial cross section
The inclusive cross section for the → process is measured in a fiducial phase space, defined by the following requirements on the diphoton final state, reflecting the selection at reconstruction level: both photons have to be within | | < 2.4 with a transverse momentum of T > 2.5 GeV. The invariant mass of the diphoton system has to be > 5 GeV with transverse momentum of T < 1 GeV. In addition, the photons must fulfil an acoplanarity requirement of < 0.01.
The integrated fiducial cross section is obtained as follows: where data = 97 is the number of selected events in data, bkg = 27 ± 5 is the number of background events, ∫ d = 2.22 ± 0.07 nb −1 is the integrated luminosity of the data sample and = 0.263 ± 0.021 is the overall correction factor that accounts for detector efficiencies and resolution effects, and for signal events passing the event selection but originating from outside the fiducial phase space (fiducial corrections). The factor is defined as the ratio of the number of reconstructed MC signal events passing the selection to the number of generated MC signal events satisfying the fiducial requirements.
The uncertainty in is estimated by varying the data/MC correction factors within their uncertainties as described in Section 7, in particular for the photon reconstruction and PID efficiencies, photon energy scale and resolution and trigger efficiency. An overview of the various uncertainties in is given in Table 1.
The uncertainty in bkg is dominated by the uncertainty in the → + − background. This has a 6% impact on the estimated integrated fiducial cross section.     The measured integrated fiducial cross section is fid = 120 ± 17 (stat.) ± 13 (syst.) ± 4 (lumi.) nb, which can be compared with the predicted values of 80 ± 8 nb from Ref.
The theoretical uncertainty in the cross section is primarily due to limited knowledge of the nuclear (EM) form-factors and the related initial photon fluxes. This is extensively studied in Ref. [59] and the relevant uncertainty is estimated to be 10% within the fiducial phase space of the measurement. For masses below 100 GeV, this uncertainty does not exhibit a dependence on the diphoton mass. Higher-order corrections (not included in the calculations) are also part of the theoretical uncertainty and are of the order of 1-3% in the corresponding invariant mass range [35,36].

Differential fiducial cross sections
Differential fiducial cross sections as a function of diphoton invariant mass, diphoton absolute rapidity, average photon transverse momentum and diphoton | cos * | are unfolded to particle level in the fiducial phase space described in the previous section.
The differential fiducial cross sections are determined using an iterative Bayesian unfolding method [60] with one iteration for all distributions. The unfolding procedure corrects for bin migrations between particleand detector-level distributions due to detector resolution effects, and applies reconstruction efficiency as well as fiducial corrections. The reconstruction efficiency corrects for events inside the fiducial region that are not reconstructed in the signal region due to detector inefficiencies; the fiducial corrections take into account events that are reconstructed in the signal region, but originate from outside the fiducial region. The background contributions are subtracted from data prior to unfolding.
The statistical uncertainty of the data is estimated using 1000 Poisson-distributed pseudo-data sets, constructed by smearing the observed number of events in each bin of the detector-level distribution. The root mean square of the differences between the resulting unfolded distributions and the unfolded data is taken as the statistical uncertainty in each bin.
In the measurement of differential fiducial cross sections, the full set of experimental systematic uncertainties described in Section 7 is considered. In addition, uncertainties due to the unfolding procedure and the modelling of the signal process are considered by repeating the cross-section extraction with modified inputs [61]. The distributions are reweighted at generator level to obtain better agreement between data and simulation after event reconstruction. The obtained prediction at detector level, which is then similar to data, is unfolded with the input of the default unfolding and the difference from the reweighted prediction at generator level is considered as an uncertainty. The size of this uncertainty is typically below 1%. The impact of statistical uncertainties in the signal simulation is estimated using pseudo-data and is found to be 1-3%.
The unfolded differential fiducial cross sections are shown in Figure 8. They are compared with the predictions from SuperChic v3.0, which provide a fair description of the data, except for the overall normalisation differences. For nearly all variables and bins the total uncertainties in the cross-section measurements are dominated by statistical uncertainties, ranging from 25% to 75%. The background systematic uncertainties are large and comparable to statistical uncertainties in some bins (up to 40%, mainly at high | |) due to the limited number of events in the data control regions. Global 2 comparisons are carried out for the shapes of differential distributions. They do not display any significant differences between predictions and data, with the largest 2 per degree of freedom being 4.3/3 when comparing the  shape of | cos( * )| distribution. The differential fiducial distribution is measured up to = 30 GeV. For > 30 GeV, no events are observed in data versus a total expectation of 0.8 events.
The cross sections for all distributions shown in this paper, including normalised differential fiducial cross sections, are available in HepData [62].

Search for ALP production
Any particle coupling directly to photons could be produced in an -channel process in photon-photon collisions, leading to a resonance peak in the invariant mass spectrum. One popular candidate for producing a narrow diphoton resonance is an axion-like particle (ALP) [12]. The measured diphoton invariant mass spectrum, as shown in Figure 7, is used to search for → → process, where denotes the ALP.
The LbyL, → + − and CEP → processes are considered as background. The contribution from → + − and CEP → processes is estimated using data-driven techniques as described in Section 6. The LbyL background is estimated using simulated events generated with SuperChic v3.0. These events are normalised to the data yield, after subtracting → + − and CEP → contributions and excluding the mass search region. To smooth statistical fluctuations in the background shape at high mass, a Crystal Ball function is fitted to the sum of all background contributions, while assigning the fit residuals as additional systematic uncertainty.
Events simulated with STARlight v2.0 [39], which implements the ALP couplings as described in Ref. [12], for various ALP masses between 5 GeV and 100 GeV are used to build an analytical model of the ALP signal, interpolating between the simulated mass points. The efficiency of ALP events to satisfy the selection criteria (outlined in Section 4) is about 20% for = 6 GeV and increases up to 45% for = 12 GeV. An efficiency plateau of about 80% is reached for an ALP mass around 40 GeV. The diphoton invariant mass resolution for simulated ALP signal ranges from 0.5 GeV at = 6 GeV to 1.5 GeV at = 100 GeV and is dominated by the photon energy resolution. The impact of the uncertainty on the primary-vertex position has a subdominant effect on the diphoton invariant mass resolution over the full mass range.
In every analysis bin a cut-and-count analysis is performed to estimate the expected numbers of background and signal events. The bin-width is chosen to include at least 80% of a reconstructed ALP signal peak within a given bin and ranges from 2 GeV to 20 GeV. To cover the entire mass range, the analysis bins overlap and have an equidistant distance of 1 GeV between the bin centres. The signal contribution is fitted individually for every bin using a maximum-likelihood fit implemented in the HistFitter software [63][64][65] which is based on HistFactory [66], RooFit [67] and RooStats [68].
Since no significant deviation from the background-only hypothesis is observed, the result is then used to estimate the upper limit on the ALP signal strength ( CLs ) at 95% confidence level (CL). The corresponding test-statistic distributions are evaluated using pseudo-experiments.
Experimental systematic uncertainties affecting the ALP signal model originate from the trigger, photon PID and reconstruction efficiencies, and photon energy scale and resolution. The systematic uncertainties are evaluated identically to the treatment in the cross-section measurements, described in Section 7. The theoretical uncertainty in the calculated ALP signal cross section is 10% in the full mass range, due to the limited knowledge of the initial photon fluxes [59]. This uncertainty is considered uncorrelated with other sources of uncertainty.
The limits set on the signal strength CLs are transformed into limits on the cross section CLs → → = CLs · MC ,gen . Additionally, limits on the ALP coupling to photons (1/Λ CLs ) are calculated from 1/Λ CLs = √ CLs · 1/Λ gen . MC ,gen and Λ gen are the cross section and coupling used in the MC generator. The observed and expected 95% CL limits on the ALP production cross section and ALP coupling to photons are presented in Figure 9. The limits set on the cross section → → for an ALP with a mass of 6-100 GeV range from 70 nb to 2 nb. The derived constraints on 1/Λ range from 0.3 TeV −1 to 0.06 TeV −1 . The widths of the one-and two-standard-deviation bands of the expected limit distribution decrease for ALP masses above 30 GeV. This behaviour is driven by the change in the background rate, which has a low Poisson mean for high ALP masses. For low ALP masses the background rate is sufficiently high to populate the > 0 expected background outcomes and raise the +1 and +2-standard-deviation boundaries. The discontinuity at = 70 GeV is caused by the increase of the mass-bin width which brings an increase in signal acceptance.
Assuming a 100% ALP decay branching fraction into photons, the derived constraints on the ALP mass and its coupling to photons are compared in Figure 10 with those obtained from various experiments [27,[69][70][71][72]. The exclusion limits from this analysis are the strongest so far for the mass range of 6 < < 100 GeV. 6   (left) and ALP coupling 1/Λ (right) for the → → process as a function of ALP mass . The observed upper limit is shown as a solid black line and the expected upper limit is shown by the dashed black line with its ±1 and ±2 standard deviation bands. The discontinuity at = 70 GeV is caused by the increase of the mass-bin width which brings an increase in signal acceptance.    [27,[69][70][71][72] are compared with the limits extracted from this measurement. The exclusion limits labelled "LHC ( )" are based on collision data from ATLAS and CMS. All measurements assume a 100% ALP decay branching fraction into photons. The plot on the right is a zoomed-in version covering the range 1 < < 120 GeV.

Conclusions
This paper presents a measurement of the light-by-light scattering process in quasi-real photon interactions from ultra-peripheral Pb+Pb collisions at √ NN = 5.02 TeV by the ATLAS experiment at the LHC. The measurement is based on the full Run 2 data set corresponding to an integrated luminosity of 2.2 nb −1 . After the selection criteria, 97 events are selected in the data while 27 ± 5 background events are expected. The dominant background processes are estimated using data-driven methods.
After background subtraction and corrections for detector effects are applied, the integrated fiducial cross section of the → process is measured to be fid = 120 ± 17 (stat.) ± 13 (syst.) ± 4 (lumi.) nb. The data-to-theory ratios are 1.50 ± 0.32 and 1.54 ± 0.32 for predictions from Ref.
[37] and from the SuperChic v3.0 MC generator, respectively. Differential fiducial cross sections are measured as a function of several properties of the final-state photons and are compared with Standard Model theory predictions for light-by-light scattering. All measured cross sections are consistent within 2 standard deviations with the predictions. The measurement precision is limited in all kinematic regions by statistical uncertainties.
The measured diphoton invariant mass distribution is used to search for axion-like particles and set new exclusion limits on their production in the Pb+Pb ( ) → Pb ( * ) +Pb ( * ) reaction. Integrated cross sections above 2 to 70 nb are excluded at the 95% CL, depending on the diphoton invariant mass in the range 6-100 GeV. These results provide, to this date and within the aforementioned mass range, the most stringent constraints in the search for ALP signals.