Search for the exotic decay of the Higgs boson into two light pseudoscalars with four photons in the final state in proton-proton collisions at √ s = 13 TeV

A search for the exotic decay of the Higgs boson to a pair of light pseudoscalars, each of which subsequently decays into a pair of photons, is presented. The search uses data from proton-proton collisions at √ s = 13 TeV recorded with the CMS detector at the LHC that corresponds to an integrated luminosity of 132 fb − 1 . The analysis probes pseudoscalar bosons with masses in the range 15–62 GeV, coming from the Higgs boson decay, which leads to four well-isolated photons in the final state. No significant deviation from the background-only hypothesis is observed. Upper limits are set on the product of the Higgs boson production cross section and branching fraction into four photons. The observed (expected) limits range from 0.80 (1.00) fb for a pseudoscalar boson mass of 15 GeV to 0.26 (0.24) fb for a mass of 62 GeV at 95% confidence level.


Introduction
In 2012, a Higgs boson (H) with a mass of 125 GeV was discovered by the ATLAS and CMS experiments [1][2][3] at the CERN LHC. Since its discovery, both collaborations have performed precision measurements of the spin, parity, width, and couplings of the Higgs boson in its various production and decay modes [4][5][6][7][8][9][10][11][12][13][14], all of which indicate that Higgs boson properties are compatible with the standard model (SM) predictions. However, data collected at √ s = 13 TeV provide an upper limit on the branching fraction of the Higgs boson to undetected states of about 40% at 95% confidence level (CL) [15]. This leaves a large margin for beyond-the-SM (BSM) decays of the Higgs boson. Various theoretical models, such as 2HDM+S models [16], describe a BSM Higgs boson that decays to light bosons, which has not yet been excluded.
Multiple searches for exotic decays of the Higgs boson have been performed using the 8 TeV [17][18][19][20] and 13 TeV [21][22][23][24][25][26][27] data collected at the LHC. Decays of the type H → aa, where a is a light pseudoscalar boson, are well motivated in various BSM scenarios [28][29][30][31], in particular in 2HDM+S models. In many scenarios, such as fermiophobic a decays, the branching fraction of the pseudoscalar bosons to a pair of photons is close to unity, which enables this search at the LHC. The final state, with four photons, provides an experimental signature that has very small contributions from SM processes and is thus an important channel for the search for light pseudoscalar bosons. This paper presents a search for light pseudoscalar bosons that arise from the decay of a Higgs boson, with four photons in the final state: H → aa → γγγγ. The event topology depends on the mass of the pseudoscalar boson being probed, which determines the opening angle between the photons for each pair. This analysis considers only events with four isolated photons in which the angular distance between the photons, for both photon pairs, is greater than 0.2. These requirements enable a search for pseudoscalar bosons that range in mass from 15 to 62 GeV and produce an experimental signature with four well isolated and fully reconstructed photons. The 15 GeV mass boundary is chosen because below that value most of the events have at least one merged photon pair or one photon that does not pass the reconstruction criteria. The signal reconstruction efficiency increases as the mass increases. The dominant Feynman diagram contributing to this process is shown in Fig. 1.
A previous search for light pseudoscalar bosons in events with at least three photons was performed by the ATLAS Collaboration using the LHC data collected at √ s = 8 TeV [32]. The first search of this type done by the CMS Collaboration is presented in this paper. Data collected by the CMS experiment from 2016 to 2018, which correspond to an integrated luminosity of

Data samples and simulated events
The proton-proton (p-p) collision data at √ s = 13 TeV used in this analysis were collected in 2016, 2017, and 2018 and correspond, respectively, to integrated luminosities of 36.3, 41.5, and 54.4 fb −1 . This is 5.4 fb −1 less than the standard CMS detector collected luminosity [40][41][42], because a required trigger was missing for a short period. Events are selected using a high-level diphoton trigger, optimized for the low-mass diphoton Higgs boson search [43], with photon transverse momentum (p T ) thresholds of 30 GeV and 18 GeV. Additionally, calorimeter-based identification requirements, which use information such as the shape of the electromagnetic shower, the isolation of the photon candidate, and the ratio between the hadronic and electro-magnetic energy deposit of the shower, are applied to the photon candidates at trigger level. Furthermore, the chosen trigger requires the diphoton candidate to have an invariant mass greater than 55 GeV in data collected during 2016-2017. Each event is required to contain at least one diphoton candidate that satisfies these high-level trigger requirements. The invariant mass selection on the two highest p T photons discards less than 25% of the signal events.
The simulated signal samples were generated corresponding to a pseudoscalar boson mass, m a , ranging from 15 to 60 GeV, in steps of 5 GeV, assuming a Higgs boson mass of 125 GeV. These samples were simulated considering only the gluon fusion production mode of the Higgs boson, using MADGRAPH5 aMC@NLO v2.4.2 [44].
The dominant backgrounds in this search are SM production of γγ + jets, γ + jets, as well as multijet events, in which jets are misidentified as isolated photons. As in Refs. [45,46], the background contributions are modeled entirely from data.
All simulated samples used in this analysis model QCD showering and hadronization with the PYTHIA 8.212 [47] event generator. The CUETP8M1 tune was used for data collected in 2016 and the CP5 tune was used for data collected in 2017-2018 [48,49]. The response of the CMS detector is modeled using the GEANT4 [50] package. The simulated events also include additional p-p interactions within the same or nearby bunch crossings (pileup). The average pileup in the 2016 (2017-2018) datasets is 23 (32) vertices. The simulated events are weighted to reproduce the distribution of pileup in data.

Event reconstruction
The particle-flow algorithm [36] is used to reconstruct photon candidates from energy clusters in the ECAL that are not matched to any charged particle trajectories originating in the pixel detector. The energy of the photon candidates is calculated by applying in-situ measured calibrations to the reconstructed hits in the ECAL. A multivariate regression technique is employed to correct the photon energies measured by the ECAL. These procedures are described in Ref. [37].
Deposits from quark fragmentation and hadronization are clustered into hadronic jets. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, while the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy deposits. Jets are clustered using the anti-k T jet finding algorithm [51,52] with a distance parameter of 0.4. The missing transverse momentum vector ⃗ p miss T is computed as the negative vector sum of the transverse momenta of all the PF candidates in an event, and its magnitude is denoted as p miss T [53].
Prompt photons are distinguished from jets, which could be misidentified as a photon, using a multivariate analysis (MVA) technique that uses information related to the photon's electromagnetic shower shape, isolation, energy, and η. This technique is detailed in Refs. [10,46].
Since photons do not leave deposits in the tracker, the most probable primary interaction vertex in the event is identified using a boosted decision tree (BDT). The primary vertex BDT is trained on simulated H → aa → γγγγ events and uses input variables related to tracks recoiling against the four-photon system and information related to photons converted in the tracker material, similar to Ref. [4]. A separate training is performed for each data-taking year (2016-2018) to properly model the variation in detector conditions over the three years, in particular with respect to the pixel detector upgrade. The analysis identifies the vertex with the highest BDT score as the primary vertex, which improves the resolution of the invariant mass of the Higgs boson candidate by approximately 3%. It also increases the vertex identification efficiency, 80% in total, which is defined as selecting a vertex within 1 cm of the true generator vertex, by about 10%, with respect to the vertex chosen with the largest value of summed p 2 T of the tracks.

Analysis strategy and selection
Events considered in this analysis are required to contain at least one diphoton candidate where both daughter photon candidates pass the identification criteria, which are slightly more stringent than the trigger requirements. Additionally, events must contain at least four identified photon candidates in the ECAL and tracker fiducial region (|η| < 2.5). This excludes the ECAL barrel-endcap transition region (1.44 < |η| < 1.57) where the photon reconstruction is suboptimal. The four photon candidates are also required to pass p T requirements. The p T thresholds on the first-and second-leading photons are 30 and 18 GeV, respectively. These selections are in sync with with the p T requirements at trigger level for the two leading photons. The thresholds on the third-and fourth-leading photons are 15 GeV, since the BDT-based photon identification criteria are optimal for p T > 15 GeV. When more than four photon candidates are found, the four candidates with the highest p T are chosen. Each photon candidate must pass an electron veto, based on the presence of geometrically compatible hits in the pixel detector. The four photon candidates with the highest p T are also required to have an invariant mass (m γγγγ ) between 110 and 180 GeV. The lower bound is chosen to avoid m γγγγ spectrum biases coming from the trigger selections. The Higgs boson candidate is constructed from the four photon candidates, which have passed all the previously described selection requirements.
To improve the sensitivity of the search, a 4-photon event classifier is trained to separate signal events from background events. The 4-photon event classifier utilizes the identification and kinematic information of the photons and pseudoscalar boson candidates. An optimized selection on the output of the event classifier is used to define the final signal regions for the analysis. The input variables for the classifier are uncorrelated with m γγγγ ; therefore, the shape of the m γγγγ spectrum is not affected by any selections placed on the 4-photon event classifier, as verified in simulation.
When all four photons from the decay of the pseudoscalar boson pair are within the acceptance criteria of the analysis, the H → aa → γγγγ signal will create a resonance in the m γγγγ distribution at 125 GeV. The analysis performs an unbinned maximum likelihood fit of the signal and background models to the observed m γγγγ distribution in data after a selection on the classifier output is applied. The signal model is constructed from a parametric fit to the simulated signal, while the background model is created using a data driven approach. After applying the 4-photon event classifier selection, the full analysis acceptance has a negligible dependence on the generated Higgs boson p T or the non-gluon-fusion Higgs boson production modes. Therefore limits are set on the product of the Higgs boson inclusive production cross section and branching fraction into the four photon final state.
In this paper, "nominal signal hypothesis" refers to simulated signal samples corresponding to a particular m a value. This assumes a branching ratio of a → γγ equal to unity. In these hypotheses, m a ranges from 15-60 GeV in steps of 5 GeV. The final results are reported with an m a granularity of 0.5 GeV up to m a = 40 GeV and 1 GeV for m a > 40 GeV, where the signal models for the intermediate mass hypotheses are constructed using the resolution of the signal model of the closest nominal mass and by interpolating the signal model normalization between the nominal masses.

Classifier training background
Because of the low event selection efficiency on the background samples, it is difficult to model the background from simulation. Therefore, the expected background model, which is used to train the 4-photon event classifier, is estimated from data.
The method, referred to as event mixing (a simplified version of "Hemisphere mixing" [54]), does not rely on a control or sideband region, but instead aims to estimate the background contribution using the original dataset as input. This procedure begins with using data events that have passed the trigger selections, as described in Section 3, and replacing three out of the four reconstructed photons in each event with reconstructed photons from three consecutive events to create a "mixed" dataset. Photon candidates in the mixed dataset are required to pass the selections described in Section 5. The psedoscalar boson candidates, a 1 and a 2 , are reconstructed considering all possible pair combinations among the chosen four photons, and taking the two pairs with the smallest value of the difference between the pair invariant masses. The shuffling of the reconstructed photons between the events not only constructs a dataset that is similar to the original data, but is also insensitive to the possible presence of a resonant signal. This procedure is performed separately for data collected during 2016, 2017, and 2018.
The events in the mixed dataset can potentially have different kinematic properties from those in the original data. To correct this, a multi-dimensional per-event weight is calculated by comparing events from the mixed dataset to the original data in the m γγγγ sideband region, i.e. 110-115 or 135-180 GeV. The weight is computed from the ratio of the four-dimensional histograms filled with data and event mixing events, using the following variables: angular distance between the two pseudoscalar boson candidates, defined by where ϕ is azimuthal angle; the transverse momenta of the two pseudoscalar bosons (p T,a1 and p T,a2 ); and the difference between invariant mass of the two pseudoscalar boson candidates (m a1 − m a2 ). This weight is applied to each event in the mixed dataset, and the reweighted events are then used to train the classifier.
The event mixing dataset is used only for training the 4-photon classifier and to optimize the selection on the classifier score. Since the background model used in the final maximum likelihood fit is obtained directly from data, any residual disagreement between data and event mixing in the background-like regions cannot induce any bias, and it could only result in suboptimal performance of the classifier.

Event classification
A dedicated 4-photon boosted decision tree (BDT) event classifier is trained to separate signal from background. The training sample is parameterized as a function of m a in order to make the classifier output uniform and sensitive to the full range of signal hypotheses considered in the search [55]. In this approach, a parameter equal to the hypothesized pseudoscalar boson mass (m a,hyp ) is provided as input to the training. The set of variables is chosen such that m γγγγ cannot be inferred from the inputs. This is done by verifying that their correlation with m γγγγ variable is negligible and that the m γγγγ spectrum is not distorted by applying a selection on the classifier output.
The parameterized classifier requires only a single training and is able to provide a smooth interpolation to pseudoscalar boson mass hypotheses not used for the training. The training signal sample is obtained by merging all generated signal samples of equal size with masses between 15 and 60 GeV in steps of 5 GeV. The value of the m a,hyp parameter is equal to the corre-sponding true mass for the nominal signal simulation. The event mixing dataset, as described in Section 6, is used as the background in the training. For the background, the value of the parameter m a,hyp is randomly distributed as a flat function among all possible nominal m a values in 5 GeV steps.
The variables used in the training help separate isolated photons originating from the decay of the pseudoscalar boson from those from prompt and non-prompt processes. Pseudoscalar boson candidates are constructed from the four photon candidates as follows. For every possible combination of two photon candidate pairs, the difference between the invariant masses of the paired photons (∆m) is evaluated. The pairs with the smallest value of ∆m are chosen to reconstruct the pseudoscalar boson candidates.
The following discriminating variables are provided as input to the training: 1. Photon identification BDT score for all four photons. 5. Angular distance ∆R a 1 a 2 divided by m γγγγ , i.e., ∆R a 1 a 2 /m γγγγ .
6. Angle cos θ * aγ in the pseudoscalar boson rest frame, between the leading pseudoscalar boson candidate and the leading photon produced from its decay, chosen in the laboratory frame. This variable is sensitive to the spin of the pseudoscalar boson object.
As part of this training procedure, simulated signal and background datasets from the three data-taking years (2016-2018) are scaled by their appropriate integrated luminosities and combined. This combination of datasets from three years provides large training statistics. Additionally, the signal and background samples are divided in half to create the training and testing samples.
The distributions of the four most highly ranked variables in the training: (m a1 − m a,hyp )/m γγγγ , m a1 − m a2 , and the photon identification BDT score of the third and fourth photon are shown in Fig. 2 for the event mixing dataset, data, and signal simulation for various pseudoscalar boson mass hypotheses. It shows the contributions from the event mixing dataset and the data from the m γγγγ sidebands, satisfying either m γγγγ = 110-115 or 135-180 GeV. The distributions of the event mixing dataset and data are found to be in reasonable agreement, and the residual disagreement does not induce any biases in the analysis since the final background model is derived directly from data.
A unique 4-photon BDT output is obtained for each pseudoscalar boson mass hypothesis. The difference between the correlations of the input variables used in the training leads to a disagreement in the output shape of the BDT between the event mixing model and data. This difference is addressed by reweighting the BDT output distribution for the event mixing model to match output distribution for data from the m γγγγ sideband region.
In order to maximize the sensitivity of the analysis, events are categorized according to the output of the 4-photon BDT. The categorization is optimized by maximizing the approximate    Figure 2: Distributions of the four most highly ranked discriminating variables: the difference between the invariant masses of the pseudoscalar bosons and the m a,hyp parameter, divided by the invariant mass of the four-photon system (upper left), with off-zero signal peaks from photon pairing mismatches (those events are discarded after the 4-photon event classifier selection); the difference between the invariant masses of the pseudoscalar boson (upper right); the photon identification BDT score of the third leading, γ 3 (lower left) and the fourth leading, γ 4 (lower right) photons. The events shown are selected from the m γγγγ sidebands (110 < m γγγγ < 115 GeV or 135 < m γγγγ < 180 GeV) for event mixing and data after fulfilling the selection criteria described in Section 4, while the signals are scaled with a cross-section of 1 pb. mean significance (AMS) [56] over all possible categories in a window covering the region: 115 < m γγγγ < 135 GeV. In particular, AMS is computed for each category, and the total AMS, defined as the sum in quadrature of the AMS of each category, is maximized. The AMS of each category is defined as: In Eq. 1, S refers to the number of expected signal events and B refers to the number of estimated background events. In order to minimize the impact of statistical fluctuations on the optimization of the BDT selection, the output BDT distribution of the event mixing dataset is smoothed, using the super-smoothing technique [57,58], prior to being used in this procedure. The distributions of the BDT output for data, simulated signal events, and event mixing dataset, after smoothing the BDT output distribution, is shown in Fig. 3 for m a = 15 and 50 GeV. Events shown in this distribution are selected after passing the criteria described in Section 4 and are in the mass window 110 < m γγγγ < 180 GeV. A priori, the greater the number of categories, the better the significance is within the uncertainties. Therefore this optimization procedure was performed separately for each of the nominal signal hypotheses for up to five categories of the 4-photon BDT score. However, as an increase of less than 1% in the AMS value was observed when increasing the number of categories beyond one, only a single category based on the BDT output was created for each pseudoscalar boson mass hypothesis.

Statistical procedure
The statistical procedure used in this analysis is identical to that described in Ref. [59], as developed by the ATLAS and CMS Collaborations. Simultaneous unbinned maximum likelihood fits are performed to the m γγγγ distributions of all analysis categories, with an m a granularity of 0.5 GeV for 15 < m a < 40 GeV and 1 GeV for 40 ≤ m a ≤ 62 GeV. A likelihood function is defined for each analysis category using analytic models to describe the m γγγγ distributions of signal and background events with nuisance parameters to account for the experimental and theoretical systematic uncertainties described in Section 11. The best fit values and confidence intervals for the parameters of interest are estimated using a profile likelihood test statistic: Where the quantities⃗ α and⃗ θ describe the unconditional maximum likelihood estimates for the parameters of interest and the nuisance parameters, respectively, whereas⃗ θ ⃗ α corresponds to the conditional maximum likelihood estimate for fixed values of the parameters of interest, ⃗ α. The value of twice the negative logarithm of the likelihood ratio, Eq. 2 is minimized when a fit of these functions is performed on the m γγγγ distribution. A penalty term, equal to the number of parameters in the functions, is added to the −2∆ ln L to prevent the addition of unnecessary floating parameters in the fit.

Signal model
The signal shape for the m γγγγ distribution, for each nominal signal hypothesis, is constructed from simulation. After all of the analysis selection criteria are applied, a unique signal model is built for each nominal signal hypothesis for each of the three data-taking years (2016-2018).
The m γγγγ distribution is modeled with a double-sided Crystal Ball (CB) function [60], which is a modified version of the standard CB function with two independent power-low tails. These signal models, scaled by the integrated luminosity for each year, are summed in order to construct the final model. The signal models for each year are shown in Fig. 4 for m a = 15 GeV. The full width at half maximum (FWHM) and the effective standard deviation (σ eff ), defined as half the width of the smallest interval containing 68% of the m γγγγ distribution, are also shown.
Two factors need to be considered to build the signal models for the intermediate mass hypotheses: the shape of the m γγγγ distribution and its yield. Since the shape of the m γγγγ distribution is not found to change significantly within a 5

Background model
The background model is built to describe the shape of the m γγγγ distribution that results from processes other than the signal process. Since the shape of this distribution is not known, different functional forms must be considered in the construction of the background model.
The function chosen to describe the background can result in a different number of estimated events in the signal peak and, as a result, affect the measured signal strength. This inherent uncertainty associated with choosing a background function is incorporated into the statistical uncertainty from the fit model via the discrete profiling method [61]. This method describes background modeling performed using data as implemented in Ref. [45], and treats the choice of the background function as a discrete nuisance parameter in the likelihood fit to data.
As part of the background modeling procedure, the candidate functions considered in the fit are exponentials, Bernstein polynomials, Laurent series, and power law functions. A subset of functions from each family are used to build the background model. For each family, the maximum order of parameters is fixed by means of an F-test [62], and the minimum order is determined by applying a requirement on the goodness-of-fit. A penalty is added to −2∆ ln L to take into account the number of floating parameters in each candidate function. When making a measurement of a given parameter of interest, the discrete profiling method minimizes the overall −2∆ ln L considering all allowed functions.
The fits are performed over the range 110 < m γγγγ < 180 GeV, and the data from the three years (2016-2018) are combined in order to construct the background model. A unique background model is created corresponding to each m a hypothesis. For each m a hypothesis, an ensemble of pseudo-experiments was generated using the various background functions. Each pseudo-experiment was fitted using the discrete profiling method, and it was established that the chosen functional form, used to describe the background, does not introduce any potential bias in the signal strength measurement.

Systematic uncertainties
The systematic uncertainty associated with the background estimation is taken into account by the discrete profiling method, as described in Section10. There are two kinds of systematic uncertainties that affect the signal model: those that modify the shape of the m γγγγ distribution, and those that leave the shape of the m γγγγ distribution unchanged but affect the overall normalization of the signal model.
The uncertainties that affect the shape of the m γγγγ distribution, which are incorporated in the signal model as nuisance parameters, are described below. These uncertainties are typically related to the energy of the individual photons, and they affect the mean and width of the signal model. 2. Nonlinearity of the photon energy scale: Any remaining differences in the linearity of the photon energy scale between data and simulation are covered by this uncertainty, which is estimated using Lorentz-boosted Z → ee events. The procedure for estimating this uncertainty is detailed in Ref. [4]. An uncertainty of 0.1% on the photon energy scale is assigned in this analysis, which accounts for the nonlinearity across the full range of photon p T values.

Shower shape corrections:
This uncertainty is associated with the imperfect modeling of shower shapes in simulation, and it is estimated by comparing the energy scale before and after any corrections are applied to the shower shape variables as described in Ref. [37]. This uncertainty in the energy scale is at most 0.01-0.15%, and it is dependent on the photon shower-shape and position in the detector.
4. Nonuniformity of light collection: Within a given ECAL crystal, there is an uncertainty associated with the modeling of the light collection as a function of the emission depth. This uncertainty is estimated by comparing simulation with the longitudinal shower profile estimates, and the procedure is detailed in Ref. [4]. The magnitude of this uncertainty is 0.07-0.25%, depending on the photon shower-shape.

5.
Modeling of material in front of the ECAL: The behavior of electromagnetic showers is affected by the amount of material present in front of the ECAL. This behavior may not be well modeled in simulation, and thus special samples with variations in the amount of upstream material are used to compute the impact on the photon energy scale [45]. For most central photons, the magnitude of this uncertainty ranges 0.02-0.05% and increases to approximately 0.24% for photons in the endcaps.
The uncertainties that affect the normalization of the signal model are listed below.
1. Integrated luminosity: Uncertainties in the luminosity measurement are estimated to be 1.2% (2016), 2.3% (2017), and 2.5% (2018) by CMS luminosity monitoring [40][41][42]. The uncertainty in the total integrated luminosity of the three years together is 1.6%. The uncertainties for each dataset are partially correlated in order to account for the common sources in the luminosity measurement schemes.

2.
Photon identification BDT score: The systematic uncertainty caused by the imperfect simulation of the input variables that are used to train the photon identification MVA is estimated by the procedure described in Ref.
[37]. The average magnitude of the resulting uncertainty is below 0.25% across the full m a range.
3. Trigger efficiency: The efficiency of the trigger selection is measured using a "tag-andprobe" technique on Z → ee events [63]. An additional uncertainty is introduced to account for a gradual shift in the timing to the inputs of the ECAL's hardware level trigger in the region |η| > 2.0, which caused a specific trigger inefficiency during 2016-2017 datataking [34]. The size of this uncertainty across the m a range is around 0.5% for 2016 and 2018 data-taking and around 1.5% for 2017 data-taking.

Photon preselections:
The systematic uncertainty on photon-based preselection is computed as the uncertainty on the ratio between the efficiency measured in data and in simulation. This is measured with the tag-and-probe technique using Z → ee events. The average magnitude of this uncertainty across the m a range is about 5%.
A summary of all the systematic uncertainties considered in this analysis is given in Table 2.
The impact of systematic uncertainties on the expected limit is about 1% across the m a range, and so the analysis sensitivity is primarily limited by the expected signal event yields. Because of this, only the main sources of systematic uncertainties are taken into account in the final result.

Results
The data collected by the CMS experiment in 2016, 2017, and 2018 are combined for the fit. The data and the signal-plus-background model that was fit to the m γγγγ distribution are shown in  No significant deviation from the background-only hypothesis is observed. Upper limits are set at the 95% confidence level (CL) on the product of the production cross section of the Higgs boson and the branching fraction into four photons via a pair of pseuodscalars, σ H B(H → aa → γγγγ). This is done using the modified frequentist approach for CL s , with the LHC profile likelihood ratio used as the test statistic [59,64]. The observed (expected) limit, shown in Fig. 6, ranges from 0.80 (1.00) fb for m a = 15 GeV to 0.26 (0.24) fb for m a = 62 GeV. For comparison, the Higgs production cross section for all channels combined is 52 pb [65].
The results presented in this section are provided in a tabulated form in the HEPDATA record [66] for this analysis.

Summary
A search for a pair of light pseudoscalar bosons produced from the decay of the 125 GeV Higgs boson, which subsequently decay into photons, is presented. The analysis is based on protonproton collision data collected at    [9] CMS Collaboration, "Measurement of the inclusive and differential Higgs boson production cross sections in the leptonic WW decay mode at √ s = 13 TeV", JHEP 03     [33] CMS Collaboration, "The CMS trigger system", JINST 12 (2017) P01020, doi:10.1088/1748-0221/12/01/P01020, arXiv:1609.02366.