Search for the decay of a Higgs boson in the $\ell\ell\gamma$ channel in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search for a Higgs boson decaying into a pair of electrons or muons and a photon is described. Higgs boson decays to a Z boson and a photon (H $\to$ Z$\gamma\to\ell\ell\gamma$, $\ell =$ e or $\mu$), or to two photons, one of which has an internal conversion into a muon pair (H $\to\gamma^{*}\gamma\to\mu\mu\gamma$) were considered. The analysis is performed using a data set recorded by the CMS experiment at the LHC from proton-proton collisions at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb$^{-1}$. No significant excess above the background prediction has been found. Limits are set on the cross section for a standard model Higgs boson decaying to opposite-sign electron or muon pairs and a photon. The observed limits on cross section times the corresponding branching fractions vary between 1.4 and 4.0 (6.1 and 11.4) times the standard model cross section for H $\to\gamma^{*}\gamma\to\mu\mu\gamma$ (H $\to$ Z$\gamma\to\ell\ell\gamma$) in the 120-130 GeV mass range of the $\ell\ell\gamma$ system. The H $\to\gamma^*\gamma\to\mu\mu\gamma$ and H $\to$ Z$\gamma\to\ell\ell\gamma$ analyses are combined for $m_\mathrm{H} =$ 125 GeV, obtaining an observed (expected) 95% confidence level upper limit of 3.9 (2.0) times the standard model cross section.


Introduction
Measurements of rare decays of the Higgs boson, such as H → γ * γ and H → Zγ, would enhance our understanding of the standard model (SM) of particle physics, and allow us to probe exotic couplings introduced by possible extensions of the SM [1][2][3][4]. The decay width can be modified by the theories involving heavy fermions, gauge bosons or charged scalars [5][6][7][8][9]. Simple extensions of the SM like two Higgs doublet models, or the minimal supersymmetric standard model also exhibit similar features [10]. Certain coefficients of the dimension-6 extension of the standard model effective field theory can be constrained by measuring the H → Zγ branching ratio precisely [11]. As an example, a model [10] which includes a hypercharge zero triplet extension, shows a modification in B(H → Zγ), with respect to the SM value, of about 10% for an additional scalar field with mass between 0 and 400 GeV.
In the search for H → γ * γ → γ, the leptonic channel, γ * /Z → ( = e or µ) is most promising as it has relatively low background. The diagrams in Fig. 1 illustrate the dominant Higgs boson decay channels contributing to these final states. The H → γ * γ → γ and H → Zγ → γ diagrams correspond to the same initial and final state and interfere with each other. Experimentally one can separate the off-and on-shell contributions, and define the respective signal regions, using a selection based on the invariant mass of the dilepton system, m = m γ * /Z . For the measurements presented in this paper a threshold of m = 50 GeV is used to separate the two processes.
The ATLAS and CMS Collaborations at the CERN LHC have both performed searches for the decay H → Zγ → γ [19,20] at √ s = 7 and 8 TeV. The ATLAS Collaboration set an upper limit on σ/σ SM of 11 (where σ SM is the expected cross section of the SM signal process) at 95% confidence level (CL) for an SM Higgs boson with m H = 125.5 GeV, and the CMS Collaboration set an upper limit of 9.5 at 95% CL for m H = 125 GeV. The CMS Collaboration also searched for the H → γ * γ → γ process with m < 20 (1.5) GeV in the dimuon (dielectron) channel at 8 TeV [21]. Combining the two channels, an upper limit of 6.7 at 95% CL was set on σ/σ SM for m H = 125 GeV. Recently, the ATLAS Collaboration also performed a search for H → Zγ → γ at √ s = 13 TeV using 36.1 fb −1 of data collected in 2016. This search set an upper limit on σ/σ SM of 6.6 at 95% CL for an SM Higgs boson with m H = 125.09 GeV [22]. This paper describes a search for Higgs bosons decaying to H → γ * γ → µµγ and H → Zγ → γ at 13 TeV. The study of the H → γ * γ → eeγ decay is challenging [21], because if m is low, the pair of electron showers merge in the electromagnetic calorimeter (ECAL). This merging makes it difficult to trigger on such events and also to reconstruct them offline. Therefore, this channel is not included in the present analysis.
The analysis uses a data sample of proton-proton (pp) collisions at a center-of-mass energy of 13 TeV recorded by the CMS experiment during 2016, corresponding to an integrated luminosity of 35.9 fb −1 . The sensitivity of the search is enhanced by dividing the selected events into mutually exclusive classes, according to the expected mass resolution and the signal-tobackground ratio, and then combining the results from each class. This paper is structured as follows. In Section 2, the CMS detector is described. The event selection used in the analysis is outlined in Section 3. Section 4 discusses about signal and background modeling. Systematic uncertainties and the results of this study are presented in Section 5, followed by the summary in Section 6.

The CMS detector and trigger
A detailed description of the CMS detector can be found in Ref. [23]. The central feature of the CMS apparatus is a superconducting solenoid, 13 m in length and 6 m in diameter, which provides an axial magnetic field of 3.8 T. Within the field volume there are several particle detection systems. Charged-particle trajectories are measured by silicon pixel and silicon strip trackers, covering 0 ≤ φ ≤ 2π in azimuth and |η| < 2.5 in pseudorapidity. A lead-tungstate crystal ECAL and a brass and scintillator hadron calorimeter (HCAL) surround the tracking volume and cover the region |η| < 3. They provide energy measurements of photons, electrons and hadronic jets. The ECAL is partitioned into a barrel region with |η| < 1.48 and two endcaps that extend up to |η| = 3. A lead and silicon-strip preshower detector is located in front of the endcap of the ECAL. Muons are identified and measured in gas-ionization detectors embedded in the steel return yoke outside the solenoid. The detector is nearly hermetic, allowing energy balance measurements in the plane transverse to the beam direction.
A two-level trigger system selects collision events of interest for physics analysis [24]. The trigger used in the H → γ * γ → µµγ channel requires a muon and a photon with transverse momenta, p T , greater than 17 and 30 GeV, respectively. The trigger efficiency is determined using signal events in simulation and µµγ events in data using an orthogonal data set selected with a single muon trigger. For events satisfying the selection criteria described in Section 3 the trigger efficiency is 83% in both cases. The H → Zγ → γ events are required to pass at least one of the dielectron or dimuon triggers. The dielectron trigger requires a leading (subleading) electron with p T greater than 23 (12) GeV. The dimuon trigger requires a leading (subleading) muon with p T greater than 17 (8) GeV. The efficiencies of these dilepton triggers as measured in data, for events satisfying the selection criteria, are dependent on p T and η of the leptons and are measured to be 90-98% and 93-95% for the eeγ and µµγ channels, respectively.

Event selection
Selected events are required to have at least one good primary vertex, with reconstructed longitudinal position within 24 cm of the geometric center of the detector and transverse position within 2 cm of the beam interaction region. Due to the high instantaneous luminosity of the LHC, there are multiple pp interactions per bunch crossing (pileup). In the case of multiple vertices, the vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex. The physics objects chosen are those that have been defined using information from the tracking detector, including jets, the associated missing transverse momentum, which is defined as the negative vector sum of the p T of those jets, and charged leptons. All leptons, which are used to select events, are required to have transverse and longitudinal impact parameters with respect to the primary vertex smaller than 5 and 10 mm, respectively.
The particle-flow (PF) event reconstruction algorithm [25] is used to reconstruct and identify each individual particle using an optimized combination of information from the various elements of the CMS detector.
Photon candidates are reconstructed from clusters of crystals in the ECAL with significant energy deposits [26]. Clusters are grouped into superclusters to recover the energy from electron bremsstrahlung and photons converting in the tracker. In the endcaps, the preshower detector energy is also included for the region covered by the preshower detector (1.65 < |η| < 2.6). The clustering algorithms result in almost complete recovery of the energy of photons. Photon candidates are selected with a multivariate discriminant that uses, as inputs, isolation variables, the ratio of the energy in the HCAL behind an electromagnetic supercluster to the supercluster energy, and the transverse width of the electromagnetic shower. Isolation variables are based on particle candidates from the PF algorithm. A conversion-safe electron veto [26] is applied to avoid misidentifying an electron as a photon. This vetoes events that have a charged particle track with a hit in the inner layer of the pixel detector that points to the photon cluster in the ECAL, unless that track is matched to a conversion vertex. Photons are required to lie in the geometrical region |η| < 2.5 and have p T > 15 GeV. The efficiency of the photon identification is measured from Z → ee events using tag-and-probe techniques [27]. It is found to be between 84 and 91 (77 and 94)% in the barrel (endcaps) depending on the p T of the photon, after including the electron veto inefficiencies measured with Z → µµγ events, where the photon is produced by final-state radiation.
Electron reconstruction starts from superclusters in the ECAL, which are matched to hits in the silicon strip and the pixel detectors. The energy of electrons is determined from a combination of the electron track momentum at the main interaction vertex and the energy of the corresponding ECAL cluster. Electrons are selected using a multivariate discriminant that includes observables sensitive to the presence of bremsstrahlung along the electron trajectory, the geometrical and momentum-energy matching between the electron trajectory and the energy of the associated cluster in the ECAL, the shape of the electromagnetic shower in the ECAL, and the variables that discriminate against electrons originating from photon conversions [13]. In this analysis, we accept electrons with p T > 7 GeV and |η| < 2.5.
Muon candidates are reconstructed in the tracker and identified by the PF algorithm using hits in the tracker and the muon systems. The matching between the inner and outer tracks proceeds either outside-in, starting from a track in the muon system, or inside-out, starting from a track in the silicon tracker. In the latter case, tracks that match track segments in only one or two planes of the muon system are also considered in the analysis in order to collect very low-p T muons that may not have sufficient energy to penetrate the entire muon system. The muons are selected from the reconstructed muon track candidates by applying minimal requirements on the track in both the muon system and inner tracker system, and taking into account compatibility with small energy deposits in the calorimeters. We accept muons with p T > 4 GeV and |η| < 2.4 [13].
The relative isolation variable, used to select prompt leptons, is defined as: and is required to be less than 0.35, where ∑ p charged T is the scalar sum of the transverse momenta of charged hadrons originating from the primary vertex, ∑ p neutral T and ∑ p γ T are the scalar sums of the transverse momenta for neutral hadrons and photons, respectively, and ∑ p PU T ( ) accounts for the contribution of neutral pileup particles. The isolation sums are performed over a cone of angular radius ∆R = √ (∆φ) 2 + (∆η) 2 = 0.3 around the lepton direction at the primary vertex. For muons, p PU T (µ) ≡ 0.5 ∑ i p PU,i T , where i runs over the momenta of the charged hadron PF candidates not originating from the primary vertex. For electrons, p PU T (e) ≡ ρ A eff , where the effective area A eff is a coefficient that is dependent on electron η and is chosen in such a way that the isolation efficiency is independent of pileup (PU), and ρ is the median of the p T density distribution for neutral particles [28][29][30]. Finally, p T is the transverse momentum of the selected lepton. To suppress muons originating from non-prompt decays of hadrons and electrons from photon conversions, we require each lepton track to have a 3D impact parameter with respect to the primary vertex that is less than four times its uncertainty.
The optimized electron selection criteria, including the isolation requirement, give an efficiency of approximately 85-93 (81-92)% in the barrel (endcaps) for electrons from W or Z bosons. For muons, the identification is tuned to maintain efficiency at low ∆R where the two muons are close to each other. The identification and isolation efficiency for single muons from Z → µµ or J/ψ meson decays is 85-97 (88-96)% in the barrel (endcaps). In case of the H → γ * γ → µµγ, the ∆R(µµ) between the two muons is small due to their low invariant mass and the high p T of the γ * . Hence, no isolation requirement is applied to the subleading muons as they are within the isolation cone of the leading muons in most events. The identification efficiency of muons from γ * is approximately 94-98 (92-97)% in the barrel (endcaps).
Selected events are classified as described in detail below. The dijet-tagged (explained in Section 3.1) event class uses jets that are built by clustering the PF candidates using the anti-k T clustering algorithm with distance parameter of 0.4 using the FASTJET software package [28]. Charged PF candidates from pileup vertices are discarded to reduce the contribution to the jet energies from pileup interactions. An offset correction is applied to account for the remaining contributions. In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to account for any residual differences in jet energy scale in data and simulation. Additional selection criteria are applied to each event to remove spurious jet-like features originating from isolated noise patterns in certain HCAL regions. Calibrated and corrected jets are required to have p T greater than 30 GeV and |η| < 4.7, and to be separated by at least 0.4 in ∆R from leptons and photons passing the selection requirements described above.

H → γ * γ → µµγ selection
In the H → γ * γ → µµγ search we select events with two muons and a photon, where the muons must have opposite charges and p T > 20 (4) GeV for the leading (subleading) muon. The p T requirement on the leading muon is driven by the trigger threshold, and that on the subleading muon by the minimum energy needed to reach the muon system. The photon and dimuon transverse momenta both must satisfy p T > 0.30m µµγ , where m µµγ is the invariant mass of the µµγ system. This requirement rejects the γ * +jet and γ+jet backgrounds without any loss in the signal sensitivity and without introducing a bias in the m µµγ spectrum. The separation between each muon and the photon is required to satisfy ∆R > 1 in order to suppress Drell-Yan background events with final-state radiation.
The dimuon invariant mass is required to be less than 50 GeV to make this selection and the Zγ selection described in Section 3.2 mutually exclusive. Events with a dimuon mass in the ranges 2.9 < m µµ < 3.3 GeV and 9.3 < m µµ < 9.7 GeV are rejected to avoid J/ψ → µµ and Υ(nS) → µµ contamination, respectively. The invariant mass m µµγ is required to satisfy 110 < m µµγ < 170 GeV. In the cases where there are multiple dilepton pairs in the event, the one with the smallest dimuon invariant mass is chosen.
A variable R 9 is defined as the energy sum of the 3×3 ECAL crystals centered on the most energetic crystal in the supercluster divided by the energy of the supercluster. The selected events are separated into four mutually exclusive event classes based on the R 9 and η of the photon and the presence of jets. An R 9 value of 0.94 is used to separate the reconstructed photons into two regions. The region containing unconverted photons, with larger values of R 9 and better energy resolution, has a smaller background. By separating events into two regions of low/high R 9 value, the sensitivity of the analysis is increased. We therefore have the following four categories: events that require the presence of at least two jets passing the selection criteria as described below; photon in the ECAL barrel (EB) region with a high R 9 value; photon in the barrel with low R 9 value; and photon in the ECAL endcap (EE) regions. Only events that do not pass the dijet tag are included in the EB or EE classes. By using this event classification scheme, as opposed to combining all events into one class, the sensitivity of this analysis is increased by 11%.
For the dijet tag event class the two highest transverse energy jets are used and the requirements are: (i) the difference in pseudorapidity between the two jets is greater than 3.5; (ii) the Zeppenfeld variable [31] (η γ − (η j1 + η j2 )/2) is less than 2.5, where η γ is the η of the γ system and η j1 and η j2 are the pseudorapidities of the leading and subleading jets, respectively; (iii) the dijet mass is greater than 500 GeV; and (iv) the difference in azimuthal angles between the dijet system and the γ system is greater than 2.4. These requirements mainly target the vector boson fusion (VBF) production mechanism of the Higgs boson.

H → Zγ → γ selection
In the H → Zγ → γ search, events with a photon and with at least two same-flavor leptons (e or µ) consistent with a Z boson decay are selected. All particles must be isolated, and have p T greater than 25 (15) GeV for the leading (subleading) electron, 20 (10) GeV for the leading (subleading) muon, and 15 GeV for the photon. In the cases where there are multiple dilepton pairs in the event, the one with the mass closest to the Z boson nominal mass [32] is selected. The invariant mass of the selected pair is required to be larger than 50 GeV. This ensures that the H → Zγ → γ event selection is orthogonal to that for H → γ * γ → µµγ.
The events are required to have a photon with E T > 0.14m γ , which rejects Z+jet background without significant loss in signal sensitivity and without introducing a bias in the m γ spectrum. Leptons are required to have ∆R > 0.4 with respect to the photon in order to reject events with final-state radiation. In addition, we require m γ + m > 185 GeV to reject events with final-state radiation from Drell-Yan processes. Finally, the invariant mass of the γ system is required to be 115 < m γ < 170 GeV.
The selected events are classified into mutually exclusive categories. A lepton-tag class contains events with an additional electron (or muon) with p T > 7 (5) GeV, to target Higgs boson production in association with either a Z or W boson. Events not included in the lepton class are considered for the dijet class. In this case the criteria described in Section 3.1 are used to select events containing a dijet, targeting Higgs boson production in a VBF process. The next class considered is the boosted class, which requires that the p T of the γ system is greater than 60 GeV in order to enhance the fraction of events that contain a Lorentz-boosted Higgs boson recoiling against a jet. Events that do not fall into these three classes are placed in the untagged categories. A significant fraction of the signal events are expected to have the photon and both leptons in the barrel, while only a sixth of the signal events have the photon in the endcap. This is in contrast to the background, where about one third of the events are expected to have a photon in the endcap. Furthermore, events where the photon does not convert to e + e − have a smaller fraction of background events and better energy resolution. For these reasons, the untagged events are classified into four categories according to the pseudorapidity of the leptons and photon, and the R 9 value of the photon. These categories are indicated as untagged 1, untagged 2, untagged 3 and untagged 4 as shown in Table 1.
It should be noted that the electron and muon channels are considered separately in all classes except for the lepton-tag class where the number of events is small. This event classification scheme increases the sensitivity of the analysis by 18%. The resulting acceptance times efficiency for pp → H → Zγ → γ in the electron (muon) channel is between 18 and 24 (25 and 31)% for m H between 120 and 130 GeV.
A complete list of all the categories considered in the analysis (pp → H → γ * γ → µµγ and pp → H → Zγ → γ), together with the expected yields for a 125 GeV SM Higgs boson signal processes, is shown in Table 2. This table also reports yields from signal processes: gluongluon fusion (ggH), vector boson fusion (VBF), associated VH production (VH) and Higgs boson production in association with top quarks (ttH).

Signal and background modeling
The search for signal events is performed using a shape-based analysis of γ invariant mass distributions. The background is estimated from data and the signal is estimated using the simulation. Even though the background is estimated from data, simulated samples are used in the H → Zγ → γ search to optimize the event classes. The main background, pp → Zγ, is generated at next-to-leading order (NLO) using the MADGRAPH5 aMC@NLO generator [33]. The Z( )+jets events with a jet misidentified as a photon are another important source of background and are generated at NLO using MADGRAPH5 aMC@NLO. The NLO parton distribution function (PDF) set, NNPDF3.0 [34], and the CUETP8M1 [35] underlying event tune are used to generate these samples. All background events are interfaced with PYTHIA 8.205 [36,37] for the fragmentation and hadronization of partons. No requirement on R 9 Signal samples for the H → γ * γ → µµγ produced via ggH, VBF, and VH processes are simulated at NLO with MADGRAPH5 aMC@NLO 2.3.3, with the Higgs boson characterization framework [38,39]. The ttH production mechanism gives a negligible contribution to the signal and is therefore ignored. For the H → Zγ → γ process, the simulated events from all four production mechanisms are generated at NLO using POWHEG v2.0 [40,41]. All signal samples are interfaced with PYTHIA 8.212 with the CUETP8M1 underlying event tune for hadronization and fragmentation. The NLO PDF set, NNPDF3.0, is used to produce these samples. The SM Higgs boson production cross sections and branching fractions recommended by the LHC Higgs cross section working group [14] are used for H → Zγ, whereas for H → γ * γ the Higgs boson production cross sections are also taken from Ref. [14], but the branching fraction of H → γ * γ is taken from the MCFM calculation and given in Eq.( 1).
The simulated signal and background events are reweighted by taking into account the difference between data and simulated events so that the distribution of pileup vertices, the trigger efficiencies, the resolution, the energy scale, the reconstruction efficiencies, and the isolation efficiency-for electrons, muons, and photons-observed in data are reproduced. An additional correction is applied to photons to reproduce the performance of the R 9 shower shape variable.
The dominant backgrounds to H → γ consist of the irreducible non-resonant SM γ production, final-state radiation in Z decays, γ * conversions, and Drell-Yan production in association with jets, where a jet or a lepton is misidentified as a photon. The background is estimated from data, by fitting the observed γ mass distributions. Separate fits are performed to the four event classes for the H → γ * γ → µµγ analysis and the thirteen classes for the H → Zγ → γ analysis. For the H → γ * γ → µµγ (H → Zγ → γ) analysis, the range 110(115) < m γ < 170 GeV is used in the fit. The fit model of the signal is obtained from an unbinned fit to the mass distribution of the corresponding sample of simulated events, using a double Crystal Ball function [42] in the H → γ * γ → µµγ analysis, and a Crystal Ball function plus a Gaussian function in the H → Zγ → γ analysis. To derive the signal shapes for the intermediate mass points where simulation was not available, a linear interpolation of the fitted parameters for available mass points was performed.
The choice of the background fit function is based on a study that minimizes the bias that could be introduced by the selected function. The study of the bias is performed for four families of functions: with 2N free parameters: p i < 0 and f i . The lowest order considered has N = 1.

A sum of N power-functions
with 2N free parameters p i < 0 and f i . The lowest order considered has N = 1.
A test is then performed to determine the best order in each family. In this test, the difference in the negative log-likelihood (NLL) between fits performed to data with two different orders of the same family of functions, N and N+M, indicates whether the data support the hypothesis of the higher-order function. A p-value of this quantity is then calculated as: where ∆NLL is the difference of log-likelihood between the two fits; ∆NLL N+M = 2(NLL N − NLL N+M ) follows a χ 2 distribution with M degrees of freedom, where M is the difference in the number of free parameters between the N+M function and the N function; NLL N and NLL N+M are the values of the log-likelihood of the fit to data using N th and (N+M) th order functions from a family. If the p-value is less than 0.05, the higher order function is supported by the data and the procedure is then applied to other higher order functions in the same family. The procedure stops when the p-value becomes greater than 0.05.
Once the best order of each family is determined for each category, pseudo-experiments (with no injected signal) describing possible experimental outcomes are randomly generated using each of the determined functions as generators of background. A signal-plus-background fit is performed for each of these sets of pseudo-experiments with all other background functions of the chosen order, so that the presence of a possible bias introduced by the fitting function can be determined. In each fit, the bias is estimated with a pull variable, computed as (µ FIT − µ t )/σ FIT , where µ FIT and σ FIT are the mean and the standard deviation of the signal strength determined from the signal-plus-background fit, and µ t is the true injected signal strength, which is zero in this case. A given fit function is deemed acceptable in a given category if its pull is less than 14% of the statistical uncertainty when fitting pseudo-experiments generated with all of the other functional families. With this requirement, the error on the frequentist coverage of the quoted measurement in the analysis is less than 1%, where the coverage is defined as the fraction of experiments in which the true value is contained within the confidence interval. Table 3 shows the fit functions chosen in each category of the analysis.
The background fits based on the m γ data distributions for the event categories of the H → γ * γ → µµγ analysis are shown in Fig. 2 and, for the electron and muon channels in all H → Zγ → γ event class definitions, in Figs. 3 and 4, respectively. Finally, Fig. 5 shows the background fit for the lepton tag category in the H → Zγ → γ analysis. As we can see from these figures, the background fits describe the data well.

Systematic uncertainties and results
No significant deviation from the background-only hypothesis is observed. The data are used to derive upper limits on the Higgs boson production cross section times the branching fractions, σ(pp → H) B(H → γ * γ → µµγ) and σ(pp → H) B(H → Zγ → γ), divided by the corresponding SM predictions. The limits are evaluated using a modified frequentist approach, asymptotic CL s , taking the profile likelihood as a test statistic [43][44][45][46]. An unbinned evaluation of the likelihood is considered.
Background uncertainties are taken from the fit to the data. As for the uncertainties related to the signal yield, the following sources of systematic uncertainties are considered: • Electron and photon energy scale and resolution: The electromagnetic energy scale is known with 0.15-0.5 (1)% precision in EB (EE). To quantify the corresponding uncertainty, the electron and photon energies are varied and the effects on signal mean and resolution are propagated as shape nuisance parameters in the estimation of limits.
• Muon momentum scale and resolution: The uncertainty in the muon momentum scale is 1%. To quantify the corresponding uncertainty, the muon momentum scale is varied and the effect on signal mean and resolution is propagated as a shape nuisance parameter in the estimation of limits.
• Integrated luminosity: The uncertainty in the integrated luminosity is 2.5% [47]. This is applied as a normalization uncertainty to the total expected yield of the signal.
• Object identification and isolation: The corrections applied to the simulation to reproduce the performance of the lepton and photon selection are measured with Z → ee and Z → µµ events. • Pileup: The uncertainty from the description of the pileup in the signal simulation is estimated by varying the total inelastic cross section by ±4.6% [48]. • Jet-energy scale and resolution: The uncertainties in the jet energy scale and resolution are accounted for by changing the jet response and resolution by ∼2%. • Underlying event and parton shower uncertainty: The uncertainty associated with the choice and tuning of the generator is estimated with dedicated samples which are generated by varying the parameters of the tune used (CUETP8M1) to generate the original signal samples. The difference in signal yields with respect to the nominal configuration is propagated as the uncertainty.
• R 9 reweighting: This shower-shape variable in the signal simulation is reweighted to match that in the data. This reweighting introduces an uncertainty that is estimated by removing the R 9 reweighting in the simulation and then estimating the yields in the categories where R 9 is used for categorization.
• Theoretical uncertainties: These include the systematic uncertainties from the effect of the choice of PDF on the signal cross section [49][50][51] and the uncertainty in the Higgs boson branching fraction prediction. The uncertainty in the branching ratio of H → Zγ is calculated to be 5.6% [14], and in the case of H → γ * γ analysis, it is assumed to be 6%.
The pre-fit values of the nuisance parameters, averaged over all the categories, are summarized in Table 4.
Based on the fit bias studies, the uncertainty in the background estimation due to the chosen functional form is assumed to be negligible. Furthermore, to combine the H → Zγ → γ and H → γ * γ → µµγ channels, uncertainties from theoretical sources, integrated luminosity, object identification, R 9 reweighting, jet energy correction and resolution are considered to be correlated across the categories.
The expected and observed exclusion limits at 95% CL for the process H → γ * γ → µµγ are shown in Fig 6. The expected limits are between 2.1 and 2.3 times the SM cross section and the observed limit varies between about 1.4 and 4.0 times the SM cross section. The limits are calculated at 1 GeV intervals in the mass range of 120 < m H < 130 GeV. Figure 6 also shows the combined limit for the H → Zγ → γ channel. The expected exclusion limits at 95% CL are between 3.9 and 9.1 times the SM cross section and the observed limit varies between about 6.1 and 11.4 times the SM cross section.
Finally, Fig. 7 shows the expected limit for each category and the combined limit for both channels for m H = 125 GeV. The combined observed (background only expected) limit is 3.9 (2.0) for a 125 GeV Higgs boson decaying to γ. The same figure shows the combined expected limit of 2.9, assuming an SM Higgs boson with m H = 125 GeV, decaying to the γ channel. After combining both analyses, H → γ * γ → µµγ and H → Zγ → γ and considering the background-only hypothesis, the observed p-value at m H = 125 GeV is 0.02, which corresponds to about two standard deviations. The combined expected p-value for an SM Higgs boson at m H = 125 GeV is 0.16, corresponding to a significance of around one standard deviation.

Summary
A search is performed for a standard model (SM) Higgs boson decaying into a lepton pair and a photon. This final state has contributions from Higgs boson decays to a Z boson and a photon (H → Zγ → γ, = e or µ), or to two photons, one of which has an internal conversion into a muon pair (H → γ * γ → µµγ). The analysis is performed using a data set from pp collisions at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb −1 . No significant excess above the expected background is found. Limits on the Higgs boson production cross section times the corresponding branching fractions are set. The expected exclusion limits at 95% confidence level are about 2.1-2.3 (3.9-9.1) times the SM cross section in the H → γ * γ → µµγ (H → Zγ → γ) channel in the mass range from 120 to 130 GeV, and the observed limit varies between about 1.4 and 4.0 (6.1 and 11.4) times the SM cross section. Finally, the H → γ * γ → µµγ and H → Zγ → γ analyses are combined for m H = 125 GeV, obtaining an observed (expected) 95% confidence level upper limit of 3.9 (2.0) times the SM cross section.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies:   [24] CMS Collaboration, "The CMS trigger system", JINST 12 (2017) P01020, doi:10.1088/1748-0221/12/01/P01020, arXiv:1609.02366.