Search for a very light NMSSM Higgs boson produced in decays of the 125 GeV scalar boson and decaying into τ leptons in pp collisions at s=8$$ \sqrt{s}=8 $$ TeV

A bstractA search for a very light Higgs boson decaying into a pair of τ leptons is presented within the framework of the next-to-minimal supersymmetric standard model. This search is based on a data set corresponding to an integrated luminosity of 19.7 fb−1 of proton-proton collisions collected by the CMS experiment at a centre-of-mass energy of 8 TeV. The signal is defined by the production of either of the two lightest scalars, h1 or h2, via gluon-gluon fusion and subsequent decay into a pair of the lightest Higgs bosons, a1 or h1. The h1 or h2 boson is identified with the observed state at a mass of 125 GeV. The analysis searches for decays of the a1 (h1) states into pairs of τ leptons and covers a mass range for the a1 (h1) boson of 4 to 8 GeV. The search reveals no significant excess in data above standard model background expectations, and an upper limit is set on the signal production cross section times branching fraction as a function of the a1 (h1) boson mass. The 95% confidence level limit ranges from 4.5 pb at ma1$$ {m}_{{\mathrm{a}}_1} $$mh1=8$$ \left({m}_{{\mathrm{h}}_1}\right)=8 $$ GeV to 10.3 pb at ma1$$ {m}_{{\mathrm{a}}_1} $$mh1=5$$ \left({m}_{{\mathrm{h}}_1}\right)=5 $$ GeV.



Introduction
The recently discovered particle with mass close to 125 GeV [1][2][3] has been shown to have properties that are consistent with those of a standard model (SM) Higgs boson [4][5][6][7][8][9][10][11][12][13][14]. Supersymmetric (SUSY) extensions of the SM [15,16] also predict a particle with such properties and resolve some problems of the SM [17]. The minimal supersymmetric standard model (MSSM) [18,19] postulates the existence of two Higgs doublets, resulting in five physical states: two CP-even, one CP-odd, and two charged Higgs bosons. This version of SUSY has been extensively tested using data collected by the ATLAS and CMS experiments at the CERN LHC. However, nonminimal SUSY extensions have received far less attention. One example is the next-to-MSSM (NMSSM), which extends the MSSM by an additional singlet superfield, interacting only with itself and the two Higgs doublets [18,[20][21][22][23][24][25][26]. This scenario has all the desirable features of SUSY, including a solution of the hierarchy problem and gauge coupling unification. In the NMSSM, the Higgs mixing parameter µ is naturally generated at the electroweak scale through the vacuum expectation value of the singlet field, thereby solving the so-called µ problem of the MSSM [27].

JHEP01(2016)079
Furthermore, the amount of fine tuning required in the NMSSM to obtain a CP-even Higgs boson with a mass of 125 GeV is significantly reduced compared to the MSSM [28][29][30]. The Higgs sector of the NMSSM is larger than that of the MSSM. There are seven Higgs bosons: three CP-even (h 1,2,3 ), two CP-odd (a 1,2 ), and two charged Higgs states. By definition, m h 3 > m h 2 > m h 1 and m a 2 > m a 1 . Over large parts of the NMSSM parameter space, the observed boson with mass close to 125 GeV, hereafter denoted H(125), could be identified with one of the two lightest scalar NMSSM Higgs bosons, h 1 or h 2 .
A vast set of next-to-minimal supersymmetric models is consistent with the SM measurements and constraints from searches for SUSY particles made with LHC, Tevatron, SLAC and LEP data, as well as with the properties of the H(125) boson measured using Run 1 LHC data [31][32][33][34][35][36]. These models provide possible signatures that cannot be realized in the MSSM given recent experimental constraints [37]. For example, the decays H(125) → h 1 h 1 and H(125) → a 1 a 1 are allowed when kinematically possible. These decay signatures have been investigated in phenomenological studies considering a variety of production modes at the LHC [38][39][40][41][42][43][44][45]. The analysis presented in this paper is motivated by the NMSSM scenarios that predict a very light h 1 or a 1 state with mass in the range 2m τ < m h 1 (m a 1 ) < 2m b , where m τ is the mass of the τ lepton and m b is the mass of the b quark. Such a light state is potentially accessible in final states with four τ leptons, where H(125) → h 1 h 1 (a 1 a 1 ) → 4τ [46,47]. In these scenarios the decay H(125) → a 2 a 2 is not kinematically allowed.
Several searches for H(125) → φ 1 φ 1 decays, where φ 1 can be either the lightest CP-even state h 1 or the lightest CP-odd state a 1 , have been performed. The analyses carried out by the OPAL and ALEPH Collaborations at LEP [48,49] searched for the decay of the CPeven Higgs boson into a pair of light CP-odd Higgs bosons, exploiting the Higgs-strahlung process, where the CP-even state is produced in association with a Z boson. These searches found no evidence for a signal, and limits were placed on the signal production cross section times branching fraction. However, searches at LEP did not probe masses of the CP-even state above 114 GeV. A similar study has been performed by the D0 Collaboration at the Tevatron [50], searching for inclusive production of the CP-even Higgs boson in pp collisions followed by its decay into a pair of light CP-odd Higgs bosons. No signal was detected and upper limits were set on the signal production cross section times branching fraction in the mass ranges 3.6 < m a 1 < 19 GeV and 89 < m H < 200 GeV. The limits set by the D0 analysis are a factor one to seven times higher compared to the SM production cross section for pp → H(125) + X.
The CMS Collaboration has recently searched for a very light CP-odd Higgs boson produced in decays of a heavier CP-even state [51]. This study probed the mass of the CP-odd state in the range 2m µ < m a 1 < 2m τ , where m µ is the mass of muon. In this mass range the decay a 1 → µµ can be significant. No evidence for a signal was found and upper limits were placed on the signal production cross section times branching fraction. The ATLAS Collaboration has also recently searched for h/H → a 1 a 1 → µµτ τ [52], covering the mass range m a 1 = 3.7-50 GeV for m H = 125 GeV, and m H = 100-500 GeV for m a 1 = 5 GeV. No excess over SM backgrounds was observed, and upper limits were placed on The search for the production of a pair of light bosons with their subsequent decay into four τ leptons has not yet been performed at the LHC and is the subject of this paper. The choice of the 4τ channel makes it possible to probe the signal cross section times branching fraction in a model-independent way.

Signal topology
This paper describes a search for the production of the H(125) boson, with its decay into a pair of light NMSSM Higgs bosons φ 1 . The signal can be associated with one of three possible scenarios: • H(125) corresponds to h 2 and decays into a pair of h 1 states, h 2 → h 1 h 1 ; • H(125) corresponds to h 2 and decays into a pair of a 1 states, h 2 → a 1 a 1 ; • H(125) corresponds to h 1 and decays into a pair of a 1 states, h 1 → a 1 a 1 .
The analysis is optimized for the gluon-gluon fusion process, which is the dominant production mechanism of the H(125) boson at the LHC. The signal topology is illustrated in figure 1. The search is performed for very light φ 1 states, covering a mass range of 4 to 8 GeV. Within this mass range the φ 1 boson is expected to decay predominantly into a pair of τ leptons, φ 1 → τ τ . In the decay of each φ 1 , one of the τ leptons is identified via its muonic decay. The other τ lepton is required to decay into a one-prong mode, i.e. a decay into one charged particle (electron, muon, or hadron) and one or more neutral particles. We identify these decays by the presence of one reconstructed track with charge sign opposite to that of the closest muon. Neutral particles are not considered in the event selection.
Given the large difference in mass between the φ 1 and the H(125) states (m H(125) m φ 1 ), one expects the φ 1 bosons to have large Lorentz boosts and their decay products to be collimated. Furthermore, in the gluon-gluon fusion process the H(125) state is mainly produced with relatively small transverse momentum p T . Thus, in the majority of H(125) → φ 1 φ 1 decays, the φ 1 states would be produced nearly back-to-back in the plane transverse to the beams, with a large separation in azimuthal angle φ between the decay products of the two φ 1 bosons. The H(125) state can be produced with relatively high transverse momentum if a hard gluon is radiated from the initial-state gluons or the heavy-quark loop. In this case, the separation between two φ 1 bosons in azimuthal angle is reduced, while the separation in pseudorapidity η can still be large. The pseudorapidity is defined via the polar angle θ as η ≡ − ln [tan (θ/2)].  The label "µ ∓ /e ∓ /h ∓ " denotes a muon, electron, or charged-hadron track.
The φ 1 → τ τ decays into final states without muons are not considered in the analysis. These decays are mimicked by hadronic jets with a significantly higher probability compared to final states with at least one muon and contribute marginally to the search sensitivity.
The signal properties discussed above are used to define the search topology. The analysis presented here searches for the signal in a sample of dimuon events with large angular separation between the muons. The two muons are required to have the same sign. This criterion almost entirely eliminates background from the Drell-Yan process, gauge boson pair production, and tt production. Each muon is accompanied by one nearby oppositesign track. Further details of the kinematic selection are given in section 4. Throughout this paper, the signal yields are normalized to the benchmark value of the signal production cross section times branching fraction of 5 pb. The choice of the benchmark scenario is motivated by recent phenomenological analyses [46,47].

CMS detector, data, and simulated samples
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a field of 3.8 T. The innermost component of the detector is a silicon pixel and strip tracker, which is used to measure the momenta of charged particles and reconstruct collision vertices. The tracker, which covers the pseudorapidity range |η| < 2.5, is surrounded by a crystal electromagnetic calorimeter and a brass and scintillator hadronic calorimeter, both placed inside the solenoid. These calorimeters cover |η| < 3.0. A quartz fiber Cherenkov forward hadron detector extends the calorimetric coverage to |η| < 5.0. Muons are measured in the pseudorapidity range |η| < 2.4, with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. The first level of the CMS trigger system, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select the most interesting events in a fixed time interval of 4 µs. The high-level trigger processor farm further decreases the event rate from around 100 kHz to less than 1 kHz before data storage. A more detailed -4 -

JHEP01(2016)079
description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in ref. [53].
The data set used in this analysis was recorded in 2012 and corresponds to an integrated luminosity of 19.7 fb −1 of pp collisions at √ s = 8 TeV. The Monte Carlo (MC) event generator pythia 6.426 [54] is used to model the NMSSM Higgs boson signal produced via gluon-gluon fusion. The H(125) boson p T spectrum from pythia is reweighted to the spectrum obtained from a next-to-leading-order computation with a next-to-next-to-leading logarithmic accuracy using the HqT 2.0 program [55,56], which performs the resummation of the large logarithmic contributions appearing at transverse momenta much smaller than the mass of the Higgs boson. For optimisation studies, diboson and quantum chromodynamics (QCD) multijet backgrounds are simulated by pythia. Inclusive Z, W, and tt production are modelled with MadGraph 5.1 [57]. The MadGraph generator is interfaced with pythia for parton showering and fragmentation. The pythia parameters that steer the simulation of hadronisation and the underlying event are set to the most recent pythia Z2* tune. This tune is derived from the Z1 tune [58], which uses the CTEQ5L parton distribution function (PDF) set, whereas Z2* adopts the CTEQ6L PDF set [59]. The tauola package [60] is used for τ lepton decays in all cases. All generated events, with the exception of a few special QCD multijet samples discussed in section 6.2, are processed through a detailed simulation of the CMS detector, based on Geant4 [61], and are reconstructed employing the same algorithms as for data.

Event selection
Events are recorded using double-muon triggers with thresholds on the muon transverse momenta of 17 GeV for the leading muon and 8 GeV for the subleading one. To pass the highlevel trigger, the tracks of the two muons are additionally required to have points of closest approach to the beam axis within 2 mm of each other along the longitudinal direction.
In 2012, the average number of pp interactions per LHC bunch crossing (pileup) was about 20. The simulated MC events are reweighted to represent the distribution of the number of pileup interactions per bunch crossing in data.
For each reconstructed collision vertex, the sum of the p T 2 of all tracks associated with the vertex is computed. The vertex for which this quantity is largest is assumed to correspond to the hard-scattering process, and is referred to as the primary vertex (PV). The identification and reconstruction of muons is achieved by matching track segments found in the silicon tracker with those found in the muon detectors [62]. Additional requirements are applied on the number of measurements in the inner pixel and outer silicon strip detectors, on the number of matched segments in the muon detectors, and on the quality of the global muon track fit, quantified by χ 2 .
The data are further selected by requiring at least one pair of muons with the same charge. This requirement significantly suppresses background contributions originating from the Drell-Yan process, from decays of tt pairs, and from QCD multijet events with muonic decays of heavy-flavour hadrons. The leading muon is required to have p T > 17 GeV and |η| < 2.1. The subleading muon is required to have p T > 10 GeV and |η| < 2.1. To -5 -

JHEP01(2016)079
reject QCD multijet events with muonic decays of hadrons containing charm or bottom quarks, selections are applied on the impact parameters of the muon tracks. The impact parameter in the transverse plane is required to be smaller than 300 µm with respect to the PV. The longitudinal impact parameter is required to be smaller than 1 mm with respect to the PV. The two selected same-sign muons are required to be separated by ∆R(µ, µ) = √ ∆η 2 + ∆φ 2 > 2, where ∆η is the separation in pseudorapidity and ∆φ is the separation in azimuthal angle between the two muons. If more than one same-sign muon pair is found in the event, the pair with the largest scalar sum of muon transverse momenta is chosen.
The analysis makes use of reconstructed tracks that fulfill selection criteria based on the track fit quality, the number of measurements in the inner pixel and outer strip silicon detector, and track impact parameters with respect to the PV [63]. Tracks must have p T > 1 GeV and |η| < 2.4. The impact parameter in the transverse plane and the longitudinal impact parameter are required to be smaller than 1 cm relative to the PV.
Given the search topology, we require each muon to be accompanied by exactly one track satisfying these criteria within a ∆R cone of radius 0.5 centred on the muon direction. We label such muon-track pairs as "isolated".
The loose impact parameter requirements on the tracks are designed to suppress background events in which a heavy-flavour hadron decays into a muon and several charged particles. Although tracks from these decay products will be displaced from the PV, they can still satisfy the loose track impact parameter criteria. Such events are rejected by the requirement of exactly one track accompanying the muon.
The track around each muon is identified as a one-prong τ lepton decay candidate if it fulfils the following selection criteria.
• The nearby track is required to have charge opposite to the muon.
• The transverse and longitudinal impact parameters of the track are required to be smaller than 200 µm and 400 µm relative to the PV, respectively.

Signal extraction
The set of selection requirements outlined in the previous section defines the signal region. The number of selected data events, the expected background and signal yields, and the signal acceptances after selection in the signal region are reported in table 1. The expected background and signal yields, along with the signal acceptances, are obtained from simulation. The signal yields are normalized to the benchmark value of the signal production cross section times branching fraction of 5 pb. The quoted uncertainties in predictions from simulation include only MC statistical uncertainties. It should be noted that no MC simulation is used to evaluate the background in the analysis described below as the modelling is based fully on data. The expected background yields presented in table 1 show that the final selected sample is dominated by QCD multijet events, and that the -6 -

JHEP01(2016)079
contribution from other background sources is negligible, constituting less than 1% of all selected events. Although MC simulation is not directly used to estimate background, the simulated samples play an important role in the validation of the background modelling as described in section 6. The signal acceptances are computed with respect to all possible decays of the four τ leptons, and include a branching fraction factor where the factor 1/2 accounts for the selection of same-sign muon pairs, and B(φ 1 → τ µ τ one-prong ) denotes the branching fraction of the φ 1 → τ τ decays to the final states characterized by the presence of only two charged particles where at least one of the charged particles is a muon. This branching fraction is expressed as where B(τ → one-prong) denotes the total branching fraction of the τ decay to one charged particle with any number of neutral particles. The factor of two in the first term accounts for the two possible charges of the required muonic decay: τ − τ + → µ − + one-prong + and τ − τ + → µ + + one-prong − . Subtraction of the term B 2 (τ → µνν) avoids double counting in the case where the two τ leptons produced by a given φ 1 both decay to muons. The invariant mass of each selected muon and the nearby track is reconstructed. The two-dimensional distribution of the invariant mass of each selected muon and the nearby track is used to discriminate between the signal and the QCD multijet background; the signal is extracted by means of a fit to this two-dimensional distribution. The binning of the two-dimensional (m 1 , m 2 ) distributions is illustrated in figure 2. For masses below 3 GeV, bins of 1 GeV width are used for both m 1 and m 2 . For masses in the range 3 < m 1 (m 2 ) < 10 GeV, a single bin is used. This choice avoids poorly populated bins in the two-dimensional (m 1 , m 2 ) distributions in the background control regions used to construct and validate the QCD multijet background model (section 6). For each selected event, the (m 1 , m 2 ) histogram is filled once if the pair of quantities (m 1 , m 2 ) occurs in one of the diagonal bins and twice, once with values (m 1 , m 2 ) and a second time with the swapped values (m 2 , m 1 ), for off-diagonal bins. This procedure insures the symmetry of the twodimensional (m 1 , m 2 ) distribution. To avoid double counting of events, the off-diagonal bins (i, j) with i > j are excluded from the procedure of the signal extraction (the hatched bins in figure 2). Thus, the number of independent bins is reduced from 4 × 4 = 16 to 4 × (4 + 1)/2 = 10.
In order to fit the data in the 10 bins of the two-dimensional distribution of figure 2, a two-component fit is performed using two-dimensional distributions ("templates") describing the QCD multijet background and the signal. The normalisations of background and signal components are free parameters in this fit. The two-dimensional template for the signal is obtained from the simulation using the generator described in section 3. The two-dimensional template for the QCD multijet background is extracted from the data as explained in the next section.
Data -873 Table 1. The number of observed events, expected background and signal yields, and signal acceptances after final selection. The computed signal acceptances include the branching fraction factor B 2 (φ 1 → τ µ τ one-prong )/2. The electroweak background contribution includes the Drell-Yan process, W + jets production, and diboson production of WW, WZ, and ZZ. The numbers of signal events are reported for the benchmark value of the signal production cross section times branching fraction of 5 pb. The expected background and signal yields and signal acceptances are obtained from simulation. The quoted uncertainties in predictions from simulation include only statistical uncertainties related to the size of MC samples.

Modelling of the QCD multijet background shape
A simulation study shows that the sample of same-sign dimuon events selected as described in section 4, but without requiring a presence of one-prong τ candidates and without applying the isolation requirement for the muon-track systems, is dominated by QCD multiparton production, where 94% of all selected events contain b quarks in the final state. The same-sign muon pairs in these events originate mainly in the following cases.
• Muonic decay of a bottom hadron in one b quark jet, and cascade decay of a bottom hadron into a charmed hadron with subsequent muonic decay of a charmed hadron in the other b quark jet.
• Muonic decay of a bottom hadron in one b quark jet, and decay of a quarkonium state into a pair of muons in the other jet. The normalization of the QCD multijet background is not constrained prior to the extraction of the signal. The procedure used to model the shape of the two-dimensional (m 1 , m 2 ) distribution of QCD multijet events in the signal region is described in this section.
Given the symmetry of the two-dimensional (m 1 , m 2 ) distribution, the modelling of the QCD multijet background shape is derived from the two-dimensional probability density is the two-dimensional pdf of the invariant masses of the muon-track systems, m 1 and m 2 , in the sample of QCD multijet events selected in the signal region; • f 1D (m i ) is the one-dimensional pdf of the invariant mass of the muon-track system in the sample of QCD multijet events selected in the signal region; • C(m 1 , m 2 ) is a symmetric function of two arguments, C(m 1 , m 2 ) = C(m 2 , m 1 ), reflecting the correlation between m 1 and m 2 .
A constant correlation function would indicate the absence of correlation between m 1 and m 2 . Based on eq. (6.1), the content of bin (i, j) of the symmetric normalized two- The modelling of f 1D (m) and C(m 1 , m 2 ), described in the following, is necessary in order to build the template f 2D (i, j).
The f 1D (m) pdf is modelled using a QCD-enriched control data sample disjoint from the signal region. Events in the control sample are required to satisfy all selection criteria, except for the isolation of the second muon-track system. The second muon is required to be accompanied by either two or three nearby tracks with p T > 1 GeV and impact parameters smaller than 1 cm relative to the PV both in the transverse plane and along the beam axis. The simulation shows that more than 99% of events selected in this control region, hereafter referred to as N 23 , are QCD multijet events. The modelling of the f 1D (m) pdf is based on the assumption that the kinematic distributions for the first muon-track system are not affected by the isolation requirement imposed on the second, and therefore the f 1D (m) pdf of the isolated muon-track system is the same in the signal region and the region N 23 .
A direct test of this assumption, given the limited size of the simulated sample of QCD multijet events, is not conclusive, and a test is therefore performed with an additional control sample. Events are selected in this control sample if one of the muons has at least one track passing the one-prong τ decay candidate criteria within a ∆R cone of radius 0.5 around the muon direction, with any number of additional tracks within the same ∆R cone. As more than one of these tracks can pass the selection criteria for a one-prong τ decay candidate, we investigate two scenarios. In one scenario, the lowest p T ("softest") track passing the one-prong τ decay candidate criteria is used to calculate the muon-track invariant mass, while in the other scenario the highest p T ("hardest") track passing the one-prong τ decay candidate criteria is used. If only one τ track is found around the first muon, the track is regarded as both "hardest" and "softest". For the second muon, two isolation requirements are considered: when the muon is accompanied by only one track passing the one-prong τ decay candidate criteria (N trk,2 = 1) as in the signal region, or when it is accompanied by two or three tracks (N trk,2 = 2, 3) with p T > 1 GeV and impact parameters smaller than 1 cm relative to the PV as in the region N 23 . The shapes of invariant mass distributions of the first muon and the softest or hardest accompanying track are then compared for the two different isolation requirements on the second muon, N trk,2 = 1 and N trk,2 = 2, 3. The test is performed both on data and on the simulated sample of QCD multijet events. The results of this study are illustrated in figure 3. In all considered cases, the shape of the invariant mass distribution is compatible within statistical uncertainties between the two cases, N trk,2 = 1 and N trk,2 = 2, 3. This observation validates the assumption that the f 1D (m) pdf can be determined in the control region N 23 .   ulation, for two representative mass hypotheses, m φ 1 = 4 and 8 GeV. The invariant mass of the muon-track system is found to have high discrimination power between the QCD multijet background and signal at m φ 1 = 8 GeV. At smaller m φ 1 the signal shape becomes more similar to the background shape, resulting in a reduction of discrimination power.
The normalized distribution f 1D (i) with the binning defined in figure 2 is extracted from the background distribution shown in figure 4.

Modelling of C(m 1 , m 2 )
In order to determine the correlation coefficients C(i, j) we define an additional control region A enriched in QCD multijet events. This control region consists of events that contain two same-sign muons passing the identification and kinematic selection criteria outlined in section 4. Each muon is required to have two or three nearby tracks within a ∆R cone of radius 0.5 around the muon direction. One and only one of these tracks must satisfy the criteria imposed on one-prong τ lepton decay candidates with p T > 2.5 GeV. The additional tracks must have transverse momentum in the range 1 < p T < 2.5 GeV. A total of 9127 data events are selected in this control region. The MC simulation predicts that the QCD multijet background dominates in region A, comprising more than 99% of all selected events. The simulation study also shows that the overall background-to-signal ratio is enhanced compared to the signal region by a factor of 15 to 20, depending on the mass hypothesis m φ 1 . Despite the large increase in the overall background-to-signal ratio, potential signal contamination in individual bins of the mass distributions can be nonnegligible. Bin-by-bin signal contamination in region A is discussed in section 7. For each event in control region A, the pair (m 1 ,m 2 ) of muon-track invariant masses is calculated. This pair is used to build the symmetrized normalized two-dimensional distribution f 2D (i, j) defined in figure 2. Then C(i, j) is obtained according to eq. (6.2) as where f 1D (i) is the one-dimensional normalized distribution with two entries per event (m 1 and m 2 ) built as for figure 4. Correlation coefficients C(i, j) derived from data in region A are presented in figure 5. A direct comparison of C(i, j) between the signal region and region A would be impossible in the simulated sample of QCD multijet events because of the very small numbers of events selected in the signal region and in region A. In order to assess the difference -13 -

JHEP01(2016)079
in C(i, j) between the signal region and region A, a dedicated MC study is performed, making use of a large exclusive sample generated with pythia. The simulation includes only two leading-order QCD multijet production mechanisms: the creation of a bb quark pair via gg → bb and qq → bb. The detector simulation and event reconstruction are not performed for this sample, and the comparison of C(i, j) between the signal region and region A is made using generator-level quantities.
These simplifications are validated by performing a set of consistency tests, making use of the available MC sample of QCD multijet events processed through the full detector simulation and event reconstruction. These tests are performed in a control region B, where each muon is required to have at least one track passing the one-prong τ decay candidate selection criteria, i.e. with p T > 2.5 GeV and impact parameters smaller than 200 µm and 400 µm in the transverse plane and along beam axis, respectively. Along with this requirement each muon is allowed to have one or more tracks within a ∆R cone of radius 0.5 around the muon direction, with p T > 1 GeV and impact parameters smaller than 1 cm. Control region B is characterized by a significantly larger yield of QCD multijet events compared to the signal region and control region A, thus making it possible to perform reliable MC consistency tests and assess the uncertainties in C(i, j). Two scenarios are investigated: 1) muons are paired with the softest one-prong τ decay candidate and 2) muons are paired with the hardest one-prong τ decay candidate. If only one oneprong τ decay candidate is found around a muon, it is regarded as both "softest" and "hardest". In both scenarios the correlation coefficients computed using the reconstructed four-momenta of muons and tracks are found to be compatible with those computed using generator-level four-momenta, within statistical uncertainties. Furthermore, the correlation coefficients computed with the inclusive QCD multijet sample are found to be compatible with those computed in the exclusive MC sample including only the gg(qq) → bb production mechanisms. This observation validates the use of the generator-level information and the exclusive bb MC sample to compare C(i, j) between the signal region and control region A. This comparison is presented in figure 6. The uncertainties in C(i, j) represent a quadratic sum of the systematic and MC statistical uncertainties. The systematic uncertainties are derived from the control region B. They take into account 1) any differences in C(i, j) calculated using the inclusive QCD multijet sample compared with the exclusive bb sample and 2) any differences in C(i, j) calculated using full detector simulation and event reconstruction compared with the study using generator-level quantities. Within their uncertainties the correlation coefficients C(i, j) in the signal region and in region A are compatible. We therefore use C(i, j) derived from data in region A to predict the QCD multijet background shape in the signal region according to eq. (6.2).

Systematic uncertainties
The analysis is affected by various systematic uncertainties, which are classified into two groups. The first group consists of uncertainties related to the background, while the second group includes uncertainties related to the signal. The systematic uncertainties are summarized in table 2. Generator study (8 TeV) Figure 6. The (m 1 , m 2 ) correlation coefficients C(i, j) determined in the control region A (circles) and in the signal region (squares) from the MC study carried out at generator level with the exclusive MC sample of QCD multijet events resulting from gg(qq) → bb production mechanisms. The bin notation follows the definition presented in figure 2. The vertical bars include both statistical and systematic uncertainties.

Uncertainties related to background
The estimation of the QCD multijet background is based solely on data and is therefore not affected by imperfections in the simulation of the detector response and inaccuracies in the modelling of the muon and track reconstruction.
The shape of the background in the two-dimensional (m 1 , m 2 ) distribution is modelled according to eq. (6.2). The uncertainty in the two-dimensional shape f 2D (m 1 , m 2 ) is dominated by uncertainties in the correlation coefficients C(i, j) derived in the QCD multijet background-enriched control region A as described in section 6. The statistical uncertainties in C(i, j) in region A range from 2 to 14%, as seen in figure 5. These uncertainties are accounted for in the signal extraction procedure by 10 independent nuisance parameters, one nuisance parameter per bin in the (m 1 , m 2 ) distribution. The systematic uncertainties related to the extrapolation of C(i, j) from the control region A to the signal region are derived from the dedicated MC study. The correlation coefficients are found to be compatible between the signal region and the control region A within uncertainties ranging from 2 to 22% (Figure 6). These uncertainties are accounted for by 10 additional independent nuisance parameters.
The possible contamination of control region A by the signal may bias the estimation of the correlation coefficients and consequently have an impact on the evaluation of the QCD multijet background. The effect is estimated with a conservative assumption on the branching fraction B(H(125) → φ 1 φ 1 ) B 2 (φ 1 → τ τ ) of 32%, which corresponds to the 95% confidence level (CL) upper limit set by CMS on the branching fraction of the H(125) boson decays to non-standard model particles [4] Table 2. Systematic uncertainties and their effect on the estimates of the QCD multijet background and signal. The effect of the uncertainties in C(i, j) on the total background yield is absorbed by the overall background normalization, which is allowed to vary freely in the fit. is set to the value predicted in the standard model (19.3 pb). Under these assumptions, the contamination of region A by the signal is estimated to be less than 2% for all mass hypotheses m φ 1 and in all bins of the two-dimensional (m 1 , m 2 ) distribution, with the exception of bin (4,4), where the contamination can reach 12% for m φ 1 = 8 GeV. However, the overall effect on the signal extraction is found to be marginal. Within this conservative scenario, variations of C(i, j) due to possible contamination of control region A by the signal modify the observed and expected upper limits at 95% CL on (σB) sig by less than 1% for all considered values of m φ 1 .

Uncertainties related to signal
The following uncertainties in the signal estimate are taken into account, and are summarized in table 2.
An uncertainty of 2.6% is assigned to the integrated luminosity estimate [64]. The uncertainty in the muon identification and trigger efficiency is estimated to be 2% using the tag-and-probe technique applied to a sample of Z → µµ decays. Because final states with two muons are selected in this analysis, this uncertainty translates into a 4% systematic uncertainty in the signal acceptance.

JHEP01(2016)079
The track selection and isolation efficiency is assessed with a study performed on a sample of Z bosons decaying into a pair of τ leptons. In the selected Z → τ τ events, one τ lepton is identified via its muonic decay, while the other is identified as an isolated track resulting from a one-prong decay. The track is required to pass the nominal selection criteria used in the main analysis. From this study the uncertainty in the track selection and isolation efficiency is estimated to be 5%. As the analysis requires each muon to be accompanied by one track, this uncertainty gives rise to a 10% systematic uncertainty in the signal acceptance.
The muon momentum and track momentum scale uncertainties are smaller than 0.5% and have a negligible effect on the analysis.
The bin-by-bin MC statistical uncertainties in the signal acceptance range from 7 to 100%. Their impact on the signal normalization is between 4 and 6% as indicated in table 1. These uncertainties are accounted for in the signal extraction procedure by 10 nuisance parameters, corresponding to 10 independent bins in the (m 1 , m 2 ) distribution.
Theoretical uncertainties have an impact on the differential kinematic distributions of the produced H(125) boson, in particular its p T spectrum, thereby affecting signal acceptance. The uncertainty due to missing higher-order corrections to the gluon-gluon fusion process are estimated with the HqT program by varying the renormalization (µ r ) and factorization (µ f ) scales. The H(125) p T -dependent k factors are recomputed according to these variations and applied to the simulated signal samples. The resulting effect on the signal acceptance is estimated to be of the order of 1%.
The HqT program is also used to evaluate the effect of the PDF uncertainties. The nominal k factors for the H(125) boson p T spectrum are computed with the MSTW2008nnlo PDF set [65]. Variations of the MSTW2008nnlo PDFs within their uncertainties change the signal acceptance by about 1%, whilst using the CTEQ6L1 PDF set changes the signal acceptance by about 0.7%. These variations are covered by the assigned uncertainty of 1%.
The contribution of b quark loops to the gluon-gluon fusion process depends on the NMSSM parameters, in particular tan β, the ratio of the vacuum expectation values of the two NMSSM Higgs doublets. The corresponding uncertainty is conservatively estimated by calculating k factors for the H(125) boson p T spectrum with powheg [66][67][68][69], removing any contribution from the top quark loop and retaining only the contribution from the b quark loop. The modified k factors applied to the simulated signal samples change the signal acceptance by approximately 3% for all mass hypotheses m φ 1 .

Results
The signal is extracted with a binned maximum-likelihood fit applied to the twodimensional (m 1 , m 2 ) distribution in data. For each mass hypothesis of the φ 1 boson, the (m 1 , m 2 ) distribution in data is fitted with the QCD multijet background shape and the gg → H(125) signal shape for the φ 1 mass under test. The contribution to the final selected sample from vector boson fusion and vector boson associated production of the H(125) boson is suppressed by the selection described in section 4, i.e. by the requirement ∆R(µ, µ) > 2. The impact of other backgrounds on the fit is found to be negligible. The signal shapes are derived from simulation. The background shape is evaluated from data, as described in section 6. The systematic uncertainties are accounted for in the fit via nuisance parameters with log-normal pdfs.
The contribution to the final selected sample from vector boson fusion (qqH) and vector boson associated production (VH) of the H(125) boson is suppressed by the selection described in section 4, especially by the requirement ∆R(µ, µ) > 2. For the values of the H(125) boson production cross sections predicted in the SM, the expected contribution from the qqH and VH processes to the final selected sample is estimated to be less than 4% of total signal yield for all tested m φ 1 hypotheses. The shapes of the two-dimensional (m 1 , m 2 ) distributions are found to be nearly indistinguishable among the three considered production modes, making it difficult to extract individual contributions from these processes in a model independent way. In the following these contributions are neglected, resulting in more conservative upper limits on (σB) sig . Subtraction of the qqH and VH contributions assuming the SM cross sections for the H(125) production mechanisms would decrease the upper limits on (σB) sig by less than 4% for all tested values of m φ 1 .
First, the data are examined for their consistency with the background-only hypothesis by means of a fit performed with the normalization of the signal fixed to zero. Figure 7 (left) shows the two-dimensional (m 1 , m 2 ) distribution unrolled into a one-dimensional array of analysis bins after performing the maximum-likelihood fit under the backgroundonly hypothesis. The signal distribution, although not used in the fit, is also included for comparison, for the mass hypotheses m φ 1 = 4 and 8 GeV.   Table 3. The number of observed data events, the predicted background yields, and the expected signal yields, for different masses of the φ 1 boson in individual bins of the (m 1 , m 2 ) distribution. The background yields and uncertainties are obtained from the maximum-likelihood fit under the background-only hypothesis. The signal yields are obtained from simulation and normalized to a signal cross section times branching fraction of 5 pb. The uncertainties in the signal yields include systematic and MC statistical uncertainties. The bin notation follows the definition presented in figure 2. Table 3 presents the number of observed data events, the predicted background yields obtained from a fit under the background-only hypothesis, and the expected signal yields obtained from simulation, for each unique bin in the two-dimensional (m 1 , m 2 ) distribution. The data are well described by the background-only model.
The signal cross section times branching fraction is constrained by performing a fit under the signal+background hypothesis, where both the background and signal normalisations are allowed to vary freely in the fit. A representative example of the fit under the signal+background hypothesis at m φ 1 = 8 GeV is presented in figure 7 (right). No significant deviations from the background expectation are observed in data. Only a small excess is found for 6 ≤ m φ 1 ≤ 8 GeV, with a local significance ranging between 1.2σ (m φ 1 = 8 GeV) and 1.4σ (m φ 1 = 6 GeV). Results of the analysis are used to set upper limits on (σB) sig at 95% CL. The modified frequentist CL s criterion [70,71], implemented in the RooStats package [72], is used for the calculation of the exclusion limits. Figure 8 shows the observed upper limit on (σB) sig at 95% CL, together with the expected limit obtained under the background-only hypothesis, for m φ 1 in the range from 4 to 8 GeV. Exclusion limits are also reported in table 4.
The observed limit is compatible with the expected limit within two standard deviations in the entire tested range of the φ 1 boson mass, 4 ≤ m φ 1 ≤ 8 GeV. The observed limit ranges from 4.5 pb at m φ 1 = 8 GeV to 10.3 pb at m φ 1 = 5 GeV. The expected limit ranges from 2.9 pb at m φ 1 = 8 GeV to 10.6 pb at m φ 1 = 4 GeV.   Table 4. The observed upper limit on (σB) sig at 95% CL, together with the expected limit obtained in the background-only hypothesis, as a function of m φ1 . Also shown are ±1σ and ±2σ probability intervals around the expected limit.
The analysis presented here complements the search for h/H → a 1 a 1 → µµτ τ performed by the ATLAS Collaboration [52], providing results in the 4τ channel, which has not been previously explored at the LHC.

Summary
A search for a very light NMSSM Higgs boson a 1 or h 1 , produced in decays of the observed boson with a mass near 125 GeV, H(125), is performed on a pp collision data set corresponding to an integrated luminosity of 19.7 fb −1 , collected at a centre-of-mass energy of 8 TeV. The analysis searches for the production of an H(125) boson via gluon-gluon fusion, -20 -

JHEP01(2016)079
and its decay into a pair of a 1 (h 1 ) states, each of which decays into a pair of τ leptons. The search covers a mass range of the a 1 (h 1 ) boson of 4 to 8 GeV. No significant excess above background expectations is found in data, and upper limits at 95% CL are set on the signal production cross section times branching fraction, where φ 1 is either the a 1 or h 1 boson. The observed upper limit at 95% CL on (σB) sig ranges from 4.5 pb at m φ 1 = 8 GeV to 10.3 pb at m φ 1 = 5 GeV.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: