Search for the neutral Higgs bosons of the minimal supersymmetric standard model in pp collisions at √ s = 7 TeV with the ATLAS detector

: A search for neutral Higgs bosons of the Minimal Supersymmetric Standard Model (MSSM) is reported. The analysis is based on a sample of proton-proton collisions at a centre-of-mass energy of 7 TeV recorded with the ATLAS detector at the Large Hadron Collider. The data were recorded in 2011 and correspond to an integrated luminosity of 4.7 fb − 1 to 4.8 fb − 1 . Higgs boson decays into oppositely-charged muon or τ lepton pairs are considered for ﬁnal states requiring either the presence or absence of b -jets. No statistically signiﬁcant excess over the expected background is observed and exclusion limits at the 95% conﬁdence level are derived. The exclusion limits are for the production cross-section of a generic neutral Higgs boson, φ , as a function of the Higgs boson mass and for h/A/H production in the MSSM as a function of the parameters m A and tan β in the m max h for m A in the range of 90 GeV to 500 GeV. the uncertainties of the Z/γ ∗ → τ + τ − and the multi-jet backgrounds are anti-correlated.


Introduction
Discovering the mechanism responsible for electroweak symmetry breaking is one of the major goals of the physics programme at the Large Hadron Collider (LHC) [1]. In the Standard Model this mechanism requires the existence of a single scalar particle, the Higgs boson [2][3][4][5][6]. The recent discovery of a particle compatible with the Higgs boson at the LHC [7,8] provides further evidence in support of this simple picture. Even if this recently discovered particle is shown to have properties very close to the Standard Model Higgs boson, there are still a number of problems that are not addressed. For instance, quantum corrections to the mass of the Higgs boson contain quadratic divergences. This problem can be solved by introducing supersymmetry, a symmetry between fermions and bosons, by which the divergent corrections to the Higgs boson mass are cancelled.
In the Minimal Supersymmetric Standard Model (MSSM) [9, 10], two Higgs doublets are necessary, coupling separately to up-type and down-type fermions. This results in five physical Higgs bosons, two of which are neutral and CP-even (h, H) 1 , one of which is neutral and CP-odd (A), and two of which are charged (H ± ). At tree level their properties can be described in terms of two parameters, typically chosen to be the mass of the CPodd Higgs boson, m A , and the ratio of the vacuum expectation values of the two Higgs doublets, tan β. In the MSSM, the Higgs boson couplings to τ leptons and b-quarks are strongly enhanced for a large part of the parameter space. This is especially true for large values of tan β, in which case the decay of a Higgs boson to a pair of τ leptons or b-quarks and its production in association with b-quarks play a much more important role than in the Standard Model.
The results presented in this paper are interpreted in the context of the m max h benchmark scenario [11]. In the m max h scenario the parameters of the model are chosen such that the mass of the lightest CP-even Higgs boson, h, is maximised for a given point in the m A -tan β plane, under certain assumptions. This guarantees conservative exclusion bounds from the LEP experiments. The sign of the Higgs sector bilinear coupling, µ, is generally not constrained, but for the analysis presented in this paper µ > 0 is chosen as this is favoured by the measurements of the anomalous magnetic dipole moment of the muon [12]. The most common MSSM neutral Higgs boson production mechanisms at a hadron collider are the b-quark associated production and gluon-fusion processes, the latter process proceeding primarily through a b-quark loop for intermediate and high tan β. Both processes have cross-sections that increase with tan β, with the b-associated production process becoming dominant at high tan β values. The most common decay modes at high tan β are to a pair of b-quarks or τ leptons, with branching ratios close to 90% and 10%, respectively, across the mass range considered. The direct decay into two muons occurs rarely, with a branching ratio around 0.04%, but offers a clean signature.
Previous searches for neutral MSSM Higgs bosons have been performed at LEP [13], the Tevatron [14-19] and the LHC [7,8]. The recently observed Higgs-boson-like particle at the LHC [20,21] is consistent with both the Standard Model and the lightest CP-even MSSM Higgs boson [22,23]. In this paper a search for neutral MSSM Higgs bosons using 4.7 fb −1 to 4.8 fb −1 of proton-proton collision data collected with the ATLAS detector [24] in 2011 at the centre-of-mass energy of 7 TeV is presented. The µ + µ − and τ + τ − decay modes are considered, with the latter divided into separate search channels according to the subsequent τ lepton decay modes. Events from each channel are further classified according to the presence or the absence of an identified b-jet.

The ATLAS detector
The ATLAS experiment at the LHC is a multi-purpose particle detector with a forwardbackward symmetric cylindrical geometry and nearly 4π coverage in solid angle [24]. It consists of an inner tracking detector surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer. The inner tracking detector covers the pseudorapidity range |η| < 2.5 2 . It consists of silicon pixel, semi-conductor micro-strip, and transition radiation tracking detectors. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity. A hadronic (iron/scintillator-tile) calorimeter covers the central pseudorapidity range (|η| < 1.7). The end-cap and forward regions are instrumented with LAr calorimeters for both EM and hadronic energy measurements up to |η| = 4.9. The muon spectrometer surrounds the calorimeters and incorporates three large air-core toroid superconducting magnets with bending power between 2.0 Tm and 7.5 Tm, a system of precision tracking chambers and fast detectors for triggering. A three-level trigger system is used to select events. The first-level trigger is implemented in hardware and uses a subset of the detector information to reduce the rate to at most 75 kHz. This is followed by two software-based trigger levels that together reduce the event rate to approximately 300 Hz. The trigger requirements were adjusted to changing data-taking conditions during 2011.

Data and Monte Carlo simulation samples
The data used in this search were recorded by the ATLAS experiment during the 2011 LHC run with proton-proton collisions at a centre-of-mass energy of 7 TeV. They correspond to an integrated luminosity of 4.7 fb −1 (τ + τ − channels) or 4.8 fb −1 (µ + µ − channel) after imposing the data quality selection criteria to require that all relevant detector sub-systems used in these analyses were operational.
Higgs boson production: The Higgs boson production mechanisms considered are gluon-fusion and b-associated production. The cross-sections for the first process have been calculated using HIGLU [25] and ggh@nnlo [26]. For b-associated production, a matching scheme described in reference [27] is used to combine four-flavour [28,29] calculations and the five-flavour bbh@nlo [30] calculation. The masses, couplings and branching ratios of the Higgs bosons are computed with FeynHiggs [31]. Details of the calculations and the associated uncertainties due to the choice of the value of the strong coupling constant, the parton distribution function and the factorisation and renormalisation scales can be found in reference [32]. Gluon-fusion production is simulated with POWHEG [33], while b-quark associated production is simulated with SHERPA [34].
The h/A/H → µ + µ − and h/A/H → τ + τ − modes are considered for the decay of the Higgs boson. The A boson samples are generated for both production mechanisms and are also employed for modelling H and h production. The differences between CP-even and CP-odd eigenstates are negligible for this analysis. The signal modelling for a given combination of m A and tan β takes into account all three Higgs bosons h, H and A, by adding the A boson mass samples corresponding to m h , m H and m A according to their production cross-sections and masses in the m max h , µ > 0 MSSM benchmark scenario. For the τ + τ − decay mode, 15 Monte Carlo samples are generated with Higgs boson masses in the range of 90 GeV to 500 GeV and tan β = 20. These are scaled to the appropriate cross-sections for other tan β values. The simulated signal samples with m A closest to the computed mass of H and h are used for the H and h bosons. The increase in the Higgs boson natural width with tan β, of the order of 1 GeV in the range considered, is negligible compared to the experimental mass resolution in this channel, which is always above 10 GeV.
For the µ + µ − decay mode, seven samples are generated with Higgs boson masses in the range of 110 GeV to 300 GeV and tan β = 40. Additionally, to study the tan β dependence of the width of the resonance, signal samples are generated for both production modes for m A = 150 GeV and 250 GeV, each at tan β = 20 and tan β = 60. Since the mass resolution is better in this channel, signal distributions are obtained using an interpolation procedure for different intermediate m A -tan β values, as described in section 5.
The generated Monte Carlo samples for the h/A/H → τ + τ − decay modes are passed through the full GEANT4 [35,36] detector simulation, while the samples for the h/A/H → µ + µ − decay mode are passed through the full GEANT4 detector simulation or the "fast" simulation, ATLFAST-II [35], of the ATLAS detector.
Background processes: The production of W and Z/γ * bosons in association with jets is simulated using the ALPGEN [37] and PYTHIA [38] generators. PYTHIA is also used for the simulation of bb production, but through an interface, which ensures that the simulation is in agreement with b-quark production data [39,40]. The tt production process is generated with MC@NLO [41] and POWHEG [42,43]. MC@NLO is used for the generation of electroweak diboson (W W , W Z, ZZ) samples. Single-top production through the s-and t-channels, and in association with W bosons, is generated using AcerMC [44]. For all event samples described above, parton showers and hadronisation are simulated with HERWIG [45] and the activity of the underlying event with JIMMY [46]. The loop-induced gg → W W processes are generated using gg2WW [47]. The following parton distribution function sets are used: CT10 [48] for MC@NLO, CTEQ6L1 [49] for ALPGEN and modified leading-order MRST2007 [50] for PYTHIA samples.
Decays of τ leptons are simulated using either SHERPA or TAUOLA [51]. Initialstate and final-state radiation of photons is simulated using either PHOTOS [52] or, for the samples generated with SHERPA, PHOTONS++, which is a part of SHERPA. The Z/γ * → τ + τ − background processes are modelled with a τ -embedded Z/γ * → µ + µ − data sample described in section 6. All generated Monte Carlo background samples are passed through the full GEANT4 simulation of the ATLAS detector.
The signal and background samples are reconstructed with the same software as used for data. To take into account the presence of multiple interactions occurring in the same and neighbouring bunch crossings (referred to as pile-up), simulated minimum bias events are added to the hard process in each generated event. Prior to the analysis, simulated events are re-weighted in order to match the distribution of the number of pile-up interactions per bunch crossing in the data. measured in the calorimeter. A multivariate algorithm based on a neural network is used in this analysis to tag jets, reconstructed within |η| < 2.5, that are associated with the hadronisation of b-quarks. The neural network makes use of the impact parameter of associated tracks and the reconstruction of b-and c-hadron decay vertices inside the jet [58]. The b-jet identification has an efficiency of about 70% in tt events, unless otherwise stated. The corresponding rejection factors are about 5 for jets containing charm hadrons and about 130 for light-quark or gluon jets.
Hadronic decays of τ leptons (τ had ) are characterised by the presence of one, three, or in rare cases, five charged hadrons accompanied by a neutrino and possibly neutral hadrons, resulting in a collimated shower profile in the calorimeters with only a few nearby tracks. The visible τ decay products are reconstructed in the same way as jets, but are calibrated separately to account for the different calorimeter response compared to jets. Information on the collimation, isolation, and shower profile is combined into a boosted-decision-tree discriminant to reject backgrounds from jets [59]. In this analysis, three selections are used -"loose", "medium" and "tight"-with identification efficiency of about 60%, 45% and 35%, respectively. The rejection factor against jets varies from about 20 for the loose selection to about 300 for the tight selection. A τ had candidate must lie within |η| < 2.5, have a transverse momentum greater than 20 GeV, one or three associated tracks (with p T > 1 GeV), and a total charge of ±1. Dedicated electron and muon veto algorithms are applied to each τ had candidate.
When different objects selected according to the above criteria overlap with each other geometrically (within ∆R = 0.2), only one of them is considered for further analysis. The overlap is resolved by selecting muon candidates, electron candidates, τ had candidates and jet candidates in this order of priority.
The magnitude and direction of the missing transverse momentum, E miss T , is reconstructed including contributions from muons and energy deposits in the calorimeters [60]. Clusters of calorimeter-cell energy deposits belonging to jets, τ had candidates, electrons, and photons, as well as cells that are not associated to any object, are treated separately in the E miss T calculation. The contributions of muons to E miss T are calculated differently for isolated and non-isolated muons, to account for the energy deposited by muons in the calorimeters. 5 The µ + µ − decay channel Signal topology and event selection: The signature of the h/A/H → µ + µ − decay is a pair of isolated muons with high transverse momenta and opposite charge. In the b-associated production mode, the final state can be further characterised by the presence of one or two low-E T b-jets. The missing transverse momentum is expected to be small and on the order of the resolution of the E miss T measurement. The µ + µ − decay channel search is complicated by a small branching ratio and considerable background rates.
Events considered in the µ + µ − analysis must pass a single-muon trigger with a transverse momentum threshold of 18 GeV. At least one reconstructed muon is required to be matched to the η-φ region of the trigger object and to have p T > 20 GeV. At least one Simulated backgrounds are shown for illustration purposes. The background uncertainties shown here are statistical in nature due to the finite number of simulated events. The contributions of the backgrounds Z/γ * → e + e − , W + jets and those of all diboson production processes but W W production are combined and labelled "Other electroweak". additional muon of opposite charge and with p T > 15 GeV is required. A muon pair is formed using the two highest-p T muons of opposite charge. This muon pair is required to have an invariant mass greater than 70 GeV. In addition, events are required to have E miss T < 40 GeV. All muons considered here must be isolated, as defined in section 4. The large background due to Z/γ * production can be reduced by requiring that the event contains at least one identified b-jet. Events satisfying this requirement are included in the b-tagged sample, whereas events failing it are included in the b-vetoed sample. The µ + µ − invariant mass distribution, m µµ , is shown separately for the b-tagged and the bvetoed samples in figure 1. For illustration purposes only, the distributions of simulated backgrounds and an assumed MSSM neutral Higgs boson signal with m A = 150 GeV and tan β = 40 are shown in the same figure. A hypothetical signal would be present as narrow peaks on top of the high-mass tail of the Z boson superimposed on a continuous contribution from non-resonant backgrounds such as tt. The Z/γ * process contributes to the total background with a relative fraction of about 99% (51%) for the b-vetoed (btagged) sample for events in the m µµ range of 110 GeV to 300 GeV, which is most relevant to the Higgs boson searches in this channel. In the b-vetoed sample the remaining nonresonant background is composed of tt, W + W − and bb events while tt events dominate the non-resonant background in the b-tagged sample.
Background modelling: The background in the µ + µ − channel is estimated from data. By scanning over the µ + µ − invariant mass distribution, local sideband fits provide the ex-pected background estimate in the mass region of interest. To this end, a parameterisation of the background shape is fitted to the µ + µ − invariant mass distribution. Search windows are defined around each of the neutral Higgs bosons and are excluded from the fit. This results in one or two windows due to the mass degeneracy among the Higgs bosons. The widths of the search windows are motivated by the expected signal width for each point in the scanned m A -tan β grid and account for asymmetries in the signal invariant mass distribution. The upper and lower boundaries of the search windows are defined by the m µµ values where the cross-section predictions of the signal model are 10% of their maximum. The lower and upper outer boundaries of the sidebands vary between 98-118 GeV and 160-400 GeV, respectively, depending on m A and m h .
The parameterisation of the µ + µ − invariant mass distribution, f B (x), is given by where x is the invariant mass, ⊗ the convolution operator and F G (x |0, σ) the Gaussian distribution with variable x, mean 0 and variance σ 2 . The function f Z describing the Z/γ * production is This is convolved with the Gaussian distribution accounting for the finite mass resolution. The function f Z is a simplification of the pure γ * and Z propagators, including Z-γ * interference contributing to the process qq → Z/γ * → µ + µ − , and hence in principle only describes the background from Z/γ * production. The parameterisation f B is found to be a good approximation of the shape of the total µ + µ − background even in the b-tagged sample, which has non-negligible contributions from physics processes other than Z/γ * production.
In total, the fit function, f B , has six parameters. The natural width of the Z boson, Γ Z , is fixed to Γ Z = 2.50 GeV, whereas the remaining parameters are unconstrained. Parameter N B describes the total normalisation of the curve and parameters A and B represent the relative normalisations of the γ * and Z-γ * contributions with respect to the Z term. Finally, m Z represents the mass of the Z boson and the parameter σ specifies the mean µ + µ − pair mass resolution.
For every point on the m A -tan β grid, a binned likelihood fit of f B to the data is performed to estimate the five unconstrained parameters and consequently the total background estimate.
The background model is validated with χ 2 -based goodness-of-fit studies. In addition, the background model is extended by polynomials of different orders to test if additional degrees of freedom change the goodness of the fit, which would hint at problems in the shape modelling. Further validation of the capability of the model to describe the shape of the data is performed by varying the fit ranges for certain mass points and accounting for the fit residuals. The goodness-of-fit studies confirm a good background modelling for both the b-vetoed and the b-tagged sample. The uncertainty on the background estimate is obtained from a variation of the fitted background function within its 68% confidence level (CL) uncertainty band. This results in an uncertainty of 5% (2%) on the expected background yield for the b-tagged (b-vetoed) sample.
Signal modelling: The h/A/H → µ + µ − signal is expected to appear as narrow peaks in the µ + µ − invariant mass distribution, as depicted in figure 1. The resolution in the relevant mass range is typically 2.5% to 3%, and numerous mass points are needed for a complete mass scan. In addition, the influence of tan β on the reconstructed width of the signal invariant mass distribution needs to be taken into account. The natural widths of the MSSM neutral Higgs bosons increase with tan β. The reconstructed width can be sensitive to this variation because of the good experimental mass resolution.
To interpolate between the different signal samples obtained from a limited number of simulated signal masses, the signal µ + µ − invariant mass distribution is parameterised with where x represents the µ + µ − invariant mass. The parameterisation consists of a Breit-Wigner function describing the signal peak convolved with a Gaussian distribution, F G , accounting for the finite mass resolution and a Landau function, F L , with low-mass tail which models the asymmetric part of the signal invariant mass distribution. The function f S is characterised by six parameters. The width of the Breit-Wigner function, Γ, is fixed to the theoretical predictions calculated with FeynHiggs [31]. The remaining five parameters are unconstrained. The overall normalisation parameter is N S and c specifies the relative normalisation of the Landau function with respect to the Breit-Wigner function. Parameter m specifies the mean of the Breit-Wigner and the Landau distributions, σ determines the width of the Gaussian distribution and ς represents the scale parameter of the Landau function.
The function f S is fitted to each signal sample available from simulation. The signal model is validated with Kolmogorov-Smirnov-and χ 2 -based goodness-of-fit tests proving a good description of the simulated signal µ + µ − invariant mass distributions. Each fit results in a set of fitted parameters, (N S , m, σ, c, ς), depending on the point in the m A -tan β plane. The dependence of this set on m A and tan β is parameterised with polynomials of different orders. The resulting polynomials provide a set of parameters which in addition to the predicted natural width, Γ, fully define the normalised probability density function for an arbitrary point in the m A -tan β plane.
This procedure is used to generate invariant mass distributions for signal masses from 120 GeV to 150 GeV in 5 GeV steps and from 150 GeV to 300 GeV in 10 GeV steps, as well as for tan β values from 5 to 70 in steps of 3 or 5. Higgs boson masses below 120 GeV were not considered because the background model does not provide precise estimates in the mass region close to the Z boson peak. For both the b-tagged and the b-vetoed samples the interpolated and normalised probability density functions are obtained separately for the Higgs boson production from gluon-fusion and in association with b-quarks. As for the background, the uncertainty on the signal prediction from the fit is obtained from its 68% CL uncertainty band. The resulting uncertainty is estimated to be 10% to 20% of the signal event yield.
Results: Figure 2 compares the data with the background estimate predicted from sideband fits in both the b-tagged and b-vetoed samples for the signal mass point m A = 150 GeV and tan β = 40. The data fluctuate around the background prediction leading to local binby-bin significances that are typically less than 2 σ. Table 1 shows the number of observed events in the fit range around the mass point m A = 150 GeV compared to the number of background events predicted by the sideband fits. The observed numbers of events are compatible with the expected yield from Standard Model processes within the uncertainties. 6 The τ + τ − decay channel The h/A/H → τ + τ − decay mode is analysed in several categories according to the τ lepton decay final-state combinations. The four decay modes considered here are: τ e τ µ (6%), τ e τ had (23%), τ µ τ had (23%) and τ had τ had (42%), where percentages in the parentheses denote the corresponding branching ratios. The combination of τ e τ had and τ µ τ had is referred to as τ lep τ had . Signal m A = 150 GeV, tan β = 40 Data 985 36044 Table 1. τ -embedded Z/γ * → µ + µ − data: Z/γ * → τ + τ − events form a largely irreducible background to the Higgs boson signal in all final states. It is not possible to select a Z/γ * → τ + τ − control sample which is Higgs boson signal-free. However, Z/γ * → µ + µ − events can be selected in data with high purity and without significant signal contamination. Furthermore, the event topology and kinematics are, apart from the τ lepton decays and the different masses of τ leptons and muons, identical to those of Z/γ * → τ + τ − events. Therefore Z/γ * → µ + µ − events are selected in data and modified using a τ -embedding technique, in which muons are replaced by simulated τ leptons. The hits of the muon tracks and the associated calorimeter cells in a cone with radius parameter ∆R = 0.1 around the muon direction are removed from the data event and replaced by the detector response from a simulated Z/γ * → τ + τ − event with the same kinematics. The event reconstruction is performed on the resulting hybrid event. Only the τ decays and their detector response are taken from the simulation, whereas the underlying event kinematics and the associated jets are taken from the data event. The procedure treats consistently the effect of τ polarisation and spin correlations. The event yield of the embedded sample after the selection of the τ decay products is normalised to the corresponding event yield obtained in a simulated Z/γ * → τ + τ − sample. This procedure has been validated as described in references [7, 61]. Systematic uncertainties on the normalisation and shape of the embedded sample are derived by propagating variations of the Z/γ * → µ + µ − event selection and the muon energy subtraction procedure through the τ -embedding process. Additional uncertainties are assigned due to the use of the Z/γ * → τ + τ − cross-section and Monte Carlo acceptance prediction in determining the τ -embedded Z/γ * → µ + µ − sample normalisation. These theoretical uncertainties are described in section 7.
Jets misidentified as hadronic τ decays: A fraction of jets originating from quarks or gluons are misidentified as τ had candidates. It has been shown in reference [62] that this misidentification fraction is higher in simulated samples than in data. To account for this difference, the Monte Carlo background estimate is corrected based on control samples. Details are presented for each decay channel separately.
The ABCD background estimation method: The estimation of the background from multi-jet processes is done from data using the ABCD method for all τ + τ − channels. Two uncorrelated variables are chosen to define four data regions, named A, B, C and D, such that one variable separates A and B from C and D, while the other separates A and C from B and D. The signal region is labelled A, and the other regions are dominated by background from multi-jet processes. An estimate of the background from these processes in the signal region, n A , is: where n B , n C and n D denote the populations of regions B, C and D, respectively. The populations of the B, C and D regions may need to be corrected by subtracting the estimated number of events that come from processes other than multi-jet production. This estimate is generally obtained from simulation.
τ + τ − mass reconstruction: The invariant mass of the τ + τ − pair cannot be reconstructed directly due to the presence of neutrinos from the τ lepton decays. Therefore, a technique known as the Missing Mass Calculator (MMC) is used to reconstruct the Higgs boson candidate mass [63]. This algorithm assumes that the missing transverse momentum is due entirely to the neutrinos, and performs a scan over the angles between the neutrinos and the visible τ lepton decay products. For leptonic τ decays, the scan also includes the invariant mass of the two neutrinos. At each point, the τ + τ − invariant mass is calculated, and the most likely value is chosen by weighting each solution according to probability density functions that are derived from simulated τ lepton decays. This method provides a 13% to 20% resolution in the invariant mass, with an efficiency of 99% for the scan to find a solution.
6.2 The h/A/H → τ e τ µ decay channel Signal topology and event selection: Events in this channel must satisfy either a single-electron, single-muon or combined electron-muon trigger. The single-lepton triggers have p T thresholds of 20 GeV or 22 GeV for electrons, depending on the run period, and 18 GeV for muons, while the combined trigger has a threshold of 10 GeV for the electron and 6 GeV for the muon. Exactly one isolated electron and one isolated muon of opposite electric charge are required, and the lepton pair must have an invariant mass exceeding 30 GeV. The p T thresholds are 15 GeV for electrons and 10 GeV for muons in cases where the event is selected by the combined electron-muon trigger. These thresholds are raised to 24 GeV for electrons and 20 GeV for muons in cases where the event is selected by a single-lepton trigger only.
The event sample is then split according to its jet flavour content. Events containing exactly one identified b-jet are included in the b-tagged sample. This is based on a b-jet identification criterion with 75% efficiency in tt events. Events without an identified b-jet are included in the b-vetoed sample. The scalar sum of the lepton transverse momenta and missing transverse momentum is required to fulfil the following condition to reduce top quark and diboson backgrounds: E miss T + p T,e + p T,µ < 125 GeV (< 150 GeV) for the b-tagged (b-vetoed) sample. In order to further suppress these backgrounds and W → ℓν events, the opening angle between the two lepton candidates in the transverse plane must satisfy the condition ∆φ eµ > 2.0 (> 1.6) for the b-tagged (b-vetoed) sample. In addition, the combination of the transverse opening angles between the lepton directions and the direction of E miss T is required to satisfy the condition ℓ=e,µ cos ∆φ E miss T ,ℓ > −0.2 (> −0.4) for the b-tagged (b-vetoed) sample. Finally, the scalar sum of the transverse energies of all jets, H T , is restricted to be below 100 GeV in the b-tagged sample to further suppress backgrounds containing a higher multiplicity of jets, or jets with higher transverse momenta, than expected from the signal processes. Jets with |η| < 4.5 and E T > 20 GeV are used to calculate the value of H T .
Estimation of the Z/γ * → τ + τ − background: The Z/γ * → τ + τ − background is estimated by using the τ -embedded Z/γ * → µ + µ − event sample outlined in section 6.1. The use of multiple triggers with different p T thresholds has an effect on the lepton transverse momentum spectra, which is accurately reproduced by the trigger simulation. However, in the τ -embedded Z/γ * → µ + µ − data there is no simulation of the trigger response for the decay products of the τ leptons. This has an impact on the MMC mass distribution in the τ -embedded Z/γ * → µ + µ − data, which is comparable to the statistical uncertainty in the b-vetoed sample, and negligible in the b-tagged sample. For this reason the trigger selection is emulated for the b-vetoed sample such that the trigger effect is adequately described. This emulation is based on the p T -dependent trigger efficiencies obtained from data.
Estimation of the tt background: The contribution of tt production is extrapolated from control regions which have purities of 90% (b-tagged sample) and 96% (b-vetoed sample). The selection criteria for these control regions are identical to the respective signal regions with two exceptions: at least two identified b-jets are required, and the selection H T < 100 GeV is not applied. The multi-jet contributions to these control regions are estimated from data with the ABCD method; the other non-tt contributions are taken from simulation. The uncertainty on the normalisation obtained in this manner is 15% (30%) in the b-tagged (b-vetoed) sample, primarily due to uncertainties on the b-tagging efficiency and jet energy scale.
Estimation of the multi-jet background: The multi-jet background is estimated using the ABCD method, by splitting the event sample into four regions according to the charge product of the eµ pair and the isolation requirements on the electron and muon. These requirements are summarised in table 2.
The systematic uncertainty of this method has been estimated by considering the stability of the ratio r C/D in regions where the isolation requirements are varied, or where only Region Charge correlation Lepton isolation requirement A (Signal Region) Opposite sign isolated B Same sign isolated C Opposite sign anti-isolated D Same sign anti-isolated Table 2. Control regions for the estimation of the multi-jet background for the h/A/H → τ e τ µ and h/A/H → τ lep τ had samples: events are categorised according to the charge product of the two τ leptons and the lepton isolation requirement. In the h/A/H → τ lep τ had channel isolation refers to the isolation of the electron or muon and in the h/A/H → τ e τ µ channel both the electron and muon are required to be isolated or anti-isolated, respectively. the muon is required to fail the isolation. The resulting uncertainty on the normalisation is 14% (23%) in the b-tagged (b-vetoed) sample.
Results: The number of observed τ e τ µ events in data, along with predicted event yields from background processes, is shown in table 3. The observed event yield is compatible with the expected event yield from Standard Model processes within the uncertainties. The MMC mass distributions for these events are shown in figure 3.  Table 3. The number of events observed in data and the expected number of signal and background events of the h/A/H → τ e τ µ channel. Simulated event yields are normalised to the integrated luminosity of the data sample, 4.7 fb −1 . The predicted signal event yields correspond to a parameter choice of m A = 150 GeV and tan β = 20 and include both the b-associated and the gluon-fusion production processes.

The h/A/H → τ lep τ had decay channel
Signal topology and event selection: Events in the h/A/H → τ lep τ had channel are selected using a single-lepton trigger with transverse momentum thresholds of 20 GeV or 22 GeV for electrons, depending on the run period, and 18 GeV for muons. Each event must contain one isolated electron with p T > 25 GeV or one isolated muon with p T > 20 GeV. Events containing additional electrons or muons with transverse momenta greater than 15 GeV or 10 GeV, respectively, are rejected in order to obtain an orthogonal selection to those used in the τ e τ µ and µ + µ − channels. One τ had with a charge of opposite sign to the selected electron or muon is required. The τ had identification criterion in use is the one with medium efficiency as introduced in section 4. The transverse mass of the lepton and the missing transverse momentum, m T , is required to be less than 30 GeV, to reduce contamination from W + jets and tt background processes. Here the transverse mass is defined as: where p lep T denotes the transverse momentum of the electron or muon and ∆φ the angle between p lep T and E miss T . After imposing these selection criteria, the resulting event sample is split into two categories depending on whether or not the highest-E T jet with |η| < 2.5 is identified as a b-jet. Events are included in the b-tagged sample if the highest-E T jet is identified as a b-jet and its E T is in the range of 20 GeV to 50 GeV. Events are included in the b-vetoed sample if the highest-E T jet fails the b-jet identification criterion and the event has E miss T > 20 GeV.
Estimation of the W + jets background: W +jets events that pass the event selection criteria up to the m T requirement consist primarily of events in which the selected lepton originates from the W decay and a jet is misidentified as a τ had . To ensure a proper estimation of the jet-to-τ had misidentification rate, the W + jets background normalisation is corrected using control regions with high purity in W + jets events defined by requiring high transverse mass: 70 GeV < m T < 110 GeV. Separate control regions are used for the τ e τ had and τ µ τ had samples, as the kinematic selections are different. The correction factors derived from these control regions are f e W = 0.587 ± 0.009 for the electron channel and f µ W = 0.541 ± 0.008 for the muon channel, where the quoted uncertainties are statistical. The relative systematic uncertainty is estimated to be 5% by varying the m T boundary definition of the control region. The correction factors have been derived separately for the b-tagged and b-vetoed samples; the numbers are in agreement between the two cases, but for the b-tagged sample the statistical uncertainty is 17%. This statistical uncertainty is considered as an additional systematic uncertainty in the b-tagged sample analysis.
Estimation of the Z/γ * → τ + τ − /e + e − /µ + µ − background: The Z/γ * → τ + τ − background is estimated using the τ -embedded Z/γ * → µ + µ − sample outlined in section 6.1. The jet activity in the embedded events is independent of the Z boson decay mode. Taking advantage of this feature, the embedding sample is also used to validate the simulated Z/γ * → e + e − and Z/γ * → µ + µ − background samples for the correct b-jet fraction, which may affect the background estimation after imposing the b-tag requirement. Correction factors are derived by comparing τ -embedded Z/γ * → µ + µ − events with simulated Z/γ * → τ + τ − events before and after the b-tagged sample selection. The correction factors are calculated to be f e Zb = 1.08 ± 0.23 and f µ Zb = 1.11 ± 0.13 for the electron and muon channels, respectively, where the quoted uncertainties are statistical. The effect on these correction factors from the tt contribution in the control region is studied by removing the 50 GeV maximum E T requirement on the b-jet. A 7% systematic uncertainty is obtained. These factors are applied to the simulated Z/γ * → e + e − and Z/γ * → µ + µ − background samples passing the b-tagged sample selection.
Estimation of the tt background: The simulated tt samples are normalised from data using a top-enriched control region. This control region is defined by applying the τ lep τ had selection criteria up to the requirement of a eτ had or µτ had pair in the event, with no requirement on the transverse mass. The highest-E T jet in the event must be identified as a b-jet, with E T in the range 50 GeV to 150 GeV, and a second highest-E T jet must satisfy the same b-jet identification requirement. This results in a control region with a purity of tt events over 90% and negligible signal contribution. The tt correction factor is derived in a manner similar to that of the W + jets correction factor, and a value of f tt = 0.88 ± 0.04 (stat.) ± 0.14 (syst.) is obtained with the systematic uncertainty due primarily to the b-jet identification efficiency.  Estimation of the multi-jet background: For the multi-jet background estimation, the ABCD method is used by defining four regions according to whether the charge of the τ jet and lepton have opposite sign or same sign, and whether the selected lepton passes or fails the isolation criteria. These requirements are summarised in table 2. In regions C and D the contribution from processes other than the multi-jet background is negligible, while in region B there is a significant contribution from other backgrounds, in particular Z/γ * +jets and W + jets, which is subtracted from the data sample using estimates from simulation. The systematic uncertainty on the predicted event yield is estimated by varying the definitions of the regions used, and by testing the stability of the r C/D ratio across the m MMC τ τ range. The resulting uncertainty is 7.5% in the τ µ τ had channel and 15% in the τ e τ had channel.
Results: The number of observed τ lep τ had events in data, along with predicted event yields from background processes, are shown in table 4. The observed event yields are compatible with the expected yields from Standard Model processes within the uncertainties. The MMC mass distributions for these events, with τ µ τ had and τ e τ had statistically combined, are shown in figure 4.

The h/A/H → τ had τ had decay channel
Signal topology and event selection: Events in this channel are selected by a di-τ had trigger with transverse momentum thresholds of 29 GeV and 20 GeV for the two τ had candidates. Events containing identified electrons or muons with transverse momenta above 15 GeV or 10 GeV, respectively, are vetoed. These vetoes suppress background events and   ensure that the channels are statistically independent. Two τ had candidates with oppositesign charges are required, one passing the tight τ had identification requirements and the second passing the medium criteria. These two leading τ had candidates are required to match the reconstructed τ had trigger objects each within a cone with radius parameter ∆R < 0.2. These two τ had candidates are required to have transverse momenta above 45 GeV and 30 GeV, respectively. These values are chosen such that the plateau of the trigger turn-on curve is reached and the electroweak and multi-jet backgrounds are suppressed effectively. The missing transverse momentum is required to be above 25 GeV to account for the presence of neutrinos originating from the τ decays and to suppress multi-jet background. The selected events are split into a b-tagged sample and a b-vetoed sample to exploit the two dominant production mechanisms for neutral Higgs bosons in the MSSM. Events in which the leading jet is identified as a b-jet are included in the b-tagged sample. The transverse momentum of this jet is restricted to the range of 20 GeV to 50 GeV to reduce the tt background. Events without jets, or in which the leading jet is not identified as a b-jet, are included in the b-vetoed sample. Due to the higher background levels in this sample, the threshold on the transverse momentum of the leading τ had candidate is raised to 60 GeV.
Identification efficiency and misidentification corrections for hadronic τ decays: The τ had identification efficiencies, the τ had trigger efficiencies and the corresponding misidentification probabilities are corrected for differences observed between data and simulation. For the di-τ had trigger it is assumed that these identification and misidentification efficiencies can be factorised into the efficiencies of the corresponding single-τ had triggers with appropriate transverse momentum requirements. This factorisation is validated using a simulated event sample. The single-τ had trigger efficiency for real hadronically decaying τ leptons with respect to the offline τ had selection was measured using a tag-andprobe analysis with Z → τ µ τ had data. A correction factor for the simulation was derived as a function of the transverse momentum of each of the two τ had candidates, and each event was weighted by the product of these factors. The probability to misidentify a jet as a τ had is extracted for both the trigger and the τ had identification algorithm by analysing jets in a high-purity W (→ µν)+jets sample. A correction factor derived on the basis of these probabilities is applied to the simulation when a jet is misidentified as a τ had . The statistical and systematic uncertainty of these correction factors leads to an uncertainty of 21% on the W + jets background and 4% to 6% on the tt background.
Estimation of the Z/γ * → τ + τ − and W (→ τ ν)+jets backgrounds: The estimates of the Z/γ * → τ + τ − and W (→ τ ν)+jets backgrounds are taken from simulation and are validated using τ -embedded Z/γ * → µ + µ − and W (→ µν)+jets samples. The τembedded Z/γ * → µ + µ − data are used to validate the simulation, rather than to provide the main estimate for the Z/γ * → τ + τ − background, in the h/A/H → τ had τ had channel. The di-τ had trigger is not modelled in the τ -embedded Z/γ * → τ + τ − data, making it difficult to apply the embedding technique. Correction factors for the efficiency of the b-jet identification requirement on the leading jet are derived in a way equivalent to that Region Charge correlation Hadronic τ decay identification requirement A (Signal Region) Opposite sign Pass B Same sign Pass C Opposite sign Fail D Same sign Fail Table 5. Control regions for the estimation of the multi-jet background for the h/A/H → τ had τ had selection: Events are categorised according to the product of the electric charges of the two τ had candidates and the τ had identification. "Pass" refers to one τ had candidate passing the tight and the other τ had candidate passing the medium identification selection. "Fail" refers to all events in which the two τ had pass the loose identification selection but do not satisfy the selection of the "Pass" category.
described in section 6.3. For the Z/γ * +jets background a factor of f Zb = 1.24±0.34 (stat.) is derived by comparing the simulated and embedded Z/γ * → τ + τ − samples. With the embedded W (→ τ ν)+jets sample, no such correction factor may be derived in this way as the contamination from tt events is quite significant once the b-jet identification requirements are applied. Instead, the procedure in section 6.3 is applied with the selection of this channel. A correction factor of f W b = 1.00 ± 0.31 (stat.) is derived for W (→ τ ν)+jets events. The correction factors for the b-vetoed sample are found to be close to unity and uncertainties are negligible, hence no correction is applied.
Estimation of the multi-jet background: The multi-jet background is estimated using the ABCD method, by splitting the event sample into four regions based on the charge product of the two leading τ had candidates and whether the nominal τ had identification requirements of these two τ had candidates are met. These variables can be assumed to be uncorrelated for multi-jet events. Table 5 illustrates the definition of the four regions. The shape of the MMC mass distribution for the multi-jet background in the signal region is taken from region C for the b-vetoed sample and from region B for the b-tagged sample. Contributions from electroweak backgrounds are subtracted from each control region using simulation. The uncertainties on these backgrounds lead to uncertainties on the multi-jet estimate of 7% for the b-tagged sample and 5% for the b-vetoed sample.
Results: The number of observed τ had τ had events in data, along with predicted event yields from background processes, is shown in table 6. The observed event yields are compatible with the expected yields from Standard Model processes within the uncertainties. The MMC mass distributions for these events are shown in figure 5.

Systematic uncertainties
Data-driven background estimation: Where possible, event yields and mass distributions for the background are estimated using control samples in data. The specific techniques and their associated uncertainties have been presented in the relevant sections. The effect of these uncertainties on the predicted background event yield is less than 5% for    the µ + µ − channels and less than 15% for the τ + τ − channels, and is usually small compared to the systematic uncertainties from the simulated samples.
Cross-section for signal and background samples: The uncertainties on the signal cross-sections are estimated to be 10% to 20%, depending on the values of m A and tan β, for both gluon-fusion and b-associated Higgs boson production [32]. An uncertainty of 5% is assumed on the cross-sections for W and Z boson background production [64, 65]. Uncertainties due to the parton distribution functions and the renormalisation and factorisation scales are included in these estimates.
Acceptance modelling for simulated samples: The uncertainty on the acceptance from the parameters used in the event generation of signal and background samples is also considered. This is done by evaluating the change in kinematic acceptance after varying the relevant scale parameters, parton distribution function choices, and if applicable, conditions for the matching of the partons used in the fixed order calculation and the parton shower. Furthermore, the effects of different tunes of the underlying event activity are considered. The resulting uncertainties are typically 2% to 20%, depending on the sample and the channel considered.
Electron and muon identification and trigger: The uncertainties on electron or muon trigger and identification efficiencies are determined from data using samples of W and Z decays [53,54]. The trigger efficiency uncertainties are below about 1%. The identification efficiency uncertainties are between 3% and 6% for electrons and below 1.8% for muons. The total effect of these uncertainties on the event yield is no greater than 4% in any channel.
Hadronic τ identification and trigger: The uncertainties related to hadronic τ trigger and identification efficiencies are also studied with data [59]. The di-τ had trigger efficiency used in the τ had τ had channel has an uncertainty of 2% to 7%. The identification efficiency uncertainty for hadronic τ decays is about 4% for reconstructed τ had p T above 22 GeV and 8% otherwise. These uncertainties are most important in the τ had τ had channel, where the effect on the estimated signal yield reaches 11%.
b-jet identification: The b-jet identification efficiency and the misidentification probabilities for jets other than b-jets have been measured in data [58,66]. The associated uncertainties are treated separately for all jet flavours and depend on jet E T and η. Typical uncertainty values are around 5% for b-jets and between 20% and 30% for other jets, leading to a total uncertainty close to 5% on the event yield in the b-tagged channels.
Energy scale and resolution: The uncertainty on the acceptance due to the energy measurement uncertainty in the calorimeter is considered for each identified object corresponding to the clusters in the calorimeters. For the clusters identified as electrons, typically a 1% (3%) energy scale uncertainty is assigned for the barrel (end-cap) region [53]. The energy scale uncertainties for clusters identified as hadronic τ decays and jets are treated as being fully correlated, and are typically around 3% [57,59]. The uncertainty in the muon energy scale is below 1%. The acceptance uncertainty due to the jet energy resolution, which affects the p T thresholds used to define the b-tagged and b-vetoed samples, is typically less than 1%. The systematic uncertainty due to the energy scales of electrons, muons, hadronic τ decays and jets is propagated to the E miss T vector. Additional uncertainties due to different pile-up conditions in data and simulation are also considered.
The uncertainty on the acceptance due to the energy scale and resolution variations reaches up to 37% for signal in the τ had τ had b-tagged channel, but is usually less than 10% for channels with fewer τ had .
Luminosity: The simulated sample event yield is normalised to the integrated luminosity of the data, which is measured [67, 68] to be 4.7 fb −1 and 4.8 fb −1 for the τ + τ − and µ + µ − channels, respectively, and has an uncertainty of 3.9%. This is applicable to all signal and background processes which are not normalised using data-driven methods.

Statistical analysis
The statistical analysis of the data employs a binned likelihood function. Each one of the µµ, τ e τ µ , τ e τ had , τ µ τ had and τ had τ had final states is split into a b-tagged and a b-vetoed sample. The likelihood in each category is a product over bins in the distributions of the MMC mass in the signal and control regions.
The expected number of events for signal (s j ) and background (b j ), as well as the observed number of events (N j ) in each bin of the mass distributions, enter in the definition of the likelihood function L(µ, θ). A "signal strength" parameter (µ) scales the expected signal in each bin. The value µ = 0 corresponds to the background-only hypothesis, while µ = 1 corresponds to the signal-plus-background hypothesis with all Higgs bosons having the masses and cross-sections specified by the point considered in the m A -tan β plane for the MSSM exclusion limit. Signal and background predictions depend on systematic uncertainties that are parameterised by nuisance parameters, θ, which in turn are constrained using Gaussian functions, F G , so that L (µ, θ) = j = bin and category where F P (N j | µ · s j + b j ) denotes the Poisson distribution with mean µ · s j + b j for variable N j . The correlations of the systematic uncertainties across categories are taken into account. The expected signal and background event counts in each bin are functions of θ. The parameterisation is chosen such that the rates in each channel are log-normally distributed for a normally distributed θ.
To calculate the upper limit on µ for a given signal hypothesis, the compatibility of the observed or expected dataset with the signal-plus-background prediction is checked following the modified frequentist method known as CL s [69]. The test statisticq µ , used in the upper limit derivation, is defined as whereμ andθ refer to the global maximum of the likelihood andθ µ corresponds to the conditional maximum likelihood for a given µ. The asymptotic approximation [70] is used to evaluate the probability density functions rather than performing pseudo-experiments; the procedure has been validated using ensemble tests. The significance of an excess in data is quantified with the local p 0 -value, the probability that the background processes can produce a fluctuation greater than or equal to the excess observed in data. The test statistic q 0 , which is used in the local p 0 -value calculation, is defined as: where the notation is the same as in Equation 8.2. The equivalent formulation in terms of the number of standard deviations (σ), Z 0 , is referred to as the local significance and is defined as: where Φ −1 is the inverse of the cumulative distribution of the Gaussian distribution. The local p 0 -value is estimated using the asymptotic approximation for the q 0 distribution [70].

Results
No significant excess of events above the background-only expectation is observed in the considered channels in 4.7 fb −1 to 4.8 fb −1 of √ s = 7 TeV proton-proton collision data. A 95% CL upper limit on tan β is set for each m A point using the frequentist method described in section 8. This is done using Higgs boson cross-sections calculated in the m max h scenario with µ > 0 [11]. Results for each of the µµ, τ e τ µ , τ lep τ had and τ had τ had final states, as well as for their statistical combination, can be seen in figure 6. The tightest constraint is at m A = 130 GeV, where values of tan β > 9.3 are excluded. The expected exclusion for the same point is tan β > 10.3. The exclusion of parameter space is significantly increased in comparison to earlier results by the ATLAS Collaboration [7] and complementary to the excluded parameter space from searches at LEP [13]. A significant portion of the MSSM parameter space that is not excluded is still compatible with the assumption that the newly discovered particle at the LHC is one of the neutral CP-even MSSM Higgs bosons [22,23]. The lowest local p 0 -values per channel are 0.014 (2.2σ) at m A = 125 GeV for the µµ channel, 0.014 (2.2σ) at m A = 90 GeV for the τ e τ µ channel, 0.067 (1.5σ) at m A = 90 GeV for the τ lep τ had channel and 0.097 (1.3σ) at m A = 140 GeV for the τ had τ had channel. The lowest local p 0 for the statistical combination of all channels is 0.004 (2.7σ) at m A = 90 GeV. The significance of this excess is below 2σ, after considering the look-elsewhere effect in the range 90 GeV ≤ m A ≤ 500 GeV and 5 ≤ tan β ≤ 60 [71].
The outcome of the search is further interpreted in the generic case of a single scalar boson φ produced in either the gluon-fusion or b-associated production mode and decaying to µ + µ − or τ + τ − . Figure 7 shows 95% CL limits based on this interpretation. The exclusion limits for the production cross-section times the branching ratio for a Higgs boson decaying to µ + µ − or τ + τ − are shown as a function of the Higgs boson mass.  The 95% CL limits for the expected limit (dashed lines) and the observed limit (continuous lines) for each of the µµ, τ e τ µ , τ lep τ had and τ had τ had channels and their statistical combination are shown on the right plot. The 95% CL exclusion region from neutral MSSM Higgs boson searches performed at LEP[13] is shown in a hatched style.  Figure 7. Expected (dashed line) and observed (solid line) 95% CL limits on the cross-section for gluon-fusion and b-associated Higgs boson production times the branching ratio into τ and µ pairs, respectively, along with the ±1 σ and ±2 σ bands for the expected limit. The combinations of all τ τ and µµ channels are shown. The difference in the exclusion limits obtained for the gluon-fusion and the b-associated production modes is due to the different sensitivity from the b-tagged samples.

Summary
A search is presented for the neutral Higgs bosons of the Minimal Supersymmetric Standard Model in proton-proton collisions at a centre-of-mass energy of 7 TeV with the ATLAS experiment at the LHC. A significant portion of the available MSSM parameter space is consistent with the assumption that the newly discovered particle at the LHC is one of the neutral CP-even MSSM Higgs bosons. The study is based on a data sample that corresponds to an integrated luminosity of 4.7 fb −1 to 4.8 fb −1 . The decay modes of the Higgs bosons considered are h/A/H → µ + µ − , h/A/H → τ e τ µ , h/A/H → τ lep τ had and h/A/H → τ had τ had . The analysis selection criteria exploit the two main production mechanisms in the MSSM, the gluon-fusion and b-associated production modes, by introducing categories for event samples with and without an identified b-jet. Since no excess of events over the expected background is observed in the considered channels, 95% CL limits are set in the m A -tan β plane, excluding a significant fraction of the MSSM parameter space.

Acknowledgments
We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently.