Measurement of the top quark mass with the template method in the tt → lepton + jets channel using ATLAS data

The top quark mass has been measured using the template method in the tt lepton + jets channel based on data recorded in 2011 with the ATLAS detector at the LHC. The data were taken at a proton-proton centre-of-mass energy of √s = 7 TeV and correspond to an integrated luminosity of 1.04 fb^(−1). The analyses in the e + jets and μ + jets decay channels yield consistent results. The top quark mass is measured to be m_ (top) =174.5±0.6_(stat)±2.3_(syst) GeV.


Introduction
The top quark mass (m top ) is a fundamental parameter of the Standard Model (SM) of particle physics. Due to its large mass, the top quark gives large contributions to electroweak radiative corrections. Together with precision electroweak measurements, the top quark mass can be used to derive constraints on the masses of the as yet unobserved Higgs boson [1,2], and of heavy particles predicted by extensions of the SM. After the discovery of the top quark in 1995, much work has been devoted to the precise measurement of its mass. The present average value of m top = 173.2 ± 0.6 stat ± 0.8 syst GeV [3] is obtained from measurements at the Tevatron performed by CDF and D∅ with Run I and Run II data corresponding to integrated luminosities of up to 5.8 fb −1 . At the LHC, m top has been measured by CMS in tt events in which both W bosons from the top quark decays themselves decay into a charged lepton and a neutrino [4]. CERN, 1211 Geneva 23, Switzerland, E-mail: atlas.publications@cern.ch The main methodology used to determine m top at hadron colliders consists of measuring the invariant mass of the decay products of the top quark candidates and deducing m top using sophisticated analysis methods. The most precise measurements of this type use the tt → lepton+jets channel, i.e. the decay tt → νb q 1 q 2 b had with = e, µ, where one of the W bosons from the tt decay decays into a charged lepton and a neutrino and the other into a pair of quarks, and where b (b had ) denotes the b-quark associated to the leptonic (hadronic) W boson decay. In this paper these tt decay channels are referred to as e+jets and µ+jets channels.
In the template method, simulated distributions are constructed for a chosen quantity sensitive to the physics observable under study, using a number of discrete values of that observable. These templates are fitted to functions that interpolate between different input values of the physics observable, fixing all other parameters of the functions. In the final step a likelihood fit to the observed data distribution is used to obtain the value for the physics observable that best describes the data. In this procedure, the experimental distributions are constructed such that they are unbiased estimators of the physics observable used as an input parameter in the signal Monte Carlo samples. Consequently, the top quark mass determined this way from data corresponds to the mass definition used in the Monte Carlo. It is expected [5] that the difference between this mass definition and the pole mass is of order 1 GeV.
The precision of the measurement of m top is limited mainly by the systematic uncertainty from a few sources. In this paper two different estimators for m top are developed, which have only a small statistical correlation and use different strategies to reduce the impact of these sources on the final uncertainty. This choice translates into different sensitivities to the uncertainty sources for the two estimators. The first implementation of the template method is a one-dimensional template analysis (1d-analysis), which is based on the observable R 32 , defined as the per event ratio of the reconstructed invariant masses of the top quark and the W boson reconstructed from three and two jets respectively. For each event, an event likelihood is used to select the jet triplet assigned to the hadronic decays of the top quark and the W boson amongst the jets present in the event. The second implementation is a two-dimensional template analysis (2d-analysis), which simultaneously determines m top and a global jet energy scale factor (JSF) from the reconstructed invariant masses of the top quark and the W boson. This method utilises a χ 2 fit that constrains the reconstructed invariant mass of the W boson candidate to the world-average W boson mass measurement [6].
The paper is organised as follows: details of the ATLAS detector are given in Section 2, the data and Monte Carlo simulation samples are described in Section 3. The common part of the event selections is given in Section 4, followed by analysis-specific requirements detailed in Section 5. The specific details of the two analyses are explained in Section 6 and Section 7. The measurement of m top is given in Section 8, where the evaluation of the systematic uncertainties is discussed in Section 8.1, and the individual results and their combination are reported in Section 8.2. Finally, the summary and conclusions are given in Section 9.

The ATLAS detector
The ATLAS detector [7] at the LHC covers nearly the entire solid angle around the collision point 1 . It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and an external muon spectrometer incorporating three large superconducting toroid magnet assemblies.
The inner-detector system is immersed in a 2T axial magnetic field and provides charged particle tracking in the range |η| < 2.5. The high-granularity silicon pixel detector covers the vertex region and provides typically 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y axis points upward. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the beam pipe. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). Transverse momentum and energy are defined as p T = p sin θ and E T = E sin θ, respectively. three measurements per track, followed by the silicon microstrip tracker which provides four measurements from eight strip layers. These silicon detectors are complemented by the transition radiation tracker, which enables extended track reconstruction up to |η| = 2.0. In giving typically more than 30 straw-tube measurements per track, the transition radiation tracker improves the inner detector momentum resolution, and also provides electron identification information.
The calorimeter system covers the pseudorapidity range |η| < 4.9. Within the region |η| < 3.2, electromagnetic calorimetry is provided by barrel and end cap lead/liquid argon (LAr) electromagnetic calorimeters, with an additional thin LAr presampler covering |η| < 1.8 to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by the steel/scintillating-tile calorimeter, segmented into three barrel structures within |η| < 1.7, and two copper/LAr hadronic endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic measurements respectively.
The muon spectrometer comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field with a bending integral up to 8 Tm in the central region, generated by three superconducting air-core toroids. The precision chamber system covers the region |η| < 2.7 with three layers of monitored drift tubes, complemented by cathode strip chambers in the forward region. The muon trigger system covers the range |η| < 2.4 with resistive plate chambers in the barrel, and thin gap chambers in the endcap regions.
A three-level trigger system is used. The first level trigger is implemented in hardware and uses a subset of detector information to reduce the event rate to a design value of at most 75 kHz. This is followed by two software-based trigger levels, which together reduce the event rate to about 300 Hz.

Data and Monte Carlo samples
In this paper, data from LHC proton-proton collisions are used, collected at a centre-of-mass energy of √ s = 7 TeV with the ATLAS detector during March-June 2011. An integrated luminosity of 1.04 fb −1 is included.
Simulated tt events and single top quark production are both generated using the Next-to-Leading Order (NLO) Monte Carlo program MC@NLO [8,9] with the NLO parton density function set CTEQ6.6 [10]. Parton showering and underlying event (i.e. additional interactions of the partons within the protons that underwent the hard interaction) are modelled using the Herwig [11] and Jimmy [12] programs. For the construction of signal templates, the tt and single top quark production samples are generated for different assumptions on m top using six values (in GeV) namely (160, 170, 172.5, 175, 180, 190), and with the largest samples at m top = 172.5 GeV. All tt samples are normalised to the corresponding cross-sections, obtained with the latest theoretical computation approximating the NNLO prediction and implemented in the HATHOR package [13]. The predicted tt cross-section for a top quark mass of m top = 172.5 GeV is 164.6 pb, with an uncertainty of about 8%.
The production of W bosons or Z bosons in association with jets is simulated using the Alpgen generator [14] interfaced to the Herwig and Jimmy packages. Diboson production processes (W W , W Z and ZZ) are produced using the Herwig generator. All Monte Carlo samples are generated with additional multiple soft proton-proton interactions. These simulated events are re-weighted such that the distribution of the number of interactions per bunch crossing (pileup) in the simulated samples matches that in the data. The mean number of primary vertices per bunch crossing for the data of this analysis is about four. The samples are then processed through the GEANT4 [15] simulation [16] and the reconstruction software of the ATLAS detector.

Event selection
In the signal events the main reconstructed objects in the detector are electron and muon candidates as well as jets and missing transverse momentum (E miss T ). An electron candidate is defined as an energy deposit in the electromagnetic calorimeter with an associated wellreconstructed track. Electron candidates are required to have transverse energy E T > 25 GeV and |η cluster | < 2.47, where η cluster is the pseudorapidity of the electromagnetic cluster associated with the electron. Candidates in the transition region between the barrel and end-cap calorimeter, i.e. candidates fulfilling 1.37 < |η cluster | < 1.52, are excluded. Muon candidates are reconstructed from track segments in different layers of the muon chambers. These segments are combined starting from the outermost layer, with a procedure that takes material effects into account, and matched with tracks found in the inner detector. The final candidates are refitted using the complete track information, and are required to satisfy p T > 20 GeV and |η| < 2.5. Isolation criteria, which restrict the amount of energy deposits near the candidates, are applied to both electron and muon candidates to reduce the background from hadrons mimicking lepton signatures and backgrounds from heavy flavour decays inside jets. For elec-trons, the energy not associated to the electron cluster and contained in a cone of ∆R = ∆φ 2 + ∆η 2 = 0.2 must not exceed 3.5 GeV, after correcting for energy deposits from pileup, which in the order of 0.5 GeV. For muons, the sum of track transverse momenta and the total energy deposited in a cone of ∆R = 0.3 around the muon are both required to be less than 4 GeV.
Jets are reconstructed with the anti-k t algorithm [17] with R = 0.4, starting from energy clusters of adjacent calorimeter cells called topological clusters [18]. These jets are calibrated first by correcting the jet energy using the scale established for electromagnetic objects (EM scale) and then performing a further correction to the hadronic energy scale using correction factors, that depend on energy and η, obtained from simulation and validated with data [19]. Jet quality criteria [20] are applied to identify and reject jets reconstructed from energies not associated to energy deposits in the calorimeters originating from particles emerging from the bunch crossing under study. The jets failing the quality criteria, which may have been reconstructed from various sources such as calorimeter noise, non-collision beamrelated background, and cosmic-ray induced showers, can efficiently be identified [20].
The reconstruction of E miss T is based upon the vector sum of calorimeter energy deposits projected onto the transverse plane. It is reconstructed from topological clusters, calibrated at the EM scale and corrected according to the energy scale of the associated physics object. Contributions from muons are included by using their momentum measured from the track and muon spectrometer systems in the E miss T reconstruction. Muons reconstructed within a ∆R = 0.4 cone of a jet satisfying p T > 20 GeV are removed to reduce the contamination caused by muons from hadron decays within jets. Subsequently, jets within ∆R = 0.2 of an electron candidate are removed to avoid double counting, which can occur because electron clusters are usually also reconstructed as jets.
Reconstruction of top quark pair events is facilitated by the ability to tag jets originating from the hadronisation of b-quarks. For this purpose, a neural-net-based algorithm [21], relying on vertex properties such as the decay length significance, is applied. The chosen working point of the algorithm corresponds to a b-tagging efficiency of 70% for jets originating from b-quarks in simulated tt events and a light quark jet rejection factor of about 100. Irrespective of their origin, jets tagged by this algorithm are called b-jets in the following, whereas those not tagged are called light jets.
The signal is characterised by an isolated lepton with relatively high p T , E miss T arising from the neutrino from the leptonic W boson decay, two b-quark jets, and  Table 1 The observed numbers of events in the data in the e+jets and µ+jets channels, for the two analyses after the common event selection and additional analysis-specific requirements. In addition, the expected numbers of signal and background events corresponding to the integrated luminosity of the data are given, where the single top quark production events are treated as signal for the 1d-analysis, and as background for the 2d-analysis. The Monte Carlo estimates assume SM cross-sections. The W +jets and QCD multijet background contributions are estimated from ATLAS data. The uncertainties for the estimates include different components detailed in the text. All predicted event numbers are quoted using one significant digit for the uncertainties, i.e. the trailing zeros are insignificant.
two light quark jets from the hadronic W boson decay. The selection of events consists of a series of requirements on general event quality and the reconstructed objects designed to select the event topology described above. The following event selections are applied: it is required that the appropriate single electron or single muon trigger has fired (with thresholds at 20 GeV and 18 GeV, respectively); the event must contain one and only one reconstructed lepton with E T > 25 GeV for electrons and p T > 20 GeV for muons which, for the e+jets channel, should also match the corresponding trigger object; in the µ+jets channel, E miss T > 20 GeV and in addition E miss T + m T W > 60 GeV is required 2 ; in the e+jets channel more stringent cuts on E miss T and m T W are required because of the higher level of QCD multijet background, these being E miss T > 35 GeV and m T W > 25 GeV; the event is required to have ≥ 4 jets with p T > 25 GeV and |η| < 2.5. It is required that at least one of these jets is a b-jet.
This common event selection is augmented by additional analysis-specific event requirements described next.

Specific event requirements
To optimise the expected total uncertainty on m top , some specific requirements are used in addition to the common event selection.
2 Here m T W is the W -boson transverse mass, defined as 2 p T, p T,ν [1 − cos(φ − φ ν )], where the measured E miss T vector provides the neutrino (ν) information.
For the 1d-analysis, three additional requirements are applied. Firstly, only events with a converging likelihood fit (see Section 6) with a logarithm of the likelihood value ln L > −50 are retained. Secondly, all jets in the jet triplet assigned to the hadronic decay of the top quark are required to fulfill p T > 40 GeV, and thirdly the reconstructed W boson mass must lie within the range 60 GeV -100 GeV.
For the 2d-analysis the additional requirement is that only light jet pairs (see Section 7) with an invariant mass in the range 50 GeV -110 GeV are considered for the χ 2 fit.
The numbers of events observed and expected, with the above selection and these additional analysis-specific requirements, are given in Table 1 for both channels and both analyses. For all Monte Carlo estimates, the uncertainties are the quadratic sum of the statistical uncertainty, the uncertainty on the b-tagging efficiencies, and a 3.7% uncertainty on the luminosity [22,23]. For the QCD multijet and the W +jets backgrounds, the systematic uncertainty estimated from data [24] dominates and is used instead.
For both analyses and channels, the observed distributions for the leptons, jets, and kinematic properties of the top quark candidates such as their transverse momenta, are all well-described by the sum of the signal and background estimates. This is demonstrated for the properties of the selected jets, before applying the analysis specific requirements, for both channels in  data. The largest differences between the central values of the combined prediction and the data is observed for the rapidity distribution, with the data being higher, especially at central rapidities. Based on the selected events, the top quark mass is measured in two ways as described below.

The 1d-analysis
The 1d-analysis is a one-dimensional template analysis using the reconstructed mass ratio: Here m reco top and m reco W are the per event reconstructed invariant masses of the hadronically decaying top quark and W boson, respectively.
To select the jet triplet for determining the two masses, this analysis utilises a kinematic fit maximising an event likelihood. This likelihood relates the observed objects to the tt decay products (quarks and leptons) predicted by the NLO signal Monte Carlo, albeit in a Leading Order (LO) kinematic approach, using tt → νb q 1 q 2 b had . In this procedure, the measured jets relate to the quark decay products of the W boson, q 1 and q 2 , and to the b-quarks, b and b had , produced in the top quark decays. The E miss T vector is identified with the transverse momentum components of the neutrino, p x,ν andp y,ν .
The likelihood is defined as a product of transfer functions (T ), Breit-Wigner (B) distributions, and a weight W btag accounting for the b-tagging information: The generator predicted quantities are marked with a circumflex (e.g.Ê b had ), i.e. the energy of the b-quark from the hadronic decay of the top quark. The quantities m W and Γ W (which amounts to about one fifth of the Gaussian resolution of the m reco  at an input mass of m top = 172.5 GeV, based on reconstructed objects that are matched to their generator predicted quarks and leptons. When using a maximum separation of ∆R = 0.4 between a quark and the corresponding jet, the fraction of events with four matched jets from all selected events amounts to 30% -40%. The transfer functions are obtained in three bins of η for the energies of b-quark jets, E jet 1 and E jet 2 , light quark jets, E jet 3 and E jet 4 , the energy, E e , (or transverse momentum, p T,µ ) of the charged lepton, and the two components of the E miss T , E miss x and E miss y . In addition, the likelihood exploits the values of m W and Γ W to constrain the reconstructed leptonic, m( ν), and hadronic, m(q 1 q 2 ), W boson masses using Breit-Wigner distributions. Similarly, the reconstructed leptonic, m( ν b ), and hadronic, m(q 1 q 2 b had ), top quark masses are constrained to be identical, where the width of the corresponding Breit Wigner distribution is identified with the predicted Γ top (using its top quark mass dependence) [6]. Including the b-tagging information into the likelihood as a weight W btag , derived from the efficiency and mistag rate of the b-tagging algorithm, and assigned per jet permutation according to the role of each jet for a given jet permutation, improves the selection of the correct jet permutation. As an example, for a permutation with two b-jets assigned to the b-quark positions and two light jets to the light quark positions, the weight W btag amounts to 0.48, i.e. it corresponds to the square of the b-tagging efficiency times the square of one minus the fake rate, both given in Section 4.
With this procedure, the correct jet triplet for the hadronic top quark is chosen in about 70% of simulated signal events with four matched jets. However, if R 32 from the likelihood fit, i.e. calculated from m reco,like top and m reco,like W , is taken, a large residual jet energy scale (JES) dependence of R 32 remains. This is because in the fit m reco W is constrained to m W , while m reco top is only constrained to be equal for the leptonic and hadronic decays of the top quarks. This spoils the desired eventby-event reduction of the JES uncertainty in the ratio R 32 [25]. To make best use of the high selection efficiency for the correct jet permutation from the likelihood fit, and the stabilisation of R 32 against JES variations, the jet permutation derived in the fit is used, but m reco W , m reco top and therefore R 32 , are constructed from the unconstrained four-vectors of the jet triplet as given by the jet reconstruction.
The performance of the algorithm, shown in Figure 2 for the e+jets channel, is similar for both channels. The likelihood values of wrong jet permutations for signal events from the large MC@NLO sample are frequently considerably lower than the ones for the correct jet permutations, as seen in Figure 2  ple, the distribution for the jet permutation in which the jet from the b-quark from the leptonically decaying top quark is exchanged with one light quark jet from the hadronic W boson decay has a second peak at about ten units lower than the one for the correct jet permutation. The actual distribution of ln L values observed in the data is well-described by the signal plus background predictions, as seen in Figure 2(b). The kinematic distributions of the variables used in the transfer functions are also well-described by the predictions, as shown in Figure 2(c), for the example of the resulting p T of the b-jet associated to the hadronic decay of the top quark. The resulting R 32 distributions for both channels are shown in Figure 3. They are also well accounted for by the predictions.
Signal templates are derived for the R 32 distribution for all m top dependent samples, consisting of the tt signal events, together with single top quark production events. This procedure is adopted, firstly, because single top quark production, although formally a background process, still carries information about the top quark mass and, secondly, by doing so m top independent background templates can be used. The templates are constructed for the six m top choices using the specifically generated Monte Carlo samples, see Section 3.
The R 32 templates are parameterised with a functional form given by the sum of a ratio of two correlated Gaussians and a Landau function. The ratio of two Gaussians [26] is motivated as a representation of the ratio of two correlated measured masses. The Landau function is used to describe the tails of the distribution stemming mainly from wrong jet-triplet assignments. The correlation between the two Gaussian distributions is fixed to 50%. A simultaneous fit to all templates per decay channel is used to derive a continuous function of m top that interpolates the R 32 shape differences among all mass points with m top in the range described above. This approach rests on the assumption that each parameter has a linear dependence on the top quark mass, which has been verified for both channels. The fit minimises a χ 2 built from the R 32 distributions at all mass points simultaneously. The χ 2 is the sum over all bins of the difference squared between the template and the functional form, divided by the statistical uncertainty squared in the template. The combined fit adequately describes the R 32 distributions for both channels. In Figure 4(a) the sensitivity to m top is shown in the e+jets channel by the superposition of the signal templates and their fits for four of the six input top quark masses assumed in the simulation.
For the background template, the m top independent parts, see Table 1, are treated together. Their individual distributions, taken either from Monte Carlo or data estimates as detailed above, are summed, and a Landau distribution is chosen to parameterise their R 32 distribution. For each channel this function adequately describes the background distribution as shown in Figure 4(b) for the e+jets channel, which has a larger background contribution than the µ+jets channel.
Signal and background probability density functions, P sig (R 32 |m top ) and P bkg (R 32 ), respectively, are used in a binned likelihood fit to the data using a number of bins, N bins . The likelihood reads: The variable N i denotes the number of events observed per bin, and n sig and n bkg denote the total numbers of signal and background events to be determined. The term L shape accounts for the shape of the R 32 distribution and its dependence on the top quark mass m top . The term L bkg constrains the total number of background events, n bkg , using its prediction, n pred bkg , and the background uncertainty, chosen to be 50%, see Table 1. In addition, the number of background events is restricted to be positive. The two free parameters of the fit are the total number of background events, n bkg , and m top . The performance of this algorithm is assessed with the pseudo-experiment technique. For each m top value, distributions from pseudo-experiments are constructed by random sampling of the simulated signal and background events used to construct the corresponding templates. Using Poisson statistics, the numbers of signal events and total background events in each pseudo-experiment are fluctuated around the expectation values, either calculated assuming SM crosssections and the integrated luminosity of the data, or taken from the data estimate. A good linearity is found between the input top quark mass used to perform the pseudo-experiments, and the result of the fit. Within their statistical uncertainties, the mean values and width of the pull distributions are consistent with the expectations of zero and one, respectively. The expected statistical uncertainties (mean ± RMS) obtained from pseudo-experiments with an input top quark mass of m top = 172.5 GeV, and for a luminosity of 1 fb −1 , are 1.36 ± 0.16 GeV and 1.11 ± 0.06 GeV for the e+jets and µ+jets channels, respectively.

The 2d-analysis
In the 2d-analysis, similarly to Ref. [27], m top and a global jet energy scale factor (JSF) are determined simultaneously by using the m reco top and m reco W distributions 3 . Instead of stabilising the estimator of m top against JES variations as done for the 1d-analysis, the emphasis here is on an in-situ jet scaling. A global JSF (averaged over η and p T ) is obtained, which is mainly based on the observed differences between the predicted m reco W distribution and the one observed for the data. This algorithm predicts which global JSF correction should be applied to all jets to best fit the data. Due to this procedure, the JSF is sensitive not only to the JES, but also to all possible differences in data and predictions from specific assumptions made in the simulation that can lead to differences in the observed jets. These comprise: the fragmentation model, initial state and final state QCD radiation (ISR and FSR), the un-3 Although for the two analyses m reco top and m reco W are calculated differently, the same symbols are used to indicate that these are estimates of the same quantities.
derlying event, and also pileup. In this method, the systematic uncertainty on m top stemming from the JES is reduced and partly transformed into an additional statistical uncertainty on m top due to the two-dimensional fit. The precisely measured values of m W and Γ W [6] are used to improve on the experimental resolution of m reco top by relating the observed jet energies to the corresponding parton energies as predicted by the signal Monte Carlo (i.e. to the two quarks from the hadronic W boson decay, again using LO kinematics). Thereby, this method offers a complementary determination of m top to the 1d-analysis method, described in Section 6, with different sensitivity to systematic effects and data statistics.
For the events fulfilling the common requirements listed in Section 4, the jet triplet assigned to the hadronic top quark decay is constructed from any b-jet, together with any light jet pair with a reconstructed m reco W within 50 GeV -110 GeV. Amongst those, the jet triplet with maximum p T is chosen as the top quark candidate. For the light jet pair, i.e. for the hadronic W boson decay candidates, a kinematic fit is then performed by minimising the following χ 2 : with respect to parton scale factors (α i ) for the jet energies. The χ 2 comprises two components. The first component is the sum of squares of the differences of the measured and fitted energies of the two reconstructed light jets, E jet,i , individually divided by the squares of their p T -and η-dependent resolutions obtained from Monte Carlo simulation, σ(E jet,i ). The second term is the difference of their two-jet invariant mass, M jet,jet , and m W , divided by the W boson width. From these jets the two observables m reco W and m reco top are constructed. The m reco W is calculated using the reconstructed light jet four-vectors (i.e. jet energies are not corrected using α i ), retaining the full sensitivity of m reco W to the JSF. In contrast, m reco top is calculated from these light jet four-vectors scaled to the parton level (i.e. jet energies are corrected using α i ) and the above determined b-jet. In this way light jets in m reco top exhibit a much reduced JES sensitivity by construction, and only the b-jet is directly sensitive to the JES. The m reco W and m reco top distributions are shown in Figure 5 for both lepton channels, together with the predictions for signal and background. These, in both cases describe the observed distributions well. The correlation of these two observables is found to be small for data and predictions, and amounts to about −0.06.
Templates are constructed for m reco top as a function of an input top quark mass in the range 160 GeV -  190 GeV, and of an input value for the JSF in the range 0.9 -1.1, and, finally, for m reco W as a function of the assumed JSF for the same range. The signal templates for the m reco W and m reco top distributions, shown for the µ+jets channel and for JSF=1 in Figure 6(a) and 6(b), are fitted to a sum of two Gaussian functions for m reco W , and to the sum of a Gaussian and a Landau function for m reco top . Since, for this analysis, the background templates are constructed including single top quark production events, the background fit for the m reco top distribution is assumed to be m top dependent. For the background, the m reco W distribution, again shown for the µ+jets channel in Figure 6(c), is fitted to a Gaussian function and the m reco top distribution, Figure 6(d), to a Landau function. For all parameters of the functions that also depend on the JSF, a linear parameterisation is chosen. The quality of all fits is good for the signal and background contributions and for both channels.  unbinned likelihood fit to the data for all events, i = 1, . . . N . The likelihood function maximised is: The three parameters to be determined by the fit are m top , the JSF and n bkg . Using pseudo-experiments, a good linearity is found between the input top quark mass used to perform the pseudo-experiments, and the result of the fits. The residual dependence of the reconstructed m top is about 0.1 GeV for a JSF shift of 0.01 for both channels, which results in a residual systematic uncertainty due to the JES. Within their statistical uncertainties, the mean values and widths of the pull distributions are consistent with the expectations of zero and one, respectively. Finally, the expected statistical plus JSF uncertainties (mean ± RMS) obtained from pseudo-experiments at an input top quark mass of m top = 172.5 GeV, and for a luminosity of 1 fb −1 , are 1.20 ± 0.08 GeV and 0.94 ± 0.04 GeV for the e+jets and µ+jets channel, respectively.

Evaluation of systematic uncertainties
Each source of uncertainty considered is investigated, when possible, by varying the respective quantities by ±1σ with respect to the default value. Using the changed parameters, pseudo-experiments are either performed directly or templates are constructed and then used to generate pseudo-experiments, without altering the probability density function parameterisations. The difference of the results for m top compared to the standard analysis is used to determine the systematic uncertainties. For the 2d-analysis, in any of the evaluations of the systematic uncertainties, apart from the JES variations, the maximum deviation of the JSF from its nominal fitted value is ±2.5%.
All sources of systematic uncertainties investigated, together with the resulting uncertainties, are listed in Table 2. The statistical precision on m top obtained from the Monte Carlo samples is between 0.2 GeV and 0.5 GeV, depending on the available Monte Carlo statistics. For some sources, pairs of statistically independent samples are used. For other sources, the same sample is used, but with a changed parameter. In this case the observed m top values for the central and the changed sample are statistically highly correlated. In all cases, the actual observed difference is quoted as the systematic uncertainty on the corresponding source, even if it is smaller than the statistical precision of the difference. The total uncertainty is calculated as the quadratic sum of all individual contributions, i.e. neglecting possible correlations. The estimation of the uncertainties from the individual contributions is described in the following.
Jet energy scale factor: This is needed to separate the quoted statistical uncertainty on the result of the 2d-analysis into a purely statistical component on m top analogous to the one obtained in an 1d-analysis, and the contribution stemming from the simultaneous determination of the JSF. This uncertainty is evaluated for the 2d-analysis by in addition performing a one-dimensional (i.e. JSF-constraint) fit to the data, with the JSF fixed to the value obtained in the twodimensional fit. The quoted statistical precision on m top is the one from the one-dimensional fit. The contribution of the JSF is obtained by quadratically subtracting the statistical uncertainties on m top for the onedimensional and two-dimensional fit of the 2d-analysis.

Method calibration:
The limited statistics of the Monte Carlo samples leads to a systematic uncertainty in the template fits, which is reflected in the residual mass differences between the fitted and the input mass for a given Monte Carlo sample. The average difference observed in the six samples with different input masses is taken as the uncertainty from this source.
Signal Monte Carlo generator: The systematic uncertainty related to the choice of the generator program is accounted for by comparing the results of pseudoexperiments performed with either the MC@NLO or the Powheg samples [28] both generated with m top = 172.5 GeV.
Hadronisation: Signal samples for m top = 172.5 GeV from the Powheg event generator are produced with either the Pythia [29] or Herwig [11] program performing the hadronisation. One pseudoexperiment per sample is performed and the full difference of the two results is quoted as the systematic uncertainty.
Pileup: To investigate the uncertainty due to additional proton-proton interactions which may affect the jet energy measurement, on top of the component that is already included in the JES uncertainty discussed below, the fit is repeated in data and simulation as a function of the number of reconstructed vertices. Within statistics, the measured m top is independent of the number of reconstructed vertices. This is also observed when the data are instead divided into data periods according to the average numbers of reconstructed vertices. In this case, the subsets have varying contributions from pileup from preceding events.
However, the effect on m top due to any residual small difference between data and simulation in the number of reconstructed vertices was assessed by computing the weighted sum of a linear interpolation of the fitted masses as a function of the number of primary vertices. In this sum the weights are the relative frequency of observing a given number of vertices in the respective sample. The difference of the sums in data and simulation is taken as the uncertainty from this source.
Underlying event: This systematic uncertainty is obtained by comparing the AcerMC [30, 31] central value, defined as the average of the highest and the lowest masses measured on the ISR/FSR variation samples described below, with a dataset with a modified underlying event.
Colour reconnection: The systematic uncertainty due to colour reconnection is determined using Ac-erMC with Pythia with two different simulations of the colour reconnection effects as described in Refs. [32][33][34]. In each case, the difference in the fitted mass between two assumptions on the size of colour reconnection was measured. The maximum difference is taken as the systematic uncertainty due to colour reconnection.
Initial and final state QCD radiation: Different amounts of initial and final state QCD radiation can alter the jet energies and the jet multiplicity of the events with the consequence of introducing distortions into the measured m reco top and m reco W distributions. This effect is evaluated by performing pseudo-experiments for which signal templates are derived from seven dedicated AcerMC signal samples in which Pythia pa-  Table 2 The measured values of m top and the contributions of various sources to the uncertainty of m top (in GeV) together with the assumed correlations ρ between analyses and lepton channels. Here '0' stands for uncorrelated, '1' for fully correlated between analyses and lepton channels, and '(1)' for fully correlated between analyses, but uncorrelated between lepton channels. The abbreviation 'na' stands for not applicable. The combined results described in Section 8.2 are also listed.
rameters that control the showering are varied in ranges that are compatible with those used in the Perugia Hard/Soft tune variations [32]. The systematic uncertainty is taken as half the maximum difference between any two samples. Using different observables, the additional jet activity accompanying the jets assigned to the top quark decays has been studied. For events in which one (both) W bosons from the top quark decays themselves decay into a charged lepton and a neutrino, the reconstructed jet multiplicities [35] (the fraction of events with no additional jet above a certain transverse momentum [36]) are measured. The analysis of the reconstructed jet multiplicities is not sufficiently precise to constrain the presently used variations of Monte Carlo parameters. In contrast, for the ratio analysis [36] the spread of the predictions caused by the presently performed ISR variations is significantly wider than the uncertainty of the data, indicating that the present ISR variations are generous.
Proton PDF: The signal samples are generated using the CTEQ 6.6 [10] proton parton distribution functions, PDFs. These PDFs, obtained from experimental data, have an uncertainty that is reflected in 22 pairs of additional PDF sets provided by the CTEQ group. To evaluate the impact of the PDF uncertainty on the signal templates, the events are re-weighted with the corresponding ratio of PDFs, and 22 pairs of additional signal templates are constructed. Using these templates one pseudo-experiment per pair is performed. The uncertainty is calculated as half the quadratic sum of differences of the 22 pairs as suggested in Ref. [37].

W+jets background normalisation:
The uncertainty on the W +jets background determined from data is dominated by the uncertainty on the heavy flavour content of these events and amounts to ±70%. The difference in m top obtained by varying the normalisation by this amount is taken as the systematic uncertainty.
W+jets background shape: The impact of the variation of the shape of the W +jets background contribution is studied using a re-weighting algorithm [24] which is based on changes observed on stable particle jets when model parameters in the Alpgen Monte Carlo program are varied.
QCD multijet background normalisation: The estimate for the background from QCD multijet events determined from data is varied by ±100% to account for the current understanding of this background source [24] for the signal event topology.
QCD multijet background shape: The uncertainty due to the QCD background shape has been estimated comparing the results from two data driven methods, for both channels, see Ref.
For this uncertainty pseudo-experiments are performed on QCD background samples with varied shapes.
Jet energy scale: The jet energy scale is derived using information from test-beam data, LHC collision data and simulation. Since the energy correction procedure involves a number of steps, the JES uncertainty has various components originating from the calibration method, the calorimeter response, the detector simulation, and the specific choice of parameters in the physics model employed in the Monte Carlo event generator. The JES uncertainty varies between ±2.5% and ±8% in the central region, depending on jet p T and η as given in Ref. [19]. These values include uncertainties in the flavour composition of the sample and mis-measurements from jets close by. Pileup gives an additional uncertainty of up to ±2.5% (±5%) in the central (forward) region. Due to the use of the observable R 32 for the 1d-analysis, and to the simultaneous fit of the JSF and m top for the 2d-analysis, which mitigate the impact of the JES on m top differently, the systematic uncertainty on the determined m top resulting from the uncertainty of the jet energy scale is less than 1%, i.e. much smaller than the JES uncertainty itself.
Relative b-jet energy scale: This uncertainty is uncorrelated with the jet energy scale uncertainty and accounts for the remaining differences between jets originating from light quarks and those from b-quarks after the global JES has been determined. For this, an extra uncertainty ranging from ±0.8% to ±2.5% and depending on jet p T and η is assigned to jets arising from the fragmentation of b-quarks, due to differences between light jets and gluon jets, and jets containing b-hadrons. This uncertainty decreases with p T , and the average uncertainty for the spectrum of jets selected in the analyses is below ±2%.
This additional systematic uncertainty has been obtained from Monte Carlo simulation and was also verified using b-jets in data. The validation of the b-jet energy scale uncertainty is based on the comparison of the jet transverse momentum as measured in the calorimeter to the total transverse momentum of charged particle tracks associated to the jet. These transverse momenta are evaluated in the data and in Monte Carlo simulated events for inclusive jet samples and for b-jet samples [19]. Moreover, the jet calorimeter response uncertainty has been evaluated from the single hadron response. Effects stemming from b-quark fragmentation, hadronisation and underlying soft radiation have been studied using different Monte Carlo event generation models [19].
b-tagging efficiency and mistag rate: The b-tagging efficiency and mistag rates in data and Monte Carlo simulation are not identical. To accommodate this, b-tagging scale factors, together with their uncertainties, are derived per jet [21,38]. They depend on the jet p T and η and the underlying quark-flavour. For the default result the central values of the scale factors are applied, and the systematic uncertainty is assessed by changing their values within their uncertainties.
Jet energy resolution: To assess the impact of this uncertainty, before performing the event selection, the energy of each reconstructed jet in the simulation is additionally smeared by a Gaussian function such that the width of the resulting Gaussian distribution corresponds to the one including the uncertainty on the jet energy resolution. The fit is performed using smeared jets and the difference to the default m top measurement is assigned as a systematic uncertainty.
Jet reconstruction efficiency: The jet reconstruction efficiency for data and the Monte Carlo simulation are found to be in agreement with an accuracy of better than ±2% [19]. To account for this, jets are randomly removed from the events using that fraction. The event selection and the fit are repeated on the changed sample.
Missing transverse momentum: The E miss T is used in the event selection and also in the likelihood for the 1d-analysis, but is not used in the m top estimator for either analysis. Consequently, the uncertainty due to any mis-calibration is expected to be small. The impact of a possible mis-calibration is assessed by changing the measured E miss T within its uncertainty. The resulting sizes of all uncertainties are given in Table 2. They are also used in the combination of results described below. The three most important sources of systematic uncertainty for both analyses are the relative b-jet to light jet energy scale, the modelling of initial and final state QCD radiation, and the light jet energy scale. Their impact on the precision on m top are different as expected from the difference in the estimators used by the two analyses. Figure 7 shows the results of the 1d-analysis when performed on data. For both channels, the fit function describes the data well, with a χ 2 /dof of 21/23 (39/23) for the e+jets (µ+jets) channels. The observed statistical uncertainties in the data are consistent with the expectations given in Section 6 with the e+jets channel uncertainty being slightly higher than the expected uncertainty of 1.36 ± 0.16 GeV. The results from both channels are statistically consistent and are: m top = 172.9 ± 1.5 stat ± 2.5 syst GeV (1d e+jets), m top = 175.5 ± 1.1 stat ± 2.6 syst GeV (1d µ+jets).   Within statistical uncertainties these results are consistent with each other, and the observed statistical uncertainties in the data are in accord with the expectations given in Section 7, however, for this analysis, with the e+jets channel uncertainty being slightly lower than the expected uncertainty of 1.20 ± 0.08 GeV. The corresponding values for the JSF are 0.985 ± 0.008 and 0.986 ±0.006 in the e+jets and µ+jets channels, respec-tively, where the uncertainties are statistical only. The JSF values fitted for the two channels are consistent within their statistical uncertainty. For both channels, the correlation of m top and the JSF in the fits is about −0.57.

Results
When separating the statistical and JSF component of the result as explained in the discussion of the JSF uncertainty evaluation in Section 8.1, the result from the 2d-analysis yields: m top = 174.3 ± 0.8 stat ± 2.3 syst GeV (2d e+jets), m top = 175.0 ± 0.7 stat ± 2.6 syst GeV (2d µ+jets).
These values together with the breakdown of uncertainties are shown in Table 2 and are used in the combinations.
Due to the additional event selection requirements used in the 1d-analysis to optimise the expected uncertainty described in Section 5, for both channels the 2d-analysis has the smaller statistical uncertainty, despite the better top quark mass resolution of the 1d-analysis. Both analyses are limited by the systematic uncertainties, which have different relative contributions per source but are comparable in total size, i.e. the difference in total uncertainty between the most precise and the least precise of the four measurements is only 16%.
The four individual results are all based on data from the first part of the 2011 data taking period. The e+jets and µ+jets channel analyses exploit exclusive event selections and consequently are statistically uncorrelated within a given analysis. In contrast, for each lepton channel the data samples partly overlap, see Section 4. However, because the selection of the jet triplet and the construction of the estimator of m top are different, the two analyses are less correlated than the about 50% that would be expected from the overlap of events.
The statistical correlation of the two results for each of the lepton channels is evaluated using the Monte Carlo method suggested in Ref. [39], exploiting the large Monte Carlo signal samples. For all four measurements (two channels and two analyses), five hundred independent pseudo-experiments are performed, ensuring that for every single pseudo-experiment the identical events are input to all measurements. The precision of the determined statistical correlations depends purely on the number of pseudo-experiments performed, and in particular, it is independent of the uncertainty of the measured m top per pseudo-experiment. In this analysis, the precision amounts to approximately 4% absolute, i.e. this estimate is sufficiently precise that its impact on the uncertainty on m top , given the low sensitivity of the combined results of m top to the statistical correlation, is negligible. For the 1d-analysis, the signal is comprised  of tt and single top quark production, whereas for the 2d-analysis the single top quark production process is included in the background, see Table 1. Consequently, the MC@NLO samples generated at m top = 172.5 GeV for both processes are used appropriately for each analysis in determining the statistical correlations. The statistical correlation between the results of the two analyses is 0.15 (0.16) in the e+jets (µ+jets) channels, respectively. Given these correlations, the two measurements for each lepton channel are statistically consistent for both lepton flavours.
The combinations of results are performed for the individual measurements and their uncertainties listed in Table 2 and using the formalism described in Refs. [39,40]. The statistical correlations described above are used. The correlations of systematic uncertainties assumed in the combinations fall into three classes. For the uncertainty in question the measurements are either considered uncorrelated ρ = 0, fully correlated between anal-yses and lepton channels ρ = 1, or fully correlated between analyses, but uncorrelated between lepton channels denoted with ρ = (1). A correlation of ρ = 0 is used for the sources method calibration and jet energy scale factor, which are of purely statistical nature. The sources with ρ = 1 are listed in Table 2. Finally, the sources with ρ = (1) are QCD background normalisation and shape that are based on independent lepton fake rates in each lepton channel.
Combining the results for the two lepton channels separately for each analysis gives the following results (note that these two analyses are correlated as described above): m top = 174.4 ± 0.9 stat ± 2.5 syst GeV (1d-analysis), m top = 174.5 ± 0.6 stat ± 2.3 syst GeV (2d-analysis).
For the 1d-analysis the µ+jets channel is more precise, and consequently carries a larger weight in the combination, whereas for the 2d-analysis this is reversed.  However, for both analyses, the improvement on the more precise estimate by the combination is moderate, i.e. a few percent, see Table 2.
The pairwise correlation of the four individual results range from 0.63 to 0.77, with the smallest correlation between the results from the different lepton channels of the different analyses, and the largest correlation between the ones from the two lepton channels within an individual analysis. The combination of all four measurements of m top yields statistical and systematic uncertainties on the top quark mass of 0.6 GeV and 2.3 GeV, respectively. Presently this combination does not improve the precision of the measured top quark mass from the 2d-analysis, which has the better expected total uncertainty. Therefore, the result from the 2d-analysis is presented as the final result. The two analyses will differently profit from progress on the individual systematic uncertainties, which can be fully exploited by the method to estimate the statistical correlation of different estimators of m top obtained in the same data sample together with the outlined combination procedure. The results are summarised in Figure 10 and compared to selected measurements from the Tevatron experiments.

Summary and conclusion
The top quark mass has been measured directly via two implementations of the template method in the e+jets and µ+jets decay channels, based on proton-proton collision data from 2011 corresponding to an integrated luminosity of about 1.04 fb −1 . The two analyses mitigate the impact of the three largest systematic uncertainties on the measured m top with different methods. The e+jets and µ+jets channels, and both analyses, lead to consistent results within their correlated uncertainties. A combined 1d-analysis and 2d-analysis result does not currently improve the precision of the measured top quark mass from the 2d-analysis and hence the 2d-analysis result is presented as the final result: This result is statistically as precise as the m top measurement obtained in the Tevatron combination, but the total uncertainty, dominated by systematic effects, is still significantly larger. In this result, the three most important sources of systematic uncertainty are from the relative b-jet to light jet energy scale, the modelling of initial and final state QCD radiation, and the light quark jet energy scale. These sources account for about 85% of the total systematic uncertainty.

Acknowledgements
We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently. We