Full simulation study of the top Yukawa coupling at the ILC at $\sqrt{s}$ = 1 TeV

We present a study of the expected precision for measurement of the top Yukawa coupling, yt, in e+e- collisions at a center-of-mass energy of 1 TeV and assuming a beam polarization of P (e-, e+) = (-0.8,+0.2). Independent analyses of ttH final states containing at least six hadronic jets are performed, based on detailed simulations of SiD and ILD, the two candidate detector concepts for the ILC. We estimate that a statistical precision of yt of 4% can be obtained with an integrated luminosity of 1 $\mathrm{ab}^{-1}$.


I. INTRODUCTION
The discovery of a Standard Model (SM)-like Higgs boson, announced on July 4th, 2012 by the ATLAS and CMS collaborations [1,2], was celebrated as a major milestone in particle physics. In the SM, the coupling strength of the Higgs boson to a fermion is given by y f = √ 2m f /v, where m f is the fermion mass and v ≈ 246 GeV is the vacuum expectation value. Since the top quark is the heaviest known elementary particle, the measurement of the top Yukawa coupling, y t , serves as the high endpoint to test this prediction. A sizable deviation in y t from the SM prediction is expected in various new physics scenarios, which motivates a precise measurement of y t . For example, in composite Higgs models, where the Higgs boson is a pseudo-Nambu-Goldstone boson, y t could deviate up to tens of %, even in the scenario that no new particles are discovered in LHC Run 2 data [3].
A recent study of the prospects of measuring y t at the LHC estimates that a precision of 14-15% (7-10%) is achievable with an integrated luminosity of 0.3 ab −1 (3 ab −1 ) [4]. For e + e − collisions, detailed simulation studies have been carried out using the ttH process at various center-of-mass (CM) energies. At √ s = 500 GeV [5][6][7], where the e + e − → ttH cross section is sharply rising, the precision is estimated to be about 10% for an integrated luminosity of 1 ab −1 , while at √ s = 800 GeV [8,9], it is estimated that y t can be measured to a precision of 5-6% for an integrated luminosity of 1 ab −1 .
The International Linear Collider (ILC) [10] is a proposed e + e − collider with a maximum CM energy √ s = 1 TeV. It has a broad physics potential that is complementary to the LHC and precision measurements of the Higgs couplings are an integral part of the physics program at this machine. We present studies of the measurement of the top Yukawa coupling in direct observation at the 1 TeV stage of the ILC. The studies are carried out in ILD and SiD [11], the two detector concepts for the ILC. They are performed with detailed detector simulations taking into account the main beam-induced backgrounds at the collider as well as the dominant background from other physics processes. Two final states are consideredevents where both W bosons from the top quarks decay hadronically, and events where exactly one of the two W bosons decays leptonically.
The studies performed for the two detector concepts have large overlaps, and we highlight significant differences between the two analyses wherever applicable. This document is organized as follows: Section II gives an overview over the signal sample and the considered physics background. Section III gives brief overviews over the two ILC detector models. The tools for the generation of physics processes and the detector simulation and reconstruction are listed in Section IV. The two dominant sources of machine-induced background in the detectors are introduced in Section V. The techniques to reduce these backgrounds and reconstruct the top quarks and Higgs bosons are described in Section VI. Details of the event selection are given in Section VII and the results are presented in Section VIII. The dominant sources of systematic uncertainty are given in Section IX and the two analyses are summarized in Section X. Figure 1 illustrates the lowest order Feynman diagrams for the process e + e − → ttH. The diagram for the reac- tion e + e − → Z * H (Higgs-strahlung) with Z * → tt which does not contain y t has a small yet non-negligible contribution to the total cross section. The size of this effect is studied by evaluating the change of the e + e − → ttH cross section when modifying y t from the SM value. Using this procedure, we extract the factor κ as defined by the relation ∆y t /y t = κ · ∆σ/σ for the SM value of y t . In the absence of the Higgs-strahlung diagram, we would find κ = 0.5. Instead, we find κ = 0.52, indicating a non-negligible contribution from the Higgs-strahlung diagram to the total cross section at √ s = 1 TeV. This factor is used in the extraction of the top Yukawa coupling precision. The correction will be known with good precision, because the Higgs coupling to the Z boson can be extracted from measurements of e + e − → ZH events at √ s = 250 GeV with a statistical uncertainty of about 1.5% [12].

II. SIGNAL AND BACKGROUND PROCESSES
For this study the semileptonic and hadronic decays of the tt system were studied with the Higgs decaying via the dominant decay mode into a bb pair. For the fully hadronic decay channel this leads to a signature of eight hadronic jets, four of which are b jets. In the semileptonic mode the final signal in the detector consists of six hadronic jets, four of which are b jets, an isolated lepton, and missing energy and momentum from a neutrino. For isolated leptons, only the prompt electrons and muons are reconstructed and considered as signal, neglecting the decays into τ leptons. These two modes are reconstructed in independent samples and are combined statistically.
Irreducible backgrounds to these processes arise from the eight-fermion final states of ttZ where the Z decays into a bb pair and ttbb where the tt system radiates a hard gluon which forms a bb pair. A large background contribution also arises from tt due to the huge relative cross section compared to the signal. There is also a contribution from the other decay modes of the ttH system such as the Higgs decaying to final states other than a bb pair and the fully leptonic decays of the top quarks.
An overview of the cross sections for the signal final states as well as for the considered backgrounds is shown in Table I. For the measurement using the final state with six jets, all other ttH events, i.e., all events where both top quarks decay either leptonically or hadronically, or events where the Higgs boson does not decay into bb, are treated as background. For the eight-jets final state events where at least one top quark decays leptonically or where the Higgs boson decays into final states other than bb are considered as background. The non-ttH backgrounds are considered for both measurements.

III. DETECTOR MODELS
SiD [11, chapter 2] and ILD concepts [11, chapter 3] are designed to be the two general-purpose detectors for the ILC, with a 4 π coverage, employing highly granular calorimeters for particle flow calorimetry. For SiD a superconducting solenoid with an inner radius of 2.6 m provides a central magnetic field of 5 T. The calorimeters are placed inside the coil and consist of a 30 layer tungsten-silicon electromagnetic calorimeter (ECAL) with 13 mm 2 segmentation, followed by a hadronic calorimeter (HCAL) with steel absorber and instrumented with resistive plate chambers (RPC) -40 layers in the barrel region and 45 layers in the endcaps. The read-out cell size in the HCAL is 10 × 10 mm 2 . The iron return yoke outside of the coil is instrumented with 11 RPC layers with 30 × 30 mm 2 read-out cells for muon identification. The silicon-only tracking system consists of five layers of 20 × 20 µm 2 pixels followed by five strip layers with a pitch of 25 µm, a read-out pitch of 50 µm and a length of 92 mm per module in the barrel region. The tracking system in the endcap consists of four stereostrip disks with similar pitch and a stereo angle of 12 • , complemented by four pixelated disks in the vertex region with a pixel size of 20 × 20 µm 2 and three disks in the far-forward region at lower radii with a pixel size of 50 × 50 µm 2 . All sub-detectors have the capability of time-stamping at the level of individual bunches, 337 ns apart, ≈ 1300 to a train. This allows to separate hits originating from different bunch crossings. The whole detector will be read out in the 200 ms between bunch trains.
The ILD detector model is designed around a different optimization with a larger size. The ECAL and HCAL are placed inside a superconducting solenoid, which provide a magnetic field of 3.5 T. The silicon-tungsten ECAL has an inner radius of 1.8 m and a total thickness of 20 cm, with 5 × 5 mm 2 transverse cell size and 30 layers of longitudinal segmentation. The steel-scintillator HCAL has an outer radius of 3.4 m with 3 × 3 cm 2 transverse tiles and 48 layers longitudinal segmentation. ILD employs a hybrid tracking system consisting of a time projection chamber (TPC) which provides up to 224 points per track and silicon-strip sensors for improved track momentum resolution, which are placed in the barrel region both inside and outside the TPC and in the endcap region outside the TPC. The vertex detector consists of three double layers of silicon pixel sensors with radii ranging from 15 to 60 mm, providing a spatial resolution of 2.8 µm. An iron return yoke instrumented with a muon detector and a tail catcher is placed outside the yoke. In addition, silicon trackers and beam/luminosity calorimeters are installed in the forward region.

IV. ANALYSIS FRAMEWORK
The ttH, ttZ, and ttbb samples were generated using the physsim [13] event generator. The sample referred to as tt in the following includes six-fermion final states consistent with the tt decays but not limited to the resonant tt production. The tt events were generated using the whizard 1.95 [14,15] event generator. All samples were generated taking into account the expected beam energy spectrum at the √ s = 1 TeV ILC, including initial state radiation [16] and beamstrahlung. The spectrum was sampled from a simulation of beam events [17]. The model for the hadronization in pythia 6.4 [18] uses a tune based on opal data [19,Appendix B.3].
Detailed detector simulations based on geant4 [20,21] are performed. In the SiD analysis, the event reconstruction is performed in the org.lcsim [22] package. The ILD analysis uses the Marlin [23,24] framework. Both analyses use the PandoraPFA [25] algorithm for calorimeter clustering and combined analysis of track and calorimeter information based on the particle flow approach. The LCFIPlus [11, Section 2.2.2.3] package is used for the identification of heavy flavor jets. The assumed integrated luminosity of the analysis is 1 ab −1 , which is split equally between the two polarization configurations (+80%, −20%) and (−80%, +20%) for the polarization of the electron and positron beams (P e − , P e + ). Detector hits from Beam-induced backgrounds from processes described in Section V are treated correctly in the simulation of the detector readout and in the reconstruction. low-momentum particles per bunch crossing. We assume 4.1 hadronic events from two-photon processes (γγ → hadrons) with an energy greater than 300 MeV per signal event. Figure 2 shows the kinematic properties of the particles originating from these processes. They do not affect the reconstruction significantly, but present a challenge to the sub-detector occupancies and pattern recognition. The SiD analysis includes both effects, while the ILD analysis includes only the γγ → hadrons processes.
While the most energetic particles from incoherent pair production are primarily outside of the detector acceptance of both detectors, some low-p T particles lead to an occupancy of up to 0.06 hits/mm 2 per bunch crossing in the vertex detector and up to 5 × 10 −5 hits/mm 2 per bunch crossing in the main tracker for the SiD detector model. They do not, however, impact on the energy reconstruction. Particles from γγ → hadrons processes on the other hand can have sizable values of p T and reach the calorimeters, affecting the jet energy resolution. The beam-induced backgrounds do not degrade the tracking performance significantly [11].
The primary vertices of the beam-induced backgrounds are distributed with a Gaussian profile along the beam direction across the luminous region of 225 µm, taking into account the bunch length along the beam direction.

A. Suppression of Beam-Induced Backgrounds
The particles originating from beam-induced backgrounds as described in Sec. V tend towards low transverse momenta and small angles with respect to the beam axis. Different approaches are used to suppress the impact of the beam-induced backgrounds. For the SiD analysis, only the reconstructed objects in the range 20 • < θ < 160 • are considered, because the ttH final state is produced via s-channel exchange and is not suppressed by this selection. In the ILD analysis, the longitudinally-invariant k T jet algorithm [26,27] with a value of 1.2 for the R parameter is employed to suppress the particles close to the beam axis. Only the particles grouped into the physics jets by the k T algorithm are considered further in the analysis. Figure 3 shows how the impact of the beam-induced backgrounds on the reconstructed Higgs mass is mitigated by the removal procedure. A modified version of the Durham jet finding algorithm [28] then groups all particles in the event into a specified number of jets, without splitting decay products of secondary vertices across different jets.

B. Reconstruction of Isolated Leptons
Signal events with six jets contain one high-energy isolated lepton from the leptonic W boson decay. No isolated leptons are expected in signal events with two hadronic W decays. Hence the number of isolated lep-tons is an important observable in the signal selections for both final states.
The electron and muon identification criteria used in this study are based on the energy deposition in the ECAL and HCAL and the momentum measured by the tracker. Electrons candidate are selected by requiring that almost all of the energy deposition is in the ECAL and that the total calorimetric energy deposition is consistent with the momentum measured by the tracker. For the muon candidates, most of the energy deposition is in the HCAL, while the calorimetric energy is required to be small compared to the corresponding momentum measured by the tracker. A selection on the impact parameter reduces non-prompt leptons.
The SiD analysis uses the IsolatedLeptonFinder processor implemented in MarlinReco [23] to identify leptons in regions with otherwise little calorimetric activity. The ILD analysis additionally exploits the transverse distance from the jet axis to identify leptons from leptonic W decays.
The electron and muon identification capabilities of the reconstruction within a multi-jet environment were tested in a sample of four jets, one lepton and missing energy. The efficiency is defined as the fraction of leptons with correctly identified flavor in a sample of isolated lepton candidates. The purity is the ratio of the number of leptons of a given type stemming from a leptonic W decay to the number of all identified isolated leptons of that type. An efficiency of 82% (89%) and purity of 95% (97%) for electrons (muons) is observed in ILD and 86% (86%) efficiency and 94% (95%) purity for electrons (muons) in SiD.

C. Jet Clustering and Flavor Identification
Depending on the signal definition for the semileptonic or hadronic final state, the Durham jet clustering algorithm is used in the exclusive mode to cluster the event into six or eight jets, respectively.
Heavy flavor identification is primarily used to remove the tt background. Both the six-jets and eight-jets final states contain four b jets. The flavor tagging classifier for the measurement of ttH production was trained on events with six quarks of the same flavor produced in electron-positron annihilation. For the training, 60000 c and b jets, and 180000 light quark jets are used. These samples were chosen since the jets have similar kinematic properties as those in ttH signal events. The tt events contain no more than two b jets from the top decays as do ∼80% of ttZ. Figure 4(a) shows the distribution of the response from the flavor-tagging multivariate selection for the jet that has the third-highest tagging probability. In both analyses, the shape of the distribution of the flavor tagging response, rather than a simple cut, is used. The background channels, in particular tt, are dominated by the peak at low values. The peak at higher values in the t t Z channel is due to events with with four genuine b jets.

D. Reconstruction of W, top and Higgs Candidates
To form W, top and Higgs candidates from the reconstructed jets, the following function is minimized for the final state with eight jets: (2) In the ILD analysis, the b tagging information is also used to reduce the number of combinations by forming the Higgs candidate from pairs of jets that have the highest value of the b-tagging classifier. The other jets in the event are used to form the top candidates.

VII. EVENT SELECTION
Events were selected using Boosted Decision Trees (BDTs) as implemented in TMVA [29]. The BDTs were trained separately for the eight-and six-jets final states. The following input variables were used: • the four highest values of the b-tagging classifier.
The third (see Figure 4(a)) and fourth highest btag values are especially suited to reject tt and most of the ttZ events, both of which contain only two b jets; • the event thrust [30] (see Figure 4(b)) defined as where p i is the momentum of the jet. Since the top quarks in tt events are produced back-to-back, the thrust variable has larger values in tt events compared to ttH, ttZ or ttbb events; • the jet resolution parameter from the Durham algorithm in the E recombination scheme y ij , when combining i jets to j = (i − 1) jets. For the six-jets final state Y 54 and Y 65 (see Figure 4(c)) are found to be effective, while Y 76 and Y 87 are used for the eight-jets final state. Isolated leptons are removed prior to the jet clustering; • the number of identified isolated electrons and muons (ILD only); • the missing transverse momentum, p miss T . Due to the leptonic W boson decay, finite values of p miss T are reconstructed for six-jets signal events while p miss T tends towards zero for eight-jets signal events; • the visible energy of the event defined as the scalar sum of all jet energies; • the masses M 12 , M 123 and M 45 as defined in Section VI D.
For the eight-jets final state additionally the two variables M 456 and M 78 as defined in Section VI D are included. The ILD analysis includes the helicity angle of the Higgs candidate as defined by the angle between the two b jet momenta in the dijet rest frame.
To select events, cuts on the BDT response are applied. The cuts were optimized by maximizing the signal significance given by: S √ S+B , where S is the number of signal events and B is the number of background events. As an example, the reconstructed top and Higgs masses in six-jets events after the cut on the BDT output are shown in Figure 5. The selection efficiencies (purities) for signal events are 33.1% (27.7%) and 56.0% (25.2%) for the six-and eight-jets analyses in ILD, respectively, and 30.5% (28.9%) and 45.9% (26.7%) in SiD. In Table II the expected yields are shown separately for all investigated final states.
Reconstructed top (a) and Higgs (b) masses for selected six-jets events in the SiD detector. All histograms were normalized to an integrated luminosity of 1 ab −1 . The distribution for tt was scaled by a factor of 0.5.

VIII. RESULTS
The cross section can be directly obtained from the number of background-subtracted signal events after the selection.
The uncertainty of the cross section measurement was estimated using the number of selected signal and background events. Assuming an integrated luminosity of 1 ab −1 split equally between the P (e − , e + ) = (−80%, +20%) and P (e − , e + ) = (+80%, −20%) beam polarization configurations, the cross section can be measured with a statistical precision of 10 -11% using the eight-jets final state and with a statistical precision of ≈ 13% for the six-jets final state.
The uncertainties of the measured cross sections translate to precisions on the top Yukawa coupling of 5 -6% and ≈ 7% from the eight-and six-jets final states, re-spectively. If both measurements are combined, the top Yukawa coupling can be extracted with a statistical precision of better than 4.5%.
For 1 ab −1 of data with only P (e − , e + ) = (−80%, +20%) polarization, this number improves to 4%. The precision for the six-jets final state could be improved further if τ leptons were included in the reconstruction. Additional improvement is also foreseen by employing kinematic fitting. The achieved precision of both analyses indicates that the reconstruction of the investigated final states is limited by confusion when combining the jets to form top, Higgs and W boson candidates rather than the differences in performance between the two investigated detector concepts.

IX. SYSTEMATIC UNCERTAINTIES
Given the low cross section and relatively clean environment at a √ s = 1 TeV ILC, it is expected that the statistical uncertainty of the measurement of the top Yukawa coupling in direct observation dominates over systematic uncertainty. In the following we estimate the contributions from the main sources of systematic errors to this measurement.
The number of background events in the final selection is comparable to the number of signal events, making the estimation of normalization and shapes of the background an important source of systematic uncertainty. The total cross section is expected to be calculable from theory to very good precision for the ttZ and tt processes. QCD contributions to the ttbb cross section make this value more challenging to compute precisely; in principle the measurement of the gluon splitting rate at relevant energies will provide a handle to estimate its size. A crucial aspect in the estimation of the efficiencies is the accurate modeling of the event selection variables. Here we illustrate how one might arrive at control samples for different background sources in order to estimate the efficiency of each component accurately.
The ttZ final state can be reconstructed in a similar fashion to the ttH final state. For hadronically decaying Z, the number of jets in the final state will be the same as in the ttH analysis. For our nominal integrated luminosities of 0.5 ab −1 for each of the two polarization states, 1400 events are expected for ttH(→ bb) and 800 events are expected for ttZ(→ bb), taking into account the Z → bb branching ratio. Other hadronic decays of the Z boson will have large tt background due to the absence of the two b tags. Including leptonic decays of the Z boson will help increase the sensitivity to this channel. Overall, one can expect that the statistical uncertainty for ttZ will be similar to that of ttH, i.e. at the few percent level.
The large cross section of tt events will allow for detailed systematic studies. While only a certain class of these events may enter the final selection, we estimate that the systematic uncertainty to the measurement of the top Yukawa coupling can be measured with precision comparable to that of ttZ.
Other sources of systematic uncertainty such as the luminosity measurement, jet energy scale, and flavor tagging are typically at the 1% level or better for e + e − colliders. The uncertainty on BR(H → bb) is not taken into account in our calculation of the top Yukawa coupling from the ttH production cross section. It is expected that this quantity can be measured with a precision of better than 1% using e + e − → ννH events [31,32].

X. SUMMARY
The physics potential for a measurement of the top Yukawa coupling at 1 TeV at the ILC is investigated. The study is based on detailed detector simulations using both the SiD and ILD detector concepts. Beam-induced backgrounds are considered in the analysis. The combination of results obtained for two different final states leads to a statistical uncertainty on the top Yukawa coupling of better than 4.5% for an integrated luminosity of 0.5 ab −1 with the P (e − , e + ) = (−80%, +20%) beam polarization configuration and 0.5 ab −1 with P (e − , e + ) = (+80%, −20%) polarization. If 1 ab −1 of data were recorded with only the P (e − , e + ) = (−80%, +20%) beam polarization configuration, the expected precision would improve to 4%.
The results from the studies presented in this paper demonstrate the robustness of the physics reconstruction of high jet multiplicity final states at √ s = 1 TeV under realistic simulation conditions. The expected precisions for measurements of the top Yukawa coupling were found to be very similar for two different detector concepts.