Search for a heavy Higgs boson decaying into a $Z$ boson and another heavy Higgs boson in the $\ell\ell bb$ and $\ell\ell WW$ final states in $pp$ collisions at $\sqrt{s}$ = 13 TeV with the ATLAS detector

A search for a heavy neutral Higgs boson, $A$, decaying into a $Z$ boson and another heavy Higgs boson, $H$, is performed using a data sample corresponding to an integrated luminosity of 139 fb$^{-1}$ from proton-proton collisions at $\sqrt{s}$ = 13 TeV recorded by the ATLAS detector at the LHC. The search considers the $Z$ boson decaying into electrons or muons and the $H$ boson into a pair of $b$-quarks or $W$ bosons. The mass range considered is 230-800 GeV for the $A$ boson and 130-700 GeV for the $H$ boson. The data are in good agreement with the background predicted by the Standard Model, and therefore 95% confidence-level upper limits for $\sigma \times B(A\rightarrow ZH) \times B(H\rightarrow bb$ or $H\rightarrow WW)$ are set. The upper limits are in the range 0.0062-0.380 pb for the $H\rightarrow bb$ channel and in the range 0.023-8.9 pb for the $H\rightarrow WW$ channel. An interpretation of the results in the context of two-Higgs-Doublet models is also given.


Introduction
The addition of a second Higgs doublet leads to five Higgs bosons after electroweak symmetry breaking. The phenomenology of such a model is very rich and depends on many parameters, such as the ratio of the vacuum expectation values of the two Higgs doublets (tan ) and the Yukawa couplings of the scalar sector [18]. When CP conservation is assumed, the model contains two CP-even Higgs bosons, ℎ and with > ℎ , one that is CP-odd, , and two charged scalars, ± . There have been many searches for a CP-even Higgs boson at the LHC, in channels that include → / [25 -30] and → ℎℎ [31, 32], as well as dedicated searches for the heavy CP-odd Higgs boson, as in the → ℎ channel [33,34]. Some 2HDM searches are agnostic with respect to whether the heavy Higgs bosons are CP-even or CP-odd, for example searches in the / → / 1 channels [35][36][37]. In the interpretation of this last category of channels it is usually assumed that both heavy Higgs bosons are degenerate in mass, a hypothesis that is motivated in certain supersymmetric models [20]. Finally, there have been searches for signatures that explicitly assume different masses for the heavy Higgs bosons, for example searches in the → → ℓℓ /ℓℓ channels [38][39][40].
The case in which the heavy Higgs bosons have different masses, in addition to being in an allowed part of the parameter space, is further motivated by electroweak baryogenesis scenarios in the context of the 2HDM [41][42][43][44]. For 2HDM electroweak baryogenesis to occur, the requirement > is favoured [41] for a strong first-order phase transition to take place in the early universe. The boson mass is also constrained to be less than approximately 800 GeV, whereas the lighter CP-even Higgs boson, ℎ, is required to have properties similar to those of a SM Higgs boson and is assumed to be the Higgs boson with a mass of 125 GeV that was discovered at the LHC [41]. Under such conditions and for large parts of the 2HDM parameter space, the CP-odd Higgs boson, , decays into [41,45]. At the LHC, the production of the boson in the relevant 2HDM parameter space proceeds mainly through gluon-gluon fusion and in association with -quarks ( -associated production).
This search for → decays uses proton-proton collision data at √ = 13 TeV corresponding to an integrated luminosity of 139 fb −1 recorded by the ATLAS detector at the LHC. The search considers → ℓℓ, where ℓ = , , to take advantage of the clean leptonic final state. The boson is studied in the → and → decay channels. The → channel takes advantage of the high branching ratio in large parts of the 2HDM parameter space, especially in the weak decoupling limit, where the boson decays into weak vector bosons are suppressed. The → decay channel is considered in the case where both bosons decay hadronically. This heavy Higgs boson decay is dominant in parts of the 2HDM parameter space close to, but not exactly at, the weak decoupling limit [41] and it provides a new way to look for ℓℓ resonances in a final state that has been less explored by other LHC searches. Both final states considered allow full reconstruction of the boson's decay kinematics. This search considers both the gluon-gluon fusion (see Figure 1(a)) and -associated production mechanisms (see Figure 1(b)) for the → → ℓℓ channel. The -associated production mode of the → → ℓℓ channel is theoretically allowed, but leads to more complicated jet combinatorics and would necessitate changing the event reconstruction strategy. For this reason, only the gluon-gluon fusion production mode (see Figure 1(c)) is considered here.
This article is organised as follows. Section 2 introduces the ATLAS detector. A description of the collision and simulated data samples used in this article is given in Section 3. The algorithms used to reconstruct the objects used in this search are described in Section 4. The event selection and background estimates for the two channels considered and the modelling of the signal are discussed in Sections 5 and 6, respectively. Section 7 is devoted to the description of the systematic uncertainties. The results are discussed in Section 8 and the conclusions are given in Section 9.

ATLAS detector
The ATLAS experiment [46] at the LHC is a general-purpose particle detector with cylindrical geometry and forward-backward symmetry. It includes an inner-detector tracker surrounded by a 2 T superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer with a toroidal magnetic field. The inner detector consists of a high-granularity silicon pixel detector, including the insertable B-layer [47,48], a silicon microstrip tracker, and a straw-tube tracker. It provides precision tracking of charged particles with pseudorapidity | | < 2.5. 2 The calorimeter system covers the pseudorapidity range | | < 4.9. It is composed of sampling calorimeters with either lead/liquid-argon, steel/scintillator-tiles, copper/liquid-argon or tungsten/liquid-argon as the absorber/sensitive material. The muon spectrometer provides muon identification and momentum measurement for | | < 2.7. A two-level trigger system [49] is employed to select events to be recorded at an average rate of about 1 kHz for offline analysis.

Data and simulated event samples
The data used in this search were collected between 2015 and 2018 from √ = 13 TeV proton-proton collisions and correspond to an integrated luminosity of 139 fb −1 [50-53], which includes only data-taking periods where all relevant detector subsystems were operational [54]. The data sample was collected using a set of single-muon [55] and single-electron triggers [56]. The single-muon triggers had T thresholds in the range of 20-26 GeV for isolated muons and 50 GeV for muons without any isolation requirement. The single-electron triggers employed a range of T thresholds in the range 24-300 GeV and a combination of quality and isolation requirements depending on the data-taking period and the T threshold.
Simulated signal events with bosons produced by gluon-gluon fusion were generated at leading order (LO) with M G 5_aMC@NLO 2.3.3 [57,58], using P 8.210 [59] with a set of tuned parameters called the A14 tune [60] for parton showering. The decays of → and were considered. Additionally, in the → → ℓℓ channel, bosons produced in association with -quarks were generated at next-to-leading-order (NLO) with M G 5_aMC@NLO 2.1.2 [58,61,62] following Ref. [63] together with P 8.212 and the A14 tune for parton showering. The gluon-gluon fusion production used NNPDF2.3lo [64] as the parton distribution functions (PDFs), while the -associated production used CT10nlo_nf4 [65]. The signal samples were generated for bosons with masses in the range of 230-800 GeV (300-800 GeV) and widths up to 20% of the mass, and for narrow-width bosons with masses in the range of 130-700 GeV (200-700 GeV) for the ℓℓ (ℓℓ ) channel.
Background events from the production of and bosons in association with jets were simulated with S v2.2.1 [66] using NLO matrix elements (ME) for up to two partons, and LO matrix elements for up to four partons calculated with the Comix [67] and OpenLoops [68,69] libraries. They were matched with the S parton shower [70] using the MEPS@NLO prescription [71][72][73][74] using the set of tuned parameters developed by the S authors. The NNPDF3.0nnlo set of PDFs [75] was used and the samples were normalised to a next-to-next-to-leading-order (NNLO) prediction [76]. Production of , and pairs was simulated using the same generator and parameters as for the and boson samples.
The production of¯events was modelled using the P -B v2 [77][78][79][80] generator at NLO with the NNPDF3.0nlo [75] PDF set and the ℎ damp parameter 3 set to 1.5 top [81]. The events were interfaced to P 8.230 to model the parton shower, hadronisation, and underlying event, with parameters set according to the A14 tune and using the NNPDF2.3lo set of PDFs. The decays of bottom and charm hadrons were performed by E G v1.6.0 [82]. The associated production of a single top quark and boson ( ) and single top production in the -channel were modelled using the P -B v2 [78-80, 83, 84] generator at NLO in QCD using the five-flavour scheme and the NNPDF3.0nlo set of PDFs. The diagram removal scheme [85] was used to remove interference and overlap with¯production in the case of production. The production of¯events was modelled using the M G 5_aMC@NLO v2.3.3 generator at NLO with the NNPDF3.0nlo PDF set. The events were interfaced to P 8.210 using the A14 tune and the NNPDF2.3lo PDF set. The decays of bottom and charm hadrons were simulated using the E G v1.2.0 program.
Finally, SM Higgs boson production in association with a vector boson was simulated using P [78][79][80]86] and interfaced with P 8.186 [87] for parton shower and non-perturbative effects. The P prediction is accurate to NLO for the ℎ boson plus one jet production. The loop-induced → ℎ process was generated separately at LO. The PDF4LHC15 PDF set [88] and the AZNLO tune [89] of P 8.186 were used. The simulation prediction was normalised to cross sections calculated at NNLO in QCD with NLO electroweak corrections for¯/ → ℎ and at NLO and next-to-leading-logarithm accuracy in QCD for → ℎ [90][91][92][93][94][95][96].
The effect of multiple interactions in the same and neighbouring bunch crossings (pile-up) was modelled by overlaying the original hard-scattering event with simulated inelastic proton-proton events generated with P 8.186 using the NNPDF2.3lo set of PDFs and the A3 tune [97]. The simulated events were weighted to reproduce the distribution of the average number of interactions per bunch crossing ( ) observed in the data. The value in the simulation was rescaled by a factor of 1.03 ± 0.07 to improve agreement between data and simulation in the visible inelastic proton-proton cross section [98]. All generated background samples were passed through the G 4-based [99] detector simulation [100] of the ATLAS detector. The ATLFAST-II simulation [100] was used for the signal samples to allow for the generation of many different and boson masses. The simulated events were reconstructed in the same way as the data.

Object reconstruction
Selected events are required to contain at least one vertex having at least two associated tracks with T > 500 MeV, and the primary vertex is chosen to be the vertex reconstructed with the largest Σ 2 T of its associated tracks.
Electrons are reconstructed from energy clusters in the electromagnetic calorimeter that are matched to tracks in the inner detector [101]. Electrons are required to have | | < 2.47 and T > 7 GeV. The associated track must have | 0 |/ 0 < 5 and | 0 | sin < 0.5 mm, where 0 ( 0 ) is the transverse (longitudinal) impact parameter relative to the primary vertex and 0 is the error in 0 . To distinguish electrons from jets, isolation and quality requirements are applied. The quality requirements refer to both the inner detector track and the calorimeter shower shape. The isolation requirements are defined using tracking and calorimeter measurements. Electrons used in this search satisfy the 'Loose' quality and isolation requirements.
Muons are reconstructed by matching tracks reconstructed in the inner detector to tracks or track segments in the muon spectrometer [102]. Muons used for this search must have | | < 2.5, T > 7 GeV, | 0 |/ 0 < 3, and | 0 | sin < 0.5 mm. They are also required to satisfy 'Loose' isolation requirements, similar to those used for electrons, as well as 'Loose' quality criteria for tracks in the inner detector and muon spectrometer [103].
Jets containing -hadrons are identified using a multivariate tagging algorithm ( -tagging) [109,110], which makes use of track impact parameters and reconstructed secondary vertices. The -tagging algorithm output is used to define a criterion to select jets originating from -quark hadronisation for jets with | | < 2.5. The jets that are selected in this way are referred to as -jets in the following. The criterion in use has an average efficiency of 70% for jets from -quarks in simulated¯events, with rejection factors of 8.9, 36 and 300 for jets initiated by -quarks, hadronically decaying -leptons and light-flavour quarks or gluons, respectively [110].
Electrons, muons and jets are reconstructed and identified independently. When those objects are spatially close, these algorithms can lead to ambiguous identifications. An overlap removal procedure [111] is therefore applied to remove ambiguities.
The missing transverse momentum, whose magnitude is denoted by miss T , is computed as the negative vectorial sum of the transverse momenta of calibrated leptons and jets, plus an additional soft term constructed from all tracks that originate from the primary vertex but are not associated with any identified lepton or jet [112,113].

Event selection and background estimation
The final states for the → → ℓℓ / decays feature a pair of oppositely charged, same-flavour leptons and either two -jets or four mostly light-flavour jets from the bosons decays. Three resonances can be formed by combining the selected objects: (i) the boson (ℓℓ), (ii) the boson ( or → 4 ), and (iii) the boson ( system).
Events are required to contain exactly two muons or two electrons. The two muons must have opposite electric charges. This requirement is not applied to electrons, since they have a non-negligible charge misidentification rate due to conversions of bremsstrahlung photons. The highest-T lepton must satisfy

The ℓℓ final state
The events that are used for the → → ℓℓ search are required to have at least two -jets, with at least one of them having T > 45 GeV. The two highest-T -jets of the event form the → system candidate. The boson candidate is formed by these two -jets and, in addition, the two leptons that are matched to the boson.
The requirement of a same-flavour lepton pair along with several -jets implies that the signal region is contaminated by boson production in association with jets and backgrounds including top quarks, likep roduction. The presence of neutrinos in semileptonic top-pair production provides a handle to reduce this background by requiring miss T / √ T < 3.5 GeV 1/2 , where T is the scalar sum of the T of all jets and leptons in the event. The +jets background is reduced by requiring where ℓℓ is the four-body invariant mass of the two-lepton, two--jet system assigned to the boson candidate and the summation is performed over the 2 T of these objects. This discriminating variable is chosen because the signal distributions are similar across the two production mechanisms and the and range used in the search. The requirement is optimised separately for each signal hypothesis and, subsequently, an average value is chosen. The distribution of the √︃ Σ 2 T / ℓℓ variable is shown in Figure 2 separately for the cases where exactly two -jets and three or more -jets are present in the event. The distribution is shown before the The two signal production mechanisms, gluon-gluon fusion and -associated production, differ mainly in the number of heavy-flavour jets that are produced in association with the boson. This motivates a categorisation based on the number of -jets present in the event. In particular, two categories are defined: the = 2 category, which contains events with exactly two -jets, and the ≥ 3 category, which contains events with three or more -jets. For gluon-gluon fusion production, more than 95% of the events passing the above selection fall into the = 2 category. For -associated production, only 25-35% of the selected events fall into the ≥ 3 category, and the others enter the = 2 category. This is because of the relatively soft T spectrum of the associated -jets and the geometric acceptance of the tracker.
Finally, the invariant mass of the -jets that are assigned to the boson must be compatible with the assumed boson mass. This is ensured by requiring to be within optimised boundaries that depend on the assumed : 0.85 · − 20 GeV < < + 20 GeV for the = 2 category, and 0.85 · − 25 GeV < < + 50 GeV for the ≥ 3 category. The wider window for ≥ 3 is motivated by a slightly poorer resolution due to potential -jet misassignments. The -jets that are matched to the boson are the highest-T -jets in the event and, hence, in the case of -associated production, where more -jets are present, may not be the ones that actually come from the → decay. In -associated production, the fraction of bosons for which the correct -jets are chosen is in the range 50-90% for the ≥ 3 category and is at least 65% for the = 2 category.
The signal efficiency in the = 2 category after the window requirement is 5.1-11% (2.5-6.6%) for gluon-gluon fusion ( -associated) production, depending on the and values. Similarly, the efficiency in the ≥ 3 category after the window requirement is 1.3-3.2% for -associated production. The quoted numbers refer to the efficiencies for bosons decaying into , with → / / and → , to pass the event selection for each of the categories. The inclusion of → in this definition lowers the quoted signal efficiency because these decays have a very small efficiency to pass in this selection (which aims at → / ). The signal region selection is summarised in Table 1. Σ 2 T / ℓℓ distributions shown before the requirement on this variable is applied for events with (a) exactly two -jets and (b) three or more -jets. Corrections from a fit to the data are applied to the simulation, as described in Sections 5.1 and 8. The signal distribution for ( , ) = (600, 300) GeV is also shown, and is normalised such that the production cross section times the branching ratios ( → ) and ( → ) corresponds to 1 pb. The signal shown includes only bosons produced in association with -quarks. The lower panel shows the ratio of the data to the background prediction (black filled circles) and the relative uncertainty, which includes both statistical and systematic components, in the background prediction (hatched area). The notations , and ℎ refer to top-pair production in association with a vector boson, diboson production and SM Higgs boson production in association with a vector boson, respectively. The production of a boson in association with jets is split based on jet flavour. The notation Z+(bb,bc,cc,bl) refers to the case where the jets originate from heavy flavour, which includes at least one jet originating from a -quark or two jets originating from -quarks, whereas the notation Z+(cl,l) includes all the remaining cases.
The ℓℓ distribution after the requirement is the final discriminating variable, which is fitted to obtain the result of the search in this channel. To improve the ℓℓ resolution, the system's four-momentum components are scaled to match the assumed boson mass and the ℓℓ system's four-momentum components are scaled to match the boson mass. This procedure, performed after the event selection, improves the ℓℓ resolution by a factor of two without significantly distorting the background distributions, resulting in an boson mass resolution that is at best about 1% and up to 4% for gluon-gluon fusion, up to 10% for -associated production in the = 2 category and up to 16% for -associated production in the ≥ 3 category, depending on the and values.
Despite the dedicated selection criteria against +jets and top-quark production, these background processes dominate the signal region: the +jets contribution is ∼60-70% depending on the category, while the top-quark contribution is ∼30-35%. In the ≥ 3 category, other processes (¯, dibosons, ℎ) contribute up to ∼5% of the total background, while their contribution to the = 2 category is less than 1%. The accurate determination of +jets and top-quark contributions is paramount for the sensitivity of this search. Their estimation employs a combination of data-driven corrections to simulated events.
The most abundant background in this channel is from +jets production. The normalisation of this process is constrained by a control region defined by inverting the window criterion for each Single-electron or single-muon trigger Exactly 2 leptons ( or ) ( T > 7 GeV) with the leading one having T > 27 GeV Opposite electric charge for pairs; 80 GeV < ℓℓ, < 100 GeV, ℓ = , At least 2 -jets ( T > 20 GeV) with one of them having T > 45 GeV Exactly 2 -tagged jets At least 3 -tagged jets boson mass hypothesis (see also Table 1). The control regions are distinct for the = 2 and ≥ 3 categories, since the accuracy of the background simulation depends on the number of -jets present in the event. The modelling of the +jets simulated events is examined extensively in a number of kinematic variables, including the T of the boson ( T ), the distribution and the √︃ Σ 2 T / ℓℓ distribution. The simulated distributions of these variables are compared against data in a control region that requires two jets with exactly one of them being a -jet, as well as an early selection stage, before the window and √︃ Σ 2 T / ℓℓ requirements. For this early selection stage, it was verified that even those signals that were already excluded in Ref.
[39] would be washed out by the background and would not bias the results. These regions are not used in the likelihood fit described in Section 8 and thus they are not included in Table 1. As a result of these studies, corrections to the distributions of T , and √︃ Σ 2 T / ℓℓ in the simulated +jets events are applied. The corrections are found to be uncorrelated and they are applied sequentially. The most significant effect on the sensitivity of this search (see also Section 7) is due to the corrections to the modelling of the T distribution, which range from +5% to −10% for most of the +jets events. As an example, Figure 3 compares the T distributions in data with the background model after all corrections used in this search for events that satisfy all the requirements of the signal region with the exception of the window requirement, separately for = 2 and ≥ 3 categories.
Top-quark production is heavily dominated by¯production in which both top quarks decay semileptonically. Therefore, it is possible to define a pure top-quark control region by keeping the same selection as discussed previously, apart from an opposite-flavour lepton criterion, i.e. an pair is required instead of an or pair (see also Table 1). This region is used for top-pair production normalisation, and also to check that kinematic distributions such as the top-quark T spectrum are adequately modelled in simulation. Different control regions are used in the = 2 and ≥ 3 categories. This is because in the ≥ 3 category the top-quark background is dominated by top-quark pair production in association with jets, which is more difficult to model than the inclusive top-quark pair production that dominates the top-quark background in the = 2 category. Finally, the window requirement is also applied to the top-quark control region, resulting in a separate control region for each hypothesis tested in the search. Good agreement within uncertainties is observed between data and simulation in the shape of all variables considered. Backgrounds from diboson, single top-quark, and SM Higgs boson production, as well as¯production in association with a vector boson are minor contributions to the total background composition. The shapes of their distributions are taken from simulation, whereas they are normalised using precise inclusive cross sections calculated from theory. The diboson samples are normalised using NNLO cross sections [114][115][116][117]. Single-top-quark production and top-quark-pair production in association with vector bosons are normalised to NLO cross sections from Refs. [118][119][120] and Ref. [58], respectively. The normalisation of SM Higgs boson production in association with a vector boson follows the recommendations of Ref. [63] using NNLO QCD and NLO electroweak corrections.

The ℓℓ final state
The decay → → ℓℓ features a pair of electrons or muons and four jets from the hadronic boson decays. The selected events are required to have at least four jets with the highest-and second-highest-T jets satisfying T > 40 GeV and T > 30 GeV, respectively. In addition, the lowest-T electrons or muons are required to have T > 15 GeV.
The selection of the correct jet pairs in the reconstruction of the two boson candidates is important for improving the signal resolution and suppressing backgrounds. For this task, all possible jet pairs that can be formed by considering up to the five highest-T jets in the event are taken into account. A set of requirements on kinematic variables, such as the angular distances between the jets within a pair, the jet transverse momenta and the reconstructed masses of the , and boson candidates, is optimised to test the various combinations for compatibility with the signal hypothesis so that the signal efficiency and background rejection are maximised. This procedure results in a signal efficiency that ranges from 50% to 70% depending on and , whereas for background processes the efficiency is about 40%. The fraction of events in which the correct jet pairs are assigned to the boson candidates after this procedure is in the range from 50% to 70%, depending on the and values.
The main background in this channel is from the production of a boson in association with jets. A criterion similar to that in the ℓℓ channel is employed to discriminate against it: 2ℓ4 is the six-body invariant mass of the two-lepton, four-jet system assigned to the boson and the summation is performed over the 2 T of these objects. This discriminating variable is chosen for the same reasons as in the ℓℓ channel and it is optimised following the same considerations. The distribution of this variable before the requirement is applied is shown in Figure 4. Σ 2 T / 2ℓ4 distribution shown before the requirement on this variable is applied. Corrections from a fit to the data are applied to the simulation, as described in Sections 5.2 and 8. The notation VV in the legend corresponds to the production of diboson events. The signal distribution for ( , ) = (600, 300) GeV is also shown, and is normalised such that the production cross section times the branching ratios ( → ) and ( → ) corresponds to 1 pb. The lower panel shows the ratio of the data to the background prediction (black filled circles) and the relative uncertainty, which includes both statistical and systematic components, in the background prediction (dashed area).
Finally, the invariant mass of the four selected jets, 4 , must be compatible with the assumed boson mass. This is ensured by requiring 4 to be within optimised boundaries that depend on : − 53 GeV < 4 < 0.97 · + 54 GeV. After this requirement the signal efficiency for bosons decaying into with → / / and → → is 6.5-11%, depending on the and values. The signal region selection is summarised in Table 2.
The 2ℓ4 distribution after the 4 requirement is the final discriminating variable, which is fitted to obtain the results of the search in this channel. To improve the 2ℓ4 resolution, the four-jet system's four-momentum components are scaled to match the assumed boson mass and the ℓℓ system's fourmomentum components are scaled to match the boson mass. The final boson mass resolution is in the range from 1% to 17% of , depending on the and values.
The dominant backgrounds after the event selection are from +jets (∼90% of total background), top-quark (∼5%), and diboson (∼5%) production. Smaller backgrounds ( +jets,¯ℎ,¯, and ℎ) contribute less than 1% to the total background and are not included in the background composition. Single-electron or single-muon trigger Exactly 2 leptons ( or ) ( T > 15 GeV) with the leading one having T > 30 GeV Opposite electric charge for pairs; 80 GeV < ℓℓ, < 100 GeV, ℓ = , At least 4 jets ( T > 20 GeV) with leading and second leading jets having T > 40, 30 GeV Jets chosen with a dedicated discriminant √︃ The shape of the +jets background is taken from simulation combined with data-driven corrections, and the normalisation is constrained by the control region outside the 4 mass window of each signal region (see Table 2), using a procedure similar to that in the ℓℓ channel. To address shape differences between distributions of kinematic variables in data and simulated backgrounds, two corrections are applied to the T of the boson candidates and to the leading jet's T . Those corrections are derived from a control region orthogonal to the signal region, obtained by selecting 3. This region is not used subsequently in the likelihood fit described in Section 8 and therefore it is not included in Table 2. The corrections are found to be uncorrelated and they are applied sequentially. The correction to the T distribution in the simulation is as large as 20% at low T values and it becomes smaller as T increases, whereas the correction to the leading jet's T does not exceed ±10%. The distributions of the T of the boson candidates and of the leading jet's T , after the reweighting, are shown in Figure 5 for events satisfying all requirements for the signal region with the exception of the 4 window cut.
The top-quark background shape is taken from simulated events. The normalisation is constrained using a high-purity control region defined by keeping the same selection as for the signal region, but replacing the electron or muon pairs by opposite-flavour leptons ( pairs), as indicated in Table 2. The single-top-quark, +jets and diboson production contributions in this control region are estimated from simulation. The diboson background shape and normalisation are taken from the simulated samples, using the same cross-section calculation as in the ℓℓ channel.

Signal modelling
This analysis searches for two new particles, with their mass hypotheses considered in the two-dimensional space -, with good mass resolution of the and reconstructed final states. The investigation of the relevant phase space requires a large number of signal mass hypotheses to be tested. In addition, various new physics scenarios which are of interest for this search, like the 2HDM, include bosons with natural widths comparable to, or larger than, the experimental mass resolution for large parts of the parameter space in which this search has sensitivity. The bosons are considered to always have negligible natural width, in accordance with the 2HDM scenarios used to interpret this search (see Section 8). For these reasons, the Both the EGE and DSCB functions have a Gaussian core but they differ in the way the tails are treated. The tails of the EGE function are exponential, described by two parameters, whereas DSCB has power-law tails described by four extra parameters. The values of the function parameters are extracted from unbinned maximum-likelihood fits to the simulated ℓℓ and 2ℓ4 distributions. Polynomial functions are used to interpolate the parameters to mass points that were not simulated. These interpolated parametric functions are used to model the signal mass shapes for all the signal assumptions considered in this search. The fit uncertainties of the DSCB and EGE function parameters, as well as the parameters of the polynomial functions used for the interpolation, are used to derive a shape uncertainty for each of the interpolated distributions.
A typical example of the result of the signal parameterisation is shown in Figure 6 for the ( , ) = (500, 300) GeV mass point. The figure shows a comparison of the simulated mass distribution and the interpolated parametric function, as well as the shape variation that is taken as an estimate of the systematic uncertainty from the procedure. In general, the cores of the ℓℓ and 2ℓ4 distributions are well-parameterised by the chosen functional forms. There are some small differences between the function description and the simulated distribution in the tails of the distributions, but those have negligible effects on the final results and they are covered by interpolation uncertainties.  Figure 6: Signal ℓℓ or 2ℓ4 distributions assuming = 500 GeV and = 300 GeV for the following cases: ℓℓ channel: (a) gluon-gluon fusion in the = 2 category, (b) -associated production in the = 2 category, and (c) -associated production in the ≥ 3 category; (d) ℓℓ channel. In the upper panels, the black filled circles correspond to the simulated distributions, which are compared against the interpolated parameterised signal distributions shown as solid red curves. Also in the same panels, the shape variations of the interpolated parameterised signal distributions are shown in dotted blue (+1 ) and black (−1 ) lines. In the lower panels, the black filled circles correspond to the ratio of the simulation to the interpolated parameterised curve. The dotted blue (black) line corresponds to the ratio of the +1 (−1 ) shape variation of the interpolated curve to the interpolated curve.
The parameterisation procedure described in the previous paragraph is modified to allow for cases where the width of the boson is comparable to, or larger than, the experimental mass resolution. This can be modelled by convolving a modified Breit-Wigner distribution 4 with the EGE or DSCB function. This procedure is valid as long as the width of the boson remains narrow relative to the experimental resolution, which is the case for the 2HDM scenarios considered in Section 8. Widths of up to approximately 20% of the boson mass are considered, which is the range relevant to the sensitive parameter space of the 2HDM scenarios that are of interest for this search.
Finally, the signal efficiencies for the interpolated mass points are obtained through separate two-dimensional interpolations on the ( , ) plane using thin-plate splines [123].

Systematic uncertainties
Several sources of systematic uncertainty in the signal and background estimates are considered, including experimental and theoretical sources. Experimental uncertainties comprise those in the luminosity measurement [124] (obtained using the LUCID-2 detector [53]), trigger, object identification, energy/momentum scale and resolution as well as underlying-event and pile-up modelling [98,102,103,107]. These uncertainties impact the simulations of signal and background processes.
The signal and background modelling have associated theoretical uncertainties. For the signal modelling, the uncertainties due to the factorisation and renormalisation scale choice, the initial-and final-state radiation treatment and the PDF choice are considered. No additional signal modelling uncertainties related to model-specific cross-section predictions, such as the 2HDM predictions used in Section 8, are considered. The renormalisation and factorisation scales are varied up and down separately by a factor of two, and the largest deviation from the nominal signal is taken as the estimated uncertainty. The uncertainties due to initial-and final-state radiation as well as the multiple parton interaction modelling are estimated using a subset of A14 tuning variations [60]. PDF uncertainties are computed using the prescription from PDF4LHC15 [88], which include the envelope of three PDF sets, namely CT14, MMHT2014 and NNPDF3.0.
Additional systematic uncertainties are assigned to cover the differences in signal efficiencies and ℓℓ and ℓℓ resolution differences between the interpolations and the simulations, as shown by the dotted blue and black lines in the lower panels of Figure 6.
For the background modelling, the most important sources of systematic uncertainty are the modelling of shapes of several kinematic distributions of +jets events. In the ℓℓ channel, they arise from the shape corrections for the T ,

√︃
Σ 2 T / ℓℓ and variables described in Section 5.1. An uncertainty is estimated by comparing the corrections and the agreement between the background prediction and the data for various variables and among various control regions. For each of the corrections, the applied uncertainty is half the size of the correction in the = 2 category, and the full size of the correction in the ≥ 3 category. In the ℓℓ channel, the uncertainties are due to the shapes of the T and leading-jet T distributions (Section 5.2). The uncertainty is estimated similarly to that in the ℓℓ channel and is half the size of the correction. For other background processes, modelling uncertainties are obtained by varying the factorisation and renormalisation scales, and the amount of initial-and final-state radiation.
The effect of these systematic uncertainties on the search is studied using a signal-strength parameter for hypothesised signal production (see also Section 8). The uncertainties found to have the largest impact depend on the choice of ( , ) signal point. Table 3 shows the relative uncertainties in the value from the leading sources of systematic uncertainty for two example mass points of gluon-gluon fusion and -associated production for the ℓℓ channel. The uncertainties are evaluated using an Asimov dataset [125] generated with the signal cross section set to the expected limits for the particular ( , ) signal point, considering a narrow-width boson. Table 4 shows the same information for the ℓℓ channel. The leading sources of systematic uncertainty are similar for other mass points studied and for larger boson widths.
For the ℓℓ channel, the most relevant sources of systematic uncertainty are the background modelling, the signal interpolation, and the jet energy scale and resolution. The limited size of the simulated samples has a higher impact at low masses, since at higher masses other sources are more dominant. Other systematic uncertainties with non-negligible impact include those associated with -tagging and theoretical errors. In the ℓℓ channel, the most relevant systematic uncertainties are those related to the jet energy scale and resolution, as expected in a channel with four jets in the final state. The limited size of the simulated samples, the background modelling and the signal interpolation also have a non-negligible impact on the signal-strength parameter. In both channels, the data statistical uncertainties have lower impact at low masses compared to the systematic uncertainties. In addition, the search sensitivity is affected at high masses by the limited size of the data sample, an effect which is more pronounced in the ℓℓ channel.    Table 4: The effect of the most important sources of uncertainty on the signal-strength parameter at two example mass points of ( , ) = (500, 300) GeV and ( , ) = (700, 200) GeV in the ℓℓ channel for gluon-gluon fusion production of a narrow-width boson. The same notation as in Table 3

Results
The ℓℓ and 2ℓ4 distributions are expected to exhibit a resonant structure if signal events are present, while background events result in a smoothly falling spectrum. Therefore, those are chosen as the final variables to discriminate between signal and background. The shape differences between the signal and background contributions in the ℓℓ and 2ℓ4 distributions are exploited through binned maximumlikelihood fits of the signal-plus-background hypotheses to extract potential signal contributions. The fits are based on the statistical framework described in Refs. [125][126][127]. For a given mass hypothesis of ( , ), the likelihood is constructed as the product of Poisson probabilities for event yields in the ℓℓ or 2ℓ4 bins: where is the number of observed events, and ( , , ì ) and ( ì, ì ) are the expected number of signal events and estimated number of background events in bin . The vector ì represents free background normalisation scale factors (described later) and the vector ì denotes all non-explicitly listed parameters of the likelihood function such as nuisance parameters associated with systematic uncertainties. Systematic uncertainties are incorporated in the likelihood as nuisance parameters with either Gaussian or log-normal constraint terms, denoted by ( ì ) in the formula above. The parameter of interest, , is a multiplicative factor applied to the expected signal rate. The ℓℓ and 2ℓ4 bin widths are chosen according to the expected detector resolution and taking into account the statistical uncertainty in the number of simulated background events. The bin centres are adjusted such that at least 65% of the test signal is contained in one bin.
For each bin, is calculated from the total integrated luminosity, the assumed cross section times branching ratio for the signal and its selection efficiency. The sum of all background contributions in the bin, , is estimated from simulation, which includes the modelling corrections discussed in Sections 5.1 and 5.2. The number of events in the¯and +jets control regions is included in the likelihood calculation to constrain their normalisation in the signal regions. This is achieved by introducing two free normalisation scale factors per channel, represented by ì in the likelihood description earlier in this section. In the ℓℓ channel these scale factors apply to the¯contribution and the heavy-flavour component of the +jets contribution, whereas the rest of the contributions in the control region are estimated from simulation. In the ℓℓ channel the scale factors apply to the¯contribution and the flavour-inclusive +jets contribution. Typical values of the scale factors are close to unity with the exception of +jets in the ℓℓ channel, which is scaled by a factor of 1.2, and¯in the ℓℓ ≥ 3 category, which is typically scaled by a factor of 1.4.
The signals that are fitted in each category are motivated by signal efficiency considerations and the interpretation of the search in the context of the 2HDM. In the ℓℓ channel the following fits are performed. First, bosons produced by gluon-gluon fusion are considered in the = 2 category. Second, a combined fit for the -associated production mechanism in both the = 2 and ≥ 3 categories is performed. Finally, there is a combination of the -associated production fit with the gluon-gluon fusion fit, which is interpreted in the context of the 2HDM. In the ℓℓ channel, only bosons produced by gluon-gluon fusion are considered and, hence, it is the only fit that is considered.

→ → ℓℓ results
The ℓℓ distributions from different mass windows are scanned for potential excesses beyond the background expectations through signal-plus-background fits. The scan is performed in steps of 10 GeV for both the range 230-800 GeV and the range 130-700 GeV, such that − ≥ 100 GeV. The step sizes are chosen to be compatible with the detector resolution for ℓℓ and . In total, there are 58 windows that are probed for the = 2 and ≥ 3 categories. The overall number of ( , ) signal hypotheses that are tested is 1711 per category. Figure 7 shows the distribution of the boson candidate mass before the window requirement in each of the two categories. Typical examples of ℓℓ distributions after the application of the window requirement are shown in Figures 8(a)-8(d). In particular, the window defined for = 300 GeV is shown in Figures 8(a) and 8(b) for the = 2 and ≥ 3 categories, respectively. On the same figures, a signal distribution is shown as well, which corresponds to gluon-gluon fusion production in Figure 8(a) and -associated production in Figure 8 In all cases, the data are found to be well described by the background model. The most significant excess for the gluon-gluon fusion production signal assumption is at the ( , ) = (610, 290) GeV signal point, for which the local (global) significance [128] is 3.1 (1.3) standard deviations. For -associated production, the most significant excess is at the ( , ) = (440, 220) GeV signal point, for which the local (global) significance is 3.1 (1.3) standard deviations. The significances are calculated for each production process separately, ignoring the contribution from the other.
In the absence of any statistically significant excess, the results of the search in this channel are interpreted as upper limits on the production cross section of an boson decaying into followed by the → decay, × ( → ) × ( → ). The cross-section upper limits consider bosons that are produced only by a single mechanism, i.e. either gluon-gluon fusion or -associated production. Modified frequentist [129] 95% confidence level (CL) upper limits on the production cross section of this process are obtained using the asymptotic approximation [125] for the various signal hypotheses that are tested. In particular, expected and observed upper limits for gluon-gluon fusion production of narrow-width bosons in the = 2 category are shown in Figures 9(a) and 9(b), respectively. For -associated production of narrow-width bosons, the expected and observed limits for the combination of the = 2 and ≥ 3 Upper limits are also calculated for signal assumptions where the natural width of the boson is large in comparison with the experimental mass resolution, which is needed for the interpretation of the search in the context of the 2HDM. The cross-section upper limit decreases as the natural width of the boson increases. In particular, a gluon-gluon produced boson with a natural width of 10% of its mass has a cross-section upper limit that is reduced on average by a factor of approximately 3 from the narrow-width case. This factor becomes approximately 4 when the natural width increases to 20%. The bosons from -associated production have worse experimental mass resolution and the deterioration of the limit is on average smaller: the upper limits are reduced by a factor of about 1.9 (2.3) for a natural width of 10% (20%).
The results for boson natural widths that are comparable to, or larger than, the experimental mass resolution are used for the interpretation of the search in the context of the CP-conserving 2HDM. The 2HDM benchmark against which the search results are compared has three free parameters: , and tan . In addition, there are four ways to assign the Yukawa couplings to fermions, defining type-I, type-II, lepton-specific and flipped 2HDMs. The remaining parameters are fixed. The mass of the lightest Higgs boson in the model is fixed to 125 GeV and its couplings are set to be the same as those of the SM Higgs boson by choosing cos( − ) = 0 [19], which is known as the 2HDM weak decoupling limit. The charged Higgs boson is assumed to have the same mass as the boson and the potential parameter 2 12 [18] is  fixed to 2 tan /(1 + tan 2 ).
The cross sections for boson production in the 2HDM are calculated using corrections at up to NNLO in QCD for gluon-gluon fusion and -associated production in the five-flavour scheme as implemented in SusHi [130][131][132][133]. For -associated production a cross section in the four-flavour scheme is also calculated as described in Refs. [134,135] and the results are combined with the five-flavour scheme calculation following Ref. [136]. The Higgs boson widths and branching ratios are calculated using 2HDMC [137]. The procedure for the calculation of the cross sections and branching ratios, as well as for the choice of 2HDM parameters, follows Ref. [63].
The interpretation of the search in the 2HDM is performed in the ( , ) plane, as shown in Figure 10. In this plot, colour-shaded areas indicate expected and observed exclusions for various tan values. There is one plot for each of the four 2HDM types. For the type-I and lepton-specific 2HDMs, only gluon-gluon fusion production is relevant. The exclusion region reaches 350 GeV for tan = 1 and the sensitivity decreases for larger tan values. In type-I 2HDM for instance, for tan = 10 the exclusion reaches 320 GeV and 500 GeV. The limiting value at 350 GeV is due to the drop of the →   branching ratio, which competes with →¯at larger values. The type-II and flipped 2HDMs are dominated by bosons from -associated production as tan increases, although gluon-gluon fusion is still important for tan ≈ 1. Like the type-I and lepton-specific 2HDMs, the type-II and flipped 2HDMs provide similar constraints because they only differ in the lepton Yukawa couplings. The contribution from -associated signal production increases the sensitivity at large tan values, excluding 650 GeV for tan = 20. The search sensitivity deteriorates at lower tan values, excluding 350 GeV for tan = 1.

→ → ℓℓ results
The 2ℓ4 distributions from different 4 mass windows are scanned for possible excesses using a procedure similar to the one in the ℓℓ channel. The scan is performed in steps of 10 GeV for both the range 300-800 GeV and the range 200-700 GeV, such that − ≥ 100 GeV. This gives in total 51 4 mass windows and the overall number of ( , ) signal hypotheses that are tested is 1326. Figure 11 shows the distribution of the boson candidate mass 4 before the 4 window requirement.  Using the same method as for the ℓℓ channel, constraints on the production of → followed by → decay are derived. The 95% CL upper limits are shown in Figure 13 for a narrow-width boson produced via gluon-gluon fusion. The upper limit varies from 0.023 pb for the ( , ) = (770, 660) GeV signal point to 8.9 pb for the ( , ) = (340, 220) GeV signal point. This is to be compared with the corresponding expected limits of 0.041 pb and 3.6 pb for these two signal points. The upper limits deteriorate when the natural width of the boson is comparable to, or larger than, the experimental mass resolution. In particular, for a natural width that is 10% of the upper limits decrease on average by a factor of 3. This factor becomes approximately 5 when the natural width increases to 20%.
The sensitivity of the ℓℓ channel in the context of the CP-conserving 2HDM was examined. The same 2HDM calculations as in the ℓℓ channel are used and the only differences are related to the parameter space of the model that is probed. In particular, because only bosons produced by gluon-gluon fusion are studied in this search, only type-I and lepton-specific 2HDMs are considered. In addition, the partial width Γ( → ) vanishes when cos( − ) = 0 and is maximal at | cos( − )| = 1, whereas for the partial width Γ( → ) the opposite is true, i.e. it vanishes when | cos( − )| = 1 and it is maximal when cos( − ) = 0. These observations imply that this channel should be most sensitive between these two extreme values of | cos( − )|.
The interpretation of the observed and expected upper limits on the cross section times branching ratio in the context of the type-I and lepton-specific 2HDM scenarios show that the ℓℓ channel has little sensitivity in regions that are not already excluded by the 125 GeV Higgs boson coupling measurements [3], an analysis that also provides similar limits in this parameter space. In particular, for the range considered in this channel, there is sensitivity up to < 250 GeV and for tan shown in Figure 14 for the type-I 2HDM. The results are very similar for the lepton-specific 2HDM, since the only difference between the two 2HDM types is the lepton Yukawa couplings, which only affect the total width.

Conclusion
Data recorded by the ATLAS experiment at the LHC, corresponding to an integrated luminosity of 139 fb −1 from proton-proton collisions at a centre-of-mass energy 13 TeV, are used to search for a heavy Higgs boson, , decaying into , where denotes another heavy Higgs boson with mass > 125 GeV. Two final states were considered, where the boson decays into a pair of -quarks or bosons, and in both cases the boson decays into a pair of electrons or muons. In the ℓℓ channel, the boson is assumed to be produced via either gluon-gluon fusion or -associated production. In the ℓℓ channel, only gluon-gluon fusion production is considered. No significant deviation from the SM background predictions is observed in the → ℓℓ and → ℓℓ → ℓℓ final states that are considered in this search. Considering each channel and each production process separately, upper limits are set at the 95% confidence level for × ( → ) × ( → or → ). For ℓℓ , upper limits are set in the range 6.2-380 fb for gluon-gluon fusion and 6.8-210 fb for -associated production of a narrow boson in the mass range 230-800 GeV, assuming the boson is in the mass range 130-700 GeV. For ℓℓ , the observed upper limits are in the range 0.023-8.9 pb for gluon-gluon fusion production of a narrow boson in the mass range 300-800 GeV, assuming the boson is in the mass range 200-700 GeV. Taking into account both production processes, the ℓℓ search tightens the constraints on the 2HDM scenario in the case of large mass splittings between its heavier neutral Higgs bosons. The ℓℓ channel has not been explored previously at the LHC, and this search explicitly demonstrates its potential to constrain 2HDM parameters away from the weak decoupling limit.