In a large part of the 2HDM(I) parameter space, the branching ratio of H±→AW± dominates. The possible decay modes of the A boson and the W± lead to many possible H+H−→AW+∗AW−∗ event topologies. Above m
A≈12 GeV, the A boson decays predominantly into a \(\mathrm {b} \bar {\mathrm {b}}\) pair, and thus its detection is based on b-flavor identification. Two possibilities, covering 90 % of the decays of two W±, are considered: quark pairs from both W± bosons or a quark pair from one and a leptonic final state from the other. The event topologies are therefore “eight jets” or “six jets and a lepton with missing energy”, with four jets containing b-flavor in both cases.
The background comes from several Standard Model processes. ZZ and W+W− production can result in multi-jet events. While ZZ events can contain true b-flavored jets, W+W− events are selected as candidates when c-flavored jets fake b-jets. Radiative QCD corrections to \(\mathrm{e}^{+}\mathrm{e}^{-}\to\mathrm{q}\bar{\mathrm{q}}\) also give a significant contribution to the expected background.
Due to the complexity of the eight-parton final state, it is more efficient to use general event properties and variables designed specifically to discriminate against the main background than a full reconstruction of the event. As a consequence, no attempt is made to reconstruct the charged Higgs-boson mass.
The analysis proceeds in two steps. First a preselection is applied to select b-tagged multi-jet events compatible with the signal hypothesis. Then a likelihood selection (with three event classes: signal, four-fermion background and two-fermion background) is applied.
The preselection of multi-jet events uses the same variables as the search for the hadronic final state in [16] with optimized cut positions. However, it introduces a very powerful new criterion, especially against the W+W− background, on a combined b-tagging variable (\({{\mathcal{B}}_{\mathrm{evt}}}\)) requiring the consistency of the event with the presence of b-quark jets.
The neural network method used for b-tagging in the OPAL SM Higgs-boson search [24] is used to calculate on a jet-by-jet basis the discriminating variables \(f^{i}_{\mathrm{c/b}}\) and \(f^{i}_{\mathrm{uds/b}}\). These are constructed for each jet i as the ratios of probabilities for the jet to be c- or uds-like versus the probability to be b-like. The inputs to the neural network include information about the presence of secondary vertices in a jet, the jet shape, and the presence of leptons with large transverse momentum. The Monte Carlo description of the neural network output was checked with LEP1 data with a jet energy of about 46 GeV. The main background in this search at LEP2 comes from four-fermion processes, in which the mean jet energy is about 50 GeV, very close to the LEP1 jet energy; therefore, an adequate modeling of the background is expected with the events reconstructed as four jets.
The AW+∗AW−∗ signal topology depends on the Higgs-boson masses. At m
A≈12 GeV or \(\mbox {$m_{\mathrm{A}}$}\approx m_{\mathrm{H}^{\pm}}\), the available energy in the A or W± system is too low to form two clean, collimated jets. At high \(m_{\mathrm{H}^{\pm}}\), the boost of the A and W± bosons is small in the laboratory frame and the original eight partons cannot be identified. At low \(m_{\mathrm{H}^{\pm}}\), the A and W± bosons might have a boost, but it is still not possible to resolve correctly the two partons from their decay. From these considerations, one can conclude that it is not useful to require eight (or even six) jets in the event, as these jets will not correspond to the original partons. Consequently, to get the best possible modeling of the background, four jets are reconstructed with the Durham jet-finding algorithm [46–49] before the b-tagger is run.
The flavor-discriminating variables are combined for the four reconstructed jets by
$$ \mbox {${{\mathcal{B}}_{\mathrm{evt}}}$}= \frac{1}{1+\alpha\cdot\prod_i f^i_\mathrm{c/b} + \beta \cdot\prod_i f^i_\mathrm{uds/b}}. $$
(1)
The index i runs over the reconstructed jets (i=1,…,4) and the parameters α and β are numerical coefficients whose optimal values depend on the flavor composition of the signal and background final states. However, since the expected sensitivity of the search is only slightly dependent on the values of α and β, they are fixed at α=0.1 and β=0.7. Events are retained if \(\mbox {${{\mathcal{B}}_{\mathrm{evt}}}$}>0.4\).
The preselections of the two event topologies (8j and 6j+ℓ) are very similar. However, in the 6j+ℓ channel, no kinematic fit is made to the \(\mathrm{W}^{+}\mathrm{W}^{-}\to\mathrm{q}\bar{\mathrm{q}}\mathrm{q}\bar{\mathrm{q}}\) hypothesis and, therefore, no cuts are made on the fit probabilities. No charged lepton identification is applied; instead the search is based on indirect detection of the associated neutrino by measuring the missing energy.
After the preselection the observed data show an excess over the predicted Monte Carlo background. This can partly be explained by the apparent difference between the gluon splitting rate into cc̅ and bb̅ pairs in the data and in the background Monte Carlo simulation. The measured rates at \(\sqrt{s}=91\ \mbox{GeV}\) are \(g_{\mathrm{c}\bar{\mathrm{c}}} = 3.2 \pm0.21 \pm0.38\ \%\) [50] and \(g_{\mathrm{b}\bar{\mathrm{b}}} = 0.307 \pm0.053 \pm0.097\ \%\) [51] from the LEP1 OPAL data. The gluon splitting rates in our Monte Carlo simulation are extracted from \(\mathrm{e}^{+}\mathrm{e}^{-}\to\mathrm{ZZ}\to\ell^{+} \ell^{-}\mathrm{q}\bar{\mathrm{q}}\) events, where the \(\mathrm{Z}\to \mathrm{q}\bar{\mathrm{q}}\) decays have similar kinematic properties to the ones in the LEP1 measurement. Note that \(\mathrm{e}^{+}\mathrm{e}^{-}\to\mathrm{ZZ}\to \mathrm{q}\bar{\mathrm{q}}\mathrm{q}\bar{\mathrm{q}}\) events can not be used as the two \(\mathrm {q}\bar {\mathrm {q}}\) pairs interact strongly with each other. The rates are found to be \(g_{\mathrm{c}\bar{\mathrm{c}}}^{\mathrm{MC}} = 1.33 \pm0.06\ \%\) and \(g_{\mathrm{b}\bar{\mathrm{b}}}^{\mathrm{MC}} = 0.116 \pm0.0167\ \%\), averaged over all center-of-mass energies. This mismodeling can be compensated by reweighting the SM Monte Carlo events with gluon splitting to heavy quarks and at the same time deweighting the non-split events to keep the total numbers of W+W−, ZZ and two-fermion background events fixed at generator level. The reweighting factor is 2.41 for \(g\to \mathrm{c}\bar{\mathrm{c}}\) and 2.65 for \(\mathrm{g}\to \mathrm{b}\bar{\mathrm{b}}\). The same reweighting factors are used for W+W−, ZZ and two-fermion events with gluon splitting at all LEP2 energies, noting that all background samples were hadronized with the same settings and assuming that the \(\sqrt{s}\) dependence of the gluon splitting of a fragmenting two-fermion system is correctly modeled by the Monte Carlo generator. It is known that the generator reproduces the energy dependence predicted by QCD in the order α
s with resummed leading-log and next-to-leading log terms [52]. This correction results in a background enhancement factor of 1.08 to 1.1 after the preselection, depending on the search channel and the center-of-mass energy, but it does not affect the shape of the background distributions.
The numbers of preselected events after the reweighting are given in columns 2 and 3 of Table 4. At this stage of the analysis the 8j and 6j+ℓ data samples are highly overlapping. The observed rates still show an excess over the background predictions, adding up to about 1.6 standard deviations in both samples. Although this difference is statistically not significant, it can be shown that the Monte Carlo prediction has minor imperfections. For the 8j case, the distributions of three variables used in the analysis, namely y
34, y
56 and \(\mathcal{B}_{\mathrm{evt}}\), are plotted in the right part of Fig. 3. As can be seen, the variable y
56 is most powerful to reject the background. Both the y
34 and the y
56 distributions are slightly shifted towards the position of a hypothetical Higgs signal and the \(\mathcal{B}_{\mathrm{evt}}\) distribution shows an excess over the predicted background at intermediate \(\mathcal{B}_{\mathrm{evt}}\) values, but the excess events are not distributed according to the expectation for a Higgs signal. The shifts are visible with better statistical significance in the left part of Fig. 3. It shows the same variables for a background enriched data sample, where the preselection cuts on y
34 and \(\mathcal{B}_{\mathrm{evt}}\) are dropped, except for the study of the y
56 variable where we keep the cut on y
34 to select multi-jet events. The resulting samples are completely dominated by background, the contribution of a Higgs signal being at most 0.5 %. Since heavy quark production in the Monte Carlo generator is already corrected, the origin of the discrepancies is likely a slight mismodeling of the topology of multi-jet events, especially if they contain heavy quarks. No further correction is applied to the estimated background. Excess events passing the final selection, even if they do not look signal-like, are thus counted with a certain weight as signal events in the statistical analysis, to be discussed later.
Table 4 Observed data and expected SM background events for each year in the AW+∗AW−∗ searches. The 8j and 6j+ℓ event samples after the preselection step (3rd and 4th columns) are highly overlapping. After the likelihood selection, the overlapping events are removed from the 8j and 6j+ℓ samples and form a separate search channel (last three columns). The uncertainty on the background prediction due to the limited number of simulated events is given. The Monte Carlo reweighting to the measured gluon splitting rates is included
As a final selection, likelihood functions are built to identify signal events. The reference distributions depend on the LEP energy, but they are constructed to be independent of the considered \((\mbox {$m_{\mathrm {H}^{\pm }}$}, \mbox {$m_{\mathrm{A}}$})\) combination. To this end, we form the signal reference distributions by averaging all simulated H+H− samples in the \((\mbox {$m_{\mathrm {H}^{\pm }}$}, \mbox {$m_{\mathrm{A}}$})\) mass range of interest.
Since the selections at \(\sqrt{s}=192\mbox{--}209\ \mbox{GeV}\) are aimed at charged Higgs-boson masses around the expected sensitivity reach of about 80–90 GeV, all masses up to the kinematic limit are included. On the other hand, at \(\sqrt{s}=189\ \mbox{GeV}\) only charged Higgs-boson masses up to 50 GeV are included since the selections at this energy are optimized to reach down to as low as a charged Higgs-boson mass of 40 GeV where the LEP1 exclusion limit lies. The input variables for the 8j final state are: the Durham jet-resolution parametersFootnote 2 log10
y
34 and log10
y
56, the oblateness [53] event shape variable, the opening angle of the widest jet defined by the size of the cone containing 68 % of the total jet energy, the cosine of the W production angle multiplied with the W charge (calculated from the jet charges [54]) for the \(\mathrm{e}^{+}\mathrm{e}^{-} \rightarrow \mathrm{W}^{+}\mathrm{W}^{-} \rightarrow \mathrm{q} \bar{\mathrm{q}} \mathrm{q}\bar{\mathrm{q}}\) interpretation, and the b-tagging variable \({{\mathcal{B}}_{\mathrm{evt}}}\). At \(\sqrt{s}=189\ \mbox{GeV}\), log10
y
23, log10
y
45, log10
y
67, and the maximum jet energy are also used. Moreover, the sphericity [55] event shape variable has more discriminating power and thus replaces oblateness. Although the y
ij
variables are somewhat correlated, they contain additional information: their differences reflect the kinematics of the initial partons.
The input variables for the 6j+ℓ selection are: log10
y
34, log10
y
56, the oblateness, the missing energy of the event, and \({{\mathcal{B}}_{\mathrm{evt}}}\). At \(\sqrt{s}=189\ \mbox{GeV}\), log10
y
23, the maximum jet energy and the sphericity are also included.
Events are selected if they pass a lower cut on the likelihood output. The likelihood distributions are shown in Fig. 4. The positions of the likelihood cuts are indicated by vertical lines. The discrepancies observed in Fig. 3 in background-enriched samples, propagate into the likelihood distributions. Since the excess events in Fig. 3 are shifted relative to the background expectation, but do not agree with the Higgs distribution, they give likelihood values between the mean background and signal values in Fig. 4. With large statistical errors, the effect can be seen at intermediate likelihood values. Some of the excess events pass the final likelihood cut.
To assure that every event is counted only once in the final analysis, the overlapping 8j and 6j+ℓ event samples, as obtained after the final likelihood cut are redistributed into three event classes: (i) events exclusively classified as 8j candidates, (ii) events exclusively classified as 6j+ℓ candidates and (iii) events accepted by both selections. If an event falls into class (iii), the larger likelihood output of the two selections is kept for further processing. The final results using the above classification are quoted in Table 4. After all selection cuts, an excess of events appears in the 1999 data sample. The excess (1.9σ) is not statistically significant and it is consistent with the results of the other years. This modified channel definition not only removes the overlap but also increases the efficiency for detecting signal events by considering the cross-channel efficiencies (e.g. the efficiency to select \(\mathrm{H}^{+}\mathrm{H}^{-}\to \mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\) signal by the exclusive 6j+ℓ selection can be as high as 18 %, though it is typically only a few %). The efficiencies are determined independently for all simulated \((\mbox {$m_{\mathrm {H}^{\pm }}$}, \mbox {$m_{\mathrm{A}}$})\) combinations and interpolated to arbitrary \((\mbox {$m_{\mathrm {H}^{\pm }}$}, \mbox {$m_{\mathrm{A}}$})\) by two-dimensional spline interpolation. The behavior of the selection efficiencies depends strongly on the targeted charged Higgs-boson mass range and also varies with the mass difference \(\Delta m=m_{\mathrm{H}^{\pm }}-m_{\mathrm{A}}\). In most cases the overlap channel has the highest efficiency. At \(\sqrt {s}=189\ \mbox{GeV}\) and \(m_{\mathrm{H}^{\pm}}=45\ \mbox{GeV}\), it reaches 32 % for the \(\mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\mathrm{b} \bar{\mathrm{b}}\ell\nu_{\ell}\) and 44 % for the \(\mathrm{H}^{+}\mathrm{H}^{-}\to \mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\) signal. At \(\sqrt{s}=206\ \mbox{GeV}\) and \(m_{\mathrm{H}^{\pm}}=90\ \mbox{GeV}\), the overlap efficiency can be as high as 62 % for the \(\mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\mathrm{b} \bar{\mathrm{b}}\ell\nu_{\ell}\) and 71 % for the \(\mathrm{H}^{+}\mathrm{H}^{-}\to \mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\mathrm{b} \bar{\mathrm{b}}\mathrm{q}\bar{\mathrm{q}}\) signal. The exclusive 6j+ℓ selection has efficiencies typically below 20–30 %, while the exclusive 8j selection below 10–15 %. Table 5 gives the selection efficiencies at selected \((\mbox {$m_{\mathrm {H}^{\pm }}$}, \mbox {$m_{\mathrm{A}}$})\) points.
Table 5 Signal selection efficiencies in percent for the H±→AW± final states in the different search channels at \(\sqrt{s}=189\ \mbox{and}\ 206\ \mbox{GeV}\) at representative \((\mbox {$m_{\mathrm {H}^{\pm }}$}, \mbox {$m_{\mathrm{A}}$})\) points
The composition of the background depends on the targeted Higgs-boson mass region. In the low-mass selection (\(\sqrt{s}=189\ \mbox{GeV}\)) that is optimized for \(m_{\mathrm{H}^{\pm}}=40\mbox{--}50\ \mbox{GeV}\), the Higgs bosons are boosted and therefore the final state is two-jet-like with the largest background contribution coming from two-fermion processes: they account for 52 % in the exclusive 8j, 80 % in the exclusive 6j+ℓ and 76 % in the overlap channel. On the other hand, in the high-mass analysis (\(\sqrt{s}=192\mbox{--}209\ \mbox{GeV}\)) the four-fermion fraction is dominant: 69 % in the 8j, 56 % in the 6j+ℓ and 70 % in the overlap channel.
Systematic errors arise from uncertainties in the preselection and from mismodeling of the likelihood function. The variables y
34 and \({{\mathcal{B}}_{\mathrm{evt}}}\) appear both in the preselection cuts and in the likelihood definition. The total background rate is known to be underestimated after the preselection step. The computation of upper limits on the production cross section, with this background rate subtracted, results in conservative limits, assuming the modeling of the other preselection variables and the signal and background likelihoods to be correct. Therefore, no systematic uncertainty is assigned to the percentage of events passing the y
34 and \({{\mathcal{B}}_{\mathrm{evt}}}\) preselection cuts. The systematic errors related to preselection variables other than y
34 and \({{\mathcal{B}}_{\mathrm{evt}}}\), evaluated from background enriched data samples, are taken into account.
As already mentioned, the discrepancies shown in Fig. 3 have an impact on the likelihood function. Event-by-event correction routines for the variables y
34 and \({{\mathcal{B}}_{\mathrm{evt}}}\) were developed to describe the observed shapes, keeping the normalization above the preselection cuts fixed. The systematic errors were estimated by computing the likelihood for all MC events with the modified values of y
34 and \({{\mathcal{B}}_{\mathrm{evt}}}\) and counting the accepted MC events. The systematic errors related to all other reference variables were estimated in the same manner.
Systematic uncertainties also arise due to the gluon splitting correction. The experimental uncertainty on the gluon splitting rate translates into uncertainties on the total background rates. Moreover, there is an uncertainty due to the Monte Carlo statistics of the \(\mathrm{g}\to \mathrm{c}\bar{\mathrm{c}}\) and bb̅ events.
Finally, uncertainties due to the limited number of simulated signal and background events are included. The different contributions are summarized in Table 6. Uncertainties below the 1 % level are neglected.
Table 6 Relative systematic uncertainties in percent for the AW+∗AW−∗ searches. Where two values are given separated by a “/”, the first belongs to the 189 GeV selection and the second to the 192–209 GeV selections. For the signal, the uncertainties due to the limited Monte Carlo statistics are calculated by binomial statistics for a sample size of 500 events and they also depend, via the selection efficiency, on the assumed Higgs-boson masses. N.A. stands for not applicable. The multiplicative gluon splitting correction factors, used to obtain the background-rate estimates as explained in the text, are given in the last line