Pulling the Higgs and Top needles from the jet stack with Feature Extended Supervised Tagging

Jet tagging has become an essential tool for new physics searches at the high-energy frontier. For jets that contain energetic charged leptons we introduce Feature Extended Supervised Tagging (FEST) which, in addition to jet substructure, considers the features of the charged lepton within the jet. With this method we build dedicated taggers to discriminate among boosted $H \to \ell \nu q \bar q$, $t \to \ell \nu b$, and QCD jets (with $\ell$ an electron or muon). The taggers have an impressive performance, allowing for overall light jet rejection factors of $10^4-10^5$, for top quark / Higgs boson efficiencies of $0.5$. The taggers are also excellent in the discrimination of Higgs bosons from top quarks and vice versa, for example rejecting top quarks by factors of $100-300$ for Higgs boson efficiencies of $0.5$. We demonstrate the potential of these taggers to improve the sensitivity to new physics by using as example a search for a new $Z'$ boson decaying into $Z H$, in the fully-hadronic final state.


I. INTRODUCTION
From the last decade the Large Hadron Collider (LHC) is probing the high-energy frontier of particle interactions. With the high luminosity achieved, it has been possible to explore the multi-TeV scales not only in the search for new resonances, but also to test the SM production mechanisms at high energy, looking for possible deviations from the predictions of the Standard Model (SM). Being the two most massive SM particles, the Higgs boson and the top quark play a unique role in the search for physics beyond the SM, in particular to probe the electroweak symmetry breaking. The Higgs boson mainly decays hadronically or semileptonically (fully leptonic and diphoton decay modes are rare) while the top quark always produces a b quark in its decay. Therefore, when they are produced with a large boost, their decay products merge into a single jet J.
Jet tagging has witnessed a tremendous progress in the last decade [1][2][3][4][5] (see Ref. [6] for a review). The goal of the different tagging methods is to distinguish a 'signal' jet resulting from the hadronic decay of a boosted heavy particle, such as a weak W/Z boson, a Higgs boson, or a top quark, from a 'background' quark or gluon jet. The discrimination is done by the analysis of the jet substructure: while the former jets are multi-pronged (containing two or three quarks, depending on the decaying particle) the latter only have one prong. Jet tagging methods have been extensively used, for instance, in searches for new gauge bosons, scalar and spin-2 particles [7][8][9][10][11][12][13][14][15][16], vectorlike quarks [17][18][19][20] and dark matter [21], as well as in SM measurements [22,23].
Generic supervised taggers have also been developed [24][25][26] aiming to distinguish arbitrary multipronged jets from QCD jets. They have been found capable of separating jets containing 'prompt' (produced in the hard process) non-isolated leptons from QCD jets in which the leptons result from the decay of b, c quarks. However, to the best of our knowledge, no tagger has been specifically developed for jets containing such leptons. (Notice, however, that non-isolated leptons are rou-tinely used as one of the ingredients for b-tagging of jets.) This paper aims to fill that gap. We build up on the previously introduced Mass Unspecific Supervised Tagging (MUST) [26] to develop neural network (NN) taggers which, in addition to jet mass, transverse momentum (p T ) and substructure variables, use as input the charged lepton energy fraction z = E ℓ /E J and the distance from the jet axis in the plane of pseudorapidity (η) and azimuthal angle (φ), ∆R = (∆η 2 ℓJ + ∆φ 2 ℓJ ) 1/2 . The method hereby introduced is dubbed as Feature Extended Supervised Tagging (FEST). We build dedicated taggers that can discriminate among H → ℓνqq, t → ℓνb and QCD jets, treating the ℓ = e, µ cases separately. These two examples have the highest interest, since there are numerous measurements and searches by the ATLAS and CMS experiments involving top quarks or Higgs bosons in the boosted regime. We note that early work [27,28] pointed out the usefulness of non-isolated leptons for the identification of t → µνb. The related lepton p T fraction z = p T ℓ /p T J has been shown [29][30][31] very useful to discriminate boosted top quarks from QCD jets. A variant, using the lepton p T fraction with respect to a sub-jet, was explored in Ref. [32], where a detailed study on lepton isolation was also performed. The electron energy fraction has also been indirectly used in Ref. [33].

II. GENERATING THE EVENT SAMPLES
The Monte Carlo samples used to train and test the NNs are obtained as follows. Boosted Higgs bosons are generated with MadGraph [34], in the SM process pp → ZH, with Z → νν and H → ℓνqq. For boosted top (anti-)quarks we use pp → Zt + Zt mediated by a vector flavour-changing tcZ coupling [35], with Z → νν and t → ℓνb. For these processes the top flavour-changing neutral interactions are implemented in Feynrules [36] and interfaced to MadGraph5 using the universal Feynrules output [37]. QCD jets are generated in the inclusive process pp → jj, with j a light jet (not including b quarks). A possible extension could include bb in the training too; however, the tagger trained on light jets has excellent performance for b jets, as it is explicitly seen in the example presented in Section VI.
Event samples are generated in 100 GeV bins of p T starting at [300, 400] GeV, and up to to [2.1, 2.2] TeV in the case of QCD samples. For Higgs bosons and top quarks the jet p T is actually smaller than the p T of the decaying heavy particle, due to the missing neutrino. Therefore, we extend the generation up to the [2.9, 3.0] TeV and [3.4, 3.5] TeV bins, respectively. This guarantees coverage of the entire jet p T range up to 2.2 TeV. Even though within each bin the events mainly populate the lower end of the interval, the bins are narrow enough to adequately parameterise the p T dependence. For testing purposes, bb samples are generated using the same p T binning.
The parton-level event samples so generated are passed through Pythia [38] for hadronisation and Delphes [39] for a fast detector simulation, using the CMS card. Jets are reconstructed with FastJet [40] applying the antik T algorithm [41] with radius R = 0.8, and groomed with Recursive Soft Drop [42]. In the subsequent analysis we only keep jets with groomed mass m J ∈ [40, 170] GeV and p T ≥ 400 GeV. The chosen mass range encompasses the jet mass distributions for top quark and Higgs boson jets, and the latter cut is imposed in order to have a sufficient boost for top quarks, so that its decay products are contained within a R = 0.8 jet. We also ask that the jets contain a charged lepton with p T ≥ 10 GeV within a distance ∆R = 0.8 of the jet axis. As discussed in the Appendix, the overall selection efficiencies for jet preselection plus tagging are quite independent of this mild lower cut. For top and Higgs high-p T jets, the leptons are already very energetic and the lepton p T threshold has little influence. On the other hand, for QCD jets a higher threshold at preselection significantly lowers the efficiency. However, the NNs eventually learn that leptons are much softer for QCD jets, and a lower preselection efficiency is compensated by a higher mistag rate by the tagger.
We note that the requirement to contain a lepton, even with a threshold as low as p T ≥ 10 GeV, has a very low efficiency for the QCD jet samples. With our simulation we find that, for example, for the sample with p T ∈ [1, 1.1] TeV at the partonic level the efficiencies to find an electron or a muon above this threshold are 0.041 and 0.020, respectively. Therefore, huge samples of dijet events are needed to have sufficient statistics: 4 × 10 5 events per p T bin for NN training and validation, and 6 × 10 5 for testing, totaling 19 million jj pairs.

III. BUILDING THE TAGGERS
Jet substructure is characterised by the set of Nsubjettiness variables proposed in [5] , (1) computed for the ungroomed jets. 1 By means of a principal component analysis, it can be seen that the number of physically relevant combinations is actually smaller. Still, because the computational speed is not a serious issue, we keep the above set. As done in Ref. [26], we include as NN inputs the jet mass, but varying on a narrower range m J ∈ [40,170] GeV, and the jet p T ∈ [0.4, 2.2] TeV. Moreover, as previously pointed out, for these taggers we also include the lepton energy fraction z and ∆R with respect to the jet. A standardisation of the 21 inputs, based on the SM background distributions, is performed to improve the NN learning.
Our goal is to simultaneously discriminate among jets corresponding to Higgs bosons, top quarks and light quarks / gluons. Therefore, we build NNs whose input are the aforementioned variables for jets corresponding to the three classes (H, t, j). The NN output is a list of three numbers (p 1 , p 2 , p 0 ), with p 1 + p 2 + p 0 = 1, giving the probabilities that a jet corresponds to the H, t or j class, respectively. The NNs contain two hidden layers of 512 and 64 nodes, with Rectified Linear Unit (ReLU) activation for the hidden layers and a softmax function for the outputs. The NNs are optimised with the categorical cross-entropy loss function, using the Adam [44] optimiser. Two independent NNs are built, for ℓ = e and ℓ = µ, using Keras [45] with a TensorFlow backend [46]. The training sets for the e (µ) NN contain 6000 (5000) events from each class (H, t, j) and p T slice, totaling around 3 × 10 5 training events. The validation sets used to monitor the NN performance have similar size and composition as the training ones.
For testing, we build additional two-class NNs to discriminate between (i) H and j; (ii) t and j; (iii) H and t, using the same architecture except for the loss function, for which we use the (binary) cross-entropy, and the output layer, which only contains one node with a sigmoid activation function. These NNs are trained only using the events corresponding to the two classes (H, j), (t, j) or (H, t), respectively. Furthermore, we also build NNs only using the jet mass and p T , and the charged lepton z and ∆R as input, to investigate to which extent the jet substructure variables contribute to the discrimination.
Let us finally mention here some checks concerning the NN architecture. We have not found any performance improvement when duplicating the size of the first hidden layer. In previous work [26] we also verified that including higher-order τ (β) n does not improve the tagger discrimination. We also investigated the possibility of using unbalanced samples in the training, or other generalised loss functions such as the one proposed in [47], without noticeable improvements.

IV. TAGGER PERFORMANCE
We test the ability of our taggers to discriminate between different pairs of classes, marginalising over the third one. Figure 1 shows the receiver operating characteristic (ROC) curves for H versus j (top), t versus j (middle) and H versus t (bottom). In all plots, the horizontal axis gives the tagging efficiency ε for a given type of jet, and the vertical axis the tagging rejection ε −1 for another type of jet. In H versus t we consider t as 'background' because Higgs boson production is not usually a background for top quark measurements, but the discrimination can be performed in either way.  Figure 2 shows the rejection factors ε −1 for fixed efficiencies of 0.7, as a function of the jet p T . The efficiencies are evaluated within intervals of p T ∈ [ p T − 200, p T + 200] GeV and plotted as a function of p T . We also include here lines corresponding to the discrimination against b-quark jets, which have not been used in the NN training. As it can be readily seen, the discrimination of both H and t jets from b jets is excellent, and likely sufficient to reject backgrounds involving b quarks.
The tagger rejection for QCD jets is impressive. Furthermore, let us remind the reader that the test samples, for which the ROC curves in figure 1 and rejection factors in figure 2 are computed, are composed of jets that already pass the preselection requirement of a charged lepton with p T ≥ 10 GeV. And for QCD jets, the efficiency of this lepton requirement is quite small (see the Appendix). For a given overall H efficiencyε H , the overall QCD jet rejectionε −1 j is straightforwardly calculated as follows: 3 • By dividing the selected overall efficiencyε H by the preselection efficiency (either for electrons or for muons) we get a H tagging efficiency ε H , to which corresponds a j rejection ε −1 j .
• Then, dividing ε −1 j by the preselection efficiency for QCD jets (either for electrons or muons), we obtain the overall QCD jet rejection factorε −1 j . For example, the preselection efficiencies for H → ℓνqq jets with p T ∈ [1, 1.1] TeV are 0.61 and 0.91 in the electron and muon channel, respectively. For QCD jets, they are 0.041 and 0.020. Therefore, considering jets with p T ∼ 1 TeV, for an overall efficiencyε H = 0.5, the cor- Similar comments can be made regarding the top jet discrimination from QCD jets. The preselection efficiencies for t jets with p T ∈ [1, 1.1] TeV are 0.73 and 0.80 in the electron and muon channel, respectively. Therefore, for an overall t efficiencyε t = 0.5, the corresponding light jet rejection factorsε −1 j are e : ε t = 0.68 → ε −1 j = 2000 →ε −1 j = 4.8 × 10 4 µ : ε t = 0.62 → ε −1 j = 11000 →ε −1 j = 5.5 × 10 5 As expected, the QCD jet rejection is much larger than for the top fully-hadronic decay. For reference, NN taggers for the hadronic top quark decay mode have a light jet rejection factor of 500 for a top tagging efficiency of 0.5, working in the same p T range [48,49]. (Note that neither of these taggers, nor the FEST tagger presented here, use b tagging to identify top quarks.) Of course, the figures are not comparable because they refer to different decay modes. A meaningful comparison can be made considering the improvement on the S/ √ B ratio (with S standing for signal and B for background) brought by the different taggers, also taking into account the branching ratio for the hadronic and leptonic modes, With this figure of merit, one can see that tagging the top semileptonic decays with FEST offers much better prospects to probe for new physics. The discrimination between H and t jets is also excellent, as seen in the lower panel of Fig. 1, and this is of high importance because top quark production may constitute a background to Higgs boson measurements, as will be seen in the Z ′ → ZH example presented in the following.

V. COMPARISON WITH TWO-CLASS TAGGERS
We restrict ourselves to the electron channel and the test interval p T ∈ [0.85, 1.15] TeV to compare the threeclass tagger discriminating among H, t and j, with less general two-class taggers. The results are shown in Fig. 3. Interestingly, the discrimination power is the same for the three-class and the two-class taggers, with minor differences that may well have a statistical nature. This fact shows that the discrimination power between two given classes is not degraded when building a tagger that simultaneously tries to distinguish among H, t and j.
Because the lepton energy (or p T ) fraction has previously been used as a simple discriminating variable between top quarks decaying semileptonically and QCD jets [27][28][29][30][31][32][33], it is worth exploring to which extent the jet substructure variables add to the discrimination. With this purpose, we build two-class taggers that only use as input the jet mass and p T , as well as z and ∆R. As expected, for H → ℓνqq (with two quarks) the jet substructure significantly enhances the discrimination with respect to light jets. For t → ℓνb, jet substructure variables help but are less important. For H versus t discrimination the analysis of the jet substructure is crucial, as expected, because the former jets have two quarks and the latter only one.
Conversely, as seen in Refs. [25,26], generic taggers only using substructure variables have a poorer discrimination between jets with leptons and QCD jets. The tests in those references are performed using as signal jets from boosted heavy neutrinos decaying N → eqq, but the conclusion is expected to be general.

VI. EXAMPLE: Z ′ → ZH
We investigate here the usefulness of the taggers here introduced to improve the sensitivity of LHC measurements. Tagging of boosted H → bb is performed both by the ATLAS and CMS Collaborations by looking at btagged subjets of a large-radius jet containing the H → bb decay products. Namely, the ATLAS Collaboration uses R = 0.2 subjets in earlier searches [51] and variable radius jets in the most recent one [52] with the full Run 2 dataset. The CMS Collaboration uses subjets of R = 0.4 [53]. Requiring one or two b-tagged subjets significantly suppresses the QCD background, especially in the latter case. The ATLAS Collaboration has considered the decay H → ℓνqq in a search for HH resonances [54] in the resolved case, where this decay produces two narrow R = 0.4 jets and a charged lepton that can be independently reconstructed. As the Higgs bosons are more boosted, the efficiency of the resolved final state decreases and the final state where all H decay products are merged into a single jet becomes more sensitive. This can be seen in Fig. 4, where we show the ∆R separation between the charged lepton and the axis of the jet containing the H decay products in Z ′ → ZH, H → ℓνqq. We select three different Z ′ masses to illustrate the dependence on the heavy resonance mass. Because the lepton isolation criterion requires the absence of significant energy in a cone of radius ∆R ∼ 0.1 around the charged lepton, the resolved channel is disfavoured for resonances beyond the TeV scale. Future studies are required to compare the sensitivity of the resolved and merged final states for boosted H → ℓνqq. Our goal here is to evaluate the potential sensitivity of new physics searches targeting the H → ℓνqq decay in the merged final state, tagged using FEST. The branching ratio Br(H → ℓνqq) = 0.13 (summing over ℓ = e, µ and lepton charges) is much smaller than Br(H → bb) = 0.58 [50] but the excellent performance of the FEST tagger makes the decay mode competitive for large luminosities, and especially in final states where the background is large. Otherwise, the large background rejection achieved by FEST is less useful.
We investigate the sensitivity of ZH resonance searches in the decay modes Z → qq, H → ℓνqq. This fully-hadronic final state also allows to show the usefulness of the tagger to simultaneously suppress backgrounds with light jets and top quarks -at the end the latter turn out to be the dominant ones. We take as our reference for comparison the search for ZH resonances in the fully-hadronic channel by the ATLAS Collaboration with the full Run 2 dataset [52], focusing on the Z → qq, H → bb decay modes. Because our results are obtained with fast simulation, the comparison with the sensitivity achieved in Ref. [52] has the caveat of a possible degradation of the tagger performance in the environment of a real experiment, therefore the comparison has to be taken with a grain of salt. We perform a simulation including the backgrounds from jj, tt, W jj and tW production. Potential backgrounds with fake leptons cannot be handled with the fast simulation, but we expect them not to be dominant. In any case, in an experimental analysis they must be included. The dijet sample is the same one used to test the NN performance, and tt, W jj and tW samples are also generated in the same 100 GeV slices of p T . Samples with p T ≥ 2.2 TeV are also generated, and the different samples are combined with weight proportional to the cross section. A 2 TeV Z ′ → ZH signal is generated with Z → qq, H → ℓνqq. For M Z ′ = 2 TeV, the 95% confidence level upper limit on the production cross section times decay branching ratio from Ref. [52] is σ(pp → Z ′ → ZH) ≤ 5.3 fb. We use this cross section as reference for comparison between the two H decay channels. Events are passed through the simulation chain described before. In addition to R = 0.8 jets, we use a collection of 'track jets' of radius R = 0.2, reconstructed using only tracks. A jet is considered as b-tagged if a btagged track jet (using the 70% efficiency working point) within the R = 0.8 jet is found.
As event preselection we require two jets with m J ≥ 40 GeV, p T ≥ 400 GeV and |η| ≤ 2.5. At least one of them is required to have a charged lepton inside the jet. That jet is labeled as the 'H' jet; if both jets have charged leptons, the one having the lepton with highest z is selected. The remaining jet is labeled as 'Z'. As a proxy for the Z ′ mass we use the invariant mass of the two jets plus the neutrino, m JJν . The neutrino three-momentum is taken parallel to the one of the charged lepton, with its transverse component equal to the missing energy in the event. 4 The m JJν distribution for the background (overwhelmingly jj) at preselection is shown in Fig. 5, normalised to a luminosity of 139 fb −1 .
Before jet tagging, we require a separation |∆η| ≤ 1.5 among the two jets, jet masses m J ≤ 110 GeV, and perform a b-tag veto on the H jet. These simple cuts reduce the background (which still is dominated by jj production) by a factor of 10 − 100, as shown in Fig. 5.
Finally, tagging of both jets is performed. For the H jet we require probabilities p 0 ≤ 0.01, p 2 ≤ 0.9 that the jet corresponds to the j and t class, respectively. For the Z jet we use the two-pronged MUST-based tagger T 2P developed in Ref. [26], requiring a NN score (quantifying the probability that the jet is two-pronged) X ≥ 0.8. Tagging the H jet reduces the dijet background by a factor of 2.8 × 10 −3 , and tagging the Z jet reduces it by an additional factor of 0.04. Thus, the tagging reduces the background by 3 − 4 orders of magnitude, as shown in figure 5, and allows the injected Z ′ signal to be seen as a bump in the falling m JJν distribution. For clarity, the background-only distributions after tagging are shown as thin lines. After tagging, the expected number of events 4 We have also explored an alternative neutrino momentum reconstruction, with the longitudinal component and energy determined by requiring that the invariant mass of the neutrino and the H jet equal the Higgs boson mass. This constraint yields a second degree equation; among the two solutions we choose the one that gives smaller longitudinal momentum. The results with this alternative reconstruction are slightly worse. for the signal and the different backgrounds near 2 TeV is given in Table I. Other backgrounds from Zj and W j production, with Z/W hadronic decay, are less important, and bb is even smaller. At the region near 2 TeV, the former two amount to 1/7 and 1/3 of the jj background in the electron and muon channel, respectively, and the latter to 1/20 and 1/9, with the final event selection.
The expected significance of the Z ′ signal can be computed by performing likelihood tests for the presence of narrow resonances over the expected background, using the CLs method [55] with the asymptotic approximation of Ref. [56]. The local significance at m JJν = 1.95 TeV is of 2.2σ in the e channel and 2.4σ in the µ channel, neglecting systematic uncertainties. 5 Combining both, the local significance reaches 3.2σ. Therefore, even having in mind that the comparison with full simulation is not fair, it seems likely that the sensitivity to Z ′ → ZH may be improved, or at least matched, by the H → ℓνqq decay mode.

VII. CONCLUDING REMARKS
We have developed a three-class tagger to discriminate among boosted H → ℓνqq, t → ℓνb, and light jets, with an impressive rejection rate for the latter, and excellent discrimination between top quarks and Higgs bosons. For top quarks, its possible applications are numerous, because the huge rejection factor for light jets overly compensates the smaller semileptonic decay branching ratio. Using as figure of merit the branching ratio times significance improvement, c.f. (2), tagging top quarks in the electron and muon channels improves over the hadronic decay mode previously considered by factors of 1.6 and 5, respectively. For Higgs boson the prospects are quite good too, despite the smaller branching fraction for H → ℓνqq.
Our tagger has been built to work on a very wide range of jet p T ∈ [0.4, 2.2] TeV. (In contrast, several hadronic top taggers in the literature [48,49] are trained with jets within a narrow p T range.) This interval is sufficiently large so as to demonstrate that the tagger can correctly learn to distinguish the differences in jet substructure arising from different p T regimes and from different jet prongness. Moreover, it has been shown in Ref. [26] that the performance of a tagger trained on wide intervals of jet mass and p T nearly matches the performance of a tagger trained on narrow intervals. Therefore, the arbitrarily chosen range p T ∈ [0.4, 2.2] TeV can be further extended and we do not expect a performance drop.
One possible caveat to the practical application of the tagger is the possible difficulty and uncertainties in the measurement of z and ∆R for electrons embedded within jets, and the possible appearance of fakes. Reference [32] performed a detailed study regarding electron isolation, and there are good prospects that the measurements will be feasible. But even in a worst-case scenario that measurements in the electron channel could not be performed -which, we stress again, seems unlikely -the sensitivity in the muon channel alone is better than in hadronic top decays, c.f. (2), and likewise is expected for Higgs decays, as shown in the previous section.
Generally, one expects that H → ℓνqq and t → ℓνb with the taggers here introduced will provide the best sensitivity for boosted Higgs boson and top quark measurements, except at the kinematical end of the spectrum where the background is already quite small. Therefore, for large integrated luminosities, and especially at the high-luminosity upgrade of the LHC, tagging these decay modes may provide the best sensitivity for boosted H, t measurements across a very wide kinematical range.
Finally, let us comment that more generic taggers for jets containing leptons can also be built, which could be sensitive for example to boosted heavy neutrinos decaying N → ℓqq, and may be presented elsewhere.
The tagger is built based on a sample of jets that already contain a charged lepton, with a minimum transverse momentum p T ≥ 10 GeV. As it has been argued, the overall performance should have little dependence on this choice, within reasonable limits. In this appendix we explicitly test this, by restricting ourselves to the electron channel and using jet samples that contain electrons with p T ≥ 20 GeV. The preselection efficiencies for jets of the three classes are collected in Table II. The same procedure is followed to train the NN, and the results are compared in Fig. 6 with the results previously obtained. We denote by PT10 and PT20 the taggers built using electron thresholds p T ≥ 10 GeV, p T ≥ 20 GeV, respectively. As expected, the performance in H versus j and t versus j jets in the ROC plots is degraded, since the higher preselection threshold already makes part of the work of the tagger in separating H and t (with energetic electrons) from j. Also as expected, the discrimination between H and t is practically unaltered, up to small differences arising from the use of different NNs. Still, as argued in Section II, the overall performance of the tagger is nearly independent of the lepton p T threshold. Let us calculate for example the j rejection for jets with p T ∼ 1 TeV, for an H overall efficiencyε H = 0.5, as done in Section IV. For the two taggers, we have PT10 : ε H = 0.61 → ε −1 j = 3000 →ε −1 j = 7.4 × 10 4 PT20 : ε H = 0.63 → ε −1 j = 1900 →ε −1 j = 8.3 × 10 4 The O(10%) difference in the overall light jet rejection factor is due to statistical fluctuations in the jet samples, caused by the high value of ε −1 j . Likewise, can test the light jet rejection for an overall t efficiencyε t = 0.5, PT10 : ε t = 0.68 → ε −1 j = 2000 →ε −1 j = 4.8 × 10 4 PT10 : ε t = 0.69 → ε −1 j = 1100 →ε −1 j = 4.9 × 10 4 and in this case the j rejection is nearly the same when using either preselection threshold.