In this chapter, we present the analysis of Higgs and new physics searches as examples of data analysis. The data handled here are already calibrated and the particle identification for each object is also done.Footnote 1 In the so-called “data analysis” of the collider experiments, the event selection, background estimation, and signal extraction or measurement including evaluating systematic uncertainties are performed.

8.1 Higgs

8.1.1 Higgs Production Mechanism in Hadron Colliders

There are some different processes of Higgs production. Figure 8.1 shows the Higgs production cross sections in pp collisions as a function of Higgs mass. The largest contribution comes from the gluon fusion (Fig. 8.2a), in which there is no additional topology or feature other than the Higgs production. Hence, the inclusive analysis (see Sect. 2.3) is enforced as long as we consider the gluon fusion process. On the other hand, the final states of the other three processes contain not only Higgs but also extra particles, resulting in the characteristic topologies.

Fig. 8.1
figure 1

Reprinted, under the Creative Commons Attribution 3.0 License from [1] © 2013–2022 CERN

Higgs production cross section as a function of Higgs mass at \(\sqrt{s} = \)8 TeV.

Fig. 8.2
figure 2

Feynman diagrams of Higgs productions

The second-largest cross section is via vector boson fusion (VBF) process where either Ws or Zs radiated from quarks couple together producing a Higgs boson (Fig. 8.2b). The quarks radiating W or Z bosons appear as forward jets, because their \(p_{\textrm{T}}\)  which tends to be close to the W or Z mass, is much smaller than the momentum of colliding protons. In addition, since this process does not contain any colour exchanges between the incoming quarks, no parton radiation would exist around the produced Higgs or the detector central region, in contrast to the overwhelming multijet background where not only hard jets but also many soft jets are produced. Putting what is mentioned so far together, the Higgs production through the vector boson fusion process has a very unique topology with two forward jets and with little QCD activities (partons due to colour exchanges) in the central region except for Higgs decays. The feature allows us to significantly reduce the background due to multijet productions as well as the other types of background.

Another important production mechanism is the associate production with a vector boson, i.e., either W or Z (Fig. 8.2c). In case the W or Z decays hadronically, it does not help to improve the signal-to-noise ratio due to the overwhelming multijet backgrounds. However, leptonic decays of W or Z produce isolated leptons, allowing us to significantly improve the signal-to-noise ratio with a cost of the small branching fractions of W and Z.

The production cross section of associate production of \(t\bar{t}\) (Fig. 8.2d) is one order of magnitude smaller than that of WH production. It is still accessible because of the characteristic topology. This production mechanism has special importance because this allows the direct access to the top Yukawa coupling.

Below we describe the basic idea of the analysis for \(H\rightarrow \gamma \gamma \), \(H\rightarrow b\bar{b}\), and \(H\rightarrow W^{+}W^{-}\).

8.1.2 \(H \rightarrow \gamma \gamma \)

The Higgs boson was discovered in the ATLAS [2] and CMS [3] experiments in 2012. In this discovery, \(H \rightarrow \gamma \gamma \) and \(H \rightarrow ZZ^* \rightarrow \ell \ell \ell ^\prime \ell ^\prime \) channels played the most important role because they can reconstruct the invariant mass of the Higgs boson precisely compared to other channels, for example, \(H \rightarrow WW^* \rightarrow \ell \nu \ell ^\prime \nu ^\prime \) even if the expected statistics for \(H \rightarrow \gamma \gamma \) and \(H \rightarrow ZZ^* \rightarrow \ell \ell \ell ^\prime \ell ^\prime \) is not high. In the distribution of the invariant mass of the Higgs boson candidates, we can observe a clear peak of the signal on top of the background events, which is one of the most reliable evidence of a resonance particle to claim its discovery. In this section, we explain how to search for the Higgs boson with the \(H \rightarrow \gamma \gamma \) channel in the ATLAS experiment.

As mentioned before, the signal statistics is limited since the branching ratio of \(H \rightarrow \gamma \gamma \) is very small, about 0.2%, for the mass of around 125 GeV, while thanks to a good resolution of diphoton invariant mass \(m_{\gamma \gamma }\), a narrow resonance was expected to be observed on a huge but smooth background as shown in Fig. 8.3 [4]. Below we’ll explain how to obtain this result.

Fig. 8.3
figure 3

Reprinted under the Creative Commons Attribution License 3.0 from [4] Copyright © 2013 CERN. The result of a fit to the data with the sum of a SM Higgs boson (126.8 GeV) and background is superimposed. The lower panel shows the residuals of the data with respect to the fitted background

Invariant mass distribution of diphotons for the combined 7 and 8 TeV data in ATLAS.

We need two photons to reconstruct the invariant mass of diphotons, which is a final discriminant to extract the signal. Events having two photon candidates must be recorded in the offline storage to perform the analysis and diphoton triggers (35 and 25 GeV for photon \(E_{\textrm{T}}\)) were used for the trigger selection. Since events with jets faking photons, which are called fake photons, are not negligible, we cannot use, for example, single-photon triggers with a low \(E_{\textrm{T}}\) threshold like 25 GeV.Footnote 2 In the analysis, two photon candidates were selected with \(p_{\textrm{T}}>40\) GeV and 30 GeV, which are high enough to ensure the offline selected events achieve 100% trigger efficiency. This is a common technique in the physics analysis, because the estimation of trigger efficiency is not easy in general, especially for the momentum close to the turn-on of efficiency. We can avoid using such events near the trigger turn-on by requiring much higher \(p_{\textrm{T}}\) in offline selection compared to the trigger level. In this way, the source of possible large systematic uncertainty can be removed with the cost of losing some fraction of signal events.

There are three different processes in the background events: two real photons, one real photon+one fake photon, and two fake photons, which are called \(\gamma \gamma \), \(\gamma \)+jet, and dijet, respectively. These background events do not make a peak but a smooth falling curve in the diphoton invariant mass \(m_{\gamma \gamma }\) distribution as shown in Fig. 8.3. These compositions can be measured using photon identification variables, for example, an isolation variable. Their fractions were determined to be \(\sim \)74% for \(\gamma \gamma \), \(\sim \)22% for \(\gamma \)+jet, and \(\sim \)3% for dijet. In addition, the Drell-Yan process (\(Z^{(*)}/\gamma \rightarrow e^+e^-\), DY) remains with \(\sim \)1% of the background due to hard-bremsstrahlung.

It is important to improve the resolution of the \(m_{\gamma \gamma }\) distribution. For this purpose, we need to measure photon energy and also the angle between two photons as precisely as possible. Since the EM calorimeter has three layers longitudinally in ATLAS, the direction of photons can be determined from the measurements of photon cluster positions. The production vertex of diphotons is calculated from the direction of two photons. This method is called calo-pointing. The position obtained with the calo-pointing is precise enough in terms of the \(m_{\gamma \gamma }\) resolution while a more precise determination is required for the association of charged tracks to jets because jets from pile-up are identified using this association information. The production vertex position is finally obtained by using several information, for example, charged tracks not matched to any photons, charged tracks from conversions, the balance between two photons and charged tracks, etc. The resolution of the \(m_{\gamma \gamma }\) is about 3%, and events with two unconverted photons have better resolution than those with at least one converted photon about 10% in relative.

Selected events are classified into several categories for two reasons; the first reason is to improve sensitivities for the search itself, which is called a global search here, and the second one is to measure properties of specific production processes, for example, VBF and VH processes using extra leptons, jets, and the missing \(E_\textrm{T}\). For example, 14 categories were introduced in the 8 TeV data analysis using jets, leptons, and transverse missing energy, where 2 for VBF, 3 for VH, and the other 9 categories for the improvement of the discovery sensitivity. There was about 30% improvement in the global search sensitivity compared to the result without categorisation.

The event excess, which is a signature of Higgs decays, is evaluated with a local \(p_0\), which is a probability of how similar an observed distribution is to that with a background-only hypothesis. If the \(p_0\) value is 0.5, it indicates that the observation is consistent with the background-only hypothesis, that is, no excess. If the \(p_0\) value is smaller (larger) than 0.5, it means there is an excess (a deficit)Footnote 3 over the background. In addition, if a search is performed for a new narrow resonance (\(\sim \)4 MeV in case of SM Higgs boson) with an unknown mass in the invariant mass distribution (\(m_{\gamma \gamma }=\) [110, 160] GeV in case of SM Higgs boson), we need to take into account the so-called look-elsewhere-effect. This effect can properly treat the fact that excesses like \(3\sigma \) due to the statistical fluctuation could happen even if there is no new resonance in the search region and the frequency of such fake excesses becomes high in case of narrow resonance searches.Footnote 4 The \(p_0\) value after taking this effect is called a global \(p_0\).Footnote 5 This effect is negligible in the case of broad resonance searches due to an intrinsic particle width, worse detector resolutions, etc. With the full dataset of LHC Run 1 in ATLAS (2011–2012), the largest excess with respect to the background-only hypothesis (based on local \(p_0\)) was observed (expected) with 7.4 (4.3)\(\sigma \) at 126.5 GeV as shown in Fig. 8.4 [4].

Fig. 8.4
figure 4

Reprinted under the Creative Commons Attribution License 3.0 from [4] Copyright © 2013 CERN. The dashed curves show the expected median local \(p_0\) for the SM Higgs boson hypothesis when tested at a given \(m_{\textrm{H}}\)

Observed local \(p_0\) as a function of the Higgs boson mass \(m_{\textrm{H}}\) for 7 TeV data (blue), 8 TeV data (red), and their combination (black).

8.1.3 \(H\rightarrow b\bar{b}\)

This section outlines the analysis of \(H \rightarrow b\bar{b}\). As the branching fraction of \(H \rightarrow b\bar{b}\) is the largest (\({\sim }58\%\)) among the various decays of Higgs of 125 GeV mass, \(H \rightarrow b\bar{b}\) could be the most useful and natural decay mode to search for Higgs and to study its properties in view of the statistics. On the other hand, the signature of the final state consists of just two b-jets. There would be no issues in the case of the \(e^{+}e^{-}\) colliders such as ILC, which provide a very clean environment experimentally, resulting in a very high signal-to-noise ratio. In the hadron colliders, however, the study of \(H \rightarrow b\bar{b}\) is not straightforward at all because of the overwhelming QCD backgrounds (multijet background processes). At the energy of LHC, for example, the production cross section of inclusive b-jets is larger by the eighth order of magnitude than that of the Higgs. In addition, the identification of b-jets is not perfect. Light jets can mimic the signal. In this case, any jet production can be a background, whose production cross section is even higher than the inclusive b-jet cross section. Therefore, at the hadron colliders, we need some clever ideas to separate the \(H \rightarrow b\bar{b}\) signals from the huge background.

In the following, we discuss the analysis method of \(H \rightarrow b\bar{b}\) using the vector boson fusion process first and then the associate production of W and Z.

8.1.3.1 Vector Boson Fusion Process

The final state consists of two b-jets decayed from Higgs and two forward jets. Since there are no isolated leptons or large missing \(E_\textrm{T}\) which are commonly used to trigger an event, careful study and the optimisation of the trigger are needed. The most apparent choice of the trigger would be to require four jets with relatively high \(p_{\textrm{T}}\). In addition a requirement on the topology, i.e., the existence of two forward jets in different \(\eta \), respectively, may be applied if such a topological trigger is available. Even with the requirements above, still the remaining events would be dominated by the multijet background because of the huge production cross section. In order to suppress the multijet events further, the existence of a muon (see Sect. 6.6.1.2) that arises from the semi-leptonic decay of b-hadrons (directly or through the cascade decay to c like \(b\rightarrow c\ell ^-\bar{\nu }\)) may be required with a cost of statistics. Even though there are two b-hadrons (and hence two c-hadrons followed by the decay of b-hadrons most of the time), the branching fraction of semi-leptonic decay is only the order of 10% (see Sect. 6.6.1) . The \(p_{\textrm{T}}\) of the lepton from the semi-leptonic decay is not so large. Because of these two factors, the signal efficiency is relatively low. Therefore, one has to optimise the trigger condition with a careful study. In other words, this is where the improvement potentially exists.

The offline analysis starts by selecting events with four jets. Out of the four, two are required to be in the central (rather small \(|\eta |\)), and the other two in the forward region (\(\equiv \) forward jets). The forward jets tend to keep the direction of the parents’ protons, and hence to be in the opposite region in \(\eta \) In order to select only the VBF process, commonly used requirements for the forward jets are to have a large separation in \(\eta \) between the two, where if one is in \(\eta > 0\) then the other must be in \(\eta <0\), and to have large invariant mass reconstructed from the two forward jets.

Once an event passes the selection criteria for the forward jets, the remaining part is rather straightforward. The two central jets must be identified as b-jets, where there is always a room of the optimisation or tuning of the b-tagging requirement. For example, a requirement of at least one b-tag is also possible. The tightness of b-tagging requirement is another knob for tuning. Finally, we look for a signal peak in dijet mass distribution, which is reconstructed from the two central jets.

8.1.3.2 Associate Production with W or Z

The idea behind using the associate production with W/Z is to exploit an isolated lepton from W/Z decay to reduce background. In both trigger and offline event selection, an event is required to have at least one isolated lepton with some criteria such as \(p_{\textrm{T}}\) or \(\eta \). Then in the offline selection, W can be identified by reconstructing transverse mass from the isolated lepton and the missing \(E_\textrm{T}\). In the case of Z, dilepton mass is a powerful tool to separate the signal out from backgrounds.

Fig. 8.5
figure 5

Reprinted under the Creative Commons Attribution 4.0 International License from [5] © CERN for the benefit of the ATLAS collaboration 2021. The dots represent data. The red histogram shows the expected signal contribution where the signal yield is assumed to be 1.06 times the standard model expectation. The grey histogram shows the expected background contribution by ZZ or WZ events

The invariant mass distribution reconstructed from two jets.

The procedure after selecting or tagging W/Z is very similar to that in the VBF analysis. The dijet mass reconstructed from b-tagged jets is the most efficient variables to discriminate signal from background. In the end, the dominant source of backgrounds is W/Z production associated with heavy flavour jets, whose final state is exactly the same as the signal. On top of that, \(t\bar{t}\) production is also a main component of the remaining background. Therefore, jet energy resolution to identify a possible peak from \(H \rightarrow b\bar{b}\) decay is one of the most important key elements in this analysis, as well as the efficiency to detect and identify the final state objects. Figure 8.5 shows the distribution of dijet mass reconstructed from two b-tagged jets in the ATLAS experiment, where all the expected background contribution, except for \(VZ, Z\rightarrow b\bar{b}\) \((V=Z\) or W), is subtracted. One can see a peak by \(Z\rightarrow b\bar{b}\) as well as the small enhancement around 125 GeV, which is the evidence of \(H\rightarrow b\bar{b}\).

8.1.4 \(H \rightarrow W^\pm W^{\mp *}\)

8.1.4.1 Analysis Overview

The branching ratio of the decay channel \(h \rightarrow W^+W^-\) is about 22%, which is the second largest for \(m_h = 125\) GeV. Since the mass of the Higgs boson is less than the sum of two W boson masses (about 161 GeV), one of the two W boson decays virtually (\(h \rightarrow WW^*\)). The analysis using 8 TeV data from the ATLAS collaboration is described in detail in Ref. [6]. The corresponding 13 TeV analysis is given in Ref. [7]; there the data analysis procedure is given briefly and refers to the former paper [6]. In this section, some key points of the \(H \rightarrow WW^*\) analysis are described.

The analysis uses the leptonic decay channel for both of the W bosons (\(h \rightarrow WW^*\rightarrow \ell \nu \ell \nu \)) (Fig. 8.6a), where \(\ell \) is either an electron (e) or a muon (\(\mu \)) in order to reduce background from multijet production \(pp \rightarrow jets\). Since the multijet final state can be produced with a process with only QCD vertices with strong coupling, the cross section of such production is many orders of magnitude larger than that of \(h \rightarrow WW^*\) signal. The decay product of the Higgs boson, therefore, is either of \(ee, e\mu , \mu \mu \) combinations with two or more neutrinos. The analysis also includes smaller number of events containing \(W \rightarrow \tau \nu _\tau \) decays where the \(\tau \)-lepton further decays into an electron or muon with two additional neutrinos. Only the sum of the transverse momenta of the neutrinos (here denoted as \(\textbf{p}_\textrm{T}^{\nu \nu }\)) can be measured through missing \(\textbf{p}_\textrm{T}\).

Fig. 8.6
figure 6

The diagrams of a \(h\rightarrow WW\) b a WW-pair production through a \(Z^0\) and c a top-quark pair with both W bosons decaying leptonically into \(e\mu \)

Major sources of the background are resonant-like WW production (Fig. 8.6b) and top-pair events where both top quarks decay leptonically, \(t \rightarrow Wb, W \rightarrow \ell \nu \) (Fig. 8.6c). The former process has the same final state as the signal and is an irreducible background source if the WW pair is produced from a colourless state such as a virtual \(Z^0\) boson. The latter process gives two b-jets and is the main background for VBF production process where we require two jets in the final state and also a significant source for the events with one jet in the final state.

The reconstruction of the Higgs boson mass is not possible because of the neutrinos in the final state. Instead, the transverse mass, \(m_\textrm{T}\), is calculated to estimate the invariant mass of the \(WW^*\) system, which uses the transverse components of the kinematic variables: \(\textbf{p}_\textrm{T}^{\nu \nu } (\textbf{p}_\textrm{T}^{\ell \ell })\), the vector sum of the neutrinos (leptons), and \(E_\textrm{T}^{\ell \ell } = \sqrt{(p_\textrm{T}^{\ell \ell })^2 + (m_{\ell \ell })^2}\). The \(m_\textrm{T}\) is defined as

$$ m_\textrm{T} = \sqrt{(E_\textrm{T}^{\ell \ell }+p_\textrm{T}^{\ell \ell })^2 - |\textbf{p}_\textrm{T}^{\ell \ell } + \textbf{p}_\textrm{T}^{\nu \nu }|^2} . $$

Since \(m_\textrm{T} \le m_h\) and \(m_h\) is below twice the W boson mass, \(h \rightarrow WW^{*}\) events will be populated in \(m_\textrm{T}\) region below that from resonant WW production (see Fig. 8.8.) The \(m_\textrm{T}\) values for top-pair production also tend to be much beyond that from the Higgs decays. This shape difference is used to quantitatively distinguish the signal and background. The peak structure in \(m_\textrm{T}\), for both signal and the WW background, however, is broad. Also, the production rate of \(WW^*\) pairs is much smaller than the SM diboson production. Several other features of the signal events are used to reduce background processes.

The number of jets, especially the number of b-jets, is one of such key ingredients to classify event categories. As described in Sect. 8.1.1, at the leading order there is no jet for the ggF processes, while in the VBF processes, each of two incoming quarks emits a vector boson and recoils, giving two jets close to the outgoing beam direction, one for each side. This means that two forward jets are observed, with large separation in rapidity space. Since these forward jets in the VBF processes are jets from light quarks, the background from \(t\bar{t}\) production is greatly suppressed by removing events with one or more b-quark jets.

The azimuthal correlation of the two leptons is also used in order to further enhance the signal. Since the Higgs boson is a scalar particle and has no spin, the spin directions of the two W bosons are opposite (Fig. 8.7a). The momentum direction of the charged leptons in \(W^- \rightarrow \ell ^-\bar{\nu }\) decays tends to be opposite to the spin direction since the anti-neutrino is right-handed and its momentum is aligned to the spin direction. For the \(W^+\), the charged lepton is emitted along the direction of the \(W^+\) spin. As the WW pairs tend to have back-to-back topology in the \(x-y\) plane, the direction of the two leptons becomes close as shown in Fig. 8.7b. Also, the invariant mass of the lepton pair, \(m_{\ell \ell }\), is peaked around 30–40 GeV while for WW pair production, it is at around 60 GeV, as seen in Fig. 7b in Ref. [6].

Fig. 8.7
figure 7

a Relation between the spin direction and momentum direction for \(H \rightarrow WW^*\rightarrow \ell \nu \ell \nu \) decays. b Illustration of typical decay topology in \(x-y\) plane (perpendicular to the beam direction)

Fig. 8.8
figure 8

Reprinted under the Creative Commons Attribution 3.0 License from [6] © 2015 CERN, for the ATLAS Collaboration

Distributions of \(m_\textrm{T}\) for 0- and 1-jet events selected for ggF signal.

Since the signal-to-background ratio is quite small in \(WW^*\) decay channel, the amount of the remaining background is still very large after selecting Higgs-like events using the properties given above. The remaining background depends strongly on the number of accompanied jets. The events are, therefore, classified according to the number of jets: 0-jet, 1-jet, and \(\ge 2\)-jet categories. The main background sources for the 0-jet category are irreducible WW production and other diboson production, especially WZ events where one of the leptons is missed. In addition, the events from \(W + \textrm{jets}\) production contributes significantly if the jet is misidentified as a lepton. Here, the \(W + \textrm{jets}\) process represents higher-order DY events \(q\bar{q} \rightarrow W^{\pm }\), i.e., with one or more associated jets. For the 1-jet category, the \(t\bar{t}\) production becomes also significant since it produces two b-jets where one of the jets is experimentally not tagged as a b-jet. For the 2-jet events, the major contribution is the \(t\bar{t}\) events. The basic idea of how to suppress these background events is described in the next subsection for each category.

8.1.4.2 Background Reduction

  • 0-jet category

    After the basic requirement of having two leptons in the final state, significant missing \(E_\textrm{T}\) and explicitly requesting no jet, most of the background is the DY process, \(pp \rightarrow Z^0/\gamma ^*+ X, Z^0/\gamma ^*\rightarrow ee, \mu \mu , \tau \tau \), especially when the two leptons have the same flavour (ee or \(\mu \mu \)). This background is also significant for \(e\mu \) channel, however, since both the \(\tau \) leptons in \(pp \rightarrow \tau ^+\tau ^- X\) processes may decay leptonically, giving a \(e\mu \) pair.

    In order to further reduce the DY events, the correlation of the lepton pair is used, by requiring \(p_\textrm{T}\) of the dilepton system being high: \(p_\textrm{T}^{\ell \ell } > 30\) GeV (Fig. 7a in Ref. [6]). Since the \(Z^0/\gamma ^*\) in the DY processes are produced from \(q\bar{q}\) annihilation, each of the quarks coming from the incoming protons, the transverse momentum of the produced \(Z^0/\gamma ^*\) tend to be small and the lepton pair from the decay tends to be produced back-to-back in the \(x-y\) plane.

    The missing \(E_\textrm{T}\) may arise from background processes through mismeasurement of the energy or momentum of the final state particles, i.e. two leptons. In such cases the missing \(E_\textrm{T}\) tends to be aligned to the momentum direction of these particles. A few requirements are applied based on the relative momentum of the missing \(\textbf{p}_\textrm{T}\) to the leptons.

    Finally, the azimuthal correlation requirement \((\phi _{\ell \ell } < 1.8)\) and the mass of the dilepton system \(m_{\ell \ell } < 55\) GeV are required to select events with \(H \rightarrow WW^*\) topology as described above.

  • 1-jet category

    The event selection for the 1-jet category is very similar to that for the 0-jet events apart from a few points: the required jet should not be tagged as a b-jet; \(\textbf{p}_\textrm{T}^{\ell \ell }\) is replaced to \(\textbf{p}_\textrm{T}^{\ell \ell j}\), adding the momentum of the jet; and additional requirement on the \(m_{\tau \tau }\) variable is imposed: \(m_{\tau \tau } < m_Z - 25\) GeV where \(m_Z\) is the mass of the \(Z^0\) boson. The \(m_{\tau \tau }\) variable is calculated by using so-called “collinear approximation” assuming that the leptons are from the decay of \(\tau \) leptons originated from \(Z^0\) and the momentum of the rest of the \(\tau \) decay products, two neutrinos for each decay, are estimated by projecting the missing \(p_{\textrm{T}}\) vector to the two lepton directions.

  • 2-jet category

    The signal-to-noise ratio for two-jet VBF categories is much smaller than the other categories at the stage after dilepton + missing \(E_\textrm{T}\) selection. In order to enrich the signal, a machine-learning technique (boosted decision tree, BDT) is used. The detail of the technique is beyond the scope of this book. Here we merely explain the main variables used as inputs for the machinery. Two variables related to the forward-going two jets, the jet-jet mass \(m_{jj}\) and the rapidity difference between the two jets \(y_{jj}\), play main role in the selection since the two jets in the VBF process tend to have large values. Some other variables related to the angular order of the VBF jets and the decay products of the Higgs boson are used to enrich the VBF process, based on the fact that the Higgs boson is produced in between the two jets, each of which goes into near the outgoing beam direction on the opposite sides (see Fig. 8.2b). In addition, since the VBF is a quark induced process without QCD vertex (see Sect. 8.1.1), the amount of the initial and final state radiations from partons are largely suppressed with respect to the main background process, the \(t\bar{t}\) production. The vector sum of \(\textbf{p}_\textrm{T}\) over hard objects in an event is sensitive to the amount of such radiation since the size of such vector indicates the amount of recoil received by the objects.

Figure 8.8 shows the \(m_\textrm{T}\) distribution of the events after all the selection for \(e\mu \) channel. A clear excess over the sum of the background is observed for both 0- and 1-jet categories. The amount of the excess divided by the expected number of events predicted by the Standard Model Higgs boson production cross section is called signal strength parameter (denoted as \(\mu \)). The value of \(\mu \) is extracted from the fit to \(m_\textrm{T}\) distributions of all the event categories after fixing the background distributions including their normalisations, as described below.

8.1.4.3 Background Estimation

It is difficult to determine the amount of background events through template fit assuming the shape of the signal and background and determining the normalisation of each contribution through the fit, since the \(m_\textrm{T}\) distribution for the signal is relatively broad and the shape is somewhat similar for the signal and some of the background events as seen in Fig. 8.8. The background contribution is, therefore, estimated by using event distributions in control regions where some of the selection criteria are inverted so that there is no overlap in events between the signal and control regions.

In this analysis, the control regions are prepared for each process for each category of events (0, 1, or 2 jets, \(e\mu \) or \(ee + \mu \mu \) final states) and for each background process (WW, top, Drell-Yan, etc.). Instead of going through all of them, we pick up a few most relevant ones.

For example, the normalisation of the WW contribution is obtained by events in high \(m_{\ell \ell }\) region, \(55< m_{\ell \ell } < 110\) GeV for the 0-jet category so that the purity of the WW contribution is improved, while keeping similar event selection criteria to the signal region. The remaining background sources from non-WW processes in this control region are subtracted by using simulated events.

The strongest constraint for normalising \(t\bar{t}\) contribution comes from 1-jet category \(e\mu \) final state, but requesting one b-tagged jet explicitly, since all top quarks practically decay to the bW final state. In addition, the requirement on lepton is tightened by requesting \(m_\textrm{T}^\ell > 50\) GeV, where \(m_\textrm{T}^\ell \) is defined as the mass between one of the leptons and missing \(p_{\textrm{T}}\) vector on the \(x-y\) plane. It is meant for reconstructing the transverse mass of the W bosons from the top-quark decays. After applying these criteria, the control region consists almost fully of top-quark production. Thus determined background fraction gives consistent results with simulation for most of the control regions, despite the fact that the event selection for \(H \rightarrow WW\) may be at the corner of the phase space for the background processes.

After repeating similar exercises for other event categories, the normalisation factors for the background processes as well the signal contribution are finally fixed by performing a simultaneous likelihood fit, where some of the normalisation factors are allowed to shift while others are fixed. The final result for the 8 TeV analysis gives \(\mu = 1.09^{+0.16}_{-0.15}\mathrm{(stat)}^{+0.17}_{-0.14}\mathrm{(syst)}\). The main sources of the systematic uncertainties are theoretical origin, like the cross section prediction of the signal itself, since the strength parameter is the cross section ratios of measurement to prediction. For the 13 TeV analysis, the statistical uncertainty was improved and became lower than the total systematic uncertainties.

8.2 Search for Physics Beyond the Standard Model

One of the main goals of the high-energy experiments is the discovery of new phenomena so that there is plenty of data analysis for physics beyond the standard model (BSM). From among them, we explain SUSY (supersymmetry) and resonance searches, which are typical BSM searches; their idea can be applicable to data analysis for other BSM.

8.2.1 SUSY

Many searches for phenomena beyond the Standard Model target a signal that does not make any resonance of new particles. One of the best examples is SUSY search. The supersymmetry is a new fermion-boson symmetry, where new fermion (boson) partners are introduced for all standard model bosons (fermions). The supersymmetric partners of electron (e), weak boson (W), quark (q), and gluon (g) as examples are scalar electron (selectron, \(\tilde{e}\)), wino (\(\tilde{W}\)), scalar quark (squark, \(\tilde{q}\)), and gluino (\(\tilde{g}\)), respectively.Footnote 6 They have the same mass as their partners; however, we have not seen such particles so far. This symmetry is assumed to be broken and the mass of supersymmetric partners can be heavy. The lightest supersymmetric particle (LSP) is assumed to be neutral and stable (under R-parity conservation) and cannot be detected so that the LSP is a good candidate for dark matter. This is one of the motivations for SUSY models.

In R-parity conversed SUSY models, a pair of SUSY particles, which are new particles for us so far, can be produced in the LHC and then each SUSY particle decays eventually in SM particles and one LSP. Due to the existence of the LSP in the decay chain, we cannot reconstruct the mass of any SUSY particles which are produced in the decay chain. Even in such cases, there are several useful variables to search for the SUSY signal and more variables are being developed.

Fig. 8.9
figure 9

Feynman diagrams of SUSY production via strong interaction in the pp collider

Since the LHC is a pp collider, we expect large production cross sections of SUSY signal via the strong interaction: \(gg \rightarrow \tilde{g}\tilde{g}\), \(gg \rightarrow \tilde{q}\tilde{q}\), and \(gq \rightarrow \tilde{g}\tilde{q}\) as shown in Fig. 8.9. The search for SUSY with these channels is of importance in the LHC. We focus on the search for SUSY through the \(gg \rightarrow \tilde{g}\tilde{g}\) production process, where we assume that the other SUSY particles except for the lightest neutralino \(\tilde{\chi _1}^0\) are heavier than gluino \(\tilde{g}\).

In such a simple scenario, gluinos decay into two quarks plus \(\tilde{\chi _1}^0\) via \(\tilde{g} \rightarrow q\tilde{q}^* \rightarrow qq\tilde{\chi _1}^0\), giving four quarks (including anti-quarks) and two \(\tilde{\chi _1}^0\) in the final state. In this analysis, we require four or more high \(p_{\textrm{T}}\) jets and a large missing transverse energy. Additional high \(p_{\textrm{T}}\) jets might come from the initial and final state radiations. In nominal SUSY searches, there are two useful variables to separate signal events from background events: missing transverse energy \(E_{\textrm{T}}^{\textrm{miss}}\) and so-called “effective” mass \(m_\textrm{eff}\). The \(m_\textrm{eff}\) variable is defined to be the scalar sum of the transverse momentum of jets and \(E_{\textrm{T}}^{\textrm{miss}}\): \(m_\textrm{eff}= \sum _\textrm{jet} p_{\textrm{T}}+ E_{\textrm{T}}^{\textrm{miss}}\). The number of jets that are added in the summation depends on analyses, for example, up to four in the \(p_{\textrm{T}}\) order. The \(m_\textrm{eff}\) variable corresponds to the mass of the SUSY particle pair initially produced. Figure 8.10 shows the \(m_\textrm{eff}\) distribution for gluinos and squarks search in ATLAS [8]. SUSY signal events can have dumps in the high \(E_{\textrm{T}}^{\textrm{miss}}\) and \(m_\textrm{eff}\) regions. This is a typical SUSY signal for which we have searched.

In practice, we are moving to a more complex analysis to improve the signal sensitivity since any SUSY signal has not been seen in the LHC. We have adopted multivariate analysis techniques like BDT, deep learning (DL), etc. The variables of \(E_{\textrm{T}}^{\textrm{miss}}\) and \(m_\textrm{eff}\) are one of the input variables to them. In these analysis techniques, not only each input variable but also the correlation of input variables are utilised to separate the signal from the background. Since the selection criteria are determined by using the MC events, for example, BDT or DL is trained with MC samples, we are careful that the correlation of variables in MC events should be similar to that in the real data as much as possible. Such checks are required to adopt the multivariate analysis technique.

Fig. 8.10
figure 10

Reprinted from [8] under the Creative Commons Attribution 4.0 International License © 2018 CERN, for the ATLAS Collaboration. The points with bars show observed data. The histograms show the MC background predictions prior to the fits. The arrows indicate the values at which the requirements on \(m_\textrm{eff}\) are applied. The expected distribution for a SUSY signal model point is shown with a dotted-line (masses in GeV)

Distribution of \(m_\textrm{eff}\) for a 6-jet region in the SUSY search.

8.2.2 Resonance Search

As the charmonium was discovered by a resonance of a pair of electrons and the Higgs boson was recently discovered by peaks of a pair of \(\gamma \)s and 4 leptons, it is historically evident that looking for any resonances of the new particles is the one of the most effective and the easiest ways to search for new physics independent of the theoretical models.

Fig. 8.11
figure 11

Dimuon invariant mass distribution for oppositely charged muon pairs with transverse momentum above 4 GeV and pseudorapidity \(|\eta | < 2.5\) and selected by muon triggers. Reprinted under the Creative Commons Attribution 4.0 International License from [9] © 2015 CERN for the benefit of the ATLAS Collaboration. The resonances of \(J/\psi \), \(\psi '\), \(\varUpsilon \) resonances, and Z are clearly visible in this distribution

The distribution of the invariant mass for oppositely charged muon pairs with transverse momentum above 4 GeV and pseudorapidity \(|\eta | < 2.5\) and selected by muon triggers at the ATLAS experiment is shown in Fig. 8.11. In case two reconstructed muons are originated from a particle with narrow decay width such as \(J/\psi \), \(\psi '\), \(\varUpsilon \), and Z boson, the invariant mass reconstructed by momenta and energies of two muons are measured to be around the mass of the resonance particle. On the other hand, the invariant mass reconstructed by candidates of two muons (including charged particles faking as muons) which are not originated from a decay of particle distributes continuously according to the combination of values of momenta and energies of the two muons. From this example of the search for the peak of “known” particle, one can learn that

  • more precise measurement of the invariant mass provides a sharper peak of the resonance over the backgrounds,

  • the level of the reducible backgrounds due to the wrong measurement such as fakes needs to be lowered as much as possible, and

  • the distribution of irreducible backgrounds needs to be under-controlled to estimate the number of background events.

The LHC experiments can search for the new resonances predicted by the BSM hypothesis up to 10 TeV by using the invariant mass reconstructed by the combination of the two or more electrons, muons, photons, and jets, which includes the decays of heavier particles such as top quarks. The following sections show two examples of BSM resonance searches.

8.2.2.1 Dilepton Resonances

Since we expect to measure the electron energy and the muon momentum more precisely than that of jets, the dilepton (dielectron and dimuon) final state is the most promising channel in any BSM resonance searches. From the theoretical point of view, various models predict resonances with decay into dileptons and can be categorised according to their spin. Thus, the experimentalists first search for any excesses in the dilepton mass distribution and then apply the result of the searches to the interpretation of models with such new resonances.

Fig. 8.12
figure 12

Reprinted under the Creative Commons Attribution 4.0 International License from [10] © 2019 The Author. Generic zero-width signal shapes, scaled to 20 times the value of the corresponding expected upper limit at 95% CL on the fiducial cross section times branching ratio, with pole masses of \(m_X =\) 1.34, 2, and 3 TeV, as well as background-only fits, are superimposed. The data points are plotted at the centre of each bin. The error bars indicate statistical uncertainties only. The differences between the data and the fit results in units of standard deviations of the statistical uncertainty are shown in the bottom panels

Distribution of the a dielectron and b dimuon invariant mass for events passing the full selection.

The filled points in Fig. 8.12 show the distribution of the dielectron and dimuon invariant mass (\(m_{\ell \ell }\)) for events passing the full selection using 139 \(\text{ fb}^{-1}\) of pp collision data collected at \(\sqrt{s}=13\) TeV with the ATLAS detector [10]. The event selection is based on the quality cuts of the electron and muon, their \(p_{\textrm{T}}\), and fiducial cuts. The \(m_{\ell \ell }\) distribution of the backgrounds, shown as red solid lines in Fig. 8.12, is modelled by formula of

$$\begin{aligned} f(m_{\ell \ell }) = f_{\textrm{BW}, Z} (m_{\ell \ell }) \cdot (1-x^c)^b \cdot x^{\sum ^3_{i=0} p_i \log {(x)}^i}, \end{aligned}$$
(8.1)

where \(x=m_{\ell \ell }/\sqrt{s}\) and b, c, and \(p_i\) with \(i=0,...3\) are the parameters determined by the fit. The function \(f_{\textrm{BW}, Z} (m_{\ell \ell }) \) is Breit-Wigner function with \(m_Z = 91.1876\) GeV and \(\Gamma _Z = 2.4952\) GeV, which models the line shape of the resonance of the Z boson at high mass region. If new heavy particles with pole masses of 1.34, 2, and 3 TeV existed, one could find the peaks of the dilepton mass over the background prediction, as shown as dashed curves in Fig. 8.12. In the prediction of these new particles, zero width is assumed, i.e., the width of the distributions is only due to the detector resolutions. Since the electron energy measured from the electromagnetic shower is more precise than the momentum measurement for a charged particle in the energy region of our interests, dielectron mass reconstructed from the energy measurement has better resolution than that from the momentum measurement. For the dimuon channel, on the other hand, only the momentum measurement is available. Therefore, the mass resolution of dielectron is better than that of dimuon. Figure 8.12 does not show any sign of a signal from the new particle. If you want to quantify if a signal exists or not, you can calculate the probability that the data are compatible with the background-only hypothesis as is described for \(H\rightarrow \gamma \gamma \) peak search (see Sect. 8.1.2).

Fig. 8.13
figure 13

Reprinted under the Creative Commons Attribution 4.0 International License from [11] © CERN, for the benefit of the ATLAS Collaboration. The solid line depicts the background prediction from the sliding-window fit. The vertical lines indicate the most discrepant interval, for which the p-value is 0.89 as reported in the figure. The expected contributions for \(q^*\) signal with a mass of 4 and 6 TeV are overlaid, normalised to 10 times their predicted cross section. The lower panel shows the bin-by-bin significance of the data-fit discrepancy, based only on statistical uncertainties

The reconstructed dijet mass distribution, \(m_{jj}\), is shown for events with \(p_{\textrm{T}}> 150\) GeV for the two leading jets, with \(|y^*| <1.2\), and \(m_{jj}\) greater than 1.1 TeV (filled points).

8.2.2.2 Dijet Resonances

New heavy particles, such as excited quarks (\(q^*\)), that couple to partons are predicted in many BSM theories and can be produced directly in pp collisions at LHC and decayed into partons. Events of this kind of a new heavy particle produce a peak in the distribution of the dijet invariant mass (\(m_{jj}\)). On the other hand, since, in the SM, the production of jet pairs in hadron colliders primarily results from \(2 \rightarrow 2\) parton scattering processes described by QCD, a smooth and monotonically decreasing distribution for the \(m_{jj}\) distribution is expected. The filled points in Fig. 8.13 show the \(m_{jj}\) distribution for events with \(p_{\textrm{T}}> 150\) GeV for the two leading jets, with \(\displaystyle |y^{*}| \equiv \frac{1}{2} |y_1 - y_2| <0.6\), and \(m_{jj}\) greater than 1.1 TeV, where \(y_1\) and \(y_2\) are the rapidity of dijet [11]. The \(m_{jj}\) distribution of the backgrounds, shown as the solid red line in Fig. 8.13, is empirically known to be predicted by formula:

$$\begin{aligned} f(x) = p_1 (1-x)^{p_2} x^{p_3 + p_4 \ln {x}}, \end{aligned}$$
(8.2)

where \(x=m_{jj}/\sqrt{s}\). Parameters of \(p_1\) to \(p_4\) are determined by the fit to real data. If a new heavy resonance particle existed, one could find the peak of the dijet mass above the background prediction, as shown as open points in Fig. 8.13. The most discrepant interval of the \(m_{jj}\) distribution of the data comparing with the background prediction is indicated by the two vertical blue lines in Fig. 8.13. The p-value for the most discrepant interval is calculated to be 0.89.