Search for a new Z' gauge boson in $4\mu$ events with the ATLAS experiment

This paper presents a search for a new Z' vector gauge boson with the ATLAS experiment at the Large Hadron Collider using pp collision data collected at $\sqrt{s} = 13$ TeV, corresponding to an integrated luminosity of 139 fb$^{-1}$. The new gauge boson Z' is predicted by $L_{\mu}-L_{\tau}$ models to address observed phenomena that can not be explained by the Standard Model. The search examines the four-muon (4$\mu$) final state, using a deep learning neural network classifier to separate the Z' signal from the Standard Model background events. The di-muon invariant masses in the $4\mu$ events are used to extract the Z' resonance signature. No significant excess of events is observed over the predicted background. Upper limits at a 95% confidence level on the Z' production cross-section times the decay branching fraction of $pp \rightarrow Z'\mu\mu \rightarrow 4\mu$ are set from 0.31 to 4.3 fb for the Z' mass ranging from 5 to 81 GeV. The corresponding common coupling strengths, $g_{Z'}$, of the Z' boson to the second and third generation leptons above 0.003 - 0.2 have been excluded.


Introduction
This paper presents a search for a new vector boson  ′ in the four-muon (4) final state with data recorded in proton-proton ( ) collisions at √  = 13 TeV by the ATLAS detector [1] at the Large Hadron Collider (LHC), corresponding to an integrated luminosity of 139 fb −1 .The new gauge boson  ′ , which only interacts with the second and third generation leptons, is predicted by   −   models [2] which extend the Standard Model (SM) with an additional  (1)   −  symmetry to address the observed anomalies of the muon magnetic dipole moment ( − 2) [3][4][5][6] and of rare B decays [7][8][9][10].These models also aim to probe outstanding physics questions related to dark matter and neutrino mass [11][12][13].
The  ′ kinematics, mass, and interactions (with the same coupling strength to the second and third lepton families), are described by the Lagrangian below: where   =    ′  −    ′  is the  ′ field strength tensor; ℓ  = (  ,   )  ( = 2, 3, denoting the second and the third generation left-handed lepton doublets);  and  represent the right-handed muon and tau singlets; and   ′ (which will be referred to as  in the rest of this paper) is the interaction coupling constant.The parameter space of (  ′ , ) has not been strongly constrained in experiments since the  ′ does not directly couple to the electron nor to any quarks, hence it could not be directly produced from  +  − and   colliding beams.
In   collisions at the LHC, the  ′ could be produced from final state radiation of  or  pairs of the Drell-Yan (DY) process as shown in Figure 1(a) with a 4 final state, which provides the cleanest signature to search for the  ′ .For relatively low  ′ mass, the most promising experimental signature would be an excess of 4 events with the invariant mass of one  +  − pair peaking around the  ′ mass.The major background comes from the SM 4 production processes shown in Figure 1(b) to 1(d).The  ′ could also be produced in  production through the DY process,   →  →  ′  → 3 + .The experimental signature would have a final state of 3 plus large missing transverse energy.This final state is not included in this analysis.Both the ATLAS and CMS Collaborations have measured the cross-sections of the SM  → 4 process [14][15][16].The measurement by ATLAS was used by theorists to set limits in the parameter space of the   −   model [11].The CMS Collaboration has directly searched for the  ′ boson in the mass region between 5 to 70 GeV with the 4 final state using 77.3 fb −1 of data [17], and set upper limits on the  ′ to muon coupling strength, , of 0.004 -0.3 at 95% confidence level, depending on the  ′ mass.
The organization of this paper is as follows.The ATLAS detector is described in the next section.
The dataset and Monte Carlo samples used in this analysis are detailed in Section 3. Event reconstruction and selection, followed by background estimation from data, are described in Section 4 and Section 5. Event classification using a deep learning approach is presented in Section 6. Systematic uncertainties and the statistical approach to interpret data to obtain the results are reported in Section 7 and Section 8.The conclusion is given in the final section.

ATLAS Detector
The ATLAS detector [1] at the LHC covers nearly the entire solid angle around the collision point. 1 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadron calorimeters, and a muon spectrometer incorporating three large superconducting air-core toroidal magnets.
The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides chargedparticle tracking in the range || < 2.5.The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable B-layer (IBL) installed before Run 2 [18,19].It is followed by the silicon microstrip tracker (SCT), which usually provides eight measurements per track.These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to || = 2.0.The TRT also provides electron identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.
The calorimeter system covers the pseudorapidity range || < 4.9.Within the region || < 3.2, electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering || < 1.8 to correct for energy loss in material upstream of the calorimeters.Hadron calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within || < 1.7, and two copper/LAr hadron endcap calorimeters.The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic energy measurements respectively.
The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroidal magnets.The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector.Three layers of precision chambers, each consisting of layers of monitored drift tubes, covers the region || < 2.7, complemented by cathode-strip chambers in the forward region, where the background is highest.The muon trigger system covers the range || < 2.4 with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.
Interesting events are selected by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [20].
The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger further reduces in order to record events to disk at about 1 kHz.
An extensive software suite [21] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Dataset and Monte Carlo Simulations
The data used for this analysis were recorded using single-muon and multi-muon triggers, corresponding to an integrated luminosity of 139 fb −1 after the application of data quality requirements [22].The transverse momentum ( T ) thresholds for the single-muon trigger vary from 20 to 26 GeV, for the di-muon trigger from 10 to 14 GeV, and for the tri-muon trigger from 4 to 6 GeV, depending on the data-taking periods [23].The overall trigger efficiency in the phase space defined by the event selections used in this analysis is higher than 98%.

Simulation of 𝒁 ′ Production
The   −   model is used for Monte Carlo (MC)  ′ signal sample production, where the  ′ couples to the left-handed (LH) muon or tau leptons and their corresponding neutrinos, and to the right-handed (RH) muon or tau leptons.In the model, the  ′ couplings to the first lepton families (electron and its neutrino) and all quarks are set to zero.The branching fractions of the  ′ decay to a pair of muons and a pair of muon neutrinos are 1 3 and 1 6 , respectively.The signal from tau decays in the 4 final state is found to be negligible in this analysis and is not included as signal.
The interactions [24] mediated by a resonance  ′ which couples to the second and third generation leptons are used for four-muon signal event generation.The signal events are generated with MadGraph5 aMC@NLO 2.7.3 [25] at leading-order (LO) accuracy in QCD by using the Universal FeynRules Output (UFO) format [26,27].Based on theoretical calculations [28,29] for related processes, the appropriate NNLO/LO correction factor K of 1.3 is used to correct the MC LO signal cross-sections.In the signal production simulations, the NNPDF2.3nloset [30] is used for parton distribution function (PDF) for the   collisions.Photos++ 3.61 [31,32] is used to simulate the effect of QED final-state radiation.
The MC simulated events are generated for a range of masses and coupling parameters of the   −   model.The  ′ mass ranges from 5 to 81 GeV, and the value of coupling constant  ranges from 0.008 to 0.316, as summarized in Table 1.The value of the  at each  ′ mass was chosen to be close to the expected experimental sensitivity to allow a sensitive search for a very small  ′ signal in the full Run 2 dataset.
With the chosen , the natural width Γ of the  ′ and the cross-section, both are proportional to  2 , are calculated as listed in Table 1.It should be noted that all the generated mass points have the ratio Γ/  ′ much smaller than 1%.Therefore, the ATLAS detector mass resolution (2%) dominates the width of the signal Z' mass peak in the experimental dimuon mass spectrum.The  ′ signal samples were simulated with the ATLAS fast simulation framework (Atlfast-II) [33] to produce predictions that can be directly compared with the data.

Simulation of Background Events
Dominant SM backgrounds in this analysis come from the SM  → 4 processes where the four leptons have an invariant mass close to that of the  boson.In the higher mass region the   * production contributes a sizable number of prompt 4 events.In addition, there are very small contributions from the Higgs boson,  t ( = , ), and tri-boson () production processes.These events are estimated with MC simulations.The background events including muons from heavy-flavour hadron decays, misidentified jets, or photon conversions (collectively referred to as "non-prompt muon background") are mostly coming from  + jets,  t and single-top-quark production processes and estimated from data in this analysis, as described in Section 5.The  + jets and  t MC samples are also produced for background studies.
Samples of diboson production  q →  ( * ) , including the processes shown in Figures 1(b) and 1(c), were simulated with the Sherpa 2.2.2 [34] generator, including off-shell effects and Higgs boson contributions, where appropriate.Fully leptonic final states and semileptonic final states, where one boson decays leptonically and the other hadronically, were generated using matrix elements at next-to-leading-order (NLO) accuracy in QCD for up to one additional parton and at LO accuracy for up to three additional parton emissions.Samples for the loop-induced processes  →   ( * ) , shown in Figure 1(d), were generated using LO-accurate matrix elements for up to one additional parton emission for both the cases of fully leptonic and semileptonic final states.The matrix element calculations were matched and merged with the Sherpa parton shower based on Catani-Seymour dipole factorisation [35,36] using the MEPS@NLO prescription [37][38][39][40].The virtual QCD corrections were provided by the OpenLoops library [41][42][43].The NNPDF3.0nnlo set of PDFs was used [44], along with the dedicated set of parton-shower parameters (tune) developed by the Sherpa authors.
The production of  t events was modelled using the Powheg Box v2 [45][46][47][48] generator at NLO in QCD with the NNPDF3.0nloPDF set and the ℎ damp parameter2 set to 1.5  top [49].The events were interfaced to Pythia 8.230 [50] to model the parton shower, hadronisation, and underlying event, with parameters set according to the A14 tune [51] and using the NNPDF2.3loset of PDFs.The associated production of top quarks with  bosons () was modelled by the Powheg Box v2 [46][47][48]52] generator at NLO in QCD using the five-flavour scheme and the NNPDF3.0nloset of PDFs.The diagram removal scheme [53] was used to remove interference and overlap with  t production.The events were interfaced to Pythia 8.230 using the A14 tune and the NNPDF2.3loset of PDFs.
The production of +jets was simulated with the Sherpa 2.2.1 generator using NLO matrix elements for up to two partons, and LO matrix elements for up to four partons, calculated with the Comix and OpenLoops libraries.They were matched with the Sherpa parton shower using the MEPS@NLO prescription with the set of tuned parameters developed by the Sherpa authors.The NNPDF3.0nnlo set of PDFs was used and the samples were normalised to a next-to-next-to-leading-order (NNLO) prediction [54].
The production of  t events was modelled using the MadGraph5 aMC@NLO 2.3.3 generator at NLO with the NNPDF3.0nloPDF.The events were interfaced to Pythia 8.210 using the A14 tune and the NNPDF2.3loPDF set.The production of tri-boson () events was simulated with the Sherpa 2.2.2 generator.Matrix elements accurate to LO in QCD for up to one additional parton emission were matched and merged with the Sherpa parton shower based on Catani-Seymour dipole factorisation using the MEPS@NLO prescription.Samples were generated using the NNPDF3.0nnloPDF set, along with the dedicated set of tuned parton-shower parameters developed by the Sherpa authors.
The generated background MC samples were processed by the full ATLAS detector simulation based on Geant4 [55].The effect of multiple interactions in the same and neighbouring bunch crossings (pile-up) was modelled by overlaying the simulated hard-scattering event with inelastic   events generated with Pythia 8.186 [56] using the NNPDF2.3loset of PDF and the A3 set of tuned parameters [57].Simulated events were reweighted to match the pile-up conditions in the data.All simulated events were processed using the same reconstruction algorithms and triggering requirements as used in data.

Event Reconstruction and Pre-selection
Proton-proton interaction vertices are reconstructed in events with at least two tracks, each with  T > 0.5 GeV.The primary hard-scatter vertex is defined as the one with the largest value of the sum of squared track transverse momenta.
Muons are identified by matching tracks reconstructed in the MS to tracks reconstructed in the ID (referred to as combined muons).To increase the muon reconstruction efficiency, non-combined muon identification algorithms are also used in the analysis, including using the MS stand-alone tracks in the region 2.5 < || < 2.7, and matching the ID tracks with calorimeter hit information within || <0.1, as well as using the ID tracks associated with at least one local track segment in the MS.In the 4 event selection at most one of the selected muons can be a non-combined muon.Each muon is then required to satisfy the 'loose' identification criteria [58].Muons are required to be isolated using a particle-flow algorithm [59] and associated with the primary hard-scatter vertex by satisfying |  0   0 | < 3 and | 0 × sin | < 0.5 mm, where  0 is the transverse impact parameter calculated with respect to the measured beam-line position,   0 its uncertainty, and  0 is the longitudinal distance between the point at which  0 is measured and the primary vertex.The minimum muon  T threshold is 3 GeV.
In addition to muons, electrons, jets and missing transverse momentum ( miss T ) are also used to select control samples for background estimation in this analysis.The reconstructions of these objects are described below.
Electrons are identified with a likelihood discriminator built from the shower shapes of electron energy deposits in the calorimeter, track-cluster matching, and some of the track quality distributions.Each electron is required to satisfy the 'medium' likelihood identification criteria [60], as well as similar vertex and isolation requirements as muons.The electrons are reconstructed in the region || < 2.5, excluding the transition area between the barrel and endcap calorimeters, 1.37 < || < 1.52, and required to have  T > 7 GeV.
Jets are reconstructed with the anti-  algorithm [61,62] with a radius parameter of  = 0.4.The jet clustering input objects are based on particle-flow [59] in the ID and the calorimeter.Jets are required to have  T > 20 GeV and || < 2.5.Jets containing B hadrons, referred to as -jets, are identified with a multivariate discriminant [63].To reduce the effect of pile-up, an additional quality requirement based on the 'Jet-Vertex-Tagger' algorithm (JVT) [64] is applied in jet identification.

𝐸 miss
T is determined as the magnitude of the negative vectorial sum of the transverse momenta of the selected and calibrated physics objects (including muons, electrons, photons, and jets including hadronically decaying tau-leptons) and the ID tracks coming from the main vertex and not associated with any physics object (soft term) [65].
Events containing at least four muons with kinematics consistent with  ( * / * ) → 4 production are then selected as follows.The four leading  T -ordered muons are required to pass the  T thresholds of 20, 15, 8, and 3 GeV, respectively.If a muon is selected as a non-combined muon, its  T must be greater than 15 GeV.Any di-muon pair in the event must have an invariant mass   greater than 4 GeV and an angular separation Δ larger than 0.2.To search for the  ′ →  +  − signature, two muon pairs are selected based on their invariant mass values.The first pair (referred to as  1 ) is selected from all the possible  +  − pairs to have the smallest mass difference between the  1 mass and the  mass, |  −   1 |.The second  +  − pair is selected from the remaining muons that has the highest invariant mass (referred to as  2 ).The correct signal di-muon pairing fraction varies with the  ′ mass, where the selected di-muon pair that forms  1 or  2 originates from the  ′ .For example, for   ′ = 5, 42, 63, 72, and 81 GeV, the correct di-muon pairing fractions are about 78%, 50%, 88%, 82%, and 90%, respectively.Finally, the selected four muons must have an invariant mass in a range of 80 to 180 GeV, but excluding the Higgs boson resonance mass region of 110 to 130 GeV.Including the 4 events in the 4 mass region between 130 to 180 GeV improves the  ′ signal detection efficiency for its mass greater than 60 GeV.
The  ′ signal efficiency at various stages of the event selection is shown in Table 2 for five representative mass points.The event selection efficiencies vary significantly depending on the  ′ mass.At generator level, an MC filter is applied, which requires at least four muons with   > 2 GeV and || < 3.0.The MC filter efficiencies of these representative  ′ signal samples are listed in the table as well.The  ′ production signature is searched for in the  1 or  2 mass spectrum depending on the assumed  ′ mass.The relatively high-mass  ′ signals mostly appear as a peak in the  1 spectrum while the relatively low-mass signals mostly appear as a peak in the  2 spectrum.Representative examples of the predicted signal over background, after further selection with a deep learning approach which will be described in Section 6, are shown in Figure 4.In the analysis the  1 and  2 mass spectra are scanned to search for a  ′ with mass greater or smaller than 42 GeV, respectively.The boundary value of 42 GeV is chosen based on the studies to optimize the search sensitivity.
The numbers of 4 events in data and the estimated background yields are given in Table 3.More details about the estimation of the reducible backgrounds containing non-prompt muons can be found in Section 5.The total uncertainties of simulated backgrounds are also listed in the table.
The evaluations of systematic uncertainties will be described in Section 7.

Estimation of Reducible Background from Data
A data-driven technique is used to estimate the reducible background from  + jets and  t production.Events from these processes may contain two prompt leptons from  or  boson decays, together with additional activity such as heavy-flavor jets or misidentified components of jets yielding reconstructed muons, collectively referred to as non-prompt muons.Such backgrounds are estimated from data using a control sample of  +  −     events, selected with the standard signal requirements except that non-prompt muons,   , are selected in place of two of the signal muons.Non-prompt muons are defined as muon candidates using the standard selection but fail the requirements on isolation and impact parameter.However, they are required to pass a much looser impact parameter significance requirement, |  0   0 | < 10.The background in the signal sample is estimated by scaling each event in the selected  +  −     control sample by  1 ×  2 , where the   ( = 1, 2) is referred to as a "fake-factor" for each of the two non-prompt muons and is computed as a function of their  T and ||.
The fake-factor  is derived from two independent non-prompt muon enriched data control samples dominated by  + jets or  t events.The  + jets ( t) sample contains non-prompt muons dominantly from light jets (heavy-flavor jets).The  + jets sample is tagged by two isolated prompt leptons from  decays ( →  +  − ,  +  − ).The  T -threshold for the two prompt leptons are 27 and 25 GeV and their invariant mass must be within 10 GeV of the  mass.The  miss T of the event must be less than 25 GeV.The  t sample is tagged by two prompt isolated high- T  ±  ∓ leptons with  T thresholds of 28 and 25 GeV and associated with at least one -jet.The event is required to have  miss T > 50 GeV.If there is an additional isolated high- T lepton in the event, the transverse mass calculated with the  miss T and the third lepton is required to be less than 60 GeV to reduce the contribution from the  t events.
Events with four muons that satisfy the signal selection criteria are removed from both  + jets and  t control samples.In addition, MC simulated diboson events satisfying the selection criteria are subtracted from these control samples.Excluding the target leptons, the additional reconstructed muon objects, passing either the signal muon or the non-prompt muon selection criteria are counted in the control samples.The fake-factor  is calculated as the ratio of the probability for a non-prompt muon to satisfy the signal muon selection criteria to the probability for a non-prompt muon to fail the signal muon selection.The factor  is derived as a function of || and  T of the non-prompt muons, which varies from 0.11 to 0.76 (0.01 to 0.27) in the  + jets ( t) control sample.Then, the fake-factors derived from two data control samples are combined using the  1 spectra obtained from simulated 4 background events from  + jets and  t processes.A simultaneous fit of the  1 spectra of the MC events to data selected in the  +  −     control sample is performed to determine the fractions of the  + jets and  t events in each bin of the mass spectrum.
The overall systematic uncertainty of  is about 14%.It is determined with alternative non-prompt muon selections, such as changing the muon isolation criteria in data control samples (7.6%), and the uncertainties of the  + jets and  t event fractions when combining the fake-factors derived from two control samples (8.6%).The statistical uncertainties of the control samples are also accounted as part of the systematic uncertainties (6.1%).
Additional reducible background contributions come from the   process.These events contain three prompt muons from  and  decays and one non-prompt muon.This background is estimated by scaling a simulated 3 +   sample with the  derived from the  + jets sample.
The total estimated number of reducible background events is 61.1 +8.3 −9.1 (with 88% and 12% contributions from the  +  −     and 3 +   control samples, respectively) as listed in Table 3.

Event Classification with Deep Learning Approach
The selected 4 events are classified with a "Deep Learning" approach to further separate the  ′ signal from the SM background.A parametrized deep neural network (pDNN) architecture [66] is used in the analysis.This algorithm allows the training of a single classifier for multiple signal mass hypotheses in the search range.In practice, the kinematic inputs together with the  ′ mass parameters of signal and background are used for training.The MC generated  ′ masses (listed in Table 1) are used as the multiple signal mass parameters, while the distribution of the mass parameter for background events is randomly drawn from the same distribution as for the signal class.The algorithm was implemented in the Keras [67] framework with the TensorFlow [68] backend.Two classifiers are trained for low (high)  ′ mass searches using MC mass parameters lower (higher) than 40 GeV.During the training of the classifier, the training samples were composed of simulated signal and background 4 events using the pre-selection described in Section 4. A set of kinematic distributions were used for pDNN training input features.They are the  T and  of each muon, the  T of the  1 and  2 , the mass difference of the  1 and  2 , Δ and Δ of each muon pair that forms the  1 and  2 , and the  T and mass of the 4 system.Examples of the input variable distributions, both for representative signals (  ′ = 15, and 51 GeV) and background, are shown in Figure 2 to compare the predicted and observed transverse momenta (  1 T and   2 T ) distributions and the mass difference of the  1 and  2 .In addition to the major background from the SM  ( * ) → 4 production, other backgrounds, including 4 events containing non-prompt muons estimated from data, and from , , and Higgs boson production processes, are included in the plots.The training fits the weight parameters of the pDNN models which are then applied to real data or MC samples to obtain the pDNN output discriminating variable.The  1 or  2 are used as the mass parameter when applying the model to data.
To improve the power of the pDNN classifier, the model is optimized through an automatic process developed with the package Tune [69] and HEBO Search algorithm [70].The neural-network structure was chosen to have two fully connected hidden layers, each with 256 (32) nodes for the high (low) mass search in the analysis.Other training hyper-parameters, such as the learning rate with decay and class weight, for both high and low mass searches were also selected.Two pDNN output discriminant classifiers for high and low mass searches were obtained and are shown in Figure 3 for data and simulated signal and background events.
To maximize the search sensitivity, a scan of the pDNN output scores was performed to find the optimal cut values for each  ′ mass hypothesis.These cut values vary from 0.42 to 0.74 (0.12 to 0.16) for the low (high) mass region.These cuts keep high signal efficiencies between 98% and 95% (90% and 50%), while the corresponding background reductions range from 10% to 50% (50% to 96%) for the low (high) mass region.
After the 4 event selection with the pDNN classifier, the final discriminant to search for the  ′ resonance signature is the  1 (for   ′ ≥ 42 GeV) or  2 (for   ′ ≤ 42 GeV) mass spectrum as shown in Figure 4. Data are compared to the estimated background together with two representative signals with masses of 15 and 51 GeV, shown in Figure 4(a) and 4(b), respectively.The values of the gauge coupling strengths () for the two mass points are chosen for the purpose of illustration.
One should note that the dimuon masses ( 1 ,  2 ) are not used as the pDNN training variables, therefore cutting on the pDNN output scores does not sculpt the dimuon invariant mass distributions of the backgrounds (see Figure 4).To interpret the observed mass spectra, systematic uncertainties in the predictions are estimated in Section 7.

Systematic Uncertainties
Systematic uncertainties in the simulated event yields and shapes, for both signal and background processes, may arise from the calibration of the physics objects and from uncertainties in theoretical modelling used in the predictions.
The major experimental uncertainties come from the muon reconstruction, identification, and isolation requirement efficiencies.These efficiencies are corrected based on studies performed in data control regions.The energy and momentum scales and resolutions of the simulated objects are corrected to reproduce data from  →  +  − and / →  +  − decays [58].The uncertainties on the SM 4 detection efficiency are determined by varying the nominal calibrations in the   MC samples by one standard deviation, including muon momentum resolutions and scales, and the trigger, reconstruction, identification, and isolation requirement efficiencies.The overall relative experimental uncertainties in the SM 4 background event selection efficiency is about 3.9%, dominated by the uncertainty of the isolation efficiency (2.9%) and the low  T calibration uncertainty (2.0%).The hypothetic  ′ signal event selection efficiency uncertainties vary from 8.3% to 3.9%, depending on the  ′ mass.In addition, the uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [71], obtained using the LUCID-2 detector [72] for the primary luminosity measurements.Sources of theoretical uncertainties come from the choice of QCD scales (renormalization   and factorization   ), strong coupling constant   , and PDFs, as well as the parton shower models.These uncertainties affect the signal and background event selection efficiencies, normalization, and the shape of their kinematic distributions.The scales are varied independently from 0.5 to 2.0 times the nominal values and the largest deviation is chosen as the systematic uncertainty.The PDF uncertainty is estimated by comparing events generated with different PDF sets, as well as the uncertainties from the nominal PDF set itself.The maximal variation (envelope) is accounted for as the systematic uncertainty, following the PDF4LHC [73] recommendations.The   uncertainty is estimated by comparing events generated with different   values using the nominal PDF set.The parton shower uncertainty is estimated by comparing events with different parton shower parameters in the Sherpa MC samples.For the  ( * ) → 4 process, the relative uncertainties of event yields for scale, PDF,   , and parton shower, are 4.6%, 1.8%, 1.0%, and 1.9%, respectively.The uncertainties of small background contributions from the  t (5.2 ± 0.7),  (0.5 ± 0.25), and Higgs boson (0.53 ± 0.06) processes are evaluated and included in the analysis.
The interference effect between the signal and background is estimated (using generator-level MC samples simulated with MadGraph5 aMC@NLO) by evaluating the cross-section ratio, Δ/  ′ , where Δ is the difference of the cross-section from the inclusive MC sample and the sum of the cross-sections of the signal and background samples in the  ′ detection phase space.The effect varies from 1.8 to 7.5% for the  ′ mass from 5 -81 GeV, which is accounted for as additional signal yield systematic uncertainties.In addition, the MC  ′ signal filter acceptance uncertainty is estimated to be about 2%, which is calculated by varying the QCD scales, the PDF-sets, and the strong coupling constant using MC events at the generator-level for different  ′ mass points.
All the uncertainties (from both experiment and theory) on the final discriminant di-muon mass spectra are included as nuisance parameters in the signal extraction fitting process described in Section 8.1.

Data Interpretation and Results
The statistical analysis is performed by comparing the data to the sum of the background prediction and the signal to search for the  ′ signature and information about the   →  +  −  ′ → 4 signal production cross-section and the associated coupling strength.In case of no significant data excess over the background prediction, upper limits on the signal production cross-section times the decay branching fraction for different  ′ masses are set at 95% confidence level (CL).

Statistical Procedure and Results
The statistical procedure used to interpret the data is described in Ref. [74].The likelihood function is constructed from the product of Poisson probabilities: where  is the bin-index of the fitted variable distribution;  is the signal strength, and  denotes the nuisance parameters, which represent the uncertainties of the measurements; G( θ|) is the Gaussian scaling function of the nuisance parameters constructed as deviations from the nominal model of the systematic uncertainties, where θ provides a maximum likelihood estimate for .The parameter of interest in the statistical analysis is the global signal strength factor , which acts as a scale factor on the total number of events predicted by the signal model.This factor is defined such that  = 0 corresponds to the background-only hypothesis and  > 0 corresponds to a  ′ signal in addition to the background.Hypothesised values of  are tested based on the profile likelihood ratio [75], which compares data with background-only () and signal+background ( + ) models using the following test statistic: where μ and θ are the values for  and  when maximizing the  with all parameters floating, which are named as the unconditional maximum-likelihood (ML) estimators; θ is the conditional ML estimator of  for a fixed value of .This test statistic extracts the information on the signal strength from a full likelihood fit to the data.The likelihood function includes all the parameters that describe the systematic uncertainties and their correlations.
The significance of an excess in the data is first quantified with the local  0 , the probability that the background can produce a fluctuation greater than or equal to the excess observed in data.The equivalent formulation in terms of number of standard deviations is referred to as the local significance, which is computed from test statistic  0 : The global probability for the most significant excess to be observed anywhere in a given search region is estimated with the method described in Ref. [76].
In case of no significant data excess over the background, exclusion limits are set, based on the CLs prescription [77], which calculates the ratio of the probabilities based on the  +  background-only models: A value of  is regarded as excluded at 95% confident level when CLs is less than 5%.
The statistical tests are performed by first searching for data excesses by means of a  0 scan, and in case no such significant data excess is found, by setting limits on the  ′ production cross-section times the branching fraction.In both steps the values and reconstructed mass distributions of hypothesized  ′ mass   ′ are used.The steps between the mass points used in these test procedures are set to match the mass resolutions determined by reconstruction of the di-muon invariant mass.A linear interpolation of the expected event yields and shapes between the MC generated signal samples is used in the fitting process.The asymptotic approximation [75] upon which the results are based has been validated against the method (detailed in Ref. [74]) of using pseudo-experiments for several mass points.
In both steps of the statistical tests, data are fit to the  1 and  2 spectrum with background () only and signal+background ( + ) hypotheses.In the fitting process each mass spectrum is divided in the signal region (SR) and the background control region (CR).For each  ′ mass point the SR is defined in a mass window of   ′ ± 3   of the di-muon mass spectrum.The sidebands outside of the SR are defined as the CR.The di-muon mass resolution    is determined by the fully simulated  ′ mass distribution, which combines the  ′ natural width and the detector resolution, ranging from 0.10 to 1.75 GeV.The mass resolution is mostly dominated by the detector resolution.Finer binning is used in the SR to enhance the sensitivity.The background CR is used to constrain the overall normalization for the background in the signal region.The shape of the major background from  ( * ) → 4 is fixed with prior uncertainties included in the fitting process, but the normalization (or strength) floats in the fit.Other background normalizations and shapes are fixed with prior uncertainties included in the fitting.

The 𝒑 0 Scan Results
The  0 -values corresponding to the background-only hypothesis are scanned in the mass range of this analysis.A binned profile-likelihood fit [75] is performed simultaneously across the  ′ signal-region and the background control region using the predicted and observed mass spectrum as inputs.Data are fit to the  1 and  2 distributions for   ′ ≥ 42 GeV and   ′ ≤ 42 GeV, respectively, with a "sliding" mass window as the defined SR changes for different  ′ mass points.The chosen bin-size inside the SR is around 0.3    , for each mass point.The total number of bins in the CR is 20.The fit mass range of  1 ( 2 ) is [30,85] GeV ([0, 45] GeV).
The  0 -values at different  ′ mass hypothesis points are computed and transformed into Gaussian standard deviations to indicate the significance as shown in Figure 5.The smallest  0 -value is at 39.6 GeV, corresponding to a local 2.65 deviation from the background-only hypothesis, while the global deviation [76] is found to be 0.52, indicating that no significant data excess over the expected background is observed.

Upper Limits
The upper limits on the production cross-section times branching fraction of the   →  +  −  ′ → 4 process are calculated using a similar fitting procedure described in Section 8.2.Confidence intervals are computed based on the profile-likelihood-ratio test statistics [75].The observed and expected upper limits at 95% CL on the cross-section times branching fraction, (   →  ′  → 4), are shown in Figure 6(a).The upper limits on the coupling parameter  are extracted from the limits of the  ′ production cross-section times branching fraction using the   −   model, which is determined by counting all the possible  ′ decay modes in this model.At each generated  ′ mass point, a limit on the coupling strength  has been obtained from the cross-section limit.The observed and expected upper limits on the coupling parameter  are shown in Figure 6(b).The limits on the coupling  are in the range of 0.003 (for   ′ = 5 GeV) to 0.2 (for   ′ = 81 GeV) depending on the  ′ mass ranging from 5 to 81 GeV.This ensures that the ratio of the  ′ natural width and mass, Γ( ′ )/  ′ , is well below 1% in this mass range.Motivated by theoretical interpretations in Ref. [11], a 2-dimensional exclusion contour at 95% CL in the parameter-space of (  ′ , ) of the   −   model from this analysis is produced and shown in Figure 7.The parameter space exclusion regions calculated by theorists using data from the Neutrino Trident experiment [78] and the   mixing measurements by a global analysis performed in Ref. [11] are also shown in Figure 7.This had left a large gap in the parameter space not yet excluded.This gap is now largely excluded by this analysis.The coupling parameter  limits from this search as a function of the  ′ mass compared to the limits [11] from the Neutrino Trident (red) and the   mixing (green) experimental results.

Conclusion
A search for a new vector gauge boson  ′ predicted by the   −   models has been performed with a 4 final state in the invariant mass range of [80,180] GeV, excluding the Higgs boson mass window [110, 130] GeV, using 139 fb −1 of √  = 13 TeV proton-proton collision data collected with the ATLAS detector.No significant excess of events over the expected SM background is observed.Therefore, upper limits are set on the  ′ production cross-section times the decay branching fraction of the   →  ′  +  − →  +  −  +  − process, varying from 0.31 to 4.3 fb at 95% CL, in a  ′ mass range of [5,81] GeV, from which the coupling strength  of the  ′ to muons above 0.003 to 0.2 (depending on the  ′ mass) are excluded in the same mass range.This search shows significant sensitivity improvements over previous indirect and direct searches of the  ′ with 4 final state.An interesting parameter space of the   −   model prediction that was not excluded by previous experiments is now largely excluded by this search.

Figure 1 :
Figure 1: Feynman diagrams of  ′ production through radiation in a Drell-Yan process (a), and of the corresponding SM background processes (b -d) with a 4 final state.

Figure 2 :
Figure 2: Distributions of   1T (a), and   2 T (b), and the mass difference of the  1 and  2 candidates (c).Small background contributions are denoted as "other backgrounds", including 4 events containing non-prompt muons estimated from data and from , , and Higgs boson production processes.

Figure 3 :
Figure 3: The pDNN output discriminant variable distributions for low mass (a) and high mass (b) with a signal sample at 35 GeV and 51 GeV, respectively.

Figure 4 :
Figure 4: Mass spectra of  2 (left) and  1 (right) for the pDNN-selected events with a signal sample at 15 GeV and 51 GeV, respectively.

Figure 5 :
Figure 5: The  0 -value scan across the  ′ mass signal regions.

Figure 6 :
Figure 6: 95% CL upper limits (expected and observed) on the cross-sections times branching fraction (a) and coupling parameter (b).The discontinuity at 42 GeV represents the border of the low/high mass classifiers.Considering the dimuon mass resolution of 2% of the ATLAS detector, the horizontal dashed line in (b) indicates the upper limit on the valid coupling parameter of the model used in this analysis.

Figure 7 :
Figure7: The coupling parameter  limits from this search as a function of the  ′ mass compared to the limits[11] from the Neutrino Trident (red) and the   mixing (green) experimental results.

Table 2 :
The  ′ signal event selection efficiencies compared to the events passing the previous cut level for several representative mass points.The overall signal efficiencies are the products of the 4 MC filter and the combined event selection efficiencies.

Table 3 :
The selected 4 events in data and the estimated backgrounds and their combined statistical and systematic uncertainties.