Searching for anomalous quartic gauge couplings at muon colliders using principal component analysis

Searching for new physics (NP) is one of the areas of high-energy physics that requires the most processing of large amounts of data. At the same time, quantum computing has huge potential advantages when dealing with large amounts of data. The principal component analysis (PCA) algorithm may be one of the bridges connecting these two aspects. On the one hand, it can be used for anomaly detection, and on the other hand, there are corresponding quantum algorithms for PCA. In this paper, we investigate how to use PCA to search for NP. Taking the example of anomalous quartic gauge couplings in the tri-photon process at muon colliders, we find that PCA can be used to search for NP. Compared with the traditional event selection strategy, the expected constraints on the operator coefficients obtained by PCA based event selection strategy are even better.


Introduction
Since its establishment, the Standard Model (SM) has withstood the challenge of high-precision experimental measurements in describing strong interaction, electromagnetic and weak interaction phenomena.The theoretical results are almost in perfect agreement with the experimental observations, but some important fundamental problems are still difficult to be explained within the framework of the SM [1][2][3][4][5].The seeking for new physics (NP) beyond the SM has become one of the most advanced and important topics in high energy physics (HEP) [6].
As the colliders' performance improves and their luminosities increase, the efficient ways to process large amounts of data become important.One of the efficient ways is to use ⋆ Corresponding author a e-mail: dyf2818051165@163.comb e-mail: myc@lnnu.edu.cnc e-mail: yangjichong@lnnu.edu.cnmachine learning (ML) algorithms.ML is a general term for a class of algorithms, including probability theory, statistics and other disciplines.Because of its advantages in complex data processing, ML algorithms have been applied in many fields, including HEP [7][8][9][10][11][12][13][14][15][16][17][18].At present, many applications show that the anomaly detection (AD) ML method can be effectively applied in the phenomenological studies of NP.One of the advantages is that, when AD is applied to search for NP, the implementation is often independent of the NP model to be searched [19][20][21][22][23][24][25][26].Nevertheless, it should be pointed out that the tunable parameters related to AD methods are often dependent on the NP models and processes to be studied.
In addition, quantum computing is another efficient way to process large amounts of complex data.Many ML algorithms can be accelerated by quantum computing [27][28][29].One example is the principal component analysis (PCA) algorithm [30,31].PCA algorithm can also be used in AD, but it is not clear whether PCA anomaly detection (PCAAD) algorithm is useful for searching NP.If it was the case, then it is strongly implied that quantum PCA can also be used to search for NP signals and we will probably get a way to discover NP using quantum computers.There are other examples such as the autoencoder (AE) [32][33][34][35].AE is better than PCA in handling data dimensionality reduction, so it has a good potential to perform better than PCA in AD as well [21,22].Not only that, AE also has the potential for quantum acceleration [36][37][38][39].However, PCA is more explicit in a geometric sense, as the events are mainly distributed along the eigenvectors of the covariance matrix.Moreover, PCAAD does not require the cooperation of other algorithms.Therefore, we focus on the PCAAD in this paper and leave the study of the AE for the future.
To test whether the PCAAD algorithm is feasible, we intend to conduct experiments by searching for dimension-8 operators in the SM effective field theory (SMEFT) con-tributing to anomalous quartic gauge couplings (aQGCs).Note that there are already ML approaches targeting SMEFT which have been shown to able to enhance the signal significance [13,17,18,23,40].
The rest of this paper consists of the following.In Sec. 2, the aQGCs and the tri-photon process are briefly reviewed.Event selection strategy using PCAAD is discussed in Sec. 3 with the contribution of interference ignored and the contributions of the SM and NP considered separately because we focus on the features of the SM and NP events in this section.In our approach the eigenvectors are solely determined by the SM events.Sec. 4 presents the expected constraints on operator coefficients.In Sec. 4, the interference terms between the SM and the NP are taken into account.Sec. 5 summarizes our main conclusions.
2 The effect of aQGCs in the tri-photon process where  At muon colliders, which are also called gauge boson colliders [86], it is expected that, the vector boson scattering/fusion (VBS/VBF) processes are dominant production modes for both the SM and NP starting from a few TeV energies because of the logarithmic enhancement from gauge boson radiation [85][86][87].For O T i operators, the dimension of cross section indicates σ NP ∼ s 3 ( f /Λ 4 ) 2 , except for logarithmic enhancement, the contribution for both VBS and tri-boson grow by s 3 , i.e., the momentum dependence in the Feynman rule of the aQGCs cancels the 1/s in the propagator in the tri-boson process.Besides, different from the LHC, where the tri-boson contributions from the aQGCs are further suppressed because they must be led by sea quark partons, no such suppression occurs at muon colliders except for that the s-channel propagator could not be the W bosons.It has been shown that, at muon colliders, for O T 5 the triboson contribution is competitive to the VBS for the process µ + µ − → γγν ν, it surpasses the VBS when √ s < 5 TeV, and is about 1/3 of the VBS when √ s = 30 TeV [78].Moreover, for the tri-photon process, there are no subsequent decays in final states.As a result, the sensitivity of the tri-photon process at the muon colliders to the O T i operators is competitive to or even better than the VBS processes.For simplicity, we consider only the tri-photon process in this paper.
The left panel of Fig. 1 shows the Feynman diagrams induced by O T i operators, while the right panel shows the SM background.The contributions of O T 1,6 operators are exactly equal to O T 0,5 operators, respectively.Therefore, in the following we consider only O T 0,2,5,7,8,9 operators.

PCA assistant event selection strategy
PCA algorithm is one of the most commonly used linear dimensionality reduction methods.It uses mathematical dimensionality reduction or feature extraction to linearly transform a large number of original variables to a smaller number of important variables, which can explain the main information of the original data-set.
Denoting the points in a data-set as , where 1 ≤ i ≤ N, N is the number of points in the data-set and m is the number of dimensions of the points (which is also called the number of features of the points), the PCA algorithm start with a standardization of the points.In this paper, we use z-score standardization, x j i = (p j i − p j )/ε j , where p j and ε j are the mean and standard deviation of the j-th feature for the whole data-set.Then the data-set can be expressed as a matrix X = (⃗ x T 1 ,⃗ x T 2 , . . .,⃗ x T N ).An m × m covariance matrix C can be obtained as C = XX T .Next, we need to calculate the eigenvectors of C. Denoting ⃗ η j as the eigenvector corresponding to eigenvalue λ j , where {λ j } are in descending order.The PCA algorithm project the points in the data-set using ⃗ η j to new features, hence the new features are x j i = ⃗ η j •⃗ x i .The process of reducing the original data dimension by PCA can be equivalent to projecting the original features onto the dimension with the maximum amount of projection information.Each eigenvector ⃗ η j is a projection axis, and each ⃗ η j corresponding eigenvalue λ j is the variance of the original features projected onto the projection axis.According to the maximum variance theory, the larger the variance, the larger the information, after projection, in order to ensure that the information lost is small, one have to choose the projection axis with larger variance, that is, to choose the ⃗ η j corresponding to the larger λ j .By selecting m ′ eigenvectors corresponding to the first m ′ eigenvalues, the dimension of the points is reduced from m for ⃗ x i to m ′ for ⃗ xi .The components of ⃗ xi , i.e. x j i with 1 ≤ j ≤ m ′ are the first m ′ principal components of the data-set.
The eigenvector reflects the directions corresponding to the degree of variance change of the original data, and the eigenvalue is the variance of the data in the corresponding direction.Since the background is mainly distributed along the most important eigenvectors, it can be expected that points in the background are closer to the eigenvectors after decentralization.And if the signal does not coincide with the background distribution, then the distances between the signal points and eigenvectors will be larger.Therefore, after the ⃗ η j are obtained, we use the distance between the points and ⃗ η j (denoted as d i, j ) as anomaly scores to search for the signal events.

Data preparation
To test the feasibility of the PCAAD algorithm for searching for aQGCs, we build the data-set using Monte Carlo (MC) simulation with the help of MadGraph5@NLO toolkit [93][94][95], and a muon collider-like detector simulation with Delphes [96] is applied.The data preparation is applied by MLAnalysis [97].To avoid infrared divergences, in this section we use the standard cuts as default, and the cuts associated with infrared divergences are where p T,γ and η γ are the transverse momentum and pseudorapidity for each photon, respectively, ∆ R γγ = ∆ φ 2 + ∆ η 2 , where ∆ φ and ∆ η are the differences in azimuth angles and pseudo-rapidities between two photons, respectively.The events of the signal are generated one operator at a time.In this section, we choose the coefficients as upper bounds listed in Table 1.We require that each event consists of at least three photons, so that an event in the data-set consists of 12 numbers which are the components of the four momenta of three photons, which means that an event corresponds to a dimension-12 vector.In this paper, the hardest three photons are selected, and the photons are arranged in descending order of energy for each event.In this section, we generate 600000 events for the SM and 30000 events for NP, respectively.It needs to point out that, the interference is ignored in this section because we concentrate on the features of the background and signal.Note that we do not use the physical information in the data.This is because we want to verify that our method is independent of the physical content.For example, for photons, the four components of the four-momentum are not independent of each other, and the four-momenta of three photons are not independent of each other.We assume that the PCAAD does not know that these data represent the fourmomenta of the photons, and do not use the above relationships, but only treat the data as vectors with 12 numbers.

Event selection strategy
To search for NP in the target data-set, based on PCAAD, we draw out the following procedure, 1. Prepare a data-set for the SM using MC.The data-set of the SM is denoted as p SM , and the data-set of the target is denoted as p tar .2. Apply the z-score standardization to the data-sets of the SM x SM,j i = (p SM,j i − pSM,j )/ε SM,j , where pSM,j and ε SM,j are the mean and the standard deviation of the j-th feature for the SM data-set.However, when standardizing the target data-set, we also use pSM, j and ε SM, j , so that , and X tar = ((⃗ x tar 1 ) T , (⃗ x tar 2 ) T , . . ., (⃗ x tar N tar ) T ), where N SM,tar are the total numbers of points in the SM and target data-sets, respectively.Calculate the covariance matrix of the SM as C SM = X SM (X SM ) T .4. Find out the eigenvalues and eigenvectors of the covariance matrix C SM . 5. Sort the eigenvalues in descending order and select the eigenvectors ⃗ η SM j corresponding to the top m ′ (m ′ < m = 12) largest eigenvalues.As will be explained later, in this paper, we use m ′ = 4. 6. Find out the m ′ new features by projecting the data-sets of both X SM and X tar with ⃗ η SM j , i.e. xSM, The distance from a point⃗ x i to an eigenvector can be ob- . d i, j for the points in the SM data-set and target data-set are denoted as d SM i, j and d tar i, j , respectively.8. Set cuts on d i, j to select events.
It can be seen that, the PCAAD event selection strategy requires a data-set of the SM, therefore is a supervised machine learning algorithm.In this section, we use the data-set of NP as the target data-set.
Taking the case of O T 0 at √ s = 3, 10, 14 and 30 TeV as examples, the standard deviations of xSM,j i (denoted as εSM,j ) are shown in Fig. 2. It can be shown that, εSM,j with 1 ≤ j ≤ 4 are much larger than the other εSM,j 's, which indicates that the background points are mainly distributed along first 4 eigenvectors.Therefore, in this paper, we use m ′ = 4.We have also verified that, the improvements of expected constraints on the operator coefficients one can archive with a larger m ′ are negligible compared with the case of m ′ = 4.The normalized distributions of d SM i, j and d tar i, j are shown in Fig. 3.It can be seen that, d tar i, j are generally larger than d SM i, j as expected.In this paper, we use d tar i, j as the anomaly score to discriminate the signal from the background.The means and standard deviations for p SM used in z-score standardization are listed in Sec.Appendix A, so as the components of ⃗ η SM 1,2,3,4 .Although we can not care about the physical meanings of d i,1 , d i,2 , d i,3 and d i,4 when searching for NP signals using a AD approach such as PCAAD.But there is still an interesting and noteworthy phenomenon.For the SM, the positions of the d i,1 , d i,2 , d i,3 and d i,4 peaks are delayed to around 2, and the peak of d i,2 is closer to 0 compared with the others.Because of decentralization, the center points are those with high energies and small momenta, which cannot be the four momenta of photons which should be light-like.This is the reason that d SM i,1 cannot be too close to 0. And ⃗ η SM 2 approximately corresponds to the directions of all photon momenta along the z-axis.Thus d i,2 distributed at a position more towards 0 shows that the photons are more inclined to the z-axis, which can be seen as a consequence of infrared divergence.Meanwhile, for d i,2 which also peaks at a position larger than 0 instead of 0 due to the cuts to avoid infrared divergence.
Another noteworthy result is that εSM,9,10,11,12 ≈ 0. The PCA can automatically find out the redundant information.Ignoring the effect of detector simulation, 4 of the 12 variables are linearly related to the others by ∑ p  where E 1,2,3 are energies of the three photons, p x,y,z 1,2,3 are the components of p

Constraints on the coefficients
When no NP signal is found, the PCAAD event selection strategy can also be used to constrain the coefficients of NP.To this end, we generate events with the coefficients in Table 2.In this section, the target data-sets consist of the events generated with the SM, NP, and interference between the SM and NP included.In Ref.
[78], p T,γ > 0.12E beam is used as a part of the event selection strategy, where E beam is the energy of the beam.To avoid dealing with too many events, when generating events, the standard cut requires p T,γ > 0.1E beam while the other standard cuts are the same as those in Eq. (2).
The cross-section can be expressed as a parabola of , where σ SM denotes the contribution of the SM, f T i /Λ 4 σ int denotes the interference between the SM and aQGCs, and σ NP f T i /Λ 4 2 is the contribution induced by aQGCs.After obtaining the cross section after cuts corresponding to each coefficient, we fitted the cross sections according to parabola using the least squares method.For simplicity we use a same criterion for d i,1≤ j≤4 for each √ s (denoted as d).Taking d i,1≤ j≤4 < 4.2 at √ s = 3 TeV and d i,1≤ j≤4 < 4.8 at √ s = 10, 14, and 30 TeV as examples, the cross-sections after the event selection strategy and the fitted cross-sections are shown in Fig. 4. It can be seen that the cross-sections also fit the parabola functions well after cuts.
In order to verify the effect of different criterions, the effect of the PCAAD event selection strategy for O T 0 at √ s = 3, 10, 14, and 30 TeV are investigated.The cross sections after cuts corresponding to different d are fitted.The expected constraints on the coefficients are estimated with the fitted cross-sections and the statistical sensitivity defined as [99,100], where N s = (σ − σ SM )L, N bg = σ SM L and L is the luminosity.The luminosities correspond to the conservative case are 1, 10, 10 and 10 ab −1 for √ s = 3, 10, 14, and 30 TeV, respectively [86].In the conservative case, N s /N bg and S stat are calculated using the fitted cross sections with the coefficients f T 0 as the upper bounds in Table 2. N s /N bg and S stat are shown in Figs. 5 and 6, respectively.It can be seen that S stat peeks at d = 4.2 at √ s = 3 TeV and 4.8 at √ s = 10, 14, and 30 TeV.As a consequence, the above d's are chosen as the criterions for the PCAAD event selection strategy.
Table 2: The ranges of operator coefficients used in the scanning In this paper, the luminosities correspond to the conservative and optimistic cases are both considered [86], and the expected constraints are listed in Tables 3 and 4, respectively.The energy and luminosities in this paper are the same as those used in Ref. [78], however, compared to the traditional event selection strategy used in Ref.
[78], the constraints are at the same order of magnitude but generally strengthened, especially for the lower bounds.Fig. 6: Same as Fig. 5 but for S stat .

Summary
Searching for NP signals at the LHC and future colliders requires a lot of data processing.Meanwhile, the quantum computers have great potential in processing a large amount of data.In this paper, we investigate the event selection strategy based on PCA algorithm, which can be accelerated by quantum computers.We proposes a PCAAD event selection strategy based on PCA algorithm to search for NP signals.
The PCAAD is an automatic event-selection strategy that does not require a prior knowledge on the physics content of the NP.Since both the aQGCs and the muon colliders are of interest to HEP community, in this paper, we use the tri-photon process at muon colliders as an example.It can be shown that PCAAD is useful and efficient in NP signal searching.The expected upper bounds on the operator coefficients w.r.t.aQGCs are generally tighter than those obtained by a traditional event selection strategy.

Fig. 1 :
Fig. 1: Feynman diagrams for the process µ + µ − → γγγ.The process derived by the O T i operator is on the left, the representative diagram of SM is on the right.In the case of the SM, permuting the photons in final state can arrive in other five diagrams.

Table 1 :
[91,92]ints on O T i coefficients at 95 % C.L obtained by the LHC.Because only the transverse operators O T i contribute to the tri-photon process[78], in this paper, we consider only O T i , they are[91,92] 3}, B µ and W i µ are U(1) Y and SU(2) I gauge fields, B µν and W µν correspond to the field strength tensors.Table1shows the constraints on the coefficients obtained by the large hadron collider (LHC).

Table 3 :
Expected constraints on the operator coefficients at √ s = 3 TeV , 10 TeV , 14 TeV, and 30 TeV in the conservative case.

Table 4 :
Expected constraints on the operator coefficients at √ s = 14 TeV, and 30 TeV in the optimistic case.

Table 5 :
The means of the SM data-sets.

Table 6 :
The standard deviations of the SM data-sets.