1 Introduction

Since its establishment, the Standard Model (SM) has withstood the challenge of high-precision experimental measurements in describing strong interaction, electromagnetic and weak interaction phenomena. The theoretical results are almost in perfect agreement with the experimental observations, but some important fundamental problems are still difficult to be explained within the framework of the SM [1,2,3,4,5]. The seeking for new physics (NP) beyond the SM has become one of the most advanced and important topics in high energy physics (HEP) [6].

As the colliders’ performance improves and their luminosities increase, the efficient ways to process large amounts of data become important. One of the efficient ways is to use machine learning (ML) algorithms. ML is a general term for a class of algorithms, including probability theory, statistics and other disciplines. Because of its advantages in complex data processing, ML algorithms have been applied in many fields, including HEP [7,8,9,10,11,12,13,14,15,16,17,18]. At present, many applications show that the anomaly detection (AD) ML method can be effectively applied in the phenomenological studies of NP. One of the advantages is that, when AD is applied to search for NP, the implementation is often independent of the NP model to be searched [19,20,21,22,23,24,25,26]. Nevertheless, it should be pointed out that the tunable parameters related to AD methods are often dependent on the NP models and processes to be studied.

In addition, quantum computing is another efficient way to process large amounts of complex data. Many ML algorithms can be accelerated by quantum computing [27,28,29]. One example is the principal component analysis (PCA) algorithm [30, 31]. PCA algorithm can also be used in AD, but it is not clear whether PCA anomaly detection (PCAAD) algorithm is useful for searching NP. If it was the case, then it is strongly implied that quantum PCA can also be used to search for NP signals and we will probably get a way to discover NP using quantum computers. There are other examples such as the autoencoder (AE) [32,33,34,35]. AE is better than PCA in handling data dimensionality reduction, so it has a good potential to perform better than PCA in AD as well [21, 22]. Not only that, AE also has the potential for quantum acceleration [36,37,38,39]. However, PCA is more explicit in a geometric sense, as the events are mainly distributed along the eigenvectors of the covariance matrix. Moreover, PCAAD does not require the cooperation of other algorithms. Therefore, we focus on the PCAAD in this paper and leave the study of the AE for the future.

To test whether the PCAAD algorithm is feasible, we intend to conduct experiments by searching for dimension-8 operators in the SM effective field theory (SMEFT) contributing to anomalous quartic gauge couplings (aQGCs). Note that there are already ML approaches targeting SMEFT which have been shown to able to enhance the signal significance [13, 17, 18, 23, 40].

There are many reasons to consider dimension-8 operators contributing to aQGCs. For example, the dimension-8 operators are important w.r.t. the convex geometry perspective to the operator space [41,42,43]. Moreover, there exist various NP models generating dimension-8 effective operators relevant for aQGCs [44,45,46,47,48,49,50,51,52,53], and there are distinct cases where dimension-6 operators are absent but the dimension-8 operators show up [54,55,56]. As a result, aQGCs have received a lot of attention [57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77]. The existence of aQGCs makes the tri-photon process at the muon colliders inconsistent with the predictions of the SM [78]. Recently, the studies of muon colliders also have drawn a lot of attention [79,80,81,82,83,84,85,86,87,88,89,90]. Although, the PCAAD might be suitable for various cases, as a test bed, we use the PCAAD to search for the signals of aQGCs in the tri-photon process at the muon colliders.

The rest of this paper consists of the following. In Sect. 2, the aQGCs and the tri-photon process are briefly reviewed. Event selection strategy using PCAAD is discussed in Sect. 3 with the contribution of interference ignored and the contributions of the SM and NP considered separately because we focus on the features of the SM and NP events in this section. In our approach the eigenvectors are solely determined by the SM events. Section 4 presents the expected constraints on operator coefficients. In Sect. 4, the interference terms between the SM and the NP are taken into account. Section 5 summarizes our main conclusions.

2 The effect of aQGCs in the tri-photon process

Because only the transverse operators \(O_{T_i}\) contribute to the tri-photon process [78], in this paper, we consider only \(O_{T_i}\), they are [91, 92]

$$\begin{aligned} \begin{aligned}&O_{T,0}=\textrm{Tr}\left[ {\widehat{W}}_{\mu \nu }{\widehat{W}}^{\mu \nu }\right] \times \textrm{Tr}\left[ {\widehat{W}}_{\alpha \beta }{\widehat{W}}^{\alpha \beta }\right] ,\\&O_{T,1}=\textrm{Tr}\left[ {\widehat{W}}_{\alpha \nu }{\widehat{W}}^{\mu \beta }\right] \times \textrm{Tr}\left[ {\widehat{W}}_{\mu \beta }{\widehat{W}}^{\alpha \nu }\right] ,\\&O_{T,2}=\textrm{Tr}\left[ {\widehat{W}}_{\alpha \mu }{\widehat{W}}^{\mu \beta }\right] \times \textrm{Tr}\left[ {\widehat{W}}_{\beta \nu }{\widehat{W}}^{\nu \alpha }\right] ,\\&O_{T,5}=\textrm{Tr}\left[ {\widehat{W}}_{\mu \nu }{\widehat{W}}^{\mu \nu }\right] \times B_{\alpha \beta }B^{\alpha \beta },\\&O_{T,6}=\textrm{Tr}\left[ {\widehat{W}}_{\alpha \nu }{\widehat{W}}^{\mu \beta }\right] \times B_{\mu \beta }B^{\alpha \nu },\\&O_{T,7}=\textrm{Tr}\left[ {\widehat{W}}_{\alpha \mu }{\widehat{W}}^{\mu \beta }\right] \times B_{\beta \nu }B^{\nu \alpha },\\&O_{T,8}=B_{\mu \nu }B^{\mu \nu }\times B_{\alpha \beta }B^{\alpha \beta },\\&O_{T,9}=B_{\alpha \mu }B^{\mu \beta }\times B_{\beta \nu }B^{\nu \alpha },\\ \end{aligned} \end{aligned}$$
(1)

where \({\widehat{W}}\equiv \vec {\sigma }\cdot {\vec W}/2\), \(\sigma \) is the Pauli matrices, \({\vec W}=\{W^1,W^2,W^3\}\), \(B_{\mu }\) and \(W_{\mu }^i\) are \(U(1)_{\textrm{Y}}\) and \(SU(2)_{\textrm{I}}\) gauge fields, \(B_{\mu \nu }\) and \(W_{\mu \nu }\) correspond to the field strength tensors. Table 1 shows the constraints on the coefficients obtained by the large hadron collider (LHC).

Table 1 Constraints on \(O_{T_{i}}\) coefficients at \(95~\%\) C.L obtained by the LHC
Fig. 1
figure 1

Feynman diagrams for the process \(\mu ^{+} \mu ^{-} \rightarrow \gamma \gamma \gamma \). The process derived by the \(O_{T_{i}}\) operator is on the left, the representative diagram of SM is on the right. In the case of the SM, permuting the photons in final state can arrive in other five diagrams

At muon colliders, which are also called gauge boson colliders [86], it is expected that, the vector boson scattering/fusion (VBS/VBF) processes are dominant production modes for both the SM and NP starting from a few TeV energies because of the logarithmic enhancement from gauge boson radiation [85,86,87]. For \(O_{T_i}\) operators, the dimension of cross section indicates \(\sigma _{NP}\sim s^3 (f/\Lambda ^4)^2\), except for logarithmic enhancement, the contribution for both VBS and tri-boson grow by \(s^3\), i.e., the momentum dependence in the Feynman rule of the aQGCs cancels the 1/s in the propagator in the tri-boson process. Besides, different from the LHC, where the tri-boson contributions from the aQGCs are further suppressed because they must be led by sea quark partons, no such suppression occurs at muon colliders except for that the s-channel propagator could not be the W bosons. It has been shown that, at muon colliders, for \(O_{T_5}\) the tri-boson contribution is competitive to the VBS for the process \(\mu ^+\mu ^-\rightarrow \gamma \gamma \nu {\bar{\nu }}\), it surpasses the VBS when \(\sqrt{s}<5\;\textrm{TeV}\), and is about 1/3 of the VBS when \(\sqrt{s}=30\;\textrm{TeV}\) [78]. Moreover, for the tri-photon process, there are no subsequent decays in final states. As a result, the sensitivity of the tri-photon process at the muon colliders to the \(O_{T_i}\) operators is competitive to or even better than the VBS processes. For simplicity, we consider only the tri-photon process in this paper.

The left panel of Fig. 1 shows the Feynman diagrams induced by \(O_{T_i}\) operators, while the right panel shows the SM background. The contributions of \(O_{T_{1,6}}\) operators are exactly equal to \(O_{T_{0,5}}\) operators, respectively. Therefore, in the following we consider only \(O_{T_{0,2,5,7,8,9}}\) operators.

3 PCA assistant event selection strategy

PCA algorithm is one of the most commonly used linear dimensionality reduction methods. It uses mathematical dimensionality reduction or feature extraction to linearly transform a large number of original variables to a smaller number of important variables, which can explain the main information of the original data-set.

Denoting the points in a data-set as \(\vec {p}_i=(p_i^1, p_i^2, \ldots , p_i^m)\), where \(1 \le i \le N\), N is the number of points in the data-set and m is the number of dimensions of the points (which is also called the number of features of the points), the PCA algorithm start with a standardization of the points. In this paper, we use z-score standardization, \(x_i^j=(p_i^j - {\bar{p}}^j)/\epsilon ^j\), where \({\bar{p}}^j\) and \(\epsilon ^j\) are the mean and standard deviation of the j-th feature for the whole data-set. Then the data-set can be expressed as a matrix \(X=(\vec {x}_1^T, \vec {x}_2^T, \ldots , \vec {x}_N^T)\). An \(m\times m\) covariance matrix C can be obtained as \(C=XX^T\). Next, we need to calculate the eigenvectors of C. Denoting \(\vec {\eta }_j\) as the eigenvector corresponding to eigenvalue \(\lambda _j\), where \(\{\lambda _j\}\) are in descending order. The PCA algorithm project the points in the data-set using \(\vec {\eta }_j\) to new features, hence the new features are \({\tilde{x}}_i^j = \vec {\eta }_j \cdot \vec {x}_i\).

The process of reducing the original data dimension by PCA can be equivalent to projecting the original features onto the dimension with the maximum amount of projection information. Each eigenvector \(\vec {\eta }_j\) is a projection axis, and each \(\vec {\eta }_j\) corresponding eigenvalue \(\lambda _j\) is the variance of the original features projected onto the projection axis. According to the maximum variance theory, the larger the variance, the larger the information, after projection, in order to ensure that the information lost is small, one have to choose the projection axis with larger variance, that is, to choose the \(\vec {\eta }_j\) corresponding to the larger \(\lambda _j\). By selecting \(m'\) eigenvectors corresponding to the first \(m'\) eigenvalues, the dimension of the points is reduced from m for \(\vec {x}_i\) to \(m'\) for \(\vec {{\tilde{x}}}_i\). The components of \(\vec {{\tilde{x}}}_i\), i.e. \({\tilde{x}}_i^j\) with \(1\le j \le m'\) are the first \(m'\) principal components of the data-set.

The eigenvector reflects the directions corresponding to the degree of variance change of the original data, and the eigenvalue is the variance of the data in the corresponding direction. Since the background is mainly distributed along the most important eigenvectors, it can be expected that points in the background are closer to the eigenvectors after decentralization. And if the signal does not coincide with the background distribution, then the distances between the signal points and eigenvectors will be larger. Therefore, after the \(\vec {\eta }_j\) are obtained, we use the distance between the points and \(\vec {\eta }_j\) (denoted as \(d_{i,j}\)) as anomaly scores to search for the signal events.

3.1 Data preparation

Fig. 2
figure 2

The variances of the principal components

Fig. 3
figure 3

Normalized distributions of \(d_{i,1\le j \le 4}\) for the SM and \(O_{T_0}\) at \(\sqrt{s} = 3\;\textrm{TeV}\) (the first row), \(10\;\textrm{TeV}\) (the second row), \(14\;\textrm{TeV}\) (the third row), and \(30\;\textrm{TeV}\) (the forth row)

To test the feasibility of the PCAAD algorithm for searching for aQGCs, we build the data-set using Monte Carlo (MC) simulation with the help of MadGraph5@NLO toolkit [93,94,95], and a muon collider-like detector simulation with Delphes [96] is applied. To avoid infrared divergences, in this section we use the standard cuts as default, and the cuts associated with infrared divergences are

$$\begin{aligned} \begin{aligned}&p_{T,\gamma }> 10\;\textrm{GeV},\; |\eta _{\gamma }| < 2.5, \; \Delta R_{\gamma \gamma } > 0.4, \end{aligned} \end{aligned}$$
(2)

where \(p_{T,\gamma }\) and \(\eta _{\gamma }\) are the transverse momentum and pseudo-rapidity for each photon, respectively, \(\Delta R_{\gamma \gamma }=\sqrt{\Delta \phi ^2 + \Delta \eta ^2}\), where \(\Delta \phi \) and \(\Delta \eta \) are the differences in azimuth angles and pseudo-rapidities between two photons, respectively. The events of the signal are generated one operator at a time. In this section, we choose the coefficients as upper bounds listed in Table 1. We require that each event consists of at least three photons, so that an event in the data-set consists of 12 numbers which are the components of the four momenta of three photons, which means that an event corresponds to a dimension-12 vector. In this paper, the hardest three photons are selected, and the photons are arranged in descending order of energy for each event. In this section, we generate 600,000 events for the SM and 30,000 events for NP, respectively. It needs to point out that, the interference is ignored in this section because we concentrate on the features of the background and signal.

Note that we do not use the physical information in the data. This is because we want to verify that our method is independent of the physical content. For example, for photons, the four components of the four-momentum are not independent of each other, and the four-momenta of three photons are not independent of each other. We assume that the PCAAD does not know that these data represent the four-momenta of the photons, and do not use the above relationships, but only treat the data as vectors with 12 numbers.

3.2 Event selection strategy

To search for NP in the target data-set, based on PCAAD, we draw out the following procedure,

  1. 1.

    Prepare a data-set for the SM using MC. The data-set of the SM is denoted as \(p^{\textrm{SM}}\), and the data-set of the target is denoted as \(p^{\textrm{tar}}\).

  2. 2.

    Apply the z-score standardization to the data-sets of the SM \(x_i^{\textrm{SM,j}}=(p_i^{\textrm{SM,j}} - {\bar{p}}^{\textrm{SM,j}})/\epsilon ^{\textrm{SM,j}}\), where \({\bar{p}}^{\textrm{SM,j}}\) and \(\epsilon ^{\textrm{SM,j}}\) are the mean and the standard deviation of the j-th feature for the SM data-set. However, when standardizing the target data-set, we also use \({\bar{p}}^{\textrm{SM},j}\) and \(\epsilon ^{\textrm{SM},j}\), so that \(x_i^{\textrm{tar},j}=(p_i^{\textrm{tar},j} - {\bar{p}}^{\textrm{SM},j})/\epsilon ^{\textrm{SM},j}\).

  3. 3.

    Denote \(X^{\textrm{SM}}=((\vec {x}_1^{\textrm{SM}})^T, (\vec {x}_2^{\textrm{SM}})^T, \ldots , (\vec {x}_{N^{\textrm{SM}}}^{\textrm{SM}})^T)\), and \(X^{\textrm{tar}}=((\vec {x}_1^{\textrm{tar}})^T, (\vec {x}_2^{\textrm{tar}})^T, \ldots , (\vec {x}_{N^{\textrm{tar}}}^{\textrm{tar}})^T)\), where \(N^{\textrm{SM,tar}}\) are the total numbers of points in the SM and target data-sets, respectively. Calculate the covariance matrix of the SM as \(C^{\textrm{SM}}=X^{\textrm{SM}}(X^{\textrm{SM}})^T\).

  4. 4.

    Find out the eigenvalues and eigenvectors of the covariance matrix \(C^{\textrm{SM}}\).

  5. 5.

    Sort the eigenvalues in descending order and select the eigenvectors \(\vec {\eta } _j^{\textrm{SM}}\) corresponding to the top \(m'\) (\(m'< m=12\)) largest eigenvalues. As will be explained later, in this paper, we use \(m'=4\).

  6. 6.

    Find out the \(m'\) new features by projecting the data-sets of both \(X^{\textrm{SM}}\) and \(X^{\textrm{tar}}\) with \(\vec {\eta } _j^{\textrm{SM}}\), i.e. \({\tilde{x}}_i^{\textrm{SM, j}}=\vec {\eta } _j^{\textrm{SM}}\cdot \vec {x}_i^{\textrm{SM}}\), \({\tilde{x}}_i^{\textrm{tar, j}}=\vec {\eta } _j^{\textrm{SM}} \cdot \vec {x}_i^{\textrm{tar}}\).

  7. 7.

    The distance from a point \(\vec {x}_i\) to an eigenvector can be obtained as \(d_{i,j} = \sqrt{|\vec {x}_i|^2 -\left( \vec {x}_i\cdot \vec {\eta }_j\right) ^2 }= \sqrt{|\vec {x}_i|^2 -({\tilde{x}}_i^j)^2 }\). \(d_{i,j}\) for the points in the SM data-set and target data-set are denoted as \(d_{i,j}^{\textrm{SM}}\) and \(d_{i,j}^{\textrm{tar}}\), respectively.

  8. 8.

    Set cuts on \(d_{i,j}\) to select events.

It can be seen that, the PCAAD event selection strategy requires a data-set of the SM, therefore is a supervised machine learning algorithm. In this section, we use the data-set of NP as the target data-set.

Taking the case of \(O_{T_0}\) at \(\sqrt{s} = 3\), 10, 14 and \(30\;\textrm{TeV}\) as examples, the standard deviations of \({\tilde{x}}_i^{\textrm{SM,j}}\) (denoted as \({\tilde{\epsilon }}^{\textrm{SM,j}}\)) are shown in Fig. 2. It can be shown that, \({\tilde{\epsilon }}^{\textrm{SM,j}}\) with \(1\le j\le 4\) are much larger than the other \({\tilde{\epsilon }}^{\textrm{SM,j}}\)’s, which indicates that the background points are mainly distributed along first 4 eigenvectors. Therefore, in this paper, we use \(m'=4\). We have also verified that, the improvements of expected constraints on the operator coefficients one can archive with a larger \(m'\) are negligible compared with the case of \(m'=4\). The normalized distributions of \(d_{i,j}^{\textrm{SM}}\) and \(d_{i,j}^{\textrm{tar}}\) are shown in Fig. 3. It can be seen that, \(d_{i,j}^{\textrm{tar}}\) are generally larger than \(d_{i,j}^{\textrm{SM}}\) as expected. In this paper, we use \(d_{i,j}^{\textrm{tar}}\) as the anomaly score to discriminate the signal from the background. The means and standard deviations for \(p^{\textrm{SM}}\) used in z-score standardization are listed in Sect. 1, so as the components of \(\vec {\eta } ^{\textrm{SM}}_{1,2,3,4}\).

Although we can not care about the physical meanings of \(d_{i,1}\), \(d_{i,2}\), \(d_{i,3}\) and \(d_{i,4}\) when searching for NP signals using a AD approach such as PCAAD. But there is still an interesting and noteworthy phenomenon. For the SM, the positions of the \(d_{i,1}\), \(d_{i,2}\), \(d_{i,3}\) and \(d_{i,4}\) peaks are delayed to around 2, and the peak of \(d_{i,2}\) is closer to 0 compared with the others. Because of decentralization, the center points are those with high energies and small momenta, which cannot be the four momenta of photons which should be light-like. This is the reason that \(d_{i,1}^{\textrm{SM}}\) cannot be too close to 0. And \(\vec {\eta } _2^{\textrm{SM}}\) approximately corresponds to the directions of all photon momenta along the \(\textbf{z}\)-axis. Thus \(d_{i,2}\) distributed at a position more towards 0 shows that the photons are more inclined to the \(\textbf{z}\)-axis, which can be seen as a consequence of infrared divergence. Meanwhile, for \(d_{i,2}\) which also peaks at a position larger than 0 instead of 0 due to the cuts to avoid infrared divergence.

Another noteworthy result is that \({\tilde{\epsilon }} ^{\textrm{SM,9,10,11,12}}\approx 0\). The PCA can automatically find out the redundant information. Ignoring the effect of detector simulation, 4 of the 12 variables are linearly related to the others by \(\sum p_n^{\gamma } = (\sqrt{s},0,0,0)\) where \(p_{n=1,2,3}^{\gamma }\) are momenta of the photons, and there are 3 other nonlinear relations \((p_n^{\gamma })^2=0\). The numerical results of \(\vec {\eta } ^{\textrm{SM}}_{9,10,11,12}\) at \(\sqrt{s}=3\;\textrm{TeV}\) indicates that,

$$\begin{aligned} \begin{aligned} {\tilde{x}}_i^9&\approx 0.0033\;\textrm{GeV}^{-1}\left( (E_1-1.452\times 10^3\;\textrm{GeV})\right. \\&\quad \left. +(E_2-1.313\times 10^3\;\textrm{GeV})+(E_3-2.351\times 10^2\;\textrm{GeV} )\right) \\ {\tilde{x}}_i^{10}&\approx 10^{-3}\;\textrm{GeV}^{-1}\left( 0.67p_1^x-0.98p_1^y\right. \\&\quad \left. +0.67p_2^x-0.98p_2^y+0.67p_3^x-0.98p_3^y\right) \\ {\tilde{x}}_i^{11}&\approx 10^{-3}\;\textrm{GeV}^{-1}\left( 0.98p_1^x+0.67p_1^y\right. \\&\quad \left. +0.98p_2^x+0.67p_2^y+0.98p_3^x+0.67p_3^y\right) \\ {\tilde{x}}_i^{12}&\approx 0.00062\;\textrm{GeV}^{-1}\left( p_1^z+p_2^z+p_3^z\right) \\ \end{aligned} \end{aligned}$$
(3)

where \(E_{1,2,3}\) are energies of the three photons, \(p_{1,2,3}^{x,y,z}\) are the components of \(p_n^{\gamma }\). \({\tilde{\epsilon }} ^{\textrm{SM}}_{9,10,11,12}\approx 0\) indicate that \({\tilde{x}}_{9,10,11,12}^{\textrm{SM}}\) are almost constants. This corresponds to \(\sum p_n^{\gamma } = (\sqrt{s},0,0,0)\). However, to cleanly remove the non-linear redundant variables, non-linear PCA must be used [97].

Table 2 The ranges of operator coefficients used in the scanning
Fig. 4
figure 4

The cross-sections after cut and the fitted cross-sections at \(\sqrt{s}\) = 3 TeV (row 1), 10 TeV (row 2), 14 TeV (row 3), and 30 TeV (row 4)

4 Constraints on the coefficients

When no NP signal is found, the PCAAD event selection strategy can also be used to constrain the coefficients of NP. To this end, we generate events with the coefficients in Table 2. In this section, the target data-sets consist of the events generated with the SM, NP, and interference between the SM and NP included. In Ref. [78], \(p_{T,\gamma } > 0.12 E_{\textrm{beam}}\) is used as a part of the event selection strategy, where \(E_{\textrm{beam}}\) is the energy of the beam. To avoid dealing with too many events, when generating events, the standard cut requires \(p_{T,\gamma } > 0.1 E_{\textrm{beam}}\) while the other standard cuts are the same as those in Eq. (2).

Fig. 5
figure 5

\(N_s/N_{\textrm{bg}}\) in the conservative cases for \(O_{T_0}\) as functions of d, with \(f_{T_0}\) as the upper bounds of Table 2. The top-left panel corresponds to \(\sqrt{s}=3\;\textrm{TeV}\), the top-right panel corresponds to \(\sqrt{s}=10\;\textrm{TeV}\), the bottom-left panel corresponds to \(\sqrt{s}=14\;\textrm{TeV}\), and the bottom-right panel corresponds to \(\sqrt{s}=30\;\textrm{TeV}\)

Fig. 6
figure 6

Same as Fig. 5 but for \({\mathcal {S}}_{\textrm{stat}}\)

Table 3 Expected constraints on the operator coefficients at \(\sqrt{s}\) = 3 TeV, 10 TeV, 14 TeV, and 30 TeV in the conservative case
Table 4 Expected constraints on the operator coefficients at \(\sqrt{s}\) = 14 TeV, and 30 TeV in the optimistic case

The cross-section can be expressed as a parabola of \(f_{T_{i}}/\Lambda ^{4}\), as \(\sigma =\sigma _{\textrm{SM}}+ \sigma _{\textrm{int}} f_{T_{i}}/\Lambda ^{4}+ \sigma _{\textrm{NP}} \left( f_{T_{i}}/\Lambda ^{4}\right) ^2\), where \(\sigma _{\textrm{SM}}\) denotes the contribution of the SM, \(f_{T_{i}}/\Lambda ^{4} \sigma _{\textrm{int}}\) denotes the interference between the SM and aQGCs, and \(\sigma _{\textrm{NP}} \left( f_{T_{i}}/\Lambda ^{4}\right) ^2\) is the contribution induced by aQGCs. After obtaining the cross section after cuts corresponding to each coefficient, we fitted the cross sections according to parabola using the least squares method. For simplicity we use a same criterion for \(d_{i,1\le j \le 4}\) for each \(\sqrt{s}\) (denoted as d). Taking \(d_{i,1\le j \le 4}<4.2\) at \(\sqrt{s}=3\;\textrm{TeV}\) and \(d_{i,1\le j \le 4}<4.8\) at \(\sqrt{s}=10\), 14, and \(30\;\textrm{TeV}\) as examples, the cross-sections after the event selection strategy and the fitted cross-sections are shown in Fig. 4. It can be seen that the cross-sections also fit the parabola functions well after cuts.

In order to verify the effect of different criterions, the effect of the PCAAD event selection strategy for \(O_{T_0}\) at \(\sqrt{s}=3\), 10, 14, and \(30\;\textrm{TeV}\) are investigated. The cross sections after cuts corresponding to different d are fitted. The expected constraints on the coefficients are estimated with the fitted cross-sections and the statistical sensitivity defined as \({\mathcal {S}}_{stat}=\sqrt{2 \left[ (N_{\textrm{bg}}+N_{s}) \ln (1+N_{s}/N_{\textrm{bg}})-N_{s}\right] }\) [98, 99], where \(N_s=(\sigma -\sigma _{\textrm{SM}})L\), \(N_{\textrm{bg}}=\sigma _{\textrm{SM}}L\) and L is the luminosity. The luminosities correspond to the conservative case are 1, 10, 10 and \(10\;\textrm{ab}^{-1}\) for \(\sqrt{s}=3\), 10, 14, and \(30\;\textrm{TeV}\), respectively [86]. In the conservative case, \(N_s/N_{\textrm{bg}}\) and \({\mathcal {S}}_{\textrm{stat}}\) are calculated using the fitted cross sections with the coefficients \(f_{T_0}\) as the upper bounds in Table 2. \(N_s/N_{\textrm{bg}}\) and \({\mathcal {S}}_{\textrm{stat}}\) are shown in Figs. 5 and 6, respectively. It can be seen that \(S_{\textrm{stat}}\) peeks at \(d=4.2\) at \(\sqrt{s}=3\;\textrm{TeV}\) and 4.8 at \(\sqrt{s}=10\), 14, and \(30\;\textrm{TeV}\). As a consequence, the above d’s are chosen as the criterions for the PCAAD event selection strategy.

In this paper, the luminosities correspond to the conservative and optimistic cases are both considered [86], and the expected constraints are listed in Tables 3 and 4, respectively. The energy and luminosities in this paper are the same as those used in Ref. [78], however, compared to the traditional event selection strategy used in Ref. [78], the constraints are at the same order of magnitude but generally strengthened, especially for the lower bounds.

5 Summary

Searching for NP signals at the LHC and future colliders requires a lot of data processing. Meanwhile, the quantum computers have great potential in processing a large amount of data. In this paper, we investigate the event selection strategy based on PCA algorithm, which can be accelerated by quantum computers. We proposes a PCAAD event selection strategy based on PCA algorithm to search for NP signals.

The PCAAD is an automatic event-selection strategy that does not require a prior knowledge on the physics content of the NP. Since both the aQGCs and the muon colliders are of interest to HEP community, in this paper, we use the tri-photon process at muon colliders as an example. It can be shown that PCAAD is useful and efficient in NP signal searching. The expected upper bounds on the operator coefficients w.r.t. aQGCs are generally tighter than those obtained by a traditional event selection strategy.