1 Introduction

Quantum machine learning (QML) is a promising candidate for real-world applications of quantum technology (Biamonte et al. 2017). In recent years, a multitude of different techniques have been developed with the aim of using quantum computers to perform machine learning tasks (Zeguendry et al. 2023; Sajjan et al. 2022) and QML techniques have been applied to a wide range of fields, including particle physics (Heredge et al. 2021; Tüysüz et al. 2021), medical data (Pregnolato and Zizzi 2023; Azevedo et al. 2022), aerodynamics (Yuan et al. 2022) and natural language processing (Meichanetzidis et al. 2023). In many QML techniques, classical data is encoded into an exponentially larger quantum space where there is the possibility that the data may be separated more easily (Havlíček et al. 2019). This is a proposed source of quantum advantage over classical routines in the case that the quantum circuit performing the encoding cannot be efficiently simulated classically (Liu et al. 2021). When trying to find suitable QML techniques for real-world data, it is important to use an advantageous encoding method for that data. A current active area of research is the search for methods to encode or represent different types of data in quantum devices, for example, finding techniques to represent two-dimensional images (Lisnichenko and Protasov 2022; Anand et al. 2022; West et al. 2022).

Many QML techniques, such as the quantum support vector machine (QSVM) (Havlíček et al. 2019), are motivated by the idea that encoding classical data into a higher dimensional quantum Hilbert space can simplify data classification. The circuit architecture used to encode classical data into a quantum state influences the class of functions that a QML algorithm can learn (Schuld et al. 2021). Therefore, it is critical to find an optimal quantum encoding for the data in any QML method. However, some QML techniques do not generalise well as the number of qubits, and hence, the dimensionality of the Hilbert space increases (Huang et al. 2021). To improve generalisation, attempts have been made to reduce the expressivity of QML methods by introducing some form of inductive bias into the method. These attempts include projected kernels, where only a limited number of qubits are measured, thus projecting to a lower-dimensional space (Kübler et al. 2021), introducing inductive bias through the tuning of quantum kernel hyperparameters (Shaydulin and Wild 2022), and variational state-based approaches that are capable of encoding inductive biases directly into quantum states, which have been shown to improve generalisation in the context of learning zero-sum games (Bowles et al. 2023). In this work, we present a method of introducing an inductive bias into a quantum encoding when the underlying data exhibits a permutation symmetry. By projecting our quantum-encoded state onto a symmetric subspace, this method exponentially reduces the encoding’s dimensionality, leading to improved generalisation in our experiments.

Fig. 1
figure 1

(a) Example point cloud generated using the Point-E demo by OpenAI using the prompt “Grand Piano” (Nichol et al. 2022). Distinguishing between different objects could be a possible classification task that uses point cloud data. (b) Demonstration of point permutation symmetry in the input for a point cloud. Changing the order of points in a point cloud does not have an effect on the point cloud itself. However, when stored as a classical input array in computer memory, exchanging point order produces a different array. Unless it has been purposely constructed otherwise, as is the case for PointNet (Qi et al. 2016), a general machine learning classifier function, denoted by f, may produce a different classification output given a different order permutation of the point order in its input \(f(x_1, y_1, z_1, x_2, y_2, z_2) \ne f(x_2, y_2, z_2, x_1, y_1, z_1)\)

In this study, we consider point cloud data types, which are three-dimensional images that consist of a set of three-dimensional points. The point cloud may represent various objects (e.g. piano, car, tree) that need to be classified. This could be in the context of identifying pedestrians in self-driving vehicles (Chen et al. 2021) or classifying different particle decay events in a high-energy physics experiment (Mikuni and Canelli 2021). Point clouds are a natural data type to study when investigating the effects of permutation invariant machine learning methods. While they may also occasionally exhibit internal data-specific symmetries such as rotation or translation symmetry, we focus here on their point order permutation symmetry when they are input into a model and present a method of encoding this symmetry into a quantum state. Permutation symmetry is a property point cloud data possesses that is not normally captured in a classical input vector. There is no inherent ordering to the points in a point cloud. Therefore, if an order is assigned to the points in a point cloud, then it should be invariant under any permutation of these point labels. Classical computers are generally forced to assign an order to the points \({\textbf {p}}_i\) when they are input, since the data must be stored in an array in memory that has an intrinsic ordering to the points. Consider the input array \([{\textbf {p}}_1, {\textbf {p}}_2]\); exchanging two points in this array will produce a different array \([{\textbf {p}}_2, {\textbf {p}}_1]\) which may give a different result when input into a given algorithm. In general, a machine learning model classifier model \(f([{\textbf {p}}_1, {\textbf {p}}_2])\), without purposeful construction, will return a different answer if given a different permutation of the same points in the input, \(f([{\textbf {p}}_1, {\textbf {p}}_2]) \ne f([{\textbf {p}}_2, {\textbf {p}}_1])\), while in reality the point cloud would be physically unchanged by this reordering. By creating an encoding that is invariant to the permutation of point ordering, we can exponentially reduce the effective dimensionality of the encoded states in a manner that respects an underlying symmetry of the data. An example point cloud is shown in Fig. 1 demonstrating the permutation invariance of points in the input.

There are other machine learning methods that aim to respect this permutation symmetry by building symmetric functions into the model, such as the max pool function in the classical PointNet (Qi et al. 2016) and the proposed quantum extension of PointNet (Shi et al. 2020). Similarly, techniques in geometric quantum machine learning have shown it is possible to construct a variational circuit that respects qubit permutation (and hence point permutation if one were to encode a single point per qubit) (Meyer et al. 2023; Nguyen et al. 2022; Schatzki et al. 2022; Kazi et al. 2023). The technique we discuss in this work differs in that we focus on implementing these symmetries into the encoding circuit, meaning that the classification part of the algorithm is free to take any form. Recent work suggests that permutation invariant operators, such as permutation equivariant variational circuits, may be classically tractable to simulate under certain conditions (Anschuetz et al. 2023). In our case, the classification part does not necessarily need to be permutation equivariant, as permutation invariance is captured in the encoding step. More broadly, the recent advancements in identifying the prerequisites for the emergence of barren plateaus (Fontana et al. 2023; Ragone et al. 2023) have prompted inquiries into the classical simulatability of variational quantum circuits devoid of such barren plateaus (Cerezo et al. 2023). In this work, we introduce techniques to include symmetry in the encoding step of the circuit, meaning the variational trainability/simulatability issue does not need to be considered here as we can avoid using a quantum variational classifier with this technique.

2 Permutation invariant encoding

In this section, we theoretically outline the structure and properties of a permutation invariant quantum state within the context of point cloud data. Point clouds were chosen as a natural use case of this encoding; however, this could be applicable to any data that exhibits permutation symmetry. We consider a point cloud data input, denoted as X, to be an array of values in the form \(X = [x_1, y_1, z_1, x_2, y_2, z_2,...,x_n, y_n, z_n]\). Each point cloud therefore consists of n points, where a single point in the point cloud can be denoted as \({\textbf {p}}_i = [x_i, y_i, z_i]\). Each point is first encoded into a quantum state \(|{\textbf {p}}_i\rangle \) using a quantum circuit U, consisting of k qubits, that maps three-dimensional classical data \({\textbf {p}}_i\) to a \(2^k\) dimensional quantum state, \(U: \textbf{R}^3 \xrightarrow {} \textbf{R}^{2^k}\). We implement this gate on an initial \(|0\rangle ^{\otimes k}\) state such that \(|{\textbf {p}}_i\rangle = U({\textbf {p}}_i) |0\rangle ^{\otimes k}\). While in our experiments U was implemented using angle encoding, this general technique could be used for any encoding strategy. To enforce the point-exchange invariance, we construct a state that is in a symmetric superposition of all \(|{\textbf {p}}_i\rangle \) permutation states. For a point cloud X with only two points, \(n = 2\), this can be represented as

$$\begin{aligned} |X_s\rangle = \mathcal {N}(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle ), \end{aligned}$$
(1)

which is invariant under permutation of the order of the two points. Regarding the normalisation constant \(\mathcal {N}\), it should be noted that depending on the data and the encoding method U, that the initial states \(|{\textbf {p}}_1\rangle \) and \(|{\textbf {p}}_2\rangle \) may or may not be orthogonal. Hence, the normalisation constant \(\mathcal {N}\) for the 2 qubit case, in general, is

$$\begin{aligned} \mathcal {N} = \frac{1}{\sqrt{2(1 + |\langle {\textbf {p}}_1 |{\textbf {p}}_2 \rangle |^2)}}. \end{aligned}$$
(2)

The point order invariant encoded state \(|X_s\rangle \) can then be evaluated using techniques such as QSVM or passed to a variational method. As the input quantum state is now in a permutation invariant state, the quantum classification algorithm \(g(|X_s\rangle )\) is free to have any design and we are guaranteed to have permutation invariance under point order permutation as \(g\Big ( \mathcal {N}(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle ) \Big ) = g\Big ( \mathcal {N}(|{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle + |{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle ) \Big )\) regardless of the structure of the quantum classification function g. This contrasts to a general machine learning classifier \(f([{\textbf {p}}_1, {\textbf {p}}_2])\), accepting its input as a classical array, that may give different results depending on the order of points in the input such that \(f([{\textbf {p}}_1, {\textbf {p}}_2]) \ne f([{\textbf {p}}_2, {\textbf {p}}_1])\). In the case of n points, we wish to construct the symmetric state defined by

$$\begin{aligned} |X_s\rangle = \mathcal {N}_n \sum _{\sigma \in S_n}|{\textbf {p}}_{\sigma _1}\rangle |{\textbf {p}}_{\sigma _2}\rangle ...|{\textbf {p}}_{\sigma _n}\rangle , \end{aligned}$$
(3)

where we sum over all permutations in the symmetric group \(S_n\). This symmetric state will be identical under any permutation of points.

By combining the quantum states into a symmetric superposition state, we utilise the inherent quantum property of state superposition to implement an underlying symmetry of the data structure into the encoding. This symmetry exploitation allows for a reduction in the expressivity of the encoding, by exponentially reducing the effective dimensionality of the state. This can be demonstrated by considering a three-qubit state. In general, a three-qubit state is \(2^3 = 8\) dimensional and can be written as

$$\begin{aligned} \begin{aligned} |\psi \rangle =&\,\alpha _0 |0\rangle |0\rangle |0\rangle \\&+\alpha _1 |1\rangle |0\rangle |0\rangle + \alpha _2 |0\rangle |1\rangle |0\rangle + \alpha _3 |0\rangle |0\rangle |1\rangle \\&+\alpha _4 |1\rangle |1\rangle |0\rangle + \alpha _5 |1\rangle |0\rangle |1\rangle + \alpha _6 |0\rangle |1\rangle |1\rangle \\&+\alpha _7 |1\rangle |1\rangle |1\rangle . \end{aligned} \end{aligned}$$
(4)

If we now insist that the state \(|\psi \rangle \) is fully symmetric with respect to its qubits, then by exchanging qubits in Eq. 4 and ensuring that the state remains unchanged under this action, it can be seen that \(\alpha _1 = \alpha _2 = \alpha _3\) and \(\alpha _4 = \alpha _5 = \alpha _6\) (Barenco et al. 1996). Hence, a three-qubit symmetric quantum state is effectively 4 dimensional with the following basis states

$$\begin{aligned} \begin{aligned} |\psi \rangle =&\,\beta _0 |0\rangle |0\rangle |0\rangle \\&+\beta _1 (|1\rangle |0\rangle |0\rangle + |0\rangle |1\rangle |0\rangle + |0\rangle |0\rangle |1\rangle ) \\&+\beta _2 (|1\rangle |1\rangle |0\rangle + |1\rangle |0\rangle |1\rangle + |0\rangle |1\rangle |1\rangle ) \\&+\beta _3 |1\rangle |1\rangle |1\rangle . \end{aligned} \end{aligned}$$
(5)

It follows that an n qubit system that is permutation invariant with respect to its qubits has dimension \(n+1\), which is exponentially smaller than \(2^n\) (Barenco et al. 1996). In the general case, where each initial state \({\textbf {p}}_i\) has k qubits, it has been shown that the dimension of this symmetric state is

$$\begin{aligned} ^{n+2^k-1}C_{2^k - 1} = \frac{1}{(2^k-1)!}n^{2^k - 1} + O(n^{2^k - 2}), \end{aligned}$$
(6)

which exhibits polynomial scaling in n, in contrast to the general case where the dimension is \((2^k)^n\) and the dimension scales exponentially (Barenco et al. 1996). Note that if the data requires, such as in cases of underfitting the training data, we retain the ability to reintroduce some expressivity through breaking of the symmetry or by increasing the number of qubits k used per point.

For the theoretical results in this work, we used Qiskit \(\textit{statevector\_simulator}\) (Abraham et al. 2019) along with an analytical symmetrisation process as described in Algorithm 1 that mathematically constructs the permutation invariant quantum states. A discussion around possible implementations of this procedure on a real quantum machine is the focus of Sect. 4.

Algorithm 1
figure a

Encode point cloud data with point permutation invariance using Qiskit \(\textit{statevector\_simulator}\) and analytical symmetrisation. This algorithm demonstrates the mathematical structure of the encoding intuitively at the cost of being computationally inefficient by containing \(\mathcal {O}(n!)\) classical computations.

3 Methodology

We compare the performance of various quantum and classical machine learning techniques when applied to the classification of two different point cloud data distributions: a sphere and a torus. Numerical results were found using Qiskit \(\textit{statevector\_simulator}\) (Abraham et al. 2019) for the initial point encodings, followed by standard operations to construct the symmetrised states and kernel entry matrices, as will be further detailed in this section.

3.1 Dataset specifications

To generate the data, we randomly sample n points from the surface of each shape to form a point cloud for each distribution. This sampling process is repeated until there are N point cloud samples in total, dividing the resulting data into training and testing sets with 80% and 20% of the data, respectively. The performance of various models is then evaluated on the testing set. The entire process is then repeated with new randomly generated datasets and we record the average test accuracy for 10 repeated experiments.

The sphere and torus distributions were both centred at the origin. To ensure that the sphere and torus distributions are as similar as possible, the torus was scaled such that the average magnitude of the points that lie on the torus distribution surface matches the radius of the sphere. An illustration of the two distributions and an example point cloud sample is shown in Fig. 2. All data was normalised to be in the range \(\frac{\pi }{2}\) to \(-\frac{\pi }{2}\) to allow it to be encoded as rotation angles.

Fig. 2
figure 2

(Left) Sphere distribution and (right) torus distribution surfaces from which points are sampled to form point clouds. The red points in each figure represent an example point cloud with \(n = 5\) points drawn from the corresponding distribution. Each dataset is generated by randomly sampling these distributions to create a set of N different point clouds

3.2 Encoding process for individual points

We tested a variety of quantum and classical techniques, summarised in Table 1, where some of the algorithms contain permutation invariance and others do not. In order to focus entirely on the encoding method, without having to consider the structure of a variational component, we used a QSVM to classify the data with various choices of encoding circuits. For the order permutation invariant encoding, we tested several different point encoding circuits U for encoding the individual points, showing results for the best-performing circuit denoted \(U_{\alpha }\) alongside a more generic point encoding circuit that uses the instantaneous quantum polynomial (IQP) encoding (Havlíček et al. 2019), denoted by \(U_{\beta }\). We provide a brief description of the different methods reported:

  • Permutation Invariant QSVM (Best) — a QSVM with the permutation invariant encoding method using the point encoding circuit \(U_{\alpha }\) that provided the highest accuracy, as shown in Fig. 3.

  • Permutation Invariant IQP QSVM — a QSVM with the permutation invariant encoding method where the point encoding circuit \(U_{\beta }\) is an IQP encoding, as shown in Fig. 4. This is to provide a more fair comparison between the regular IQP encoding and the invariant encoding.

  • IQP Encoding QSVM — a QSVM using the IQP encoding applied to all variables in the input (without permutation invariance) as described by Havlíček et al. (2019).

  • PointNet — classical point cloud classifier algorithm utilising neural networks with a symmetric max pool function to ensure point order permutation invariance (Qi et al. 2016). PointNet was run over 100 training epochs.

  • RBF Kernel SVM — classical SVM using the radial basis function (RBF) kernel. Hyperparameters were optimised using grid search over a cross-validation set.

Fig. 3
figure 3

a Pink boxes represent parameterised Z rotation gates. b Single layer of the best-performing point encoding circuit \(U_{\alpha }\) found for the sphere/torus dataset during our investigation, which is used for results titled Permutation Invariant QSVM (Best)

Table 1 Summary of the various algorithms tested indicating whether they contain permutation invariance in their design and whether they are quantum or classical approaches

Our results were obtained from noiseless simulations using Qiskit \(\textit{statevector\_simulator}\) (Abraham et al. 2019). This allows us to obtain a quantum state \(|{\textbf {p}}_i\rangle \) for each point in the point cloud. Notably as we only simulate a single point at a time, and each point uses k qubits, the scaling of this simulation step is at worst \(\mathcal {O}(n 2^k)\) still maintaining a linear scaling in the number of points n. A fully quantum implementation would scale \(\mathcal {O}(nk)\) in the number of qubits used.

Fig. 4
figure 4

The point encoding circuit \(U_{\beta }\) that uses the IQP encoding (Havlíček et al. 2019), which corresponds to results titled Permutation Invariant IQP QSVM. The entanglement function was defined as \(f(x,y)=\frac{1}{\pi }(\pi -x)(\pi -y)\)

3.3 Symmetric state preparation

While a quantum circuit capable of probabilistically preparing symmetric superposition states is shown in Sect. 4, for relatively small-scale classical simulations, a brute force method can be used to calculate all the permutations of the quantum states \(|{\textbf {p}}_i\rangle \), sum them together, and then normalise the resulting state to find the symmetrised state

$$\begin{aligned} |X_s\rangle = \mathcal {N}_n \sum _{\sigma \in S_n}|{\textbf {p}}_{\sigma _1}\rangle |{\textbf {p}}_{\sigma _2}\rangle ...|{\textbf {p}}_{\sigma _n}\rangle . \end{aligned}$$
(7)

This brute force classical technique scales as \(\mathcal {O}(n!)\), which would quickly become infeasible for large n. The fully quantum implementation discussed in Sect. 4 utilises \(kn + \frac{1}{2}kn(kn-1)\) qubits and hence would exhibit scaling of \(\mathcal {O}(k^2n^2)\) which is more feasible for large n than classical alternatives. An extra consideration in a real quantum device is that the state is only prepared with a certain success probability; hence, the true scaling of the quantum model may be worse than \(\mathcal {O}(k^2n^2)\). However, our empirical investigation in Appendix F suggests the additional scaling would not be worse than polynomial.

3.4 Quantum support vector machine integration

After preparing a symmetrised quantum state \(|X_s\rangle \), the point cloud X has successfully been encoded in a manner which is invariant to permutations of the point orderings. This is a key novel proposal of this work, as it subsequently allows any classification algorithm that utilises this symmetric state as an input to remain permutation invariant. This is different to many other approaches in the literature that often rely on utilising a permutation equivariant encoding followed by a permutation equivariant variational ansazt to guarantee label invariance in the overall model (Meyer et al. 2023; West et al. 2024).

Fig. 5
figure 5

The average accuracy over 10 repeated experiments as the number of points in each point cloud increases. Shaded regions indicate the error bounds on the average accuracy. Each experiment consists of a random dataset sample of 500 point clouds. Each point cloud is generated by randomly sampling a number of points from either the torus or the sphere distribution. The training and testing data contained 80% and 20% of the total data respectively

As we have implemented permutation invariance into the quantum encoding step itself, we are able to utilise any classification technique and retain permutation invariance. For this study, we chose to use a quantum support vector machine (QSVM) to perform the classification. A support vector machine (SVM) works by finding a hyperplane that maximally separates data x that has been encoded into some higher dimensional space as \(\phi (x)\). A crucial feature of a SVM is that the explicit form of all \(\phi (x_i)\) need not be known, only the inner product between them for all data points \(K_{i,j} = \phi (x_i)^T \phi (x_j)\), with the entries forming a matrix called the kernel matrix. It has been proposed that QSVMs can perform classification by using the overlap of quantum states as the kernel entries (Havlíček et al. 2019). In this case, the kernel entries are given by \(K_{i,j} = |\langle \psi (X_i) |\psi (X_j) \rangle |^2\). Once we have prepared the symmetric quantum states \(|X_i\rangle \) for each data point \(X_i\), then we can calculate the kernel entries by using a swap test, or otherwise, to calculate \(K_{i,j} = |\langle X_i |X_j \rangle |^2\) (Havlíček et al. 2019). In the case of classical simulations, we can directly calculate the inner product between the vector representation of the quantum states, which in general would scale as \(\mathcal {O}(2^n)\).

3.5 Results

Figure 5 displays how the various techniques scale as the number of points in the point cloud increases. Although more points provide more information, the non-invariant IQP QSVM’s performance decreases as the number of points increases. This finding is consistent with previous research indicating that generic QSVM methods may struggle to generalise as the number of qubits increases (Huang et al. 2021). In contrast, the permutation invariant IQP encoding exhibits an improvement in accuracy as the number of points increases. This result indicates that the symmetrisation technique may help prevent poor scaling due to the reduced expressivity in the encoding. Additionally, our tests reveal that using the best encoding circuit \(U_{\alpha }\) produces a classifier that can outperform the classical PointNet algorithm for this dataset. This is further demonstrated in the results depicted in Table 2 which shows that for small datasets, the permutation invariant quantum encoding outperforms both non-invariant quantum/classical methods and the permutation invariant classical method PointNet.

Table 2 The average accuracy over 10 repeated experiments
Fig. 6
figure 6

The average accuracy over 10 repeated experiments as the number of samples in the training and testing dataset increases. Shaded regions indicate the error bounds on the average accuracy. Each experiment consists of a random dataset sample of point clouds. Each point cloud is generated by randomly sampling 3 points from either the torus or the sphere distribution. The training and testing data contained 80% and 20% of the total data respectively

The effect of increasing the size of the data sample is shown in Fig. 6. In this case, more data is available, but the number of points, and thus qubits is fixed. All algorithms tested generally improve with more data samples, as expected. Comparing again to Fig. 5 shows that the IQP encoding, while improving with more data samples generally, is specifically struggling when there is an increase in qubits. This problem is not apparent with the permutation invariant encodings.

For larger data samples, PointNet starts to approach the accuracy of the permutation invariant QSVM encoding. It is worth noting that PointNet uses deep neural networks consisting of a total of 3.5 million parameters that can be better utilised with a large amount of training data. There are also additional aspects to the PointNet algorithm that tackle geometric symmetries, such as rotational invariance, in point cloud data that have not been implemented in this quantum approach. Implementing these geometric symmetries into the encoding could be a subject of further investigation.

3.6 Quantum errors

If during the encoding step the points are subject to a source quantum noise, then it is expected that the effectiveness of the overall classification will decrease. In order to investigate this effect, we considered point clouds where the individual points had initially been encoded into a state \(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle ...|{\textbf {p}}_n\rangle \). We then introduced random errors to these initial point states by generating a random Hermitian matrix \(H_\gamma \) and subsequently applying the unitary matrix

$$\begin{aligned} U_\gamma = e^{i \epsilon H_\gamma }, \end{aligned}$$
(8)

where \(\epsilon \) parameterises the magnitude of the error. We generate n different unitaries in this manner and one to each qubit in the state \(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle ...|{\textbf {p}}_n\rangle \).

Figure 7 shows how the QSVM classifier performs after errors have been applied to the initial states, either with or without the symmetrisation process applied to the state \(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle ...|{\textbf {p}}_n\rangle \). It can be seen that the advantage obtained by the symmetrisation procedure is diminished at a certain threshold of noise in the initial point states. This suggests a possible drawback of the symmetrisation method in this work is that it may not exhibit an advantage if the initial input states are accompanied by a certain amount of noise.

Fig. 7
figure 7

Mean test accuracy of the QSVM classifier as the noise applied to the initial point states is increased. Blue indicates that the permutation invariant symmetrisation suggested in this work was performed; green indicates no symmetrisation procedure was used (the encoding process simply consisted of the initial point states in one particular order). Mean test accuracy reported is the mean average of ten repeated experiments, with the shaded region indicating uncertainty of the mean. The points were encoded using the IQP encoding as described in Sect. 3. A total of 200 point cloud samples were used per experiment

4 Symmetric state projection circuit implementation

The results in Sect. 3 suggest that permutation invariant encodings could be useful for point cloud data. These results were however created using analytical simulations, which required \(\mathcal {O}(n!)\) classical operations. We now present a discussion on the practical aspects of implementing them the encoding directly onto a real quantum device.

4.1 Symmetric projection for two points

Initially let us consider a point cloud X consisting of only two points, \({\textbf {p}}_1\) and \({\textbf {p}}_2\). As described previously, these points are sepately encoded into quantum states represented by \(|{\textbf {p}}_1\rangle \) and \(|{\textbf {p}}_2\rangle \). It has been shown that the symmetrisation process \(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle \Rightarrow \mathcal {N}(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle )\) cannot be done perfectly by a unitary transformation (Buzek and Hillery 2000). However, it can be implemented in a probabilistic manner using controlled swap gates and ancilla qubits (Barenco et al. 1996). For a point cloud data sample that contains only two points, we start with each point having been encoded in a separate quantum state \(|{\textbf {p}}_1\rangle = U({\textbf {p}}_1) |0\rangle ^{\otimes k}\) and \(|{\textbf {p}}_2\rangle = U({\textbf {p}}_2) |0\rangle ^{\otimes k}\), using an encoding circuit U. The symmetrisation procedure needs to produce the state

$$\begin{aligned} |X_s\rangle = \mathcal {N}(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle ), \end{aligned}$$

such that exchanging the ordering of the two points in the input will leave this new quantum state invariant. This can be achieved in the two-qubit case by preparing an ancilla qubit using a Hadamard gate in the state \(\frac{1}{\sqrt{2}}(|0\rangle + |1\rangle )\) and using it to apply a controlled swap gate to the two input states \(|{\textbf {p}}_1\rangle \) and \(|{\textbf {p}}_2\rangle \), followed by another Hadamard gate applied to the ancilla qubit. This action leaves the system in the state

$$\begin{aligned} \frac{1}{2}|0\rangle (|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle ) + \frac{1}{2}|1\rangle ((|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle - |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle ), \end{aligned}$$
(9)

which contains both the permutation symmetrised and anti-symmetrised states. By measuring the ancilla qubit and discarding any result when the ancilla qubit is in the \(|1\rangle \) state (corresponding to the anti-symmetric state), we arrive at the permutation symmetrised state whenever the ancilla qubit is measured in the \(|0\rangle \) state. Inspecting Eq. 9, it can be seen that the probability of measuring the ancilla qubit in the desired \(|0\rangle \) state is \(\frac{1}{2}(1 + |\langle {\textbf {p}}_1 |{\textbf {p}}_2 \rangle |^2)\) (Buzek and Hillery 2000), which means that in the worst-case scenario, when the input states are orthogonal, the probability is \(\frac{1}{2}\). This symmetrisation procedure for a two-qubit system is illustrated in Fig. 8.

Fig. 8
figure 8

Permutation symmetrisation circuit for two points. The circuit consists of a \(|p_1\rangle \) state, a \(|p_2\rangle \) state, and an ancilla qubit that performs a controlled swap operation. The final state of this circuit is given by \(|X\rangle = \frac{1}{2}|0\rangle (|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle ) + \frac{1}{2}|1\rangle (|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle - |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle )\). This state is symmetric when the ancilla qubit is measured in the state \(|0\rangle \) and anti-symmetric when it is measured in the state \(|1\rangle \). By measuring the ancilla qubit and discarding any measurements in the state \(|1\rangle \), we are left with the desired symmetric quantum state \(|X_s\rangle = \mathcal {N}(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle )\)

4.2 Generalisation to n points

This procedure can be generalised to n qubits using the technique outlined by Barenco et al. (1996), which involves the iterative application of controlled swap symmetrisation operations using ancilla qubits. In this technique, we group up the ancilla qubits into \(n-1\) collections. The collection indexed by f will contain f ancilla qubits. The unitary operator \(V_f\) prepares f ancilla qubits into an equal superposition of all states with Hamming weight 0 and 1. This state can be written as

$$\begin{aligned} \frac{1}{\sqrt{f + 1}}(|00...0\rangle + |10...0\rangle + |01...0\rangle + ... + |00...1\rangle ). \end{aligned}$$
(10)

This is carried out for the collections labelled from \(f=1\) to \(f=n-1\); hence, in total, there are \(c = \frac{1}{2}n(n-1)\) ancilla qubits. Construction of the gates that can implement \(V_f\) is discussed in Appendix E.

Fig. 9
figure 9

Generalisation of the permutation symmetrisation process, as proposed by Barenco et al. (1996), applied to a circuit containing four input states. The unitary operators \(V_3\) perform a transformation on the ancilla qubits such that they are in the state \(\frac{1}{\sqrt{4}}(|000\rangle + |100\rangle + |010\rangle + |001\rangle )\). Similarly, \(V_2\) prepares \(\frac{1}{\sqrt{3}}(|00\rangle + |10\rangle + |01\rangle )\) and \(V_1\) prepares \(\frac{1}{\sqrt{2}}(|0\rangle + |1\rangle )\). Through implementing controlled swap gates, these ancilla qubits will perform every possible permutation of the input states. This results in an equal superposition of every permutation of the input states after the controlled swap gates are applied and the ancilla qubits, after having \(V^\dagger \) applied, are measured to be in the \(|000000\rangle \) state. This process can be extended to any number of input states

These collections of ancilla qubits are then used to apply controlled swap gates onto the input states. This is done in an iterative manner. Unitary \(V_1\) will prepare the state \(\frac{1}{\sqrt{2}}(|0\rangle + |1\rangle )\), which will control a swap gate between states \(|{\textbf {p}}_1\rangle \) and \(|{\textbf {p}}_2\rangle \). This produces \(\frac{1}{\sqrt{2}}(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle )\) as a final outcome. Considering the group elements of \(S_n\), we could consider the action of this ancilla as acting on the initial state with a combination of \(I + \sigma _{12}\). Subsequently, unitary \(V_2\) prepares the state \(\frac{1}{\sqrt{3}}(|00\rangle + |01\rangle + |10\rangle )\), which controls swaps between the the third qubit and the first two in the state \(\frac{1}{\sqrt{2}}(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle ) \otimes |{\textbf {p}}_3\rangle \). This resorts in the creation of the state

$$\begin{aligned} \frac{1}{\sqrt{6}}\big ( |{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle |{\textbf {p}}_3\rangle + |{\textbf {p}}_3\rangle |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle + |{\textbf {p}}_1\rangle |{\textbf {p}}_3\rangle |{\textbf {p}}_1\rangle \\ + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle |{\textbf {p}}_3\rangle + |{\textbf {p}}_3\rangle |{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_3\rangle |{\textbf {p}}_1\rangle \big ). \end{aligned}$$

To get a clearer view, one may consider these two steps as applying the permutations

$$\begin{aligned} (I + \sigma _{12})(I + \sigma _{13} + \sigma _{23}) \!=\! I + \sigma _{12} + \sigma _{13} + \sigma _{23} + \sigma _{312} +\sigma _{231}. \end{aligned}$$
(11)

As we iteratively multiply by the sum of all two-point permutations for each new point added in the iteration, the end result is a combination of all possible permutations in \(S_n\), providing an equal superpositon of all permutations.

This process is followed by an application of the inverse of the unitary operations \(V_f^\dagger \) on the ancilla qubits before their measurement. This recursively applies every possible qubit permutation to the input states, resulting in the desired symmetric superposition state when the ancilla qubits are measured in the state \(|0\rangle ^{\otimes c}\). An example circuit for a four-qubit scenario is shown in Fig. 9 and an example of the generalisation to n qubits is shown in Fig. 10. When each state is composed of multiple qubits, no extra ancilla qubits are necessary. Instead, additional controlled swap gates are applied to the extra qubits in the same manner (Buzek and Hillery 2000). This is demonstrated in Fig. 11, which shows the symmetrisation of a two-dimensional point cloud consisting of only two points.

Fig. 10
figure 10

Generalisation of the permutation symmetrisation process, as proposed by Barenco et al. (1996), applied to a circuit containing n input states. The unitary operators \(V_{n-1}\) perform a transformation on the ancilla qubits such that they are in the state with an equal superposition of all basis states with hamming weight zero or one. If the ancilla qubits are measured to be in the zero state, then the input states will have been initialised into a superpositon of every possible permutation

In general, one will require \(\frac{1}{2}n(n-1)\) ancilla qubits and \(\frac{1}{2}n(n-1)\) controlled swap gates to perform the symmetrisation. We are also required to implement unitary gates \(V_f\) on the ancilla qubits which can be created out of a single qubit gate and \(f-1\) two qubit gates. Hence counting the \(V_f^\dagger \) gates as well, we would require \(n(n-1)\) two qubit gates to deal with the ancilla unitaries. Overall the number of gates and qubits for the quantum algorithm scales \(\mathcal {O}(n^2)\).

One drawback of this probabilistic implementation is the need to discard any states when the ancilla qubits are not in the state \(|0\rangle ^{\otimes c}\). The probability of this happening depends on the states themselves, with the probability being 1 if the input states happen to be all identical, and decreasing as the overlap between states decreases. The probability of measuring the ancilla qubit in the state \(|0\rangle \) for the case of two qubits can be shown to be \(\frac{1}{2}(1 + |\langle {\textbf {p}}_1 |{\textbf {p}}_2 \rangle |^2)\) (Buzek and Hillery 2000). In this work, we utilised angle encoding for the point encoding circuit U to produce the input states \(|{\textbf {p}}_1\rangle \) and \(|{\textbf {p}}_2\rangle \). This technique results in a relatively low probability for two specific data points to be orthogonal. As a result, the average probability of success remains relatively high even with an increase in the number of data points n. Additional details and supporting evidence for this claim are presented in Appendix F.

Although this method would be viable for small point clouds, the scaling of the probabilistic ancilla-based symmetrisation becomes problematic when a point cloud consists of a large number of points. This is because as the number of ancilla qubits increases, so does the number of gates in the circuit, as well as the number of states that need to be discarded. An alternative approach could be to approximately implement these states by producing states of the form \(|X_e\rangle \approx \mathcal {N}_e(|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle + |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle )\). This could potentially be achieved using techniques such as genetic state preparation algorithms (Creevey et al. 2023) or quantum generative adversarial networks (Zoufal et al. 2019). Creating approximate symmetric states within some error \(\epsilon \) could even be used to introduce a parameterised symmetry-breaking term that may help fine-tune the model, as has been shown to be useful in some variational quantum eigensolvers (Park 2021). A parameterised method to increase the dimensionality of the problem through symmetry breaking would allow control over the expressibility of the encoding, as well as potentially making the encoding harder to simulate classically.

Fig. 11
figure 11

Permutation symmetrisation of a 2-dimensional point cloud consisting of two points. On the left side, we first create an entangled state for each point using an encoding function U. On the right-hand side, we implement the symmetrisation process for a two-state system. If we discard any measurements in which the ancilla qubit is in the state \(|1\rangle \), we will have prepared a symmetric encoding of this point cloud

Alternatively, there is also the possibility of not discarding any states and instead using other superposition states (mixtures of anti-symmetric and symmetric permutation terms with respect to different points). For example, in the two-point case, one could accept the anti-symmetric state \(|X_a\rangle = \mathcal {N}_a (|{\textbf {p}}_1\rangle |{\textbf {p}}_2\rangle - |{\textbf {p}}_2\rangle |{\textbf {p}}_1\rangle )\). This state will exhibit a quasi-symmetry through the fact that permuting points results in a phase shift in the quantum state, but keeps their relative magnitudes intact. The effect that quasi-symmetric or approximately symmetric encodings would have on a QML technique, especially in situations where the model may be able to learn to overcome this peculiarity, is a possible subject for further research.

5 Conclusion

This study presents a method for encoding point cloud data into a quantum state that is invariant under point order permutation, using a symmetrisation process to create a quantum superposition of all order permutations of the points. This exponentially reduces the dimensionality of the encoding, leading to an encoding that exhibits better generalisation. This was demonstrated by noting that the permutation invariant IQP encoding accuracy scaled up as the number of points increased, whereas a non-invariant IQP encoding performed worse as the number of points increased. These findings suggest that this method may be a promising solution to the problem of QSVM generalisation performance worsening with increasing qubits (Kübler et al. 2021; Huang et al. 2021), and may have potential applications in future QML algorithms in various fields such as object recognition and particle physics.

This work demonstrates an encoding method to improve generalisation for permutation invariant data; however, it does not necessarily guarantee a quantum advantage exists. Recently there have been results showing that qubit permutation invariant operators may be classically tractable under certain conditions due to their reduction in dimensionality (Anschuetz et al. 2023). A key difference between our work and methods that utilise equivariant variational models (Meyer et al. 2023; Nguyen et al. 2022; Schatzki et al. 2022; Kazi et al. 2023) is that the method presented in this paper implements permutation invariance directly into the encoding step, without considering the classification model. Hence, the variational model does not need to be permutation invariant in order to capture the symmetry. Further investigation into methods of efficient implementation could help improve the technique by overcoming problems such as the \(\mathcal {O}(n^2)\) scaling of ancilla qubits in the probabilistic symmetrisation circuit (Barenco et al. 1996). Additionally, there have recently been efficient methods demonstrated for finding expectation values of symmetric states that could be of use in this technique (Zhang and Tong 2023).

Future work could focus on extending this encoding to other types of symmetries, such as rotational symmetry or translation symmetry, which could be relevant for point clouds and other data types such as images or time series. There is also the possible challenge of finding an efficient implementation on real devices and assessing how the encodings will perform in the presence of noise. The technique suggested here could be used for any data that exhibits permutation invariance, including cases where the input data itself is quantum. Real applications will be dependent on the rate of technological advancement of quantum machines; however, near-term use cases could focus on point clouds with a small number of points, as is often the case in particle physics data.