A quantum k-nearest neighbors algorithm based on the Euclidean distance estimation

Zardini, Enrico; Blanzieri, Enrico; Pastorello, Davide

doi:10.1007/s42484-024-00155-2

A quantum k-nearest neighbors algorithm based on the Euclidean distance estimation

Research Article
Open access
Published: 23 April 2024

Volume 6, article number 23, (2024)
Cite this article

Download PDF

You have full access to this open access article

Quantum Machine Intelligence Aims and scope Submit manuscript

A quantum k-nearest neighbors algorithm based on the Euclidean distance estimation

Download PDF

523 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

The k-nearest neighbors (k-NN) is a basic machine learning (ML) algorithm, and several quantum versions of it, employing different distance metrics, have been presented in the last few years. Although the Euclidean distance is one of the most widely used distance metrics in ML, it has not received much consideration in the development of these quantum variants. In this article, a novel quantum k-NN algorithm based on the Euclidean distance is introduced. Specifically, the algorithm is characterized by a quantum encoding requiring a low number of qubits and a simple quantum circuit not involving oracles, aspects that favor its realization. In addition to the mathematical formulation and some complexity observations, a detailed empirical evaluation with simulations is presented. In particular, the results have shown the correctness of the formulation, a drop in the performance of the algorithm when the number of measurements is limited, the competitiveness with respect to some classical baseline methods in the ideal case, and the possibility of improving the performance by increasing the number of measurements.

An Improved Quantum Nearest-Neighbor Algorithm

Effect of Different Encodings and Distance Functions on Quantum Instance-Based Classifiers

Image classification based on quantum K-Nearest-Neighbor algorithm

Article 03 August 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Quantum machine learning (QML) is one of the most recent and most popular directions of scientific investigation in the area of quantum computing. In particular, the application of quantum computation to machine learning (ML) tasks offers some interesting solutions characterized by a quantum advantage with respect to the classical counterparts, at least on a theoretical level. Furthermore, QML seems to be a good way to exploit existing prototypes of quantum computers for tackling real-world problems. In this sense, a general “practical” approach consists in executing quantum algorithms as subroutines of more complex learning schemes, in which a quantum machine is used as a co-processor within a hybrid architecture. This approach is an interesting alternative to the development of quantum algorithms that fully accomplish ML tasks under the (strong) assumptions of ideality and universality of the quantum hardware.

In the last decade, several interesting QML algorithms have been proposed and characterized from a theoretical viewpoint; sometimes, they have been also empirically tested. Remarkable examples are the quantum SVM proposed by Rebentrost et al. (2014), the distance-based classifiers like the one defined by Schuld et al. (2017), and the quantum neural networks, whose performance have been discussed by Abbas et al. (2021). In particular, several quantum versions of the k-nearest neighbors (k-NN) algorithm have been proposed (see Section 2). In ML, the k-NN is a very simple and widely used classification algorithm that assigns a label to an unclassified data instance according to the labels of the k nearest training instances. To do so, a suitable reference distance in the space in which the data are represented must be selected. In the classical realm, typical choices are the Hamming distance and the Euclidean distance; instead, in the quantum realm (considering only the quantum k-NNs), the Euclidean distance has not received much consideration. In this work, we propose a quantum k-nearest neighbors algorithm in which the calculation of the Euclidean distances is based on a novel quantum encoding with low qubit requirements and a simple quantum circuit, making the implementation particularly advantageous. As with other algorithms involving the quantum computation of the Euclidean distance (e.g., by means of the SWAP test), an exponential speedup over a classical calculation can be obtained assuming the availability of a quantum random access memory (QRAM, Giovannetti et al. 2008) for data retrieval. Otherwise, there is not a true quantum advantage in terms of time complexity. From a practical viewpoint, in this article, we analyze the performance of the proposed quantum k-NN in terms of classification accuracy and correctness of the nearest neighbors found (evaluated through the Jaccard index). The algorithm has been implemented using Qiskit and run with three different execution modalities: classical, statevector, and simulation. Instead, the empirical evaluation on a real quantum machine was prevented by the number of qubits required by the considered experiments.

The article is structured as follows: Section 2 provides some background information; Section 3 presents the new quantum k-nearest neighbors algorithm based on the Euclidean distance metric; Section 4 deals with the implementation of the algorithm; Section 5 describes the experimental evaluation and the results obtained; Section 6 concludes the article.

2 Background

This section presents background information about quantum machine learning, the quantum k-nearest neighbors algorithms available in the literature, and the usages of the (squared) Euclidean distance in the field of quantum machine learning.

2.1 Quantum machine learning

In general, ML is the automation of methodologies for extracting information from collected data. If the data analysis techniques are implemented on conventional digital computers, we refer to classical ML; if quantum machines are employed, we refer to quantum ML. A general reason justifying the efforts in developing new QML schemes is suggested by Biamonte et al. (2017): since (even small) quantum systems are difficult to simulate with classical computers, we can conjecture that (even small) quantum processors can find structures in data that are difficult to discover classically. Therefore, QML could be the right path towards non-trivial applications of the small-scaled quantum machines available today and in the near future. On the other hand, under strong assumptions of universality, large scale, and fault tolerance, it is possible to formulate several QML algorithms that outperform their classical counterparts. This is very important for the comprehension of the foundations of quantum computing and for showing the actual potential of quantum computers. However, to promote the advent of quantum technologies in the near term, taking into account the limitations of the current quantum hardware is useful while seeking new QML schemes.

From the mathematical viewpoint, there is another relevant motivation for developing ML algorithms to be executed by quantum machines, given by a formal analogy between quantum mechanics and ML: both fields rely on matrix operations in high-dimensional vector spaces. In practice, the Hilbert spaces, in which physical quantum systems are described, can be used as feature spaces for data representations. In this framework, linear algebraic operations are physically realized by the time evolution of quantum states; for instance, in the circuit model of quantum computation, the evolution is described as the action of quantum gates, i.e., unitary operators. In addition, representing data into quantum states is advantageous also in terms of space resources, since the dimension of the Hilbert space of a multi-qubit system is exponential in the number of qubits. Then, the controlled dynamics of a small number of qubits towards a target state may correspond to the application of a complex linear algebraic operation on the considered feature space.

A crucial notion in QML is quantum encoding, which is any procedure that encodes classical data (e.g., a list of symbols) into quantum states. In particular, loading efficiently large amounts of data into quantum architectures is a serious bottleneck at the current status of QML; indeed, the state preparation required for running several well-known QML algorithms can be done efficiently only under the strong assumption of the availability of a QRAM. More in detail, given an n-qubit register, let $\{|{i}\rangle \}_{i=0,...,2^n-1}$ be a fixed orthonormal basis of the corresponding Hilbert space that we call computational basis. The simplest quantum encoding is the basis encoding, in which the bit strings of length n are encoded into the states that form the computational basis. Therefore, n qubits are used to encode n bits of classical information with interesting quantum opportunities, like creating superpositions of data and enabling non-classical correlations via entanglement. Instead, a more efficient quantum encoding in terms of space resources is the amplitude encoding, in which a data instance represented by a normalized complex vector $\textbf{x} \in \mathbb {C}^{2^n}$ is encoded into the coordinates (or amplitudes) of a quantum state with respect to the computational basis, namely,

$$\begin{aligned} |\psi \rangle =\sum _{i=0}^{2^n-1}x_i |i\rangle \qquad (n\text {-qubit state}). \end{aligned}$$

The amplitude encoding exploits the exponential storing capacity of a quantum memory, but it does not allow the direct retrieval of the stored data. Indeed, the amplitudes cannot be observed, and only the probabilities $|x_i|^2$ can be estimated. The encoding procedure used in this work is based on amplitude encoding.

Let us conclude this introductory section by arguing that QML is probably the most promising way to find out effective applications of the existing small-scale quantum computers. In particular, one can also drop the requirement that an ML task must be entirely accomplished by quantum computations in favor of hybrid approaches, in which quantum co-processors efficiently solve specific subproblems within more complex learning schemes. Moreover, the quantum speedup is not the unique quantum advantage that can be pursued. Accuracy in prediction, expressive power, generalization capability, and the ability to avoid plateaus in training are also noteworthy figures of merit in evaluating the learning performance of quantum machines.

2.2 Quantum k-NN

The k-nearest neighbors (Fix and Hodges 1951) is a classification algorithm that consists of three steps: the computation of the distance with respect to the training elements; the identification of the k nearest neighbors, i.e., the k elements closest to the test instance; the prediction of the class label through a majority voting. Several quantum variants with different distance measures have been proposed, but a common aspect to all of them is the exploitation of a superposition state in order to perform parallel operations, such as computing the distance from the training elements simultaneously (quantum parallelism).

First of all, quantum k-NN algorithms employing the Hamming distance, thus requiring binary features, have been proposed by Schuld et al. (2014), Wiśniewska and Sawerwain (2018), Ruan et al. (2017), Zhou et al. (2021), and Li et al. (2021). Specifically, the first two works compute the Hamming distances by encoding the sums of the qubits differences (differences obtained through controlled-NOT gates) into the amplitudes by means of a unitary operation (an idea proposed first by Trugenberger 2002). Then, the classification is performed directly by measuring without explicitly selecting the nearest neighbors. Instead, the other works exploit the incrementation circuit presented by Kaye (2004) in order to obtain the distance values in basis encoding. After that, Ruan et al. (2017) select the data with a distance lower than a given threshold by means of an OR gate and a projection operation to directly perform the classification, Zhou et al. (2021) exploit Dürr’s minimization algorithm (Dürr and Høyer 1999) to find the k minimum distance values, while Li et al. (2021) apply a novel quantum search procedure inspired by a binary search in order to identify the minimum.

Concerning non-binary features, distance measures related to the angle between vectors, such as the cosine distance, are widely used. For instance, Dang et al. (2018) and Wang et al. (2019) have applied a quantum k-NN variant of this type to image classification tasks. In particular, the SWAP test (Buhrman et al. 2001) without measurements is used to compute the distances, whose values are then transferred to the qubits states through the amplitude estimation algorithm (Brassard et al. 2002). Finally, the nearest neighbors are found by means of Dürr’s algorithm. This workflow has been presented first by Wiebe et al. (2015), although for finding only the nearest neighbor. Instead, Afham et al. (2020) and Ma et al. (2021) have proposed a conceptually simpler variant, which consists in iterating SWAP tests and measurements in order to estimate a quantity proportional to the squared cosine similarity with respect to the training instances. In addition, the model allows processing multiple test instances in parallel, as shown by Ma et al. (2021). Actually, Afham et al. have recently proposed another variant (Basheer et al. 2021) whose workflow, however, is not so different from that of the previously described works. Indeed, it involves the SWAP test, a quantum analog-to-digital conversion algorithm (Mitarai et al. 2019), and a variation of Dürr’s algorithm.

Other interesting distance measures for non-binary features are the Euclidean, the Mahalanobis, and the polar distances. Specifically, the Euclidean distance is dealt with in-depth in Section 2.3. Regarding the other ones, Gao et al. (2022) have proposed a variant based on the Mahalanobis distance, while Feng et al. (2023) have presented a quantum k-NN based on the polar distance, which combines angle and module length information through an adjustable parameter. In detail, the Mahalanobis distance is computed by exploiting the phase estimation algorithm (Cleve et al. 1998), combined with Hamiltonian simulation (Rebentrost et al. 2018), and a controlled rotation; instead, the polar distance is calculated through the SWAP test without measurements and a pair of Toffoli gates (one of which extended). After that, in both works, the distances are encoded in the qubits states by applying the amplitude estimation algorithm (or its coherent version, proposed by Wiebe et al. 2015) and the nearest neighbors are retrieved through Dürr’s algorithm (or an algorithm based on it, proposed by Miyamoto et al. 2019). To conclude, it is also worth mentioning the quantum k-NN, based on a quantum sorting subroutine, that has been proposed by Quezada et al. (2022). It requires a metric operator computing distances and encoding them in qubits states, an oracle that identifies sorted sequences, and Grover’s algorithm (Grover 1996); as in other works, the classification is performed directly without identifying the k nearest neighbors.

2.3 Quantum Euclidean distance

The Euclidean distance is a well-known distance metric in ML. Here, the definition of its squared version is provided, since it will be useful in the following sections. In particular, given two vectors $\textbf{u}, \textbf{v} \in \mathbb {R}^n$, the squared Euclidean distance between them, $d^2(\textbf{u},\textbf{v})$, is defined as

$$\begin{aligned} d^2(\textbf{u},\textbf{v}) = \Vert \textbf{u} - \textbf{v} \Vert ^2 = \Vert \textbf{u}\Vert ^2 - 2 \langle \textbf{u}, \textbf{v}\rangle + \Vert \textbf{v}\Vert ^2, \end{aligned}$$

(1)

where $\langle \textbf{u}, \textbf{v}\rangle $ is the scalar product between $\textbf{u}$ and $\textbf{v}$.

The distance metric in question has been employed also in the field of QML. For instance, Lloyd et al. (2013) have proposed a quantum procedure to estimate the squared Euclidean distance between a data point and the centroid of a cluster, i.e., the mean of the elements contained in a group of data. Specifically, the algorithm relies on the SWAP test, which is applied to the index registers, and does not require the input vectors to have unit norms. An analogous procedure has been used by Sarma et al. (2020) to provide a hybrid k-means clustering algorithm (in which the centroids are classically computed), and by Getachew (2020) for a hybrid version of the k-medians one. Instead, Yu et al. (2020) have proposed three quantum algorithms to estimate three similarity measurements, based on the squared Euclidean distance, for datasets. In particular, all procedures do not require unit-norm input vectors, exploit the quantum interference (given by the change of basis), and use the amplitude estimation algorithm to determine the similarity measures. Finally, it is worth mentioning the quantum binary classifier devised by Schuld et al. (2017). In detail, the classifier circuit consists of a Hadamard gate (necessary for the quantum interference), a conditional measurement, and a final measurement. By iterating the procedure just described, a probability value related to the squared Euclidean distances is estimated for each class. In this last work, input vectors with unit norms are considered.

Regarding the quantum k-NN model, as far as the authors know, the only variant based on the Euclidean distance available in the literature has been presented by Fastovets et al. (2019); it exploits the procedure proposed by Lloyd et al. (2013) to estimate the pairwise distance values, and Dürr’s minimization algorithm to find the k nearest neighbors. The main drawback lies in the need of multiple iterations for each of these steps, since both involve a final measurement. In addition, Dürr’s algorithm requires an oracle, i.e., a black-box function, to be used. Actually, the nearest neighbor algorithm proposed by Wiebe et al. (2015) admits also the Euclidean distance as the distance metric. However, the workflow, which has been described in Section 2.2, is quite complex to be implemented. Finally, the computation of the single linkage, namely, one of the set similarity measures considered by Yu et al. (2020), could be seen as a generalization of the nearest neighbor search. Nevertheless, their quantum algorithm uses the reciprocals of the input vectors; as a consequence, theoretically, the original distance relationships are not preserved.

3 Method

In this section, the new quantum k-NN algorithm based on the Euclidean distance metric is presented. In addition, a brief discussion of the algorithm’s complexity compared with that of the classical counterpart is provided.

3.1 Algorithm

In the quantum k-NN algorithm introduced in this work, a quantity related to the squared Euclidean distance is computed in parallel for all training instances by means of a novel encoding and a simple quantum circuit, which performs a SWAP-test-like procedure without controlled-SWAP gates. In practice, the algorithm exploits the quantum interference and encodes these distance-related values, which are then estimated through measurements, in the amplitudes of the quantum states. It is worth highlighting that the input vectors do not undergo a unit-norm normalization, which would result in a significant information loss. In addition, the number of qubits needed is low, and no oracle is involved, making the implementation feasible. A more detailed and formal description of the steps of this new algorithm is provided below.

3.1.1 Data preprocessing

Let us consider a training set $\mathcal {U} = \{\textbf{u}_0,..., \textbf{u}_{N-1} \}$ of real-valued data instances $\textbf{u}_j \in \mathbb R^d$, and let $\mathcal {L} = \{l_0,..., l_{N-1} \}$ be the set of corresponding labels. In addition, let us consider a test instance $\textbf{u}' \in \mathbb R^d$, whose label is unknown.

The preprocessing step of the algorithm consists in centering and normalizing the data features into the range $\left[ -\frac{1}{2\sqrt{d}}, \frac{1}{2\sqrt{d}}\right] $ (procedure detailed in Section 4). In this way, the maximum norm of the resulting vectors turns out to be $\frac{1}{2}$ and the maximum (squared) Euclidean distance turns out to be 1.

3.1.2 Initial state and encoding(s)

Let $\mathcal {V} = \{\textbf{v}_0,..., \textbf{v}_{N-1} \}$ and $\textbf{v}'$ be the training set and the test instance after the preprocessing step described above. The quantum circuit is then initialized in the state

$$\begin{aligned} |{\psi }\rangle = |{0}\rangle \otimes \left( \frac{1}{\sqrt{2}} (|{0}\rangle |{\alpha }\rangle + |{1}\rangle |{\beta }\rangle )\right) , \end{aligned}$$

(2)

where

$$\begin{aligned} |{\alpha }\rangle = \frac{1}{\sqrt{N}} \sum _{j=0}^{N-1}|{j}\rangle \sum _{i=0}^{F-1}x_{ji}|{i}\rangle , \end{aligned}$$

$$\begin{aligned} |{\beta }\rangle = \frac{1}{\sqrt{N}} \sum _{j=0}^{N-1}|{j}\rangle \sum _{i=0}^{F-1}x'_{ji}|{i}\rangle . \end{aligned}$$

Here, F is a positive integer value depending on the encoding used, while $\textbf{x}_j = \{ x_{ji} \}_{i=0,...,F-1}$ and $\textbf{x}'_j = \{ x'_{ji} \}_{i=0,...,F-1}$ represent the quantum encoded versions of the preprocessed training and test data, respectively. Therefore, the number of qubits required is $2 + \lceil \log _2 N \rceil + \lceil \log _2 F \rceil $. In particular, two encodings, whose advantages are discussed in the next sections, have been devised and tested in this work: extension and translation. Let us look at their definitions. As regards the extension encoding, $F = 2d + 3$ and

$$\begin{aligned} x_{ji}\!=\!\left\{ \begin{array}{l} \frac{2}{\sqrt{3}}v_{ji} \\ \frac{2}{\sqrt{3}}v_{j(i-d)} \\ \frac{2}{\sqrt{3}}\Vert \textbf{v}_j\Vert \\ 0 \\ \sqrt{1 - 4\Vert \textbf{v}_j\Vert ^2} \end{array}\right. \quad x'_{ji}\!=\!\left\{ \begin{array}{ll} -\frac{2}{\sqrt{3}}v'_{i} &{} \!\!\!\!\!\!0 \!\le \! i \!<\! d \\ -\frac{2}{\sqrt{3}}v'_{(i-d)} &{} \!\!\!\!\!\!d \!\le \! i \!<\! 2d \\ \frac{2}{\sqrt{3}}\Vert \textbf{v}_{j}\Vert &{} \!\!\!\!\!\!i \!=\! 2d \\ \sqrt{1 - \frac{4}{3}(2\Vert \textbf{v}'\Vert ^2 + \Vert \textbf{v}_j\Vert ^2)} &{} \!\!\!\!\!\!i \!=\! 2d \!+\! 1 \\ 0 &{} \!\!\!\!\!\!i \!=\! 2d \!+\! 2, \end{array}\right. \end{aligned}$$

with $v_{ji}$ being the i-th feature of the j-th preprocessed training instance, and $v'_i$ being the i-th feature of the preprocessed test instance. Instead, for the translation encoding, $F = 2d + 4$ and

$$\begin{aligned} x_{ji}\!=\!\left\{ \begin{array}{l} v_{ji} \\ v_{j(i-d)} \\ \Vert \textbf{v}_j\Vert \\ \frac{1}{2} \\ 0 \\ \sqrt{\frac{3}{4} - 3\Vert \textbf{v}_j\Vert ^2} \end{array}\right. \quad x'_{ji}\!=\!\left\{ \begin{array}{ll} -v'_{i} &{} \!\!\!\!\!\!0 \!\le \! i \!<\! d \\ -v'_{(i-d)} &{} \!\!\!\!\!\!d \!\le \! i \!<\! 2d \\ \Vert \textbf{v}_j\Vert &{} \!\!\!\!\!\!i \!=\! 2d \\ -\frac{1}{2} &{} \!\!\!\!\!\!i \!=\! 2d \!+\! 1 \\ \sqrt{\frac{3}{4} - (2\Vert \textbf{v}'\Vert ^2 + \Vert \textbf{v}_j\Vert ^2)} &{} \!\!\!\!\!\!i \!=\! 2d \!+\! 2 \\ 0 &{} \!\!\!\!\!\!i \!=\! 2d \!+\! 3. \end{array}\right. \end{aligned}$$

As a consequence, the number of qubits required is the same for both encodings. It is also worth highlighting that, in both cases, $\textbf{x}_j$ (and therefore $|{\alpha }\rangle $) is independent of the preprocessed test instance $\textbf{v}'$, whereas $\textbf{x}'_j$ (and therefore $|{\beta }\rangle $) depends on the preprocessed training set $\mathcal {V}$.

3.1.3 Bell-H operation and final state

After the initial state preparation, an operation denoted here as Bell-H is performed. In detail, the Bell-H corresponds to a SWAP-test-like procedure in which the states of interest ($|{\alpha }\rangle $ and $|{\beta }\rangle $), initially prepared in superposition, interfere by means of a controlled-NOT (CNOT) gate. The corresponding quantum circuit, including also the initial state preparation (separated by a dashed vertical line), is the following:

where $I = \lceil \log _2 N \rceil + \lceil \log _2 F \rceil $. In practice, the Bell-H circuit consists of a Hadamard gate applied to the first qubit, a CNOT gate with the first qubit as control and the second qubit as target, and another Hadamard gate applied to the first qubit. Hence, the difference with respect to a standard Bell circuit, commonly used to generate Bell states, lies in the presence of an additional downstream Hadamard gate. A significant advantage with respect to the standard SWAP test lies in the constant number of elementary gates required (three), independently of the size of the states involved; as a drawback, the preparation of the input state is more complex, especially without the availability of a QRAM.

Table 1 Properties of the two encodings

A quantum k-nearest neighbors algorithm based on the Euclidean distance estimation

Abstract

Similar content being viewed by others

An Improved Quantum Nearest-Neighbor Algorithm

Effect of Different Encodings and Distance Functions on Quantum Instance-Based Classifiers

Image classification based on quantum K-Nearest-Neighbor algorithm

1 Introduction

2 Background

2.1 Quantum machine learning

2.2 Quantum k-NN

2.3 Quantum Euclidean distance

3 Method

3.1 Algorithm

3.1.1 Data preprocessing

3.1.2 Initial state and encoding(s)

3.1.3 Bell-H operation and final state

3.1.4 Measurements and distance estimate(s)

3.1.5 k nearest neighbors and classification

3.2 Complexity observations

Proposition 1

Proof

4 Implementation

5 Empirical evaluation

5.1 Methods

5.2 Datasets

5.3 Experimental setup

5.4 Results

5.4.1 Execution modalities comparison

5.4.2 Encodings and distance estimates comparison

5.4.3 Comparison with baseline methods

5.4.4 Number of shots analysis

6 Conclusion

Availability of data and materials

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendices

Appendix A: Derivations

1.1 A.1 Probability of measuring 1 on the first qubit

1.2 A.2 Reduced final state

Appendix B: Distance estimates

1.1 B.1 Instance sorting

1.2 B.2 Magnitude

Appendix C: Additional plots

1.1 C.1 Execution modalities comparison

1.2 C.2 Encodings and distance estimates comparison

1.3 C.3 Comparison with baseline methods

1.4 C.4 Number of shots analysis

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation