Abstract
Methods of processing quantum data become more important as quantum computing devices improve their quality towards fault tolerant universal quantum computers. These methods include discrimination and filtering of quantum states given as an input to the device that may find numerous applications in quantum information technologies. In the present paper, we address a scheme of a classification of input states, which is nondestructive and deterministic for certain inputs, while probabilistic, in general case. This can be achieved by incorporating phase estimation algorithm into the hybrid quantumclassical computation scheme, where quantum block is trained classically. We perform proofofprinciple implementation of this idea using superconducting quantum processor of IBM Quantum Experience. Another aspect we are interested in is a mitigation of errors occurring due to the quantum device imperfections. We apply a series of heuristic tricks at the stage of classical postprocessing in order to improve raw experimental data and to recognize patterns in them. These ideas may find applications in other realization of hybrid quantumclassical computations with noisy quantum machines.
Introduction
Machine learning is a computing paradigm, where recognition of patterns in available data plays a central role, but the computing system is not explicitly programmed; many examples indeed demonstrate success of this approach to realworld problems. Quantum machine learning is an emergent technology based on the assumption that quantum resources can be useful in the pattern analysis, see, e.g., Wiebe et al. (2012), Schuld et al. (2015a), Biamonte et al. (2017), Amin et al. (2018), Preskill (2018), and Adcock et al. (2015). Quantum algorithms within such applications can be used as a part of a larger computation scheme which also incorporates classical blocks.
There are two major approaches for the construction of a quantum block in such schemes—it can be represented either by quantum annealer or by algorithmic quantum computer (Biamonte et al. 2017). Most of the proposals unfortunately are characterized by input/output bottlenecks occurring at stages of encoding classical data into quantum states and decoding them back (Aaronson 2015; Arunachalam et al. 2015). However, these bottlenecks seem to be not severe in the case input states are quantum (Granade et al. 2012; Wiebe et al. 2014). The role of the quantum machine is to recognize their underlying patterns, which may have no classical counterpart (for example, characteristics of quantum entanglement), and then to classify these states or filter them. Let us stress that the classification of quantum states is supposed to play a crucial role in quantum metrology and sensing (Degen et al. 2017). For instance, in quantum illumination problem, one has to operate with the entangled photonic states and to reveal their characteristics (Lloyd 2008; Tan et al. 2008). Another possible source of quantum data can be a quantum simulator or another quantum computer (for example, more noisy and/or of a larger size) (Biamonte et al. 2017).
Machine learning tasks can be roughly divided into supervised and unsupervised. In the present paper, we address a hybrid quantumclassical approach to the problem of classification of input quantum states, where quantum block is trained classically with the set of labeled input vectors (supervised learning). An essential ingredient of the model we consider is a phase estimation algorithm embedded into the quantum part of the computational scheme. Using ancilla qubits, it is possible to extract information about quantum state without doing a direct measurement of the qubits encoding this state. It is thus possible to make a classification of certain input quantum states both nondestructively and deterministically. For general input states, the classification is probabilistic. This idea is motivated by the recent suggestion on simulation of perceptron on a quantum computer (Schuld et al. 2015b).
We also perform proofofprinciple realization of our scheme with real superconducting quantum computer of IBM Quantum Experience available through the cloud service. Its performance, as well as performances of existing quantum computers based on other physical realizations, is limited by imperfections of quantum hardware, which include effects of decoherence and quantum gate errors. This limitation restricts possible realizations of quantum machine learning algorithms to fewqubit examples, see, e.g., Cai et al. (2015) and Li et al. (2015). We therefore address a rather simple toy model, which is associated with the classification of maximally entangled twoqubit states. In order to obtain a valuable information from raw experimental data affected by noise, we apply a series of tricks based on classical postprocessing which are also associated with pattern recognition. These ideas can be of interest in a general context of hybrid quantumclassical computation, which attracts a lot of attention now, see, e.g., Farhi et al. (2014), McClean et al. (2016), Peruzzo et al. (2014), Preskill (2018), Kandala et al. (2017), and Ristè et al. (2017).
This paper is organized as follows. In Section 2, we explain basic ideas behind the approach used. In Section 3, we present an explicit treatment of a toy model dealing with the classification of twoqubit maximally entangled states. In Section 4, we describe the realization of this toy model on superconducting quantum computer of IBM Quantum Experience and apply different approaches to mitigate the effect of errors. We conclude in Section 5.
Phase estimation algorithm in classification problems
Programmable quantum computers operate with data encoded into quantum states. An example of the potential applications for quantum computers is a classification of states given as input, according to some criterion or criteria. In order to accomplish this task, one has to construct a circuit which signals out if a state belongs to one of predefined classes. Another example is associated with filtering problem—quantum device must nondestructively pass a state which belongs to a predefined class and should also signal this event out. The problem on how to construct such a circuit is obvious only for trivial cases and it is not simple for more complex quantum states.
One of the possible solutions is to use ideas from the machine learning field. For example, it is reasonable to construct a quantum circuit with some limited number of free parameters which enter certain blocks of the algorithm. Then, the quantum algorithm can be “trained” by sending training states to the input, tuning the parameters and finding their optimal values allowing for the desirable classification, which can include multiple groups. It is difficult to implement the training as a purely quantum procedure, so that this part of the whole scheme might be accomplished classically, i.e., through the classical computer. The scheme, in this case, represents one of the numerous examples of a hybrid quantumclassical computations. The classical training procedure can be based on various methods, such as grid search, Monte Carlo method, or gradient descent method.
In the present paper, for the quantum part of the scheme, we adopt ideas based on the phase estimation algorithm, which enables to get information about an input state without doing a direct measurement of the qubits encoding this state, but instead exploits ancilla qubits. For certain input states, the classification can be made both nondestructive and deterministic. The quantum block of this circuit is shown schematically in Fig. 1, where U(ω) is a unitary operator parametrized by a set of tunable parameters ω to be adjusted during the training procedure. If the input state is an eigenstate of U(ω), the measurements of ancilla qubits do not destroy it, so ψ〉 is passed nondestructively through the scheme (apart of a general phase it obtains). Moreover, in this case, the measurements of ancilla qubits are deterministic, provided the eigenvalue of ψ〉 is \(\exp (2i \pi n/2^{N_{a}})\), where N_{a} is the number of ancilla qubits, whereas n is an integer number ranging from 0 to \(2^{N_{a}}  1\). The inverse statement is also true: deterministic results of ancilla’s measurement are possible only if the input state is one of the eigenstates of U(ω) and its eigenvalue is of the form \(\exp (2i \pi n/2^{N_{a}})\).
Hence, if there are two input states each being eigenstates of U(ω) with different eigenvalues of the above type, it is possible to classify these states both nondestructively and deterministically by doing measurements of ancillas. Otherwise, the classification is probabilistic: the probability to get a set of 0 and 1 corresponding to the eigenstate of U(ω) with given eigenvalue \(\exp (2i \pi n/2^{N_{a}})\) is the sum of overlaps between ψ〉 and all mutually orthogonal eigenstates of U(ω) characterized by this particular eigenvalue. If ψ〉 is the eigenstate of U(ω), the classification is nondestructive, but probabilistic, in general case. Notice that the nondestructive character of state transfer through the circuit can be probed by the SWAP test.
We now discuss the same problem, but from another perspective. Let us assume that we have M orthogonal input states. We may try to perform an ideal classification of these states, i.e., to construct an operator U(ω), for which these states are eigenstates and, moreover, the results of ancilla’s measurements allow for the unambiguous deterministic discrimination between them. Let us stress that such a circuit provides a nondestructive and deterministic classification among given set of M input states, while a general input state is classified probabilistically. In the latter case, through the repeating measurements, we may recognize which of the M states of the training set the input state is closer to. It is clear that the minimum necessary number of ancilla qubits is determined by the condition \(2^{N_{a}}\geqslant M\). Apparently, requirements for the operator U(ω) for such a classification are quite restrictive. Alternatively, it is possible to find such a U(ω), which yields a nondestructive but probabilistic classification of M orthogonal training states. Again, the nondestructive character of the input state transfer through the circuit can be verified by SWAP test.
The problem of efficient construction of desirable U(ω) is far from being obvious. In principle, it is possible to try a bruteforce strategy, which seems rather universal: one may use a fixed entangler of all qubits of the register and to apply it multiple times, but to insert a set of singlequbit rotations between each application of the entangler; rotation angles can be treated as variational parameters. A similar approach was utilized in Kandala et al. (2017) for the preparation of variational manybody states for the modelling of molecules. It is then possible to optimize some error function in order to minimize a level of “destructiveness” or “nondeterminism” of the classification. Another possible strategy is to rely on heuristics when finding suitable form of U(ω), which depends on the characteristics of vectors from the training set. Below, we discuss a toy model, which contains all essential ingredients of the scheme we discuss and can be tested with existing quantum machines. Within this simple example, we follow the heuristic approach for the construction of a proper operator U(ω).
Toy model: classification of maximally entangled twoqubit states
Let us consider four possible input states defined as twoqubit maximally entangled states. In other words, we assume that there are four training vectors, which are Bell states Φ_{±}〉 and Ψ_{±}〉, given by
Our aim is to construct an ideal classification scheme allowing for the nondestructive and deterministic classification of these four states into two classes Φ_{±}〉 and Ψ_{±}〉.
The states of these two classes differ from each other by their “internal structure” reflected in the probabilities to be in the orthogonal states of computational basis, which is not sensitive to the phases. Therefore, it is perspective to construct U on the basis of rotations around z axis. We thus parametrize U as U = U_{z1}(ω_{1})U_{z2}(ω_{2}), where indices 1 and 2 refer to the qubit number and \(U_{z}(\omega )=\left [\begin {array}{cc} e^{i\pi \omega /2} & 0 \\ 0 & e^{i\pi \omega /2} \end {array}\right ]\) is a singlequbit rotation around z axis.
We first show explicitly that such a parametrization for U gives a desirable result and also determine optimal values of ω_{1} and ω_{2} yielding nondestructive and deterministic classification. We then do the same work using the real quantum computer by finding such optimal parameters through the grid search that can be treated as a learning procedure.
It is easy to see that Φ_{±}〉 are eigenstates of U provided ω_{1} + ω_{2} = 2k, where k is an integer number. The eigenvalue of U for both Φ_{+}〉 and Φ_{−}〉 is the same, U_{Φ} = e^{−iπk}. Similarly, Ψ_{±}〉 are eigenstates of U provided ω_{1} − ω_{2} = 2q, where q is an integer number; while the eigenvalue of U for both Ψ_{+}〉 and Ψ_{−}〉 is the same, U_{Ψ} = e^{−iπq}. Let us choose p and q in such a way as to make U_{Ψ} and U_{Φ} different from each other, which is necessary for the classification to work. Obviously, parities of p and q must be opposite. We may choose, for instance, k = 0 and q = 1, which leads to U_{Φ} = −U_{Ψ} = 1 and ω_{1} = −ω_{2} = 1. Fortunately, for our simplistic toy model, both eigenvalues we found fall automatically into the discrete set, which enables for a deterministic classification. This can be achieved using a single ancilla. The whole quantum scheme for this case is shown in Fig. 2. For the input state Φ_{±}〉⊗0〉, the output state at the end of the circuit is \({\varPhi }_{\pm }\rangle \otimes \frac {1}{2} ((1+U_{{\varPhi }}) 0\rangle + (1U_{{\varPhi }}) 1\rangle ) = {\varPhi }_{\pm }\rangle \otimes 0\rangle \). For the input state Ψ_{±}〉⊗0〉, the output is \({\varPsi }_{\pm }\rangle \otimes \frac {1}{2} ((1+U_{{\varPsi }}) 0\rangle + (1U_{{\varPsi }}) 1\rangle ) = {\varPsi }_{\pm }\rangle \otimes 1\rangle \). Thus, we see that indeed nondestructive and deterministic classification of two groups of input states is possible, since for Φ_{±}〉 the probability P_{0}(Φ_{±}〉) to find ancilla in the state 0〉 is exactly 1, while for Ψ_{±}〉 the probability P_{0}(Ψ_{±}〉) to find ancilla in the state 0〉 is exactly 0. The scheme basically performs a parity check, and the parity is to be considered as a “quantum pattern”.
For the input twoqubit state of a general form
after some straightforward calculations, we obtain the expression for probability P_{0}(Ψ〉) to find ancilla in the state 0〉 provided optimal ω_{1}, ω_{2} = 1 are incorporated into the circuit
It can be rewritten as
In this general case, the scheme works as a probabilistic classifier, and the classification occurs according to the distance between the input state and two subspaces, in which Φ_{±}〉 and Ψ_{±}〉 form local bases. We stress that P_{0}(Ψ〉) is no longer exactly 0 or 1, while a measurement of the ancilla cannot be treated as nondestructive. A nondestructive classification is possible between quantum states of two classes, α00〉 + δ11〉 and β01〉 + γ10〉.
Now let us come back to the previous stage and consider the learning procedure. If explicit treatment is impossible, optimal values of ω_{1} and ω_{2} have to be determined from the results of measurements of ancillas. Let us introduce probability P_{0}(Ψ〉;ω_{1}, ω_{2}) to find the ancilla in the state 0〉 for general (ω_{1}, ω_{2}) and for the input state Ψ〉. This quantity is a generalization of P_{0}(Ψ〉) given by Eq. 3 and it can be written as
The training procedure consists in finding optimal (ω_{1}, ω_{2}) by evaluating both P_{0}(Φ_{±}〉;ω_{1}, ω_{2}) and P_{0}(Ψ_{±}〉;ω_{1}, ω_{2}) and extracting points in the (ω_{1}, ω_{2}) space, where the first quantity is exactly 1, while the second quantity is exactly 0 (or vice versa). Values of (ω_{1}, ω_{2}) can be tuned by the classical computer, while quantum algorithm is implemented with the quantum computer. The bruteforce method to determine optimal (ω_{1}, ω_{2}) is a grid search. In the next section, we perform such a search using the real quantum computer. The experimental results will be compared with the explicit treatment. In order to facilitate this comparison, in Fig. 3, we show the results of our calculations for P_{0}(Φ_{±}〉;ω_{1}, ω_{2}) and P_{0}(Ψ_{±}〉;ω_{1}, ω_{2}) based on Eq. 5. From this figure, we again see that there are values of ω_{1} and ω_{2}, supporting a discrimination between two pairs of Bell states in a single measurement.
Implementation on a noisy quantum device
Quantum circuit
Having a simple algorithm at hand, we perform proofofprinciple realization on a currently available quantum device. An additional important issue we are interested in is an error mitigation in hybrid quantumclassical computation schemes, so we consider the realization of a given algorithm as a playground for this quite general problem.
We use 16qubit IBMqx5 superconducting quantum chip, which is available through the cloud service within the IBM Quantum Experience project. The realization of our scheme is illustrated in Figs. 4 and 5. Figure 4 shows the schematic image of the chip. The qubits utilized in our quantum algorithm are shown by the red color. The quantum circuit itself is presented in Fig. 5. Due to the limitations in connectivity, the quantum circuit includes an additional SWAP gate required to interchange quantum states of two physical qubits. Note that this gate is composed of three CNOT gates and it therefore provides an additional significant contribution to the total error rate.
Raw data
Stateoftheart quantum computers still suffer from decoherence problem, as well as imperfections of quantum gates and readouts. In order to use such devices for realization of quantum algorithms, one has to deal with the accumulation of errors. It is worth discussing sources of errors for quantum circuits of different lengths under the realization on available superconducting quantum devices. Roughly, they can be divided into readout errors, quantum gate errors, and a bare influence of decoherence, which are characterized as follows:

(i)
Readout error is typically of the order of 10^{− 2}

(ii)
Average gate errors is of the order of 10^{− 3}. It is also known that errors of twoqubit gates are nearly one order of magnitude larger than that of singlequbit gates

(iii)
Longitudinal and transverse relaxation times of individual qubits are typically tens of microseconds. They must be compared to typical timescales of individual quantum gates. This time for singlequbit gates is nearly 80 ns and the duration of twoqubit gates is about 300 ns; there is also 10 ns buffer between two gates.
To partially suppress or mitigate the errors, different tricks have been suggested (Temme et al. 2017; Li and Benjamin 2017; McClean et al. 2017; Endo et al. 2018). These tricks are usually efficient in the regime of low error rate, which is achieved provided shallow quantum circuits are used within the schemes of quantumclassical computation. In contrast, the implementation of our toy model is already associated with the quantum circuit which is not so shallow. Therefore, the error rates in our experiments are relatively high, while the dominant contribution is provided by CNOT errors. We therefore use a series of tricks based on classical postprocessing techniques applied for the output from a noisy quantum device. Since our final goal is to find experimentally the probability patterns of the form similar to the theoretical ones depicted in Fig. 3, in our treatment we also use certain analogies with a problem of image denoising. Thus, we again address the problem of pattern recognition, but now classically.
Figure 6 shows the results for P_{0}(Ψ_{−}〉;ω_{1}, ω_{2}) obtained from IBM classical simulator (left panel), which does not take into account device imperfections, and the real quantum machine (right panel). The results from the classical simulator are, of course, the same, within the computational accuracy and disregarding discretization, as the ones obtained analytically (see Fig. 3b). Both experimental and theoretical maps contain 40 × 40 points. There were 8192 measurements for each point. We have chosen the state Ψ_{−}〉 among the four possibilities in order to illustrate our results and ideas on error mitigation; the results for the remaining three states are rather similar. The comparison of the experimental and theoretical data shows that the agreement is not satisfactory—the experimental data even for our toy classification model are heavily damaged by the noise. Particularly, the experimental probabilities tend to approach 0.5 instead of being distributed from 0 to 1. Moreover, the experimental probability pattern also lacks “connecting bridges between islands”: the exact pattern contains diagonal areas with high values of P_{0}(Ψ_{−}〉;ω_{1}, ω_{2}), while in the experimental data these diagonal areas are dissociated into five separate islands with suppressed values of P_{0}(Ψ_{−}〉;ω_{1}, ω_{2}) between them. Nevertheless, in the next subsections, we are going to apply a combination of tricks in order to extract valuable information from so noisy raw data.
A poor quality of experimental data is the reason why we restricted ourselves to an oversimplified classification problem with few qubits only among 16 qubits of the device. Indeed, classification of quantum states involving larger number of qubits implies application of much larger number of twoqubit gates which provide the main contribution to the total error rate.
As a measure of difference between ideal (theoretical) results and the experimental data, we have chosen several standard metrics:

(i)
A signaltonoise measure, defined as
$$ SNR(M, M^{\prime}) = 10\log_{10} \left( \frac{\sigma^{2}(M)}{MSE(M,M^{\prime})}\right), $$(6)where M and \(M^{\prime }\) are arrays of data, obtained from the quantum chip and ideal classical simulator, correspondingly; σ^{2}(M) is variance; and \(MSE(M,M^{\prime })\) is mean square error.

(ii)
L_{1} distance (Manhattan distance), defined as
$$ d_{L_{1}}(M, M^{\prime}) = \underset{m \in M, m^{\prime} \in M^{\prime}}{\sum} m  m^{\prime}, $$(7)where m and \(m^{\prime }\) are elements of matrices M and \(M^{\prime }\) correspondingly.

(iii)
Pearson correlation, defined as
$$ \rho_{M, M^{\prime}} = \frac{E[(M  E[M])(M^{\prime}  E[M^{\prime}])]}{\sigma(M)\sigma(M^{\prime})}, $$(8)where E[M] is an expectation value of M.
We are going to trace the evolution of these three quantities after each step of our denoising procedure.
Postselection
The first step of our procedure is associated with the postselection of experimental data. The underlying idea is the following: consider we run some quantum circuit on a noisy quantum device and there are certain constraints on possible outputs. These constraints can originate from, e.g., symmetric considerations, and the knowledge of constraints does not necessary require the resolution of the full problem—otherwise, quantum computer is useless. For example, in simulations of manybody systems, there may be certain conditions dictated by an electronhole symmetry or particlenumber conservation. Thus, in computations with noisy quantum devices, we may discard wrong outputs which explicitly violate such requirements. Note that some of us have recently used this idea in Zhukov et al. (2019) dealing with benchmarking of quantum computers using quantum communication protocols.
In the situation. we here consider similar constraints that can be deduced from the explicit derivation of the circuit’s output. Since the main goal of this part of our paper is linked to error mitigation, it is legitimate to use some information from the explicit treatment. Namely, under the proper work of the quantum machine, if the input state is Φ_{−}〉, the output must be a superposition of Φ_{−}〉 and Φ_{+}〉 irrespective of (ω_{1}, ω_{2}). Thus, if the result of measurements of two register qubits in the computational basis is 00 or 11, this result can be discarded. In order to perform such a postselection, we need to measure not only ancillas, but also data qubits.
The approach we use is not completely universal, since it relies on constraints or symmetries which do not exist for an arbitrary problem. However, we would like to stress that, under certain conditions, it may be efficient to use a redundant coding, i.e., to encode a single logical qubit into larger number of physical qubits and thus to create constraints artificially. Automatic error correction or classical postselection of results can be then applied to discard part of wrong outputs associated with certain quantum errors. Of course, a redundant coding is associated with the increase of the number of noisy gates of the algorithm, but nevertheless the advantages due to the postselection can overcome disadvantages due to the increase of the gate number. The success of the this strategy depends on the details of the algorithm as well as on the errors mechanisms and errors rates. For example, in Zhukov et al. (2019), the redundant encoding supplemented by the postselection was utilized and certain improvement of results has been achieved.
The results of a postselection for the problem we here address are shown in Fig. 7, while Table 1 provides metric values before the postselection and after it. All three quantities indicate certain improvement of data after the procedure we utilized. However, there are also some qualitative changes in the overall distribution of the probability, which can be noticed by comparing experimental data after postselection and raw experimental data (Fig. 7). Namely, postselection leads to the emergence of a correct paternal structure of probability distribution—separate “islands” now tend to be connected by “bridges.” This fact is crucial for the subsequent analysis, since it allows for the partial reconstruction of correct data at the end of our procedure.
The fraction of discarded data after this step is approximately 1/2, and it is not so dependent on ω_{1} and ω_{2}. Of course, the additional measurement of two qubits leads to the increase of total readout error rate. However, these extra errors are definitely much smaller than the total error accumulated by the whole algorithm. This conclusion is evident from the fraction of discarded results, which is as high as 1/2, and known error rates of readouts the latter being typically only several percent.
Image denoising
Let us now discuss another series of heuristic tricks we use to partially suppress the effects of noises. They are associated with the image denoising. However, before that, let us stress that postselection should be used before this step; otherwise, the reconstruction will completely fail. Particularly, without the postselection, the probability pattern lacks connecting “bridges” between “islands” we mentioned before. These features are, of course, crucial for the reconstruction of a correct pattern.
We start with the observation that the experimentally determined values of probability are generally close to 0.5 instead of being distributed between 0 and 1. Nevertheless, the spatial variations of probability as well as its pattern structure in (ω_{1}, ω_{2}) plane are reproduced much more adequately. Notice that both controlling parameters (ω_{1}, ω_{2}) enter the circuit only through singlequbit rotations. The obtained results imply that, in our noisy experiments with real hardware, the output results can be roughly divided into two classes: (i) wrong outputs, which are due to the single or multiple errors occurring during the algorithm executions and (ii) correct results corresponding to zero number of errors occurred. The first contribution is apparently dominant. An important observation is that it is nearly independent of controlling parameters (ω_{1}, ω_{2}). A similar behavior has been recently observed by some of us in Zhukov et al. (2018) dealing with the simulation of unitary evolution of spin clusters using programmable quantum hardware, where a similar controlling parameter was associated with the dimensionless time. The uniformity of the wrong part of the output data with respect to this parameter was attributed to the fact that the circuit was not so shallow and contained a reasonable number of noisy quantum gates. An error occurring at particular gate produces its own dependence of the corresponding output on the controlling parameter. However, such dependencies for errors occurring at different gates of the circuit are also different, so that they finally average out into a nearly uniform dependence on the controlling parameter. Hence, this nearly uniform “background” can be simply eliminated by considering properly normalized differences instead of absolute values of quantities of interest. Let us stress that this situation is a direct consequence of a relatively large number of noisy gates in the circuit—noise in this regime, in some sense, can help extracting valuable information from imperfect data. Of course, as the number of noisy gates grows, the fraction of correct outputs lowers down exponentially—as a result, the trick we discuss can be utilized only in the regime of “intermediatedepth” circuits.
In order to get rid of background, we apply the following transform:
where we introduced the notation P_{0} = P_{0}(Ψ_{−}〉;ω_{1}, ω_{2}). This transform rescales linearly the measured quantity in such a way that the lowest value is mapped to 0 and the highest value is brought to 1. We point out that this trick is not fitting to the already known result. Our methodology is that, in our reconstruction, we use only a partial information on a correct and unknown probability distribution, which in this case is just the minimum and maximum value of the quantity of interest. In many cases, such additional parameters can be deduced from quite general considerations and do not require full knowledge of the output from the quantum computer. The result of the procedure is shown in Fig. 8, and Table 2 gives an evolution of metric values. We see that SNR was improved as well as the Manhattan distance. However, this is not true for the Pearson coefficient which did not change. This latter result is natural, since the Pearson coefficient must be insensitive to linear transformations.
Although the transformation defined by Eq. 9 enables us to partially get rid of the nearly constant background, it has a serious drawback. The problem is that only a single value of probability corresponding to some particular point of the map is brought to 1, while the probability generally fluctuates significantly from one discrete point in (ω_{1}, ω_{2}) plane to another. The origin of these fluctuations is associated with imperfections of quantum gates.
The particular point of maximum probability resides nearly at the center of the map shown in Fig. 6, i.e., at ω_{1}, ω_{2} ≈ 0. The same problem, of course, exists for the particular point of the map, for which the measured probability is lowest and hence is switched to 0 by rescaling (7). In order to circumvent this problem, we apply a well known sigmoid transformation. It maps \({P}_{0}^{\prime }\) to the new value \({P}_{0}^{\prime \prime }\), according to
where a and b are free parameters. The value of b is fixed by the requirement that b must stay invariant under the transformation, so that b = 0.5 in our case.
Again, from general considerations, we can deduce a partial information about a true probability pattern, which includes not only minimum and maximum values of this quantity, but also a typical length scale of its variation in the space of parameters (ω_{1}, ω_{2}). For the 2 −qubit input state and the problem we here consider, this length scale can be roughly estimated as ≈ 1/2. Next, we can define another length scale which is much smaller and evaluate the mean value of probability over the corresponding area. It is clear that the probability must be essentially constant within this area. Thus, we choose the parameters of the sigmoid transformation a in such a way as to map the mean value of probability within the corresponding area \(<f>_{\max \limits }\) in the vicinity of its maximum to some number, which is slightly lower than 1 (or alternatively, slightly higher than 0 in the vicinity of its minimum). We choose this number as 0.9. This leads us to equate the \({P}_{0}^{\prime \prime } (<f>_{\max \limits })\) and 0.9. We thus find \(a \approx 5/(2<f>_{\max \limits }1)\). We obtained that \(<f>_{\max \limits }\) for our set of data is nearly 0.65 in the close vicinity of the point ω_{1}, ω_{2} ≈ 0 (averaging has been performed over the area of 5 × 5 points) and hence a ≈ 15. Let us stress that the quality of reconstruction is nearly the same until a ranges from 10 to 20; thus, the choice of a characteristic number 0.9 as well as the area of the region for performing averaging are rather relative.
Table 2 provides the evolution of results for the metrics values. The use of the sigmoid transformation with a = 15 applied after the normalization gives further improvement of data quality according to the SNR metrics. However, L_{1} and Pearson coefficients indicate certain decrease of the agreement between the experiment and theory. The reason is linked to the fact that the sigmoid transformation, at this stage, produces artifacts—it enhances fluctuations in some points of the plane by bringing values of probabilities close either to 0 or to 1. It is evident that L_{1} is a pointwise local metric and it is rather sensitive to the enhancement of such local fluctuations. Pearson coefficient is also more sensitive to local fluctuation than SNR which is consistent with the fact that it is invariant under the rescaling of the probability pattern as a whole.
The procedures we have used do not completely suppress fluctuations of probability between neighboring discrete points of the map; moreover, the sigmoid transformation even enhances them to a certain extent. A natural idea is to use a mean filtering, i.e., to average out discrete data over small areas discussed in relation to the sigmoid transformation. However, this leads to the fact that the probability is again shifted towards 0.5. As seen from the results of Table 2, it is accompanied by the decrease of SNR, although other metrics show better results due to the fact that after filtering procedure the artifacts of sigmoid transformation, as discussed above, have been partially suppressed. In order to get rid of the decrease of SNR at this stage, we afterwards reapply normalization and sigmoid transforms with the same parameters a and b and achieve a further improvement of data quality according to the three metrics we used.
The whole procedure of postprocessing is the following: postselection (step 0) → normalization (step 1) → sigmoid transform (step 2) → mean filtering (step 3) → normalization (step 4) → sigmoid transform (step 5). The final result at the end of last three steps of this sequence is shown in Fig. 8. Table 2 provides an evolution of metrics values, which shows that all of them have been significantly improved although the details of their evolution at different steps of our procedure were not identical due to different types of correlations these quantities are responsible for. The comparison between the final pattern and the exact pattern shows that the agreement is good, although certain discrepancies are still present. As a whole, the improvement compared to raw data is significant. Thus, our procedure provides a case study which illustrates that it is possible to extract valuable information from data of noisy quantum computer even if they are heavily damaged by the decoherence and gate errors.
Conclusion
In this paper, we have addressed a hybrid quantumclassical scheme for the classification of input quantum states, where quantum part is represented by the phase estimation algorithm. It is based on a tunable unitary operator which can be adjusted to accomplish a desired classification of input quantum states from the training set. Due to the fact that measurements are performed on ancilla qubits, the classification can be made nondestructive and deterministic. For a general input quantum state, the scheme works as a probabilistic classifier and can be used to classify underlying patterns in quantum data.
We demonstrated proofofprinciple implementation of this idea using a superconducting quantum computer of IBM Quantum Experience and a specific simple example of the hybrid scheme we suggested. This scheme is able to classify maximally entangled twoqubit states into two groups depending on their parity. The real quantum hardware is characterized by different imperfections which lead to the accumulation of errors during the algorithm executions. Error mitigation, within our realization, was another issue addressed in this paper. We have applied a series of tricks associated with classical postprocessing to improve the raw experimental data and to recognize patterns contained in them. These ideas may be used in other realizations of hybrid quantumclassical computation schemes. Our results also demonstrate that pattern recognition can be an important ingredient of classical postprocessing of data from noisy quantum hardware.
References
Aaronson S (2015) Read the fine print. Nat Phys 11:291
Adcock J, Allen E, Day M, Frick S, Hinchliff J, Johnson M, MorleyShort S, Pallister S, Price A, Stanisic S (2015) Advances in quantum machine learning. arXiv:1512.02900
Amin MH, Andriyash E, Rolfe J, Kulchytskyy B, Melko R (2018) Quantum boltzmann machine. Phys Rev X 8:021050
Arunachalam S, Gheorghiu V, JochymO’Connor T, Mosca M, Srinivasan PV (2015) On the robustness of bucket brigade quantum RAM. New J Phys 17:123010
Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S (2017) Quantum machine learning. Nature 549:19
Cai XD, Wu D, Su ZE, Chen MC, Wang XL, Li L, Liu NL, Lu CY, Pan JW (2015) Entanglementbased machine learning on a quantum computer. Phys Rev Lett 114:110504
Degen CL, Reinhard F, Cappellaro P (2017) Quantum sensing. Rev Mod Phys 89:035002
Endo S, Benjamin SC, Li Y (2018) Practical quantum error mitigation for nearfuture applications. Phys Rev X 8:031027
Farhi E, Goldstone J, Gutmann S (2014) A quantum approximate optimization algorithm. arXiv:1411.4028
Granade CE, Ferrie C, Wiebe N, Cory DG (2012) Robust online hamiltonian learning. New J Phys 14:103013
Kandala A, Mezzacapo A, Temme K, Takita M, Brink M, Chow JM, Gambetta JM (2017) Hardwareefficient variational quantum eigensolver for small molecules and quantum magnets. Nature 549:242
Li Y, Benjamin SC (2017) Efficient variational quantum simulator incorporating active error minimization. Phys Rev X 7:021050
Li Z, Liu X, Xu N, Du J (2015) Experimental realization of a quantum support vector machine. Phys Rev Lett 114:140504
Lloyd S (2008) Enhanced sensitivity of photodetection via quantum illumination. Science 321:1463
McClean JR, Romero J, Babbush R, AspuruGuzik A (2016) The theory of variational hybrid quantumclassical algorithms. New J Phys 18:023023
McClean JR, KimchiSchwartz ME, Carter J, de Jong WA (2017) Hybrid quantumclassical hierarchy for mitigation of decoherence and determination of excited states. Phys Rev A 95:042308
Peruzzo A, McClean J, Shadbolt P, Yung MH, Zhou XQ, Love PJ, AspuruGuzik A, O’Brien JL (2014) A variational eigenvalue solver on a photonic quantum processor. Nat Comm 5:4213
Preskill J (2018) Quantum computing in the NISQ era and beyond. Quantum 2:79
Ristè D, da Silva MP, Ryan CA, Cross AW, Smolin JA, Gambetta JM, Chow JM, Johnson BR (2017) Demonstration of quantum advantage in machine learning. npj Quantum Information 3:16
Schuld M, Sinaiskiy I, Petruccione F (2015a) An introduction to quantum machine learning. Contemp Phys 56(2):1034
Schuld M, Sinayskiy I, Petruccione F (2015b) Simulating a perceptron on a quantum computer. Phys Lett A 379:660
Tan SH, Erkmen BI, Giovannetti V, Guha S, Lloyd S, Maccone L, Pirandola S, Shapiro JH (2008) Quantum illumination with gaussian states. Phys Rev Lett 101:253601
Temme K, Bravyi S, Gambetta JM (2017) Error mitigation for shortdepth quantum circuits. Phys Rev Lett 119:180509
Wiebe N, Braun D, Lloyd S (2012) Quantum algorithm for data fitting. Phys Rev Lett 109:050505
Wiebe N, Granade C, Ferrie C, Cory DG (2014) Hamiltonian learning and certification using quantum resources. Phys Rev Lett 112:190501
Zhukov AA, Remizov SV, Pogosov WV, Lozovik YE. (2018) Algorithmic simulation of farfromequilibrium dynamics using quantum computer. Quantum Inf Process 17:223
Zhukov AA, Kiktenko EO, Elistratov AA, Pogosov WV, Lozovik YE (2019) Quantum communication protocols as a benchmark for programmable quantum computers. Quantum Inf Process 18:31
Acknowledgments
We acknowledge use of the IBM Quantum Experience for this work. The viewpoints expressed are those of the authors and do not reflect the official policy or position of IBM or the IBM Quantum Experience team. W. V. P. acknowledges a support from RFBR (project no. 190200421).
Author information
Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Babukhin, D.V., Zhukov, A.A. & Pogosov, W.V. Nondestructive classification of quantum states using an algorithmic quantum computer. Quantum Mach. Intell. 1, 87–96 (2019). https://doi.org/10.1007/s42484019000109
Received:
Accepted:
Published:
Issue Date:
Keywords
 Quantum computing
 Quantum data processing
 Postprocessing
 Quantum error correction
 Error mitigation