1 Introduction

With the help of increasingly powerful supercomputers, we are nowadays able to solve image processing problems that have been unsolvable just a few years ago. Especially in artificial intelligence and machine learning, higher computing power and the rise of deep convolutional neural networks recently enabled a leap forward. For some problems, often referred to as NP-hard problems, an exact solution using classical computers is not possible. Some of these NP-hard problems, however, can be solved efficiently on quantum computers. The best known example is the integer factorization or discrete logarithm problem [16]. One goal of research in quantum computing is to demonstrate quantum supremacy, that is, to show the superiority of a quantum computer over classical supercomputers for solving a complex problem [12, 13].

In image processing, it has become increasingly important to process very large images, e.g., images bigger than one terabyte, efficiently. A classical solution for this is to apply algorithms to a subdivision of the image and to subsequently merge the results. To avoid such splitting of the data and the associated edge effects, the transfer of the problem to the quantum world is promising.

In general, quantum image processing consists of three parts: quantum image representation, quantum image processing algorithms, and quantum image measurement. In classical image processing, the focus is on the actual algorithm and its efficiency. In quantum image processing, it is of highest priority to convert classical images into quantum states. For this step, there is a variety of algorithms including the qubit lattice [20], the Real Ket quantum image representation [7], and the flexible representation of quantum images (FRQI) [8]. These three methods are considered to be the fundamental image representation options and serve as basic building blocks of many algorithms and starting points for other quantum image representations, see, e.g., [18, 19, 21,22,23].

In this paper, we take a closer look at FRQI. Often this method is only described in theory and not explicitly applied on real backends. Mostly, only the minimal image size of \(2\times 2\) pixels is considered. Here, we show how larger images can be implemented. Furthermore, we explore the current limitations in the noisy intermediate-scale quantum (NISQ) era, in which the results of the quantum computer are affected by noise. On a circuit-based superconducting quantum computer of IBM [6], we investigate up to which size an image can currently (August 2021) be encoded and measured. We increase the possible image size by improving the implementation of Qiskit [1] based on an idea that is also beneficial for other image representations and algorithms.

This paper is organized as follows. In Sect. 2.2, we first explain why we chose the FRQI implementation and the encoding of classical data into quantum states using the FRQI implementation. A naive implementation together with two preprocessing approaches is presented in Sect. 3. Additionally, we explain two approaches for recovering the classical data from the empirical probabilities of the quantum computing results. In Sect. 4, we introduce an alternative to the naive implementation which has advantages especially for larger images. In Sect. 5, we propose a way to extend both implementations to larger images. In Sect. 6, we present the experimental setup for evaluating the accuracy of the implementation on actual quantum computers. The results and the current maximum possible image size are discussed in Sect.  7. Sect. 8 concludes the paper.

2 Image encoding method

2.1 Choice of the image encoding method

FRQI, novel enhanced representation for quantum images (NEQR) [23], and quantum probability image encoding (QPIE) [21] are the most common encoding methods [1, 21]. The goal of this paper is not only to cover the theory, but also the application on IBM’s gate based quantum computer. That means, we do not have all-to-all connectivity such that missing connections must be bridged by additional SWAP-gates.

With NEQR, encoding a \(2\times 2\) pixel gray value image requires 2 qubits for the pixel positions and 8 qubits to store the intensities (bit-wise representation of values in [0, 255]). Theoretically, NEQR performs better than FRQI in converting the probabilities of the quantum states to classical pixel values. We only need to know the highest activated states not their exact probability. However, on the current NISQ hardware, NEQR’s performance is limited. For example, on IBM’s backend ibmq_16_melbourne, we would need around 368 CX-gates and a circuit depth of 340 for encoding a \(2\times 2\) image with values \(\{0,85,170,255\}\) due to the additional SWAP-gates needed.

For the amplitude encoding method QPIE, we initialize the qubits in the state with the amplitudes corresponding to the pixel values of the image. For that, we can use the initialize  command in Qiskit. Thus, in principle, the encoding procedure for small images is simple. However, it is more difficult to transform the probabilities of the quantum states back to pixel values. Normally, the median amplitude is used: Let h be the height, w the width of an image, shots the number of shots (i.e., how often we run the circuit), and counts(key) the frequency of observing a specific state in the measurements. We retrieve a pixel’s 8bit gray value as

$$\begin{aligned} \textrm{value}=\textrm{round}\left( 128\frac{h\cdot w\cdot counts(key)-shots}{shots}+128\right) . \end{aligned}$$
(1)

However, we cannot distinguish between a black and a white image with this image retrieval.

Due to these limitations of NEQR and QPIE on the actual quantum computers, we choose FRQI. We conjecture that our measures to improve FRQI would apply to NEQR, too.

2.2 FRQI image representation

For representing a \(2^n\times 2^n\) pixel gray value image, just \(2n+1\) qubits are needed - 2n qubits for the position and one qubit for the gray value information of the pixels. In the basis states, each qubit is in a two-dimensional state, either \(|{0} \rangle \) or \(|{1} \rangle \). A combination of 2n qubits via the tensor product \(\otimes \) yields \(2^{2n}\)-dimensional computational basis states \(|{i} \rangle \) for the position information. For simplicity, the basis states \(|{i} \rangle \) are named by the decimal number i corresponding to the vector of zeros and ones read as binary number.

The gray value information of the pixels is encoded by

$$\begin{aligned} |{c_i} \rangle =\cos (\theta _i)|{0} \rangle +\sin (\theta _i)|{1} \rangle \text { and } \theta _i\in \left[ 0,\frac{\pi }{2}\right] , \end{aligned}$$
(2)

where \(\theta =\left( \theta _0, \theta _1, \dots , \theta _{2^{2n}-1}\right) \) is a vector of angles encoding the gray values of each pixel. Section 3.1 provides more details on how to convert gray values into angles. In total, the two states for position and gray value information of every pixel are converted into the FRQI state by the formula

$$\begin{aligned} |{Img(\theta )} \rangle =\frac{1}{2^n}\sum _{i=0}^{2^{2n}-1}|{c_i} \rangle \otimes |{i} \rangle . \end{aligned}$$
(3)

Figure 1 shows a \(2\times 2\) sample gray value image and its FRQI state.

Fig. 1
figure 1

\(2\times 2\) pixel gray value image, transformation to angles in the range \([0,\pi /2]\), and the resulting quantum state representation

In quantum computing, the qubits are usually initialized in well-prepared states. In Qiskit [1], the \(|{0} \rangle \) state is the initial one for all qubits. Thus, the initial state of a quantum circuit for FRQI is \(|{0} \rangle ^{\otimes 2n+1}\). To convert this initial state into the FRQI state from Equation (3), we apply single- and multi-qubit gates. The polynomial preparation theorem [8] shows that the desired FRQI state can be constructed efficiently in two steps. First, all 2n position qubits have to be superposed to cover all \(2^{2n}\) pixels in the classical image. This is achieved by using 2n Hadamard (H) gates in parallel for the position qubits. That is, we apply the tensor product \(I \otimes H^{\otimes 2n}\) to the initial state

$$\begin{aligned} |{H} \rangle =(I \otimes H^{\otimes 2n})|{0} \rangle ^{\otimes 2n+1}=\frac{1}{2^n}|{0} \rangle \otimes \sum _{i=0}^{2^{2n}-1}|{i} \rangle . \end{aligned}$$
(4)

In the second step, controlled rotation gates are applied

$$\begin{aligned} |{Img(\theta )} \rangle =\left( \prod _{i=0}^{2^{2n}-1} R_i\right) |{H} \rangle , \end{aligned}$$
(5)

where

$$\begin{aligned} R_i=\left( I \otimes \sum _{j=0,j\ne i}^{2^{2n}-1}|{j} \rangle \langle {j}\vert \right) +R_y(2\theta _i)\otimes |{i} \rangle \langle {i}\vert . \end{aligned}$$
(6)

Here, \(\langle {\cdot }\vert \) represents the adjoint of \(|{\cdot } \rangle \) and \(|{\cdot } \rangle \langle {\cdot }\vert \) is the outer product. The first part of Equation (6) describes the control qubits and the second part the rotation using the R\(_\text {y}\)-gate [11] defined as

$$\begin{aligned} R_y(\theta _i)=\left( \begin{array}{ccr} \cos (\theta _i/2) &{}&{} -\sin (\theta _i/2) \\ \sin (\theta _i/2) &{}&{} \cos (\theta _i/2) \end{array}\right) . \end{aligned}$$
(7)

Each multi-controlled operation, like the controlled rotation above, can be decomposed into single-qubit and two-qubit operations [11]. This allows us to implement such operations on quantum computers. On these, all operations have to be transformed into basis gates, which for IBM’s backends are currently identity (I), NOT (X), square-root NOT (SX), rotation (R\(_\text {z}\)), and controlled-NOT (CX) gates [6]. Typically, an operation can be decomposed in several ways. The decomposition chosen is decisive for the number of gates required and the resulting noise level.

3 Practical implementation of FRQI

3.1 Preprocessing: Converting gray values into angles

For representing an image on the quantum computer with FRQI, the pixel values have to be converted into angles \(\theta _i\in \left[ 0,\frac{\pi }{2}\right] \) (see Equation (2)). Here, we only consider 8-bit gray value images with input pixel values \(v_{in, i}\in [0,255]\). The first and obvious approach for obtaining the angle representation is a linear transformation. The pixel values are converted to fall into the range \(\left[ 0,\frac{\pi }{2}\right] \) via

$$\begin{aligned} \theta _i= \frac{v_{in,i}}{255}\cdot \frac{\pi }{2}. \end{aligned}$$
(8)

Alternatively, we follow [4] and use the transformation

$$\begin{aligned} \theta _i=\arcsin \left( \frac{v_{in,i}}{255}\right) , \end{aligned}$$
(9)

to get angles in the required interval \(\left[ 0,\frac{\pi }{2}\right] \). Of course, the arcsin in Equation (9) can also be replaced by an arccos to achieve angles in the required range.

3.2 MCRY-implementation

Independent of the two transformation variants, the multi-controlled rotation gates from Equation (6) must be converted into gates which are executable on a real quantum computer. The multi-controlled Y rotation gates (MCRY) are converted by some standard routines of Qiskit [1]. The MCRY-gates only apply a rotation to the gray value qubit if all position qubits are in state \(|{1} \rangle \). Subsequently, the result of the rotation has to be transferred to the correct position qubit state by application of X-gates. This allows to encode the next pixel using state \(|{1} \rangle \) of all position qubits.

One MCRY-gate is applied per input pixel. An example for a decomposition of one MCRY-gate with an arbitrary angle \(\gamma _i=2\theta _i\) is shown in Fig. 2a. Here, the factor 2 is needed in Equation (7) to get the sine and cosine terms required in Equation (2).

An MCRY-gate is decomposed into two CX-gates and three controlled-U-gates (CU). The latter have to be converted into basis gates available on the real backends. That is, 2 CX-, 4 SX- and 4 R\(_\text {z}\)-gates per CU-gate. These numbers of basis gates are obtained when using the MCRY-gate in Qiskit and decomposing it with the transpiler using the basis gates as entry [1]. Thus the number of gates, especially CX-gates, increases further (8 CX-, 12 SX-, and 12 R\(_\text {z}\)-gates for one MCRY-gate). The circuit for an arbitrary \(2\times 2\) example using MCRY-gates is visualized in Fig. 2b. Note that in the graphical representation of gates and circuits, the top or gray value qubit is always the target qubit. The remaining qubits are controls. In this implementation, the number of CX-gates is high and requires the entanglement of all qubits. This means that we have to insert SWAP-gates for missing connections on the real backends.

Fig. 2
figure 2

MCRY-gate and MCRY-implementation for a \(2\times 2\) image

3.3 Post-processing: Converting probabilities to gray values

The state of the quantum computer after running the above circuit cannot be determined exactly. By a measurement, frequencies of the possible states are observed. The classical representation of the output image then has to be determined from the thus derived empirical probability distribution.

Consider a state \(i=c\otimes j\), where c represents the state of the gray value qubit and j the state of the position qubits. Let \(p_{j|c=|{0} \rangle }\) the conditional probability of observing state j given that the gray value qubit is in state \(|{0} \rangle \). Analogously, \(p_{j|c=|{1} \rangle }\) is the conditional probability of observing j with gray value qubit in \(|{1} \rangle \). The output pixel value can be determined by

$$\begin{aligned} v_{out,i}=\arccos \left( \sqrt{\frac{p_{j|c=|{0} \rangle }}{p_{j|c=|{0} \rangle }+p_{j|c=|{1} \rangle }}}\right) \cdot 255 \cdot \frac{2}{\pi } \end{aligned}$$
(10)

for the linear conversion as given in Equation (8). The probabilities for the states with gray value qubit in state \(|{0} \rangle \) are considered and normalized with the sum of the probabilities with gray value qubit in state \(|{0} \rangle \) and \(|{1} \rangle \) for each possible state. This ensures that the quotient in the first part of Equation (10) is in [0, 1] and the pixel values in the gray value range [0, 255]. According to Equation (2), arccos is applied and the pixel values are retrieved by inverting Equation (8).

In the second approach, by definition, we only need to consider the states where the gray value qubit is in state \(|{1} \rangle \) for retrieving the image. The output pixel values \(v_{out,i}\) are given by

$$\begin{aligned} v_{out,i}=2^{n}\cdot \sqrt{ p_{j|c=|{1} \rangle }}\cdot 255. \end{aligned}$$
(11)

The factor \(2^n\) cancels the equal state weighting in the FRQI definition (3) and the factor 255 converts the normalized angles into gray values.

Obviously, states with gray value qubit in state \(|{0} \rangle \) are not considered in Equation (11). Instead, the general weighting factor \(2^n\) is used. When using a small number of measurements/shots and in the presence of noise, there is no way to ensure that \(p_{j|c=|{0} \rangle }+p_{j|c=|{1} \rangle }=1/{2^{2n}}\) for \(j\in \{0,2^{2n}-1\}\). This can lead to incorrectly weighted states and result in pixel values that are not in the gray value range [0, 255]. Therefore, this approach is only useful if high numbers of measurements/shots can be performed.

Due to these problems, we choose the first approach and apply the linear transformation. Note that an exact quantum image recovery requires a state tomography [14] which is challenging, so far untested on real backends, and scales poorly with the size of the quantum system.

4 Modification of the MCRY-implementation: MARY-implementation

A more efficient method to implement the FRQI state for a \(2\times 2\) image is inspired by [4]. The basic structure as well as the X- and Hadamard gates remain unchanged. However, instead of MCRY-gates we use MARY-gates for a part of the decomposition, leaving only CX-gates or single-qubit Y rotation gates in the implementation. Figure 3a shows a MARY-gate with an arbitrary angle \(\gamma _i=2\theta _i\), which is decomposed into four R\(_\text {y}\)-gates and four CX-gates. The R\(_\text {y}\)-gates are further decomposed into SX- and R\(_\text {z}\)-gates, but the number of CX-gates stays unchanged.

This way the number of error-prone CX-gates is halved in comparison with the MCRY-implementation. Furthermore, the CX-gates act only on mixed pairs of a gray value and a position qubit but not pairs of position qubits anymore. Consequently, not all three qubits have to be pairwise connected. Thus, the SWAP-gates required in the MCRY-implementation are avoided here. Figure 3b shows the circuit for an arbitrary \(2\times 2\) sample image. We only need to replace the MCRY- by MARY-gates. Additional information about the MARY-implementation is presented in Appendix 1.

Fig. 3
figure 3

MARY-gate and MARY-implementation for a \(2\times 2\) image

5 Extension to larger images

For larger images, the difference between the two implementations is more pronounced. In both cases, the position qubits must be entangled with the gray value qubit. In the MCRY-implementation, only the additional position qubits are added to the Qiskit MCRY-gate as control qubits. The decomposition of these multi-controlled rotation gates is left to the transpilation step and does not have to be adapted by the user. The effort to adapt the implementation to another image size is therefore very low. However, the required number of CX-gates is very large, which increases the error rate, circuit depth, and execution time.

To reduce the number of CX-gates, we adapt the MARY-implementation for the larger images. We have to entangle the additionally required position qubits with the gray value qubit. Similar to [4], we use RCCX- or RCCCX-gates [9, 17] to entangle three or four qubits, respectively. These simplified versions of the Toffoli gate or multi-controlled X-gate with three control qubits yield the same output up to relative phases, so the elements in the matrix representation differ by a factor of \(e^{i\pi \phi }, \phi \in {\mathbb {R}}\) [9]. Definitions of the gates are given in Appendix 1.

The change of relative phase allows for a simpler implementation such that we can reduce the number of CX-gates compared to the direct use of multi-controlled CX-gates, see Fig. 4. Hence, we only need 3 CX-gates in the case of RCCX- and 6 CX-gates in the case of RCCCX-gates instead of 6 or 14, respectively.

The main idea in constructing the circuits for larger images is to keep the operations on the gray value qubit the same as in [4] for all sizes, using X-gates to change the desired state and entangle the position qubits with the gray value qubit. We can entangle two qubits with a CX-gate, three qubits with an RCCX-gate and four qubits with an RCCCX-gate. We have to combine these gates to entangle the 2n position qubits with the gray value qubit for a \(2^n\times 2^n\) pixel gray value image. For that we have two ways: enlarging the MARY-gates by replacing CX-gates with RCCX-, or RCCCX-gates or entangling the qubits before the MARY-gate and applying the corresponding gates symmetrically after the MARY-gate. In practice, one of the approaches or a combination of both are chosen depending on the image size.

Enlarging the MARY-gate allows to entangle a maximum of ten qubits using RCCCX-gates. An example of a MARY8-gate which entangles eight qubits is shown in Fig. 5. A detailed description on how to derive the MARY8 decomposition in Fig. 5 from the basic MARY-gate in Fig. 3a is given in Appendix 1.

For images smaller than \(32\times 32\), we can just increase the MARY-gate and keep the original workflow. However, for larger images, we have to both enlarge the MARY-gate and entangle outside the MARY-gate. Figure 6 shows this combined way for a \(64\times 64\) pixel gray value image. We entangle the position qubits before and symmetrically after the MARY8-gate. With the use of one ancilla qubit and one RCCX-gate, we can further combine the information from two position qubits. Using X-gates prepares the position qubits to receive information from other position qubits. This is achieved by two RCCCX-gates. Analogously, more RCCX- and RCCCX-gates can be used to encode images of other sizes.

Fig. 4
figure 4

Gates for entangling qubits instead of using multi-controlled CX-gates. Their decompositions into single-qubit-gates and CX-gates are shown on the right. Top: RCCX-gate; Bottom: RCCCX-gate

Fig. 5
figure 5

Decomposed MARY8-gate for an arbitrary angle \(\gamma _i=2\theta _i\)

Fig. 6
figure 6

Circuit for one angle of a \(64\times 64\) image. Hadamard gates are used for superposition, RCCX- and RCCCX-gates for entangling, and MARY8 for entangling and rotation. X-gates after the Hadamard gates are not shown here

Examples of possible circuits for image sizes \(2^n\times 2^n\), with \(n=1,\dots , 9, 13\), are shown in Appendix 2. This way, encoding of images of size \(8192 \times 8192\) is conceivable by using one gray value qubit, three ancilla, and 26 position qubits as visualized in the supplementary information in Fig. 6. The ancilla qubits serve only as storage qubits for entangled position qubits.

6 Quantum computing environment used

Here, we describe our framework for realizing the methods from the previous section. It includes software, a classical computer, a quantum computer, and error models.

We use the open-source software development kit Qiskit [1] for working with IBM’s circuit-based superconducting quantum computers [6]. Via cloud access, they provide a variety of systems, also known as backends, which differ in the number and performance of the qubits and their connectivity. In this paper, we use the backends ibmqx2, ibmq_16_melbourne, ibmq_santiago, ibmq_manila, ibmq_toronto, and the German backend ibmq_ehningen. The corresponding coupling maps are shown in Fig. 7.

Fig. 7
figure 7

Backends used in this paper. The qubit frequencies (points) and the CX errors for the connections between the qubits (lines) are encoded in the color values. Dark blue indicates small, purple high frequency/error

The backends underlie external influences. As a consequence, the characteristics of the backends, such as CX error, readout error or decoherence times, can change hourly. The errors are usually reported as averages over 24 hours. Intermediate calibration is used to keep the errors low. Typical average error values and frequencies are shown in Table 1.

Table 1 Typical average calibration data of the six chosen backends

Currently, all IBM quantum computers are affected by the errors just addressed. Measurement error mitigation is a way to reduce the errors, especially the readout error, and to improve the results. Various methods for quantum error mitigation have been introduced including measurement error mitigation, zero noise extrapolation or probabilistic error cancellation to name just a few. An overview is given in [3].

Error mitigation is a vivid field of research. It is beyond the scope of this paper to explore and compare all possibilities. We therefore only demonstrate the effect using a simple measurement error mitigation according to [2]. The idea is to prepare all \(2^n\) basis input states, execute them on the qasm_simulator  with an error model, and to compute the probability of measuring basis states differing from the true input. Based on this, a backend specific calibration matrix is calculated which compares the desired and the measured states. Applying the inverse of this calibration matrix finally yields the improved results.

Two error models are used in the following. The Pauli error is applied to the measurement of the qasm_simulator. It consists in randomly flipping each bit in the output with a probability \(p_{meas}\). In contrast, the depolarizing error model captures imperfections in operations. While processing, the state of any qubit is replaced by a completely random state with a probability of \(p_{gate}\). For two-qubit gates, like CX-gates, the depolarizing error is applied independently to each qubit. For further details on these two error models and measurement error mitigation in general we refer to [2].

In addition to quantum computers, a classical computer is needed for testing algorithms, reducing errors, and generating and storing the circuits before sending them to the quantum computer. The latter is the main limitation as circuits for encoding larger images need a lot of gates and have to be stored completely in the RAM. We use a computer with an Intel Xeon E5-2670 processor running at 2.60 GHz, a total RAM of 64 GB, and Red Hat Enterprise Linux 7.9.

7 Results and comparison of MCRY- and MARY-implementations

7.1 Sample image \(2\times 2\)

Our aim is to encode a classical image into a quantum state, measure the outcome, and compare the result with the classical input image.

We set the pixel values of the input image \(image_{in,i}, i\in \{0,1,2,3\}\), to [10, 85, 170, 255] to cover the whole gray value range. We start at pixel value 10, since 0 is not generally interesting as only identity gates are needed for encoding it. We measure deviation by the relative difference:

$$\begin{aligned} \mathrm{{diff}}_{rel}=\frac{1}{2^{2n}}\sum _{i=0}^{2^{2n}-1}|image_{out,i}-image_{in,i}|\cdot \frac{100}{255} \end{aligned}$$
(12)

for a general image size of \(2^n\times 2^n\). The term \(image_{out}\) is the resulting image reconstructed from the measurements of the quantum computer as described in Sect. 3.3. To compare the implementations, we ran the two circuits in one job on five different backends. This ensures that the same calibration is used for the two implementations and that differences in the outcome are not due to varying error rates or coherence times. Box plots for 42 executions between June 7 and June 14, 2021, are shown in Fig. 8.

Fig. 8
figure 8

Box plots for relative difference between input image [10, 85, 170, 255] and output image from five different backends

Obviously, the MARY-implementations perform better than those with the MCRY-gates. This is due to the lower circuit depth (see Fig. 9), especially the lower number of CX-gates (see Fig. 10).

Fig. 9
figure 9

Box plots for circuit depth when using the sample image [10, 85, 170, 255]

Fig. 10
figure 10

Box plots for number of CX-gates when using the sample image [10, 85, 170, 255]

Differences between the five backends are clearly visible, too. Backend ibmqx2  has three fully-connected qubits and (like ibmq_16_melbourne) no heavy-hexagonal topology (see Fig. 7). This yields the lower variance of the outcomes for this backend. The difference is even more apparent for the MCRY-implementation, where we can benefit from the high connectivity of the qubits in that no SWAP-gates have to be used. Newer backends use a purely heavy-hexagonal topology or segments thereof because it yields lower error rates than more connected topologies [24]. In Fig. 8 this effect is visible in the lower relative difference of the younger backends ibmq_santiago, ibmq_manila, and ibmq_ehningen  compared to the older ibmq_16_melbourne  and ibmqx2.

As shown in Fig. 8, the MARY-implementation’s results are more precise. Measurement error mitigation is a way to improve them further.

Here, we pursue two ways to obtain the calibration matrix. The first one, mitigation_own, is executed on the qasm_simulator  assuming a self-determined noise model. We follow the suggestions from [2] and assume Pauli-errors for the measurements and depolarizing errors for X- and CX-gates. We set \(p_{meas}=p_{gate}=10\%\), which exceeds the actual error probabilities of the backends (see Table 1). This way, we can also account for other errors that cannot be incorporated directly, e.g., errors caused by the environment of the quantum computer.

For the second approach, mitigation_backend, we replace our noise model by the noise model from the Qiskit Aer Noise module [1] which is obtained via the command from_backend(backend). It includes all error rates, coherence times, and the coupling map of the backend at that specific time. The outcome for the MARY-implementation is shown in Fig. 11.

Fig. 11
figure 11

Box plots for relative difference between input image [10, 85, 170, 255] and output image from five different backends using the MARY-implementation. The backends ibmq_santiago, ibmq_manila, and ibmq_ehningen  have heavy-hexagonal topology with lower error rates

Both variants of measurement error mitigation reduce the relative difference for all tested backends. Fig. 11 shows that mitigation_own  works better. This is probably because mitigation_backend takes into account the actual daily qubit and gate errors of the backends but ignores other error sources such as crosstalk errors. In addition, other factors such as the environment of the quantum computer can also affect the result. With mitigation_own, we circumvent this problem by fixing the error to \(10\%\) which surely overestimates the true values. Changing or adjusting this value offers further potential for improving measurement error mitigation, but will not be considered further in this paper.

Relative differences in the range of \(2-3\%\), in some executions also lower, can be achieved by using mitigation_own. Thus, with our adjustments and improvements, it is possible to reconstruct the input image. The outcomes with the smallest, highest, and mean relative difference are shown in Fig. 12.

Fig. 12
figure 12

\(2\times 2\) pixel gray value input image and resulting output images. We use the qasm_simulator  with 8192 shots and in total 42 executions on the five backends. We show the output images with the smallest (best) and highest relative difference (worst) as well as the mean of the 42 executions and the five backends. The best results were obtained on backend ibmq_manila. The worst cases came from ibmq_16_melbourne

7.2 Current maximal possible image sizes for simulator and NISQ quantum computer

In [4], an input image of size \(32 \times 32\) was implemented. Following the same idea, we implement circuits for images of size \(2^n\times 2^n\), where \(n\le 9\) (details in Appendix 2). Input pixel values are chosen randomly up to a size of \(16\times 16\). From \(32\times 32\) on, we use downscaled versions of the 8-bit gray value Lena image [15], which originally has a size of \(512\times 512\) and is a standard test image in image processing.

We apply the steps from Sections 3, 4, and 5 to check feasibility but do not numerically compare the output with the original input. The results are strongly influenced by noise as the image size is larger than \(2\times 2\) and thus the associated circuit depth increases. This complicates the retrieval of an image enormously, since in a probabilistic model like FRQI the exact measurement of the probabilities for the individual states is crucial. The retrieved pixel values are all around 125 using the five backends from above or the backend ibmq_toronto and do not reflect the input image at all.

We further illustrate this in Fig. 13. Instead of the random image, we use a \(4\times 4\) pixel downscaled gray value image from the MNIST dataset [5]. This image shows a well recognizable digit seven. However, even with the MARY-implementation and when including error mitigation, the seven is no longer visible in the output image.

Fig. 13
figure 13

\(4\times 4\) pixel downscaled gray value input image from the MNIST dataset [5] and resulting output images. We use the MARY-implementation, 8192 shots for the qasm_simulator, and ibmq_toronto  as real backend. The results can only be improved slightly with the measurement error mitigation method

Consequently, we can only recover images with the minimal size of \(2\times 2\) with FRQI on a real backend. Above that, the noise makes the recovery of an image impossible with the used implementation even when using measurement error mitigation.

Even with the noise free and fully connected qasm_simulator, using 8192 shots for larger images results in images strongly deviating from the input image, as shown in Fig. 14a.

Fig. 14
figure 14

Relative difference (in %) between \(image_{in}\) and \(image_{out}\) for varying image sizes, qasm_simulator  as backend and 8192 or \(10^6\) shots

Even a much higher number of shots is no complete remedy (see Fig. 14b). The visual differences for the downscaled \(256\times 256\) Lena image are shown in Fig. 15.

Fig. 15
figure 15

\(256\times 256\) pixel gray value downscaled Lena image [15] with qasm_simulator  outcomes and varying number of shots

The observed deviation is related to the number of states that can occur for a given image size. For example in the MARY-implementation, we only increase the size of the MARY-gates without adding an ancilla qubit for images smaller than \(32\times 32\). Therefore, we have in total \(2^{2n+1}\) possible states (like in the MCRY-implementation), where \(2n+1\) is the number of required qubits. For images larger than or equal to \(32\times 32\), we have an ancilla qubit in addition to the positions qubits and the gray value qubit, so \(2^{2n+2}\) possible states are conceivable.

This results in more than a million possible states for images of size \(512\times 512\), since 20 qubits are needed. Thus, accurate estimation of the distribution of possible states requires a sufficiently large sample. The number of possible states and the deviation from the input image grow exponentially as visualized in Fig. 14.

The circuit depth increases exponentially, too. This is due to the entanglement of the position qubits with the gray value qubit and the associated exponential increase of the number of CX-gates. See Fig. 16 for a visualization.

Fig. 16
figure 16

Circuit depth for varying image sizes and MCRY-/MARY-implementation. Values are mean values from 10 observations and shown in logarithmic scale. Used backends: a qasm_simulator, b ibmq_toronto

With qasm_simulator, the circuit depth can be determined up to an image size of \(32\times 32\) for MCRY- and \(512\times 512\) for MARY-implementation without exceeding memory (see Fig. 16a). However, actual execution is no longer possible from these image sizes onward with 64GB RAM available in the classical machine used for generating and storing the circuits.

For the backend ibmq_toronto, the circuit depth allows calculation for image sizes up to \(16\times 16\) for the MCRY- and \(64\times 64\) for the MARY-implementation only, see Fig. 16b. This is due to the fact that more gates are needed compared to the simulator. Additionally, backend specific things like coupling maps and errors are considered in the transpilation step which increases the memory needed. For the real backend, circuit depths vary due to changing errors, which is why we average over 10 observations in Fig. 16b.

The circuit depth is significantly higher for the MCRY-implementation which implies that more noise is accumulated in the process. Additionally, more memory has to be used which limits the executability of the MCRY-implementation.

All in all, we need more and more memory to generate the circuits for increasing image sizes. In the end, this is the limiting factor. The current possible image sizes that can be handled on 64GB RAM are shown in Table 2. For larger images, the available memory is exceeded and the job aborts. If we focus on the outcomes, the finite sampling shot noise increases the relative difference for the qasm_simulator. The additional noise from gates and the environment further restricts the maximal image size for the real backend. In total, in spite of all suggested improvements, the reconstruction of the image with FRQI is only possible for \(2\times 2\) images on the real backends.

Table 2 Current maximum executable and usable image sizes for MCRY- and MARY-implementation on qasm_simulator  with 8192 shots and IBM’s backend ibmq_toronto  limited to 64GB memory

Additionally, there are also significant differences in the execution times of the MCRY- and the MARY-implementation. This is shown in Fig. 17. The higher circuit depth and number of gates in the MCRY-implementation result in higher execution times for both the qasm_simulator  and IBM’s backend ibmq_toronto. This difference is more pronounced for larger image sizes.

Fig. 17
figure 17

Execution time for varying image sizes and MCRY-/MARY-implementation. Values are mean values from 10 observations and shown in logarithmic scale. Used backends: a qasm_simulator  with 8192 shots, b ibmq_toronto

8 Conclusion

In this paper, we investigate the practical use of one of the basic quantum image representations, namely FRQI. In particular, we determine the manageable image sizes both in terms of useable results and memory requirements. With exponentially increasing image size, the number of qubits increases only linearly. However, the number of possible states increases exponentially. As a result, even when using qasm_simulator  and a fixed number of shots, it happens that pixel values of the output image cannot be distinguished from noise and therefore do not match those of the input image.

Furthermore, the number of gates required for encoding an image with increasing size increases exponentially, too. Thus the influence of per-gate errors increases at the same speed. Our simplified implementation saves a large number of gates, reduces errors that way, and enables faster computation. The basic idea is to replace the error-prone multi-controlled gates by simplified versions needing fewer CX-gates. With increasing image size, this strategy becomes more and more important. First, to obtain less confounded results. Second, to push the feasible image sizes to new frontiers.

All quantum image representations and all quantum algorithms involving multi-controlled operations can benefit from the presented simplified implementation.