1 Introduction

Quantum computing is an emerging technology that is expected to provide speedup for various classically time-consuming problems. Following recent advances, researchers are now investigating its practical applications. Some fundamental quantum algorithms have evolved into solvers for more concrete numerical problems. For example, the Harrow–Hassidim–Lloyd algorithm [1] to solve linear equation systems and its extensions [2,3,4,5,6,7,8,9,10] led to quantum solvers for ordinary differential equations (ODEs) [11,12,13,14] and partial differential equations (PDEs) [3, 15,16,17,18,19], including even nonlinear systems [20,21,22,23,24,25]. The practical use of these algorithms has been proposed in various fields such as epidemiology [20, 26], fluid dynamics [20], and financial derivative pricing [27]. Compared with existing classical methods, these quantum algorithms achieve exponential speedup with respect to the size of the problem. For an ODE, the problem size is the number of equations in the system; for a PDE, it is the dimension d of the domain of the solution function f, that is, the number of the variables of f. Thus, they are among the most promising applications of quantum computing. In addition to the above algorithms that run on fault-tolerant quantum computers (FTQCs), some algorithms for noisy intermediate-scale quantum (NISQ) devices also solve ODEs and PDEs [28,29,30,31,32,33,34,35,36,37,38,39,40,41,42].

However, when we consider these and other quantum solvers and the speedup they provide, we should pay attention to the meaning of the words “solve” and “speedup.” That is, we must resolve the issue of how to extract the function values from the quantum state, which might ruin the quantum speedup.

The detail is as follows. As pointed out in [43], many quantum algorithms output the solution not as a figure that we eventually want, but as the quantum state in which the solution is encoded as the amplitudes. For example, quantum algorithms for solving a PDE output the quantum state in the form of

$$\begin{aligned} |f\rangle :=\frac{1}{C}\sum _{i=0}^{N_{\textrm{gr}}-1} f(\vec {x}_i)|i\rangle ~, \end{aligned}$$
(1)

where f is the solution function, \(\vec {x}_0,\vec {x}_1,\ldots ,\vec {x}_{N_{\textrm{gr}}-1}\) are the \(N_\textrm{gr}\) grid points in the domain of f, \(|0\rangle ,|1\rangle ,\ldots ,|N_{\textrm{gr}}-1\rangle \) are the computational basis states, and C is the normalization factor. Reading out the values of the function as classical data from such a quantum state can often be a bottleneck. For the example of solving a PDE, the number of grid points \(N_{\textrm{gr}}\) is exponentially large with respect to the dimension d: naively, taking \(n_{\textrm{gr}}\) grid points in each dimension leads to \(N_{\textrm{gr}}=n_{\textrm{gr}}^d\) points in total. This makes the amplitude \(\frac{1}{C}f(\vec {x}_i)\) of each basis state in \(|f\rangle \) exponentially small when the amplitude is not localized on certain grids. This means that when we try to retrieve the amplitude and then the function value \(f(\vec {x}_i)\) at the point \(x_i\) using methods such as quantum amplitude estimation [44, 45], an exponentially large time overhead is added. Therefore, even if a quantum algorithm exists that generates the state \(|f\rangle \) exponentially faster than a classical algorithm outputs the value of \(f(\vec {x}_i)\), obtaining \(f(\vec {x}_i)\) through the quantum algorithm might not be faster than the classical one.

If we want to obtain the function values at many points, for example, to plot the values as a graph, the situation worsens. To obtain the function values at M points, the quantum algorithm must be repeated M times.

Motivated by this background, this study focuses on how to efficiently extract the function encoded as the amplitudes in the quantum state \(|f\rangle \), given an oracle \(O_f\) to generate it. Although we can resolve this issue using the specific nature of the problem in some cases such as derivative pricing considered in [27], where the martingale property of the derivative price is used to calculate the function value at a point, nevertheless, it is desirable to devise methods that are generally applicable.

The method proposed in this study is twofold. First, we use an orthogonal function system. Orthogonal functions such as trigonometric functions, Legendre polynomials, Chebyshev polynomials, and so on are the sequences of functions orthogonal to each other with respect to some inner product, which is defined as the weighted integral of the product of two functions over the domain or the sum of the values of the product at the specific nodes. Orthogonal functions are often used for function approximation. Any function f satisfying some conditions of smoothness is approximated as \(f\approx \sum _l a_l P_l\), the series of orthogonal functions \(\{P_l\}_l\), with the coefficient \(a_l\) given by the inner product of f and \(P_l\). We expect that orthogonal function expansion may also be used in the quantum setting, because, as explained later, the coefficient \(a_{l}\) is given by \(\langle P_l|f\rangle \) multiplied by a known factor, with \(|P_l\rangle \) being the quantum state in which \(P_l\) is encoded like \(|f\rangle \). Thus, by estimating \(\langle P_l|f\rangle \) for every l up to a sufficiently high order, we obtain the orthogonal function expansion \({\tilde{f}}\) of f, and then the approximate values of f at arbitrary points by evaluating \({\tilde{f}}\). This approach seems promising because we expect that \(\langle P_l|f\rangle \) is not exponentially small, unlike the amplitudes of the computational basis states in \(|f\rangle \) (see Sect. 2.3).

However, in the high-dimensional case, the above approach still suffers from high complexity. If we use the D orthogonal functions \(\{P_l\}_{l\in [D]_0}\)Footnote 1 for an accurate approximation in the one-dimensional case, the naive way to achieve similar accuracy in the d-dimensional case is to use the tensorized functions \(\{P_{\vec {l}}\}_{\vec {l}\in [D]_0^d}\), where \(P_{\vec {l}}(x_1,\ldots ,x_d)=\prod _{i=1}^dP_{l_i}(x_i)\) for \(\vec {l}=(l_1,\ldots ,l_d)\). In this way, because the total number of \(P_{\vec {l}}\)’s is \(D^d\), obtaining the coefficients \(a_{\vec {l}}\) for all \(P_{\vec {l}}\)’s and then the orthogonal function expansion of f exhibits exponential complexity, and so does evaluating the resultant expansion. Although this is less serious than reading out the amplitudes in \(|f\rangle \) because we take \(D<n_{\textrm{gr}}\), the exponential dependence of the complexity on d still exists.

Then, we make use of the second building block of our method: tensor network, especially the type called matrix product state (MPS), which is simple and widely used (for reviews, see [46, 47]). Tensor network is an approximation scheme for high-order tensors as a contraction of lower-order tensors. In some situations, it approximates the original tensor well, reducing the degrees of freedom (DOF) and data volume. It was originally invented in quantum many-body physics to approximate wave functions in intractably high-dimensional Hilbert spaces; however, it is currently used to reduce complexity in various fields including function approximation [48,49,50,51,52,53,54]. A recent study [54] showed that the complexity of the tensor network approximation of a d-dimensional function does not scale exponentially on d under certain conditions, which indicates the powerful approximation ability of tensor network.

Another advantage of tensor network is its compatibility with quantum computing. That is, we can generate a quantum state in which a type of tensor network is amplitude-encoded using a simple quantum circuit [55]. Moreover, there is a general procedure for optimizing such a tensor network circuit to maximize the fidelity between the resulting state and a given state [56]. Therefore, we reach the following idea: given \(O_f\), we find a tensor network approximation of the coefficients \(a_{\vec {l}}\) in the orthogonal function expansion of f and then an approximate function of f through quantum circuit optimization.

In the remainder of this paper, we show how this idea is concretely realized. We present the tensor-network-based quantum circuit and optimization procedure for making the generated state close to \(|f\rangle \), based on [56]. The parameters in the tensor network are easily read out from the circuit. Note that this method can be used on both fault-tolerant quantum computers and noisy intermediate-scale quantum computers.

The remainder of this paper is organized as follows. Section 2 presents our method. Beginning with an explanation of the problem setting under consideration, we present the quantum circuit we use and the procedure to optimize it. Section 3 describes the numerical experiment we conducted to validate our method, which demonstrates that the method successfully approximates a finance-motivated multivariate function. Section 4 summarizes this paper.

2 Our method

2.1 Overview

Fig. 1
figure 1

Overview of the quantum circuit used in the proposed method

Before providing a detailed explanation of the proposed method, we present a high-level overview. For the given function f, our goal is to obtain an approximation function efficiently computable on a classical computer. For this purpose, we use a quantum circuit, whose overview is shown is given in Fig. 1. This consists of two parts. In the first half, we encode the coefficients \({\tilde{a}}_{\vec {l}}\) of the orthogonal function series written in the form of an MPS into the amplitudes of the quantum state. In the latter half, we operate the quantum circuit that encodes the values of the orthogonal functions into the amplitudes. These two operations yield the quantum state \(|\tilde{{\tilde{f}}}\rangle \), in which the approximation function \(\tilde{{\tilde{f}}}\) based on the MPS and orthogonal function expansion is encoded in the amplitudes. Then, we optimize the first half of the circuit such that \(|\tilde{{\tilde{f}}}\rangle \) approaches \(|f\rangle \), the state encoding f. As a result, we obtain the coefficients in the form of an MPS optimized to approximate f, and then the approximation function \(\tilde{{\tilde{f}}}\), which is efficiently computable by virtue of the MPS.

The approximation problem addressed in this study is explained in more detail in Sect. 2.2 and the setting of the oracles available is shown in Sect. 2.3. The MPS approximation of the coefficients is explained in Sect. 2.4, and encoding it into the quantum state, that is, the first half of the circuit, is described in Sect. 2.5. The entire circuit including the latter half and optimization of the MPS are explained in Sect. 2.6.

2.2 Problem description

As the approximation target, we consider a real-valued function f on the d-dimensional hyperrectangle \(\Omega :=[L_1,U_1]\times \cdots \times [L_d,U_d]\), where, for \(i\in [d]\) ,Footnote 2\(L_i\) and \(U_i\) are real values such that \(L_i<U_i\).

For any \(i\in [d]\), we also consider orthogonal functions \(\{P^i_l\}_l\) on \([L_i,U_i]\) labeled by \(l\in {\mathbb {N}}_0:=\{0\}\cup {\mathbb {N}}\). These functions are characterized by the orthogonal relation that, for any \(l,l^\prime \in {\mathbb {N}}_0\),

$$\begin{aligned} \int _{L_i}^{U_i} P^i_{l}(x)P^i_{l^\prime }(x)w^i(x)\textrm{d}x=\delta _{l,l^\prime }, \end{aligned}$$
(2)

where the weight function \(w^i\) is defined on \([L_i,U_i]\) and takes a nonnegative value, and \(\delta _{l,l^\prime }\) is the Kronecker delta. However, we hereafter assume that \(\{ P^i_l \}_l\) satisfy the discrete orthogonal relation as follows: for any \(D\in {\mathbb {N}}\), there exist \(n_{\textrm{gr}}\) points \(x_{i,0}<x_{i,1}<\cdots <x_{i,n_\textrm{gr}-1}\) in \([L_i,U_i]\), where \(n_{\textrm{gr}}\ge D\), such that, for any \(l,l^\prime \in [D]_0\),

$$\begin{aligned} \sum _{j=0}^{n_{\textrm{gr}}-1} P^i_{l}(x_{i,j})P^i_{l^\prime }(x_{i,j})=c^i_l\delta _{l,l^\prime } \end{aligned}$$
(3)

holds for some \(c^i_l>0\). We can consider Eq. (3) as the discrete approximation of Eq. (2), with \(w^i\) absorbed into the spacing of the grid points \(x_{i,j}\). For some orthogonal functions such as trigonometric functions and Chebyshev polynomials, the relationship in Eq. (3) holds strictly.

We define the tensorized orthogonal functions as follows: for any \(\vec {l}=(l_1,\ldots ,l_d)\in {\mathbb {N}}_0^d\) and \(\vec {x}=(x_1,\ldots ,x_d)\in {\mathbb {R}}^d\),

$$\begin{aligned} P_{\vec {l}}(\vec {x}):=\prod _{i=1}^d P^i_{l_i}(x_i). \end{aligned}$$
(4)

It follows from Eq. (3) that, with \(\vec {x}_{\vec {j}}\) defined as \(\vec {x}_{\vec {j}}:=(x_{1,j_1},\ldots ,x_{d,j_d})\in {\mathbb {R}}^d\) for any \(\vec {j}=(j_1,\ldots ,j_d)\in [n_{\textrm{gr}}]_0^d\), they satisfy the orthogonal relation

$$\begin{aligned} \sum _{\vec {j}\in [n_{\textrm{gr}}]_0^d}P_{\vec {l}}\left( \vec {x}_{\vec {j}}\right) P_{\vec {l}^\prime }\left( \vec {x}_{\vec {j}}\right) =c_{\vec {l}}\delta _{\vec {l},\vec {l}^\prime } \end{aligned}$$
(5)

for any \(\vec {l}=(l_1,\ldots ,l_d)\) and \(\vec {l}^\prime =(l_1^\prime ,\ldots ,l_d^\prime )\) in \([D]_0^d\), where \(\delta _{\vec {l},\vec {l}^\prime }:=\prod _{i=1}^d \delta _{l_i,l_i^\prime }\) and \(c_{\vec {l}}:=\prod _{i=1}^d c_{l_i}^i\).

We can use these \({P_{\vec {l}}}_{\vec {l}}\) to approximate f by the orthogonal function series

$$\begin{aligned} {\tilde{f}}(\vec {x}):= \sum _{\vec {l}\in [D]_0^d} a_{\vec {l}}P_{\vec {l}}(\vec {x}) \end{aligned}$$
(6)

with D such that \(\max _{\vec {x}} |f(x)-{\tilde{f}}(x)| \) is less than the desired threshold \(\epsilon \). Here, for any \(\vec {l}\in [D]_0^d\), the coefficient \(a_{\vec {l}}\) is given by

$$\begin{aligned} a_{\vec {l}} = \frac{1}{c_{\vec {l}}}\sum _{\vec {j}\in [n_{\textrm{gr}}]_0^d} f\left( \vec {x}_{\vec {j}}\right) P_{\vec {l}}\left( \vec {x}_{\vec {j}}\right) , \end{aligned}$$
(7)

which follows from Eq. (3). It is known that this kind of series converges to f in the large D limit for some orthogonal functions (see [57] for the trigonometric series and [58] for the Chebyshev series).

2.3 Oracles to generate the function-encoding states

In this study, we assume the availability of the oracle \(O_f\) mentioned in the introduction. We now define this formally. Hereafter, we assume that the grid number \(n_{\textrm{gr}}\) satisfies \(n_{\textrm{gr}}=2^{m_{\textrm{gr}}}\) with some integer \(m_{\textrm{gr}}\). Then, we consider the system S consisting of \(d m_{\textrm{gr}}\)-qubit registers and the unitary operator \(O_f\) that acts on S as

$$\begin{aligned} O_f |0\rangle ^{\otimes d}=|f\rangle :=\frac{1}{C} \sum _{\vec {j}=(j_1,\ldots ,j_d)\in [n_{\textrm{gr}}]_0^d} f\left( \vec {x}_{\vec {j}}\right) |j_1\rangle \cdots |j_d\rangle . \end{aligned}$$
(8)

For any \(n\in {\mathbb {N}}_0\), we denote by \(|n\rangle \) the computational basis states on the quantum register with a sufficient number of qubits, in which the bit string on the register corresponds to the binary representation of n. \(|j\rangle \) with \(j\in [n_{\textrm{gr}}]_0\) in Eq. (8) is the computational basis state on an \(m_\textrm{gr}\)-qubit register. Furthermore, C in Eq. (8) is defined as

$$\begin{aligned} C:= \sqrt{\sum _{\vec {j}\in [n_{\textrm{gr}}]_0^d} \left( f\left( \vec {x}_{\vec {j}}\right) \right) ^2}. \end{aligned}$$
(9)

We also assume that the availability of the oracles \(V^1_{\textrm{OF}},\ldots ,V^d_{\textrm{OF}}\), each of which acts on an \(m_{\textrm{gr}}\)-qubit register as

$$\begin{aligned} V^i_{\textrm{OF}}|l\rangle = |P^i_l\rangle := \frac{1}{\sqrt{c^i_l}}\sum _{j=0}^{n_{\textrm{gr}}-1} P^i_{l}(x_{i,j})|j\rangle \end{aligned}$$
(10)

for any \(l\in [D]_0\). \(V^i_{\textrm{OF}}|D\rangle ,\ldots ,V^i_\textrm{OF}|n_{\textrm{gr}}-1\rangle \) may be any states as far as \(V^i_{\textrm{OF}}\) is unitary. In principle, these oracles are constructible because of the orthogonal relation (3). We discuss their implementation in “Appendix A.”

Now, let us elaborate on the reason why reading out \(f\left( \vec {x}_{\vec {j}}\right) \) for \(\vec {j}\in [n_{\textrm{gr}}]_0^d\) from \(|f\rangle \) is difficult but obtaining it through orthogonal function expansion is more promising, as mentioned briefly in the introduction. We rewrite the amplitude in \(|f\rangle \) as

$$\begin{aligned} \frac{f\left( \vec {x}_{\vec {j}}\right) }{C}=\frac{1}{\sqrt{N_{\textrm{gr}}}}\frac{f\left( \vec {x}_{\vec {j}}\right) }{{\bar{f}}}, \end{aligned}$$
(11)

where \({\bar{f}}:=\sqrt{\frac{1}{N_{\textrm{gr}}}\sum _{\vec {j}\in [n_{\textrm{gr}}]_0^d} \left( f\left( \vec {x}_{\vec {j}}\right) \right) ^2}\) is the root mean square of f over the grid points. The amplitude is suppressed by the factor \(1/\sqrt{N_{\textrm{gr}}}\), which is exponential with respect to d, and thus exponentially small unless f is extremely localized at \(\vec {x}_{\vec {j}}\) such that \(f\left( \vec {x}_{\vec {j}}\right) /{\bar{f}}\) is comparable to \(\sqrt{N_{\textrm{gr}}}\). However, we note that, for any \(\vec {l}\in [D]_0^d\), we have

$$\begin{aligned} \langle P^i_{\vec {l}} | f\rangle = \frac{1}{C\sqrt{c_{\vec {l}}}}\sum _{\vec {j}\in [n_{\textrm{gr}}]_0^d} f\left( \vec {x}_{\vec {j}}\right) P^i_{\vec {l}}\left( \vec {x}_{\vec {j}}\right) = \frac{\sqrt{c_{\vec {l}}}}{C}a_{\vec {l}}, \end{aligned}$$
(12)

where \(|P^i_{\vec {l}}\rangle :=|P^1_{l_1}\rangle \cdots |P^d_{l_d}\rangle \). Because both C and \(\sqrt{c_{\vec {l}}}\) are the root mean squares of some functions over the grid points, we expect that their ratio is O(1) and that we can efficiently obtain the expansion coefficient \(a_{\vec {l}}\) and then the approximation \({\tilde{f}}\) of f in Eq. (6) by estimating \(\langle P^i_{\vec {l}} | f\rangle \), without suffering from an exponential suppression factor such as \(1/\sqrt{N_{\textrm{gr}}}\). However, we do not consider this direction in this study, because finding all the expansion coefficients suffers from an exponential increase in the number of coefficients in the high-dimensional case, as explained in the introduction.

2.4 Matrix product state

Thus, we consider using a tensor network. Among the various types of tensor networks, we use MPS, also known as the tensor train, which is simple but powerful and therefore widely used in various fields; a basic introduction of the MPS is presented in “Appendix B.” In MPS scheme, the order-d tensor \(a_{\vec {l}}\in {\mathbb {R}}^{\overbrace{D\times \cdots \times D}^d}\) is approximated by

$$\begin{aligned} {\tilde{a}}_{\vec {l}}:= \sum _{k_1=1}^{r}\cdots \sum _{k_{d-2}=1}^{r} U^{1}_{l_1,k_1} U^{2}_{k_1,l_2,k_2} \cdots U^{d-2}_{k_{d-3},l_{d-2},k_{d-2}} U^{d-1}_{k_{d-2},l_{d-1},l_d}, \end{aligned}$$
(13)

where \(r\in {\mathbb {N}}\) is called the bond dimension, \(U^1\in {\mathbb {R}}^{D \times r}\), \(U^i\in {\mathbb {R}}^{r\times D \times r}\) for \(i\in \{2,\ldots ,d-2\}\), and \(U^{d-1}\in {\mathbb {R}}^{r \times D\times D}.\)Footnote 3 We also impose the following conditions: for any \(i\in \{2,\ldots ,d-2\}\) and \(k_{i-1},k_{i-1}^\prime \in [r]\),

$$\begin{aligned} \sum _{l_i=0}^{D-1}\sum _{k_i=1}^r U^i_{k_{i-1},l_i,k_i}U^i_{k_{i-1}^\prime ,l_i,k_i}=\delta _{k_{i-1},k_{i-1}^\prime } \end{aligned}$$
(16)

holds, and, for any \(k_{d-2},k_{d-2}^\prime \in [r]\),

$$\begin{aligned} \sum _{l_{d-1}=0}^{D-1}\sum _{l_d=0}^{D-1} U^{d-1}_{k_{d-2},l_{d-1},l_d}U^{d-1}_{k_{d-2}^{\prime },l_{d-1},l_d}=\delta _{k_{d-2},k_{d-2}^\prime } \end{aligned}$$
(17)

holds. We call this form of the MPS the right canonical form. This can always be imposed by the procedure explained in “Appendix B.”

The representation in Eq. (13) actually reduces the DOF compared with the original \(a_{\vec {l}}\) as a d-dimensional tensor. The total number of components in \(U^1,\ldots ,U^{d-1}\) is

$$\begin{aligned} rD + (d-3)r^2D + rD^2, \end{aligned}$$
(18)

which is smaller than that in \(a_{\vec {l}}\), \(D^d\), unless \(r=O(D^{O(d)})\). Furthermore, with the coefficients \(a_{\vec {l}}\) in Eq. (6) represented as Eq. (13), the computation of the approximation of the function f becomes efficient. We now approximate f by

$$\begin{aligned} \tilde{{\tilde{f}}}(\vec {x})= & {} \sum _{\vec {l}\in [D]_0^d} {\tilde{a}}_{\vec {l}} P_{\vec {l}}(\vec {x}) = \sum _{l_1=0}^{D-1}\cdots \sum _{l_{d}=0}^{D-1}\sum _{k_1=1}^{r}\cdots \sum _{k_{d-2}=1}^{r} U^{1}_{l_1,k_1} U^{2}_{k_1,l_2,k_2} \nonumber \\{} & {} \cdots U^{d-2}_{k_{d-3},l_{d-2},k_{d-2}} U^{d-1}_{k_{d-2},l_{d-1},l_d} P^1_{l_1}(x_1)\cdots P^d_{l_d}(x_d). \end{aligned}$$
(14)

This can be computed as

$$\begin{aligned} \tilde{{\tilde{f}}}(\vec {x})= & {} \sum _{k_1=1}^{r}\cdots \sum _{k_{d-2}=1}^{r} \left( \sum _{l_1=0}^{D-1}U^{1}_{l_1,k_1}P^1_{l_1}(x_1)\right) \times \left( \sum _{l_2=0}^{D-1}U^{2}_{k_1,l_2,k_2}P^2_{l_2}(x_2)\right) \nonumber \\{} & {} \times \cdots \times \left( \sum _{l_{d-2}=0}^{D-1}U^{d-2}_{k_{d-3},l_{d-2},k_{d-2}}P^{d-2}_{l_{d-2}}(x_{d-2})\right) \nonumber \\{} & {} \times \left( \sum _{l_{d-1},l_d=0}^{D-1}U^{d-1}_{k_{d-2},l_{d-1},l_{d}}P^{d-1}_{l_{d-1}}(x_{d-1})P^d_{l_d}(x_d)\right) , \end{aligned}$$
(19)

that is, we first contract \(U^i_{k_{i-1},l_i,k_i}\) and \(P^i_{l_i}\) with respect to \(l_i\) for each \(i\in [d]\), and then take contractions with respect to \(k_1,\ldots ,k_{d-2}\). In this procedure, the number of arithmetic operations is

$$\begin{aligned} O(dr^2D+rD^2), \end{aligned}$$
(20)

which is much smaller than \(O(D^d)\) for computing (6) with general \(a_{\vec {l}}\) not having any specific structure.

2.5 Quantum circuit to generate the tensor network state

Fig. 2
figure 2

Diagram of the quantum circuit \(V_{\textrm{MPS}}\) that generates the MPS-encoded state (21) in the case of \(d=4\). All the \(dm_{\textrm{deg}}\) qubits are initialized to \(|0\rangle \). We describe the subsystems consisting of a portion of the qubits at the left end: \(S_{\textrm{deg},i}\) has \(m_{\textrm{deg}}\) qubits, the \(((i-1)m_\textrm{deg}+1)\)-th to \(im_{\textrm{deg}}\)-th ones, and \(S_{\textrm{deg},i}^\prime \) has \(m_{\textrm{deg}}+m_{\textrm{BD}}\) qubits, the \(((i-1)m_{\textrm{deg}}+1)\)-th to \((im_{\textrm{deg}}+m_{\textrm{BD}})\)-th ones, except for \(S_{\textrm{deg},d-1}^\prime \) having the last \(2m_{\textrm{deg}}\) qubits

The quantum circuit to generate an MPS-encoded quantum state is shown in Fig. 2. First, we prepare \(dm_{\textrm{deg}}\) qubits initialized to \(|0\rangle \), where we assume that \(D=2^{m_\textrm{deg}}\) holds with some \(m_{\textrm{deg}}\in {\mathbb {N}}\). Labeling them with the integers \(1,\ldots ,dm_{\textrm{deg}}\), for each \(i\in [d]\), we denote the system of the \(((i-1)m_{\textrm{deg}}+1)\)-th to \(im_\textrm{deg}\)-th qubits by \(S_{\textrm{deg},i}\). In addition, assuming that \(r=2^{m_{\textrm{BD}}}\) also holds with some \(m_{\textrm{BD}}\in {\mathbb {N}}\), for each \(i\in [d-2]\), we denote the system of \(S_{\textrm{deg},i}\) and the first \(m_{\textrm{BD}}\) qubits in \(S_{\textrm{deg},i+1}\) by \(S_{\textrm{deg},i}^\prime \), and the system of the last \(2m_{\textrm{deg}}\) qubits by \(S_{\textrm{deg},d-1}^\prime \). Then, we put the quantum gates \(V_1,\ldots ,V_{d-1}\) on \(S_{\textrm{deg},1}^\prime ,\ldots ,S_{\textrm{deg},d-1}^\prime \), respectively, in this order.

Let us denote by \(V_{\textrm{MPS}}\) the unitary that corresponds to the whole of the above quantum circuit, which is depicted in Fig. 2. Then, \(V_{\textrm{MPS}}\) generates the MPS-encoded state

$$\begin{aligned} V_{\textrm{MPS}}|0\rangle ^{\otimes d}= & {} |{\tilde{a}}\rangle :=\sum _{l_1=0}^{D-1}\cdots \sum _{l_{d}=0}^{D-1}\sum _{k_1=1}^{r}\cdots \sum _{k_{d-2}=1}^{r} U^{1}_{l_1,k_1} U^{2}_{k_1,l_2,k_2} \nonumber \\{} & {} \cdots U^{d-2}_{k_{d-3},l_{d-2},k_{d-2}} U^{d-1}_{k_{d-2},l_{d-1},l_d}|l_1\rangle \cdots |l_d\rangle , \end{aligned}$$
(21)

which is close to the state

$$\begin{aligned} |a\rangle :=\sqrt{\frac{1}{\sum _{\vec {l}\in [D]_0^d} |a_{\vec {l}}|^2}}\sum _{\vec {l}\in [D]_0^d} a_{\vec {l}}|l_1\rangle \cdots |l_d\rangle \end{aligned}$$
(22)

that encodes the true coefficients \(a_{\vec {l}}\) if \({\tilde{a}}_{\vec {l}}\) in Eq. (13) approximates \(a_{\vec {l}}\). Here, \(|l\rangle \) with \(l\in [D]_0\) denotes the computational basis states on \(S_{\textrm{deg},1},\ldots ,S_{\textrm{deg},d}\). In addition, we associate the entries in \(U^1,\ldots ,U^{d-1}\) with those in \(V^1,\ldots ,V^{d-1}\) as unitaries, respectively, as follows:

$$\begin{aligned} U^1_{l_1,k_1}=\langle l_1+Dk_1|V^1|0\rangle \end{aligned}$$
(23)

for any \(l_1\in [D]_0\) and \(k_1\in [r]\),

$$\begin{aligned} U^i_{k_{i-1},l_i,k_i}=\langle l_i+Dk_i|V^i|k_{i-1}\rangle \end{aligned}$$
(24)

for any \(i\in \{2,\ldots ,d-2\},l_i\in [D]_0\) and \(k_{i-1},k_i\in [r]\), and

$$\begin{aligned} U^{d-1}_{k_{d-2},l_{d-1},l_{d}}=\langle l_{d-1}+Dl_{d}|V^{d-1}|k_{d-2}\rangle \end{aligned}$$
(25)

for any \(l_{d-1},l_d\in [D]_0\) and \(k_{d-2}\in [r]\), where \(|n\rangle \) with \(n\in {\mathbb {N}}_0\) denotes the computational basis state on either \(S_{\textrm{deg},1}^\prime ,\ldots ,S_{\textrm{deg},d-1}^\prime \). Note that each \(U^i\), which is not a unitary matrix, is realized as a block in \(V^i\), a component unitary in the circuit \(V_{\textrm{MPS}}\) in Fig. 2. Although \(V^1,\ldots ,V^{d-1}\) have additional components that do not appear in Eqs. (23) to (25), those do not affect the state \(|{\tilde{a}}\rangle \) because of the initialization of all qubits to \(|0\rangle \).

Note that \(U^2,\ldots ,U^{d-1}\) in Eqs. (24) and (25) automatically satisfy the conditions (15) and (16) because of the unitarity of \(V^2,\ldots ,V^{d-1}\). However, the unitarity of \(V^1\) imposes the constraint

$$\begin{aligned} \sum _{l_1=0}^{D-1}\sum _{k_1=1}^r |U^1_{l_1,k_1}|^2 = 1 \end{aligned}$$
(26)

on \(U^1\) in Eq. (23). This implies that, with such \(U^1\), the MPS-based approximation \(\tilde{{\tilde{f}}}\) in Eq. (18) does not have the DOF of the overall factor, and therefore, although we can express the functional form of f by \(\tilde{{\tilde{f}}}\), we cannot adjust the magnitude of \(\tilde{{\tilde{f}}}\) so that it fits f. Conversely, if we have some estimate C for the ratio of f to \(\tilde{{\tilde{f}}}\), we can approximate f by \(C\tilde{{\tilde{f}}}\). This issue is addressed in Sect. 2.6.

2.6 Optimization of the quantum circuit

We now consider how to optimize the quantum circuit \(V_{\textrm{MPS}}\) and obtain an MPS-based function approximation.

First, we extend the circuit \(V_{\textrm{MPS}}\) in Sect. 2.5 using \(\{V^i_{\textrm{OF}}\}_i\) as follows. For each \(i\in [d]\), we add \(m_{\textrm{gr}}-m_{\textrm{deg}}\) qubits after the \(im_{\textrm{deg}}\)-th qubits in the original circuit and denote the system consisting of \(S_{\textrm{deg},i}\) and the added qubits by \(S_{\textrm{gr},i}\). Note that the resultant system is the same as the system S for \(O_f\), which consists of the d \(m_{\textrm{gr}}\)-qubit registers. We then perform \(V^i_{\textrm{OF}}\) on \(S_{\textrm{deg},i}\). The resultant circuit \(V_\textrm{App}\) is shown in Fig. 3. Note that this circuit generates

$$\begin{aligned} V_{\textrm{App}}|0\rangle ^{\otimes d}=|\tilde{{\tilde{f}}}\rangle:= & {} \sum _{l_1=0}^{D-1}\cdots \sum _{l_{d}=0}^{D-1}\sum _{k_1=1}^{r}\cdots \sum _{k_{d-2}=1}^{r} \sum _{j_1=0}^{n_{\textrm{gr}}-1}\cdots \sum _{j_d=0}^{n_{\textrm{gr}}-1}U^{1}_{l_1,k_1} U^{2}_{k_1,l_2,k_2}\nonumber \\{} & {} \cdots U^{d-2}_{k_{d-3},l_{d-2},k_{d-2}} U^{d-1}_{k_{d-2},l_{d-1},l_d}P^1_{l_1}(x_{1,j_1})\cdots P^d_{l_d}(x_{d,j_d})|j_1\rangle \cdots |j_d\rangle , \nonumber \\ \end{aligned}$$
(27)

where \(|n\rangle \) with \(n\in {\mathbb {N}}_0\) now denotes the computational basis state on \(S_{\textrm{gr},1},\ldots ,S_{\textrm{gr},d}\). That is, this is the quantum state that amplitude-encodes \(\tilde{{\tilde{f}}}\) in Eq. (18), the approximation of f by the orthogonal function expansion and the MPS approximation of the coefficients. Therefore, if we obtain \(V_{\textrm{App}}\) that generates \(|\tilde{{\tilde{f}}}\rangle \) close to \(|f\rangle \), we also obtain \(\{U^i\}_i\) for which \(\tilde{{\tilde{f}}}\) in Eq. (18) well approximate f at least on the grid points, by reading out their entries from the quantum gates \(\{V^i\}_i\) in \(V_{\textrm{App}}\), except for the overall factor C.

Next, we consider how to obtain such \(V_{\textrm{App}}\), especially \(\{V^i\}_i\) in it. We aim to maximize the fidelity

$$\begin{aligned} F=\langle f | \tilde{{\tilde{f}}}\rangle . \end{aligned}$$
(28)

Note that maximizing F is equivalent to minimizing the sum of the squared differences between the two normalized functions

$$\begin{aligned} \sum _{\vec {j}\in [n_{\textrm{gr}}]_0^d} \left( \frac{f(\vec {x}_{\vec {j}})}{C}-\tilde{{\tilde{f}}}(\vec {x}_{\vec {j}})\right) ^2, \end{aligned}$$
(29)

which, in the large \(n_{\textrm{gr}}\) limit, is equivalent to

$$\begin{aligned} \int _{\Omega } \left( \frac{f(\vec {x}_{\vec {j}})}{C}-\tilde{{\tilde{f}}}(\vec {x}_{\vec {j}})\right) ^2 w^1(x_1)\cdots w^d(x_d) \textrm{d}\vec {x}, \end{aligned}$$
(30)

that is, the squared \(L^2\) norm of \(\frac{f}{C}-\tilde{{\tilde{f}}}\), the common metric in function approximation.

Fig. 3
figure 3

Diagram of the quantum circuit \(V_{\textrm{App}}\) that generates the state (27) in which the approximation \(\tilde{{\tilde{f}}}\) in Eq. (18) of f is amplitude-encoded. \(V_{\textrm{MPS}}\) is the circuit depicted in Fig. 2; the lines that bypass it are not used in it. All the \(dm_{\textrm{gr}}\) qubits are initialized to \(|0\rangle \). The subsystems consisting of a portion of the qubits, which are indicated at the left end, are described in the body text

The procedure for maximizing F is similar to that presented in [56]. We attempt to optimize each \(V^i\) alternatingly. In other words, we optimize each \(V^i\) with the others fixed, setting i to \(1,2,\ldots ,{d-1}\) in turn. This loop can be repeated an arbitrary number of times. The optimization step for \(V^i\) proceeds as follows. We define

$$\begin{aligned} F_i = \textrm{Tr}_{{\bar{S}}^\prime _{\textrm{deg},i}} \left[ |\Psi _{i+1}\rangle \langle \Phi _{i-1}| \right] . \end{aligned}$$
(31)

Here, \({\bar{S}}^\prime _{\textrm{deg},i}\) denotes the system consisting of the qubits except those in \(S^\prime _{\textrm{deg},i}\), and, for any subsystem s in S, \(\textrm{Tr}_{s}\) denotes the partial trace over the Hilbert space corresponding to s. \(|\Psi _{i+1}\rangle \) is defined as

$$\begin{aligned}{} & {} |\Psi _{i+1}\rangle \nonumber \\{} & {} \quad = {\left\{ \begin{array}{ll} \left( V^{i+1} \otimes I_{{\bar{S}}^\prime _{\textrm{deg},i+1}}\right) ^\dagger \cdots \left( V^{d-1} \otimes I_{{\bar{S}}^\prime _{\textrm{deg},d-1}}\right) ^\dagger (V^1_{\textrm{OF}}\otimes \cdots \otimes V^{d-1}_{\textrm{OF}})^\dagger |f\rangle \\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad ; \ {\text{ for } } \ i=1,\ldots ,d-2 \\ (V^1_{\textrm{OF}}\otimes \cdots \otimes V^{d-1}_{\textrm{OF}})^\dagger |f\rangle \ \ \ \qquad \qquad \qquad ; \ {\text { for}} \ i=d-1 \end{array}\right. } \end{aligned}$$
(32)

where \(I_{{\bar{S}}^\prime _{\textrm{deg},i}}\) denotes the identity operator on \({\bar{S}}^\prime _{\textrm{deg},i}\), and thus \(V^{i} \otimes I_{{\bar{S}}^\prime _{\textrm{deg},i}}\) denotes the i-th block in \(V_{\textrm{MPS}}\) in Fig. 2. \(|\Phi _{i-1}\rangle \) is defined as

$$\begin{aligned} |\Phi _{i-1}\rangle = {\left\{ \begin{array}{ll} \left( V^{i-1} \otimes I_{{\bar{S}}^\prime _{\textrm{deg},i-1}} \right) \cdots \left( V^{1} \otimes I_{{\bar{S}}^\prime _{\textrm{deg},1}} \right) |0\rangle ^{\otimes d} \\ \qquad \qquad \qquad \qquad \qquad \qquad ; \ \textrm{for} \ i=2,\ldots ,d-1 \\ |0\rangle ^{\otimes d} \qquad \qquad \qquad \qquad \ \ \ \; \ \textrm{for} \ i=1 \end{array}\right. } \end{aligned}$$
(33)

We can regard this \(F_i\) as an \(M\times M\) matrix, where \(M=rD\) for \(i\in [d-2]\) and \(M=D^2\) for \(i=d-1\). Its entries are given by \(\langle l|F_i|l^\prime \rangle \), where \(|l\rangle ,|l^\prime \rangle \in \{|0\rangle ,|1\rangle ,\ldots ,|M-1\rangle \}\) are now the computational basis states on \(S^\prime _{\textrm{deg},i}\). Supposing that we know \(F_i\), we perform its singular-value decomposition (SVD)

$$\begin{aligned} F_i=XDY, \end{aligned}$$
(34)

where X and Y are the \(M\times M\) unitaries and D is the diagonal matrix having the singular values of \(F_i\) as its diagonal entries. Finally, we update \(V^i\) by

$$\begin{aligned} V^i = XY. \end{aligned}$$
(35)

We obtain \(F_i\) through Pauli decomposition and the Hadamard test [56]. We set \(m=\log _2 M\), and, for any \(k\in [m]\), we denote the identity operator, the Pauli-X gate, the Pauli-Y gate, and the Pauli-Z gate on the k-th qubit in \(S^\prime _{\textrm{deg},i}\) by \({\hat{\sigma }}^k_0,{\hat{\sigma }}^k_1,{\hat{\sigma }}^k_2\) and \({\hat{\sigma }}^k_3\), respectively. We also define

$$\begin{aligned} {\hat{\sigma }}_{\vec {\alpha }}:= {\hat{\sigma }}^1_{\alpha _1}\otimes \cdots \otimes {\hat{\sigma }}^{m}_{\alpha _{m}} \end{aligned}$$
(36)

for any \(\vec {\alpha }=(\alpha _1,\ldots ,\alpha _{m})\in \{0,1,2,3\}^{m}\). Then, we can always decompose \(F_i\) as

$$\begin{aligned} F_i = \sum _{\vec {\alpha }\in {\mathcal {A}}}{\tilde{F}}_{i,\vec {\alpha }} {\hat{\sigma }}_{\vec {\alpha }}, \end{aligned}$$
(37)

with \({\tilde{F}}_{i,\vec {\alpha }}\in {\mathbb {R}}\), where, rather than \(\{0,1,2,3\}^{m}\), the index tuple \(\vec {\alpha }\) runs over the subset

$$\begin{aligned} {\mathcal {A}}:=\left\{ \vec {\alpha }\in \{0,1,2,3\}^{m} \ \big | \ \mathrm{the \ number \ of \ 2's \ in} \ \alpha _1,\ldots ,\alpha _m \ \mathrm{is \ even.} \right\} \end{aligned}$$
(38)

because \(F_i\) is real. Because

$$\begin{aligned} \textrm{Tr}_{S^\prime _{\textrm{deg},i}} \left[ F_i {\hat{\sigma }}_{\vec {\alpha }}\right] =M{\tilde{F}}_{i,\vec {\alpha }} \end{aligned}$$
(39)

holds, we obtain \({\tilde{F}}_{i,\vec {\alpha }}\) by estimating \(\textrm{Tr}_{S^\prime _{\textrm{deg},i}} \left[ F_i {\hat{\sigma }}_{\vec {\alpha }}\right] \). Noting that

$$\begin{aligned} \textrm{Tr}_{S^\prime _{\textrm{deg},i}} \left[ F_i {\hat{\sigma }}_{\vec {\alpha }}\right] = \langle \Phi _{i-1}| {\hat{\sigma }}_{\vec {\alpha }} |\Psi _{i+1}\rangle , \end{aligned}$$
(40)

we can estimate this using the Hadamard test, in which the circuit \(V_{\textrm{MPS}}\) with replacement of \(V^i\) by \({\hat{\sigma }}_{\vec {\alpha }}\) is used. In one update of each \(V^i\), the total number of estimations of the quantities (40) is \(|{\mathcal {A}}|=M(M+1)/2\). We refer to [56] for the details of the estimations.

After we optimize the \(\{V^i\}_i\) and read out \(\{U^i\}_i\) from them, the remaining task is just multiplying the factor C to \(\tilde{{\tilde{f}}}\) constructed from \(\{U^i\}_i\). In this paper, we do not go into the details of estimating this factor but simply assume that we have some estimate for it. In some quantum algorithms for solving PDEs, the method of estimating this factor, the root of the squared sum of the function values on the grid points, is presented [19].

The entire procedure to obtain an approximation of f is summarized as Algorithm 1.

figure a

2.7 Situation in which our method is useful

In the introduction, we mentioned solving PDEs using quantum algorithms as a main use case of our method. One might think that if the solution of the PDE can be approximated by MPS as Eq. (19), we can directly plug the ansatz (19) into the PDE and optimize the parameters \(\{U^i_{k_{i-1},l_i,k_i}\}\) using a classical computer. In fact, such a method has been studied in [59,60,61,62]. However, the scheme we are now considering, that is, generating the quantum state that encodes the solution in the amplitude by a quantum algorithm and reading out the solution in MPS-based approximation, is applicable to the broader range of problems. For example, in the initial value problem, it is possible that MPS-based approximation is valid at the terminal time but not at some intermediate time points. In the context of quantum many-body physics, this is applicable to the wave function of a system that changes to a low-entangled state via a highly entangled state, such as quantum annealing [63,64,65]. In such a case, finding the function at the terminal time using a quantum PDE solver and then reading it out by MPS-based approximation might work, although using MPS throughout does not seem to work.

3 Numerical experiment

We now confirm the feasibility of our method through a numerical experiment on approximating a finance-motivated function.

3.1 Problem setting

As a reasonable instance of the target function for the approximation, we take the price of a financial derivative (simply, derivative).

Specifically, we consider the worst-of put option written on the d underlying assets that obey the Black–Scholes (BS) model and take its present price as a function \(f(\vec {s})\) of the asset prices \(\vec {s}=(s_1,\ldots ,s_d)\) as the approximation target. This is the solution of the so-called Black–Scholes PDE and thus fits the current situation in which a quantum PDE solver outputs the state \(|f(\vec {s})\rangle \). Details are provided in “Appendix C.”

We approximated \(f(\vec {s})\) on the hyperrectangle \([L_1,U_1]\times \cdots \times [L_d,U_d]\). We took cosine functions as orthogonal functions. Specifically, for any \(i\in [d]\) and \(l\in [D]_0\), we set

$$\begin{aligned} P^i_{l}(s_i) = \cos \left( l\frac{s_i-L_i}{U_i-L_i}\pi \right) . \end{aligned}$$
(41)

The settings for \(U_i\) and \(L_i\) are presented in “Appendix C.” The orthogonal relation (3) is satisfied with the grid points set as

$$\begin{aligned} s_{i,j}=\frac{j+\frac{1}{2}}{n_{\textrm{gr}}}(U_i-L_i)+L_i \end{aligned}$$
(42)

for \(j\in [n_{\textrm{gr}}]_0\) and \(c^i_l\) being

$$\begin{aligned} c^i_l= {\left\{ \begin{array}{ll} n_{\textrm{gr}};&{} \quad {\text { for }} l=0 \\ \frac{n_{\textrm{gr}}}{2}; &{} \quad {\text { for }} l\in [D-1] \end{array}\right. }. \end{aligned}$$
(43)

Let us comment on the choice of cosines as orthogonal functions. We consider the case where the grid points are equally spaced points in the hyperrectangle, which we expect to be the simplest and most common. In this setting, the cosine functions (41) satisfy the orthogonal relation (5) exactly. Therefore, the choice was natural in this case. Generally, it is plausible to take orthogonal functions such that the orthogonal relation holds for the given grid points.

3.2 Algorithm modifications to run the numerical experiment

We used Algorithm 1 but made the following modification, since we performed all calculations on a classical computer, and thus there was a memory space limitation. Rather than F in Eq. (28), we attempt to maximize

$$\begin{aligned} F^\prime = \langle a | {\tilde{a}}\rangle \end{aligned}$$
(44)

Here, \(|{\tilde{a}}\rangle \) is given in Eq. (21) and

$$\begin{aligned} |a\rangle :=\sqrt{\frac{1}{\sum _{\vec {l}\in [D]_0^d} |a_{\vec {l}}|^2}}\sum _{\vec {l}\in [D]_0^d} a_{\vec {l}}|l_1\rangle \cdots |l_d\rangle , \end{aligned}$$
(45)

where we calculated the coefficients \(a_{\vec {l}}\) for each \(\vec {l}\in [D]_0^d\) using Eq. (7) and the values of \(f(\vec {s})\) on the grid points computed by Monte Carlo integration (see “Appendix C”). We take the minimal grid points to calculate the coefficients for a given D, which means \(n_\textrm{gr}=D\), although it is assumed that \(n_{\textrm{gr}}>D\) when our algorithm is used on a future quantum computer with the oracle \(O_f\).

With this modification, the quantum circuit under consideration becomes small, and the calculation becomes feasible on a classical computer. The gates \(\{V^i_{\textrm{OF}}\}\) do not appear in our experiment, and their roles are absorbed into the aforementioned preprocessing to calculate \(\{a_{\vec {l}}\}\). Most importantly, note that \(\{U^i\}\) calculated under the above modification are the same as those output by our algorithm without modification.

3.3 Result of the approximation

Next, we attempted to obtain an approximation of \(f(\vec {s})\). We set the parameters as follows: \(d=5,D=r=16,n_{\textrm{iter}}=5\). For C, we used the root squared sum of the values of \(f(\vec {s})\) on the grid points, which are calculated using Monte Carlo integration. We used the ITenosr library [66] for the tensor calculations.

Fig. 4
figure 4

Five-asset worst-of put option prices \(f(\vec {s})\) for \(s_1=\cdots =s_5=s\) calculated in three different ways. The horizontal and vertical axes, respectively, correspond to s and \(f(\vec {s})\). The \(+\), \(\times \), and black line, respectively, indicate results obtained from MPS-based approximation, cosine expansion approximation, and Monte Carlo integration

We now demonstrate the accuracy of the obtained MPS-based approximation of \(f(\vec {s})\). However, because we are considering a high-dimensional space, it is difficult to display the accuracy over the entire space. Therefore, we show the accuracy on the following two sets of selected points: (i) the “diagonal" line \(s_1=\cdots =s_d\) and (ii) \(10^4\) random points sampled according to the BS model (see “Appendix C” for details).

Figure 4 displays results for the diagonal line, depicting the MPS-based approximation \(\tilde{{\tilde{f}}}(\vec {s})\) of \(f(\vec {s})\) along with Monte Carlo integration values and the cosine expansion approximation as per Eq. (6). The MPS-based approximation fits the Monte Carlo integration values well within an error of about 0.5 or less, except for the region near the lower end of the domain of the approximations. Note that the cosine expansion approximation already has an error from the Monte Carlo values, and this acts as a lower bound on the error of the MPS-based approximation. The MPS-based approximation almost overlaps the cosine expansion approximation, which means that the MPS-based approximation worked well.

The results for the random sample points are summarized in Table 1. On these points, the maximum differences among the values of \(f(\vec {s})\) calculated by Monte Carlo integration, MPS-based approximation, and cosine expansion using the full coefficients without MPS approximation are about 0.5 or less, similar to most regions in the diagonal line case. This result supports the proposed method.

Table 1 Maximum differences among the values of the five-asset worst-of put option prices \(f(\vec {s})\) calculated by Monte Carlo integration (MC), MPS-based approximation (TN), and cosine expansion using the full coefficients (COS), on the random sample points

In the current MPS-based approximation, the number of DOF is 12544, which is smaller than the number of cosine expansion coefficients (1048576) by two orders of magnitude. Therefore, we achieved a large parameter reduction while maintaining the approximation accuracy.

3.4 Relationship between approximation accuracy and degrees of freedom

Fig. 5
figure 5

Diagram of the quantum circuit \(V^\prime _{\textrm{MPS}}\) used in the additional numerical experiment instead of \(V_{\textrm{MPS}}\) for \(m_{\textrm{deg}}=4\) and \(m_{\textrm{bl}}=3\)

To investigate the relationship between the DOF and the approximation accuracy, we performed the following additional experiment. We replaced the circuit \(V_{\textrm{MPS}}\) in Fig. 2 by a different \(V_{\textrm{MPS}}^\prime \) having the different DOF. Here, with \(m_{\textrm{bl}}\in \{2,\ldots ,dm_{\textrm{deg}}-1\}\), \(V_{\textrm{MPS}}^\prime \) consists of the gates \({\tilde{V}}^1,\ldots ,{\tilde{V}}^{dm_{\textrm{deg}}-m_{\textrm{bl}}+1}\) that act on the systems of \(m_{\textrm{bl}}\) qubits displaced one by one, as shown in Fig. 5. The DOF of this circuit is

$$\begin{aligned} 2^{m_{\textrm{bl}}}+2^{2m_{\textrm{bl}}-1}(dm_{\textrm{deg}}-m_{\textrm{bl}}). \end{aligned}$$
(46)

We alternatingly optimized the blocks \(\{{\tilde{V}}^i\}\) in a manner similar to Algorithm 1 so that \(\langle a | {\tilde{a}}^\prime \rangle \) is maximized, where

$$\begin{aligned} |{\tilde{a}}^\prime \rangle :=\sum _{\vec {l}\in [D]_0^d}{\tilde{a}}^\prime _{\vec {l}}|l_1\rangle \cdots |l_d\rangle \end{aligned}$$
(47)

is the state generated by \(V_{\textrm{MPS}}^\prime \). Then, we obtained the approximation of \(f(\vec {s})\) as

$$\begin{aligned} \tilde{{\tilde{f}}}^\prime (\vec {s}):=C\sum _{\vec {l}\in [D]_0^d}{\tilde{a}}^\prime _{\vec {l}}P_{\vec {l}}(\vec {s})~. \end{aligned}$$
(48)
Fig. 6
figure 6

Maximum difference \(\Delta _{\textrm{max}}\) of the various approximations for the five-asset worst-of put option price from the cosine expansion approximation of degree \(D=16\). The circles indicate the approximations based on the circuit \(V^\prime _\textrm{MPS}\) in Fig. 5. From left to right, the points correspond to \(m_{\textrm{bl}}=2,\ldots ,6\), respectively. The dotted lines represent the function \(\Delta _{\textrm{max}}(x)=ax^b\) of the degrees of freedom x with a and b fitted with respect to the data for \(m_{\textrm{bl}}=[4,6]\) for (a) and \(m_{\textrm{bl}}=[2,6]\) for (b). The square indicates the maximum difference of the MPS-based approximation, and the triangles indicate those of the cosine expansion approximations of degree \(D=3,\ldots ,10\), which correspond to the points from left to right, respectively. The vertical and horizontal lines correspond to the maximum difference and DOF of the approximations, respectively

In Fig. 6a, we display the maximum difference of \(\tilde{{\tilde{f}}}^\prime \) from the cosine expansion approximation on the diagonal line \(s_1=\cdots =s_d\), taking \(m_{\textrm{bl}}=2,\ldots ,6\), along with that of the MPS-based approximation. This figure shows that there is a power law relationship between the approximation accuracy and DOF in the region with large DOF. This behavior is similar to that often observed in critical MPS systems applied to one-dimensional quantum systems or two-dimensional classical systems [67,68,69,70,71,72,73]. If this behavior also appears in the case of \(d \gg 1\), one might make a similar argument to the finite bond dimension (entanglement) scaling refined in the study of MPS and evaluate \({\tilde{f}}\) with appropriate extrapolations with respect to DOF.

We then show the relationship between the accuracy of the cosine expansion approximation and its DOF in Fig. 6a. We plot the maximum error of the approximated worst-of put option price by the cosine expansions of low degree \(D=3,\ldots ,10\) on the line \(s_1=\cdots =s_d\), taking that of degree \(D=16\) as the reference value. The error of the low-degree cosine expansion is much larger than that of \(\tilde{{\tilde{f}}}^\prime \) with comparable DOF. Figure 6b is similar to Fig. 6a but for the random sample points. Again, the maximum difference of \(\tilde{{\tilde{f}}}^\prime \) from the cosine expansion with \(D=16\) displays a power law with respect to DOF and is much smaller than the error of the low-degree cosine expansion.

These results provide additional evidence of the advantages of our MPS-based approximation and the approximation by the circuit \(V^\prime _{\textrm{MPS}}\) over the simple cosine expansion.

4 Summary

In this study, we have considered how to extract the function encoded in the amplitudes of the quantum state as classical data. Such a task necessarily accompanies quantum algorithms such as PDE solvers, but to date there has not been a proposal for a general method to accomplish this task, even though it risks ruining quantum speedup. We proposed a method based on orthogonal function expansion and tensor network. Orthogonal function expansion is widely used for function approximation but suffers from an exponential increase in the number of the parameters, the expansion coefficients, with respect to the dimension, that is, the number of variables of the function. We then use an MPS, a type of tensor network, to approximate the coefficients as a high-order tensor and reduce the DOF. We presented a quantum circuit that produces the state corresponding to such a function approximation. Such a circuit is, in fact, constructible because an MPS is encoded in a quantum state by a simple circuit, and so are orthogonal functions because of their orthogonal relation. We also presented the procedure to optimize the quantum circuit and MPS-based approximation, based on the alternating method proposed in [56]. Finally, we conducted a numerical experiment to approximate a finance-motivated multivariate function and found that our method works in this case.

This study has scope for further investigation. For example, we did not explicitly present the bound on the total query complexity for the oracles \(O_f\) and \(V^i_{\textrm{OF}}\) to obtain an approximating function with a given accuracy \(\epsilon \). Presenting such a complexity estimation is desired but difficult at present. In general, function approximation in a high-dimensional space is a long-standing problem in computational mathematics. Although a recent study showed that there is an MPS-based approximation that achieves a given accuracy with DOF subexponential with respect to d for any function with some property [54], to the best of the authors’ knowledge, there is no known algorithm that certainly outputs such an approximation. The convergence property of the alternating optimization method explained above is not well known theoretically, even though its good performance is empirically known. In these situations, we have considered a quantitative discussion on the accuracy and complexity of our method beyond the scope of this study and left it for future work, presenting some numerical evidence instead.

We expect that the proposed method can be used widely in combination with various quantum algorithms that output function-encoding states, whether for FTQC or NISQ. In future work, we will attempt to combine this method with concrete function-finding quantum algorithms on concrete problems and present a complete set of the algorithms, along with a more quantitative analysis of complexity.