Optimal approximation to unitary quantum operators with linear optics

Linear optical systems acting on photon number states produce many interesting evolutions, but cannot give all the allowed quantum operations on the input state. Using Toponogov's theorem from differential geometry, we propose an iterative method that, for any arbitrary quantum operator $U$ acting on $n$ photons in $m$ modes, returns an operator $\widetilde{U}$ which can be implemented with linear optics. The approximation method is locally optimal and converges. The resulting operator $\widetilde{U}$ can be translated into an experimental optical setup using previous results.


INTRODUCTION
Linear optical devices under quantum light show a rich behaviour and have different applications in experiments on the foundations of quantum optics and quantum information [1,2,3]. While they can be built with relatively simple optical elements like beam splitters and phase shifters [4,5,6,7,8], their behaviour for photon number states cannot be accurately reproduced by any classical system. One clear example is the boson sampling problem, for which quantum systems can give efficient solutions which cannot be produced by any classical method [9].
Classically, the evolution of the electrical field in m orthogonal modes going through a linear optical system is perfectly described by a unitary m × m matrix, S, called the scattering matrix of the system [10]. The evolution of n photons distributed through these m possible modes is given by an M × M unitary evolution matrix U acting on the M = m+n−1 n states of the resulting Hilbert space. The photonic homomorphism ϕ : U (m) → U (M) gives the evolution matrix U which corresponds to a scattering matrix S describing the linear optical system. U = ϕ(S) can be computed from different equivalent methods [11,12,13].
Any unitary matrix can be written as an exponential U = e iH for a Hermitian matrix H. In linear optical devices, we will call this matrix the effective Hamiltonian H U of the linear system. Similarly, S = e iH S .
The image of the photonic homomorphism, im(ϕ), is a subgroup of U (M) which contains all the quantum evolutions that are allowed for n photons a linear optical system with The first author has been funded by Junta de Castilla y León (project VA296P18). The second author has been partially supported by the Research Program of the University Jaume I-Project UJI-B2018-3, as well as by the Spanish Government Ministerio de Economía y Competitividad (MINECO-FEDER) grant MTM2017-84851-C2-2. The third author was partially supported by the Spanish Government, Ministerios de Ciencia e Innovación y de Universidades, grant PGC2018-096446-B-C22, as well as by Universitat Jaume I, grant UJI-B2018- 10. m modes. The image subgroup is a representation of U (m) in U (M) and maps each possible classical scattering matrix S describing a linear system into the quantum evolution U = ϕ(S) it induces for n photons.
The evolution in the corresponding unitary algebras from iH S to iH U is given by the differential of ϕ, dϕ : u(m) → u(M), for which there are also explicit expressions [14,15,16,17,18,19].
From the point of view of system design, a natural question is whether any given quantum evolution U ∈ U (M) can be realized using only linear optics. From a simple dimensional argument, it is clear that, except when m = 1 or n = 1, there must be some impossible operations [20].
In a previous work, we have given an explicit inverse method to find the S corresponding to any U ∈ im(ϕ) which can be implemented using linear optics [21].
Here, we address the problem of approximating U ∈ im(ϕ). We give a method to find the linear optics system with an evolution matrix U ∈ im(ϕ) which minimizes the distance to U locally. The result is based on Toponogov's comparison theorem [22] from differential geometry. Section 2 describes the structure of the image algebra and its complement and states two theorems that will become useful later. Section 3 introduces the basic concepts from differential geometry used in the proof and the notation for the rest of the paper. Section 4 defines the bi-invariant Riemannian metric in which the results are given. Section 5 shows how to apply Toponogov's theorem to reduce the problem of approximating a unitary to finding a geodesic in the correct manifold. Section 6 discusses some tricks related to the generation of random unitaries which are needed to explore the image group and to be able to compute a valid matrix logarithm for any desired U . Section 7 describes the iterative method that produces the desired approximation and gives some examples. Finally, Section 8, gives a general overview of the method and comments on some practical problems and possible improvements for the approximation algorithm.

THE IMAGE ALGEBRA AND ITS ORTHOGONAL COMPLEMENT
If we study the induced map dϕ : u(m) → u(M), we can decompose the Lie algebra u(M) orthogonally so that where (im dϕ) ⊥ is the orthogonal complement of im dϕ with respect to the metric For this metric, we can prove a couple of useful facts.
be the orthogonal decomposition of v, with a tangent component v T ∈ im dϕ and a normal Therefore, for any normalized |ψ with ψ|ψ = 1, we have The proof is given by introducting a bi-invariant metric and reducing the issue to a problem in plane geometry thanks to Toponogov's comparison theorem [22]. Later, with this theorem, we can give a recursive method to find a locally optimal approximation and show it converges.

PRERREQUISITES AND NOTATION
For the very basic notions in differential geometry, such as manifold, curve, tangent space, etc., the reader is referred to the books of Do Carmo [23] or Sakai [24].
A Riemannian metric on a differentiable manifold M is a correspondence which associates to each point p on M an inner product , p on the tangent space T p M which varies differentiably in the sense that, for any pair of vector fields X and Y which are differentiable in a neighborhood V of M, the function X ,Y is differentiable on V . The metric with which a Riemannian manifold M is endowed may come from a distance. Given two points p, q ∈ M, the distance d(p, q) between them is defined to be the infimum of the lengths of all curves joining p and q which are piecewise differentiable.
Two fundamental concepts of Riemannian geometry are those of geodesic and curvature. Roughly speaking, a geodesic is a curve minimizing the distance between two nearby points. More precisely, let I be a closed interval in R; a parametrized curve γ : I → M is called a geodesic at t 0 if the covariant derivative D dt dγ dt vanishes at the point t 0 (see i.e. [23], Definition 2.1); if γ is a geodesic at t for all t ∈ I, then γ is called a geodesic. If [a, b] ⊆ I and γ : I → M is a geodesic, the restriction of γ to [a, b] is called a geodesic segment joining γ(a) to γ(b). By abuse of language it is often referred to the image γ(I) of a geodesic γ as a geodesic.
A minimal geodesic between p and q is the shortest one joining p and q. It is easily seen that if there exists a minimal geodesic γ joining p to q, then d(p, q) equals the length ℓ(γ) of γ. This conditional if holds under the hypothesis of completeness: a Riemannian manifold M is said to be (geodesically) complete if for every p ∈ M the exponential map exp p is defined for the whole tangent space T p M, i.e., if any geodesic γ(t) starting from p is defined for all t ∈ R, and the statement is: An important consequence of Theorem 3.1 is the following, see [24], Corollary 1.4 and Problem 1 of Chapter III:

Corollary 3.2. A C ∞ manifold M is compact if and only if any Riemannian metric on M is complete.
On the other hand, the concept of curvature we will refer to is that of sectional curvature. According to Milnor [25] (p. 295), the sectional curvature of the tangential 2-plane spanned by some orthogonal unit vectors u and v can be described geometrically as the Gaussian curvature, at the point, of the surface swept out by all geodesics having a linear combination of u and v as tangent vector.
We are interested in Riemannian manifolds with additional algebraic structure: Lie groups. A Lie group is a group G with a differentiable structure such that the mapping G × G → G given by (x, y) → xy −1 , x, y ∈ G, is differentiable. It follows that translations from the left L x resp. translations from the right R x given by A Riemannian metric on G is said to be left invariant resp. right invariant if for all p, g ∈ G and for all u, v ∈ T p G it holds that

A Riemannian metric is called bi-invariant if it is both left and right invariant. Any compact Lie group can be endowed with a bi-invariant metric [23, Exercise 7].
We also consider the Lie algebra G of G, which consists of the vectors in T e G with e the neutral element of G and with a well-known additional structure provided by a commutator (or Lie bracket) in the usual way.
We will focus on the Lie group U (M) as a differentiable manifold, for a positive integer M; its Lie algebra will be denoted by u(M). Identity matrices of any size will be denoted by Id.

A BI-INVARIANT RIEMANNIAN METRIC
In this section, we endowe U (M) with a Riemannian structure which will be useful later on.
For u, v ∈ u(M), we define an inner product This definition does actually correspond to a positive definite symmetric bilinear form: The bilinearity is an easy exercise, and moreover: ( (2) The symmetry of (4.1) is clear, The positive definiteness follows from the fact that where ||u|| = ∑ i, j |u i j | 2 is the Frobenius norm of u, which is nonnegative and has all the required norm properties [26]. The metric defined above is Riemannian and bi-invariant. Bi-invariant metrics are useful for us because of the following.  [25]). Every compact Lie group admits a bi-invariant metric, which has nonnegative sectional curvature.
In the case of a bi-invariant metric, the sectional curvature admits an easier formula, see [25, p. 323 Furthermore, in a Lie group admitting a bi-invariant metric, geodesic curves have an easy description: they coincide with the exponential map. More precisely, for p ∈ U (M), a geodesic curve γ : In fact, for p, q ∈ U (M), there exists a geodesic γ joining p and q with γ(0) = p such that ).
The concept of a minimal geodesic is related with the concept of principal logarithm in the following way is a minimal geodesic segment joining p and q.
Recall iK is called a principal logarithm of a unitary matrix M ∈ U (M) if There are efficient algorithms that can compute the principal logarithm of a unitary matrix [27]. We choose this definition of principal logartihm over the one for the interval (−π, π) so that there is always a principal matrix logarithm, even for matrices with real negative eigenvalues (-1). Some properties, like infinite differentiability, are lost with this definition, but they are not used in our result.

Proof of Lemma 4.2. The length of a geodesic segment γ(t)
This expression only depends on the list of eigenvalues of pq −1 and their logarithms.
with a minimum when l i = 0 for i = 1, · · · , M, which corresponds to a principal logarithm.

AN APPLICATION OF TOPONOGOV'S COMPARISON THEOREM
Riemannian manifolds whose curvature is bounded below may be investigated by applying Toponogov's comparison theorem. We first need to define triangles on the Riemannian manifold.

RANDOM UNITARY MATRICES AND A BASIS FOR THE IMAGE
Before looking into the iterative process that gives the locally optimal approximation to any desired U ∈ imϕ, we need to consider a few useful tricks.
While in the previous section we have considered the identity matrix as our starting point in the image group, the results hold for any U 0 ∈ imϕ. The identity has some advantages: it is always in the image, for any values of n and m, and, in terms of computation, it is trivial to generate the M × M identity matrix.
However, in general, the landscape of the image group is unknown and numerical experiments show there are multiple local minima. In a 2D space, we can picture the image as an irregular profile with mountains and valleys. For any given starting point, the iterative process ends in a local minimum, but, as we do not know how many minima exist, we need a series of random starting points to sample multiple approximations, each the closest to the intended U in its local neighbourhood.
In order to generate a random unitary U r ∈ imϕ, we choose a random S r ∈ U (m) uniformly from all the possible matrices and compute U r = ϕ(S r ). The random unitaries in U (m) can be generated from samples of a normal distribution [28,29] and the photonic homomorphism is known [13,12].
Finally, we need a way to project the logarithm of any U ∈ U (M) to the image algebra imdϕ. The method is similar to the generation of random unitaries. We start by working on the algebra corresponding to the scattering matrices, u(m), where we can write down a known basis and, using the differential map dϕ, explicitly given in [19], we can obtain a basis for imdϕ, which is a subalgebra of u(M). This basis is not necessarily orthonormal, but it can be orthogonalized and normalized. The details of the whole procedure can be found in [21]. This basis, together with the inner product of Eq. (4.1) are enough to obtain the desired projection on the image algebra.

AN ITERATIVE PROCESS FOR THE APPROXIMATION
In Section 5 we have constructed an approximation matrix U 1 := U a ∈ im(ϕ), starting from U = exp (v) as Now, we can repeat this by taking a new approximation U 2 ∈ im(ϕ) by considering a geodesic triangle of vertices Id, U −1 1. Convergence. The method described above converges. Consider the sequence {U i } with i ≥ 0 and U 0 = Id. For the first step we know that d( T . In general, we have: Furthermore, the equality holds if and only if U i = U i+1 . Proof. By a successive application of the inequalities (2) and (1) above we obtain . Now, if the equality (7.1) holds, then v i N = v i and so v i T = 0; inequality (3) above allows us to conclude that U i = U i+1 . The converse is trivial.
Proof. Assume that U i = U i+1 for every i (otherwise, there would exist n ∈ N such that U n = U n+1 , and therefore d n = d n+1 = · · · and the sequence converges to d n ). By Proposition 7.1 the sequence {d i } is decreasing; in particular d i < d 1 for all i. This together with the fact that d i ≥ 0 for all i (i.e., the sequence {d i } is bounded) implies the convergence.
In fact, the approximation given by the described method is the best possible one in the following sense. Since {d i } is a convergent sequence, then it is a Cauchy-sequence. This means that for every ε > 0 there exists n ε ∈ N with Since, by (1) and (2) we have that We first observe the following inequality: Now, inequality (7.2) together with Lemma 7.3 imply that On the other hand, v i − v i N 2 = v T 2 and, together with inequality (3), this yields Therefore the sequence {U i } is a Cauchy-sequence itself. This allows us to apply Theorem 3.1 (Hopf-Rinow) in the complete metric space (M, g), and so the sequence {U i } converges to a matrixŨ ∈ M, and v T ∞ = 0. Hence the geodesic joiningŨ with U is normal.
7.2. Application example. We can see an example of the procedure with the Quantum Fourier Transform We choose the QFT not only for being a useful transformation in quantum information and quantum optics, but also because it can be considered one of the most difficult transformations for linear optics. The QFT is impossible to achieve with linear systems with m > 1 and more than one photon and has a strong structure.
We start from the QFT matrix for M = 3 (n = m = 2) For the implementation of U , we consider the states in the ordered basis {|20 , |02 , |11 }. All the results are given with 5 significant digits for matrix entries and 10 significant digits for distances, rounding the imaginary or real parts to zero when they are much smaller than the surrounding terms.
The starting point is the identity matrix, for which U − Id = 2.449489743. In the first step of our procedure, we take the projection of the principal logarithm of U into the image algebra: with U −U 20 = 1.732050808.
7.3. Random initial matrices. In fact, the results hold for any arbitrary U 0 ∈ imϕ. Now the procedure becomes computationally more involved, as we need an explicit evaluation of U r = ϕ(S r ) for random S r matrices, with a complexity which grows combinatorily in n and m. However, as we see in this second example, we can explore different local optima. For the QFT matrix in Eq. (7.2), if we start at random points, the approximation gravitates towards three solutions. Two of them are at the same distance from the QFT matrix than the approximation we found starting with the identity matrix, 1.7320, with There is a third solution with a distance 0.85675 to the QFT for the matrix: These solutions seem to be found with equal probability. For a run of 1000 experiments we found U 1 a 311 times, U 2 a 374 times and U 3 a 315 times. Further experiments showed a similar behaviour.
The best approximation to the 3 × 3 QFT matrix U 3 a comes from a scattering matrix: which is, up to a 5π 12 global phase, a balanced two input beam splitter with a scattering matrix preceded by a −π 3 phase shifter in the first port and followed by a π 3 in the second port. Depending on the starting local point, there are different local optima. Sometimes, two different approximation matrices in the image group give the same distance to the target unitary.

SUMMARY, RECOMMENDATIONS AND FUTURE IMPROVEMENTS
We have given an iterative method that finds a linear optical setup that approximates any arbitrary quantum evolution for n photons in m modes. The approximation is optimal in the local neighbourhood of the initial guess. Once we have the closest unitary that can be implemented, U, we can use a previous method [21] to obtain a scattering matrix S that gives the approximated evolution and there are multiple algorithms that give the physical setup corresponding to S using only beam splitters and phase shifters [4,7].
The proposed algorithm is based on results from differential geometry, in particular, Toponogov's theorem. They show the method will converge and find local optima.
There are a few practical details worth mentioning. First, numerically, we find that if we try the method on an evolution which is possible to obtain from a linear optics system, U ∈ imϕ, sometimes it will converge to a matrix close the actual solution, but, depending on the local landscape, it might fall into a variety of different matrices all at a similar large distance. In practical applications, the recommendation would be, first, check whether an exact implementation exists (with our previous algorithm [21]) and, if there is none, look for an approximation.
There are also some open problems. From the structure of the involved groups and algebras, it is not clear how many local optima exist for any given transformation ϕ. We have proposed a randomized way of exploring the state space of the potential approximation matrices, and, for the limited dimensions that can be numerically explored, it seems to work well, but there is no guarantee the method finds a global minimum. Any further knowledge of the structure of the image group would help in the search for a global minimum or, at least, in finding a probabilistic bound on the optimal approximation. Additionally, a lower bound on U − U a , as opposed to our upper norm, would help to determine the global optimum.
Even in its present form, the proposed algorithm, when combined with previous results, can assist in the design of quantum optical operations and has applications to quantum information and quantum optics experiment design.