Abstract
We analyze a quantum version of the Monge–Kantorovich optimal transport problem. The quantum transport cost related to a Hermitian cost matrix C is minimized over the set of all bipartite coupling states \(\rho ^{AB}\) with fixed reduced density matrices \(\rho ^A\) and \(\rho ^B\) of size m and n. The minimum quantum optimal transport cost \(\textrm{T}^Q_{C}(\rho ^A,\rho ^B)\) can be efficiently computed using semidefinite programming. In the case \(m=n\) the cost \(\textrm{T}^Q_{C}\) gives a semidistance if and only if C is positive semidefinite and vanishes exactly on the subspace of symmetric matrices. Furthermore, if C satisfies the above conditions, then \(\sqrt{\textrm{T}^Q_{C}}\) induces a quantum analogue of the Wasserstein-2 distance. Taking the quantum cost matrix \(C^Q\) to be the projector on the antisymmetric subspace, we provide a semi-analytic expression for \(\textrm{T}^Q_{C^Q}\) for any pair of single-qubit states and show that its square root yields a transport distance on the Bloch ball. Numerical simulations suggest that this property holds also in higher dimensions. Assuming that the cost matrix suffers decoherence and that the density matrices become diagonal, we study the quantum-to-classical transition of the Monge–Kantorovich distance, propose a continuous family of interpolating distances, and demonstrate that the quantum transport is cheaper than the classical one. Furthermore, we introduce a related quantity—the SWAP-fidelity—and compare its properties with the standard Uhlmann–Jozsa fidelity. We also discuss the quantum optimal transport for general d-partite systems.
Similar content being viewed by others
References
Agredo, J., Fagnola, F.: On quantum versions of the classical Wasserstein distance. Stochastics 89, 910 (2017)
Altschuler, J., Weed, J., Rigollet, P.: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In: NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1961–1971 (2017)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning, PMLR, vol. 70, p. 214 (2017)
Bengtsson, I., Życzkowski, K.: Geometry of Quantum States, 2nd edn. Cambridge University Press, Cambridge (2017)
Bhatia, R., Gaubert, S., Jain, T.: Matrix versions of the Hellinger distance. Lett. Math. Phys. 109, 1777–1804 (2019)
Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.: Quantum machine learning. Nature 549, 195 (2017)
Biane, P., Voiculescu, D.: A free probability analogue of the Wasserstein distance on the trace-state space. Geom. Funct. Anal. 11, 1125 (2001)
Bigot, J., Gouet, R., Klein, T., López, A.: Geodesic PCA in the Wasserstein space by convex PCA. Ann. Inst. H. Poincaré Probab. Stat. 53, 1–26 (2017)
Bistroń, R., Eckstein, M., Życzkowski, K.: Monotonicity of the quantum 2-Wasserstein distance. J. Phys. A 56, 095301 (2023)
Bonneel, N., van de Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using Lagrangian mass transport. ACM Trans. Graph. 30, 158 (2011)
Brandão, F.G.S.L., Svore, K.: Quantum speed-ups for solving semidefinite programs. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp. 415–426
Braunstein, S.L., Caves, C.M.: Statistical distance and the geometry of quantum states. Phys. Rev. Lett. 72, 3439 (1994)
Caglioti, E., Golse, F., Paul, T.: Quantum optimal transport is cheaper. J. Stat. Phys. 181, 149 (2020)
Carlen, E.A., Maas, J.: Non-commutative calculus, optimal transport and functional inequalities in dissipative quantum systems. J. Stat. Phys. 178, 319 (2020)
Chakrabarti, S., Huang, Y., Li, T., Feizi, S., Wu, X.: Quantum Wasserstein generative adversarial networks. In: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, arXiv:1911.00111
Chen, Y., Gangbo, W., Georgiou, T.T., Tannenbaum, A.: On the matrix Monge-Kantorovich problem. Eur. J. Appl. Math. 31, 574 (2020)
Cook, W.J., Cunningham, W.H., Pulleyblank, W.R., Schrijver, A.: Combinatorial Optimization. Wiley, New York (1998)
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2292–2300. Curran Associates Inc., New York (2013)
Datta, N., Rouzé, C.: Relating relative entropy, optimal transport and Fisher information: a quantum HWI inequality. Ann. H. Poincaré 21, 2115 (2020)
De Palma, G., Marvian, M., Trevisan, D., Lloyd, S.: The quantum Wasserstein distance of order 1. IEEE Trans. Inf. Theory 67, 6627–6643 (2021). https://doi.org/10.1109/TIT.2021.3076442
De Palma, G., Trevisan, D.: Quantum optimal transport with quantum channels. Ann. Henri Poincaré 22, 3199–3234 (2021)
Duvenhage, R.: Quadratic Wasserstein metrics for von Neumann algebras via transport plans. J. Operator Theory 88, 289–308 (2022)
Filipiak, K., Klein, D., Vojtková, E.: The properties of partial trace and block trace operators of partitioned matrices. Electron. J. Linear Algebra 33, 3–15 (2018)
Flamary, R., Cuturi, M., Courty, N., Rakotomamonjy, A.: Wasserstein discriminant analysis. Mach. Learn. 107, 1923–1945 (2018)
Friedland, S.: Matrices: Algebra, Analysis and Applications, p. 596. World Scientific, Singapore (2016)
Friedland, S.: Notes on semidefinite programming, Fall 2017, http://homepages.math.uic.edu/~friedlan/SDPNov17.pdf
Friedland, S.: Tensor optimal transport, distance between sets of measures and tensor scaling, arXiv:2005.00945
Friedland, S., Eckstein, M., Cole, S., Życzkowski, K.: Quantum Monge-Kantorovich problem and transport distance between density matrices. Phys. Rev. Lett. 129, 110402 (2022)
Friedland, S., Ge, J., Zhi, L.: Quantum Strassen’s theorem. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 23, 2050020 (2020)
Friesecke, G., Vögler, D.: Breaking the curse of dimension in multi-marginal Kantorovich optimal transport on finite state spaces. SIAM J. Math. Anal. 50(4), 3996–4019 (2018)
Gilchrist, A., Langford, N.K., Nielsen, M.A.: Distance measures to compare real and ideal quantum processes. Phys. Rev. A 71, 062310 (2005)
Golse, F., Mouhot, C., Paul, T.: On the mean field and classical limits of quantum mechanics. Commun. Math. Phys. 343, 165–205 (2016)
Golse, F., Paul, T.: Wave packets and the quadratic Monge-Kantorovich distance in quantum mechanics. Comptes Rendus Math. 356, 177–197 (2018)
Hitchcock, F.L.: The distribution of a product from several sources to numerous localities. J. Math. Phys. Mass. Inst. Tech. 20, 224–230 (1941)
Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, Cambridge (2013)
Horodecki, M., Horodecki, P., Horodecki, R.: Separability of mixed states: necessary and sufficient conditions. Phys. Lett. A 223, 1–8 (1996)
Ikeda, K.: Foundation of quantum optimal transport and applications. Quantum Inform. Process. 19, 25 (2020)
Jozsa, R.: Fidelity for mixed quantum states. J. Mod. Opt. 41, 2315–23 (1994)
Kantorovich, L.V.: Mathematical methods of organizing and planning production. Manag. Sci. 6, 366–422 (1959/60)
Keyl, M.: Fundamentals of quantum information theory. Phys. Rep. 369, 431–548 (2002)
Kiani, B.T., De Palma, G., Marvian, M., Liu, Z.-W., Lloyd, S.: Learning quantum data with the quantum earth mover’s distance. Quantum Sci. Technol. 7, 045002 (2022)
Liu, J., Yuan, H., Lu, X.-M., Wang, X.: Quantum Fisher information matrix and multiparameter estimation. J. Phys. A 53, 023001 (2020)
Lloyd, J.R., Ghahramani, Z.: Statistical model criticism using kernel two sample tests. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS’15, pp. 829–837. MIT Press, Cambridge (2015)
Lloyd, S., Weedbrook, C.: Quantum generative adversarial learning. Phys. Rev. Lett. 121, 040502 (2018)
Miszczak, J.A., Puchala, Z., Horodecki, P., Uhlmann, A., Życzkowski, K.: Sub- and super-fidelity as bounds for quantum fidelity. Quantum Inf. Comp. 9, 0103–0130 (2009)
Monge, G.: Mémoire sur la théorie des déblais et des remblais, Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année, pp. 666–704 (1781)
Mueller, J., Jaakkola, T.: Principal differences analysis: interpretable characterization of differences between distributions, In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS’15, pp. 1702–1710. MIT Press, Cambridge (2015)
Müller-Hermes, A.: On the monotonicity of a quantum optimal transport cost, preprint arXiv:2211.11713 (2022)
Panaretos, V.M., Zemel, Y.: Amplitude and phase variation of point processes. Ann. Stat. 44, 771–812 (2016)
Peres, A.: Separability criterion for density matrices. Phys. Rev. Lett. 77, 1413–1415 (1996)
Renner, R.: Quantum Information Theory, Exercise Sheet 9, http://edu.itp.phys.ethz.ch/hs15/QIT/ex09.pdf
Riera, M.H.: A transport approach to distances in quantum systems, Bachelor’s thesis for the degree in Physics, Universitat Autònoma de Barcelona (2018)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a distance for image retrieval. Int. J. Comput. Vis. 40, 99–121 (2000)
Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional Wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph. 34, 66 (2015)
Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth mover’s distance distance for image analysis. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1590–1602 (2011)
Šafránek, D.: Discontinuities of the quantum Fisher information and the Bures distance. Phys. Rev. A 95, 052320 (2017)
Székely, G.J., Rizzo, M.L.: Testing for equal distributions in high dimension. Inter-Stat. 11, 1–16 (2004)
Uhlmann, A.: The ‘transition probability’ in the state space of a *-algebra. Rep. Math. Phys. 9, 273 (1976)
Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996)
Vasershtein, L.N.: Markov processes over denumerable products of spaces describing large system of automata. Probl. Inf. Transmission 5, 47–52 (1969)
Villani, C.: Optimal Transport, Old and New, Grundlehren der Mathematischen Wissenschaften, 338. Springer, Berlin (2009)
Werner, R.F.: Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden-variable mode. Phys. Rev. A 40, 4277 (1989)
Winter, A.: Tight uniform continuity bounds for quantum entropies: conditional entropy, relative entropy distance and energy constraints. Commun. Math. Phys. 347, 291–313 (2016)
Wolfram Research, Inc., Mathematica, Version 12.2, Champaign, IL, USA (2020), https://www.wolfram.com/mathematica
Zhou, L., Yu, N., Ying, S., Ying, M.: Quantum earth mover’s distance, no-go quantum Kantorovich-Rubinstein theorem, and quantum marginal problem. J. Math. Phys. 63, 102201 (2022)
Życzkowski, K., Słomczyński, W.: Monge distance between quantum states. J. Phys. A 31, 9095–9104 (1998)
Życzkowski, K., Słomczyński, W.: The Monge distance on the sphere and geometry of quantum states. J. Phys. A 34, 6689 (2001)
Acknowledgements
It is a pleasure to thank Rafał Bistroń, John Calsamiglia, Matt Hoogsteder, Tomasz Miller, Wojciech Słomczyński and Andreas Winter for numerous inspiring discussions and helpful remarks. Financial support by Simons collaboration Grant for mathematicians, Narodowe Centrum Nauki under the Maestro Grant number DEC-2015/18/A/ST2/00274 and by the Foundation for Polish Science under the Team-Net Project no. POIR.04.04.00-00-17C1/18-00 is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Basic Properties of Partial Traces
In order to understand the partial traces on \( \textrm{B}({\mathcal {H}}_m\otimes {\mathcal {H}}_n)\) it is convenient to view this space as a 4-mode tensor space [29] and use Dirac notation. Denote by \({\mathcal {H}}_m^\vee \) the space of linear functionals on \({\mathcal {H}}_m\), i.e., the dual space. Then \({\textbf{y}}^\vee =\langle {\textbf{y}}|\in {\mathcal {H}}_m^\vee \) acts on \({\textbf{z}}\in {\mathcal {H}}_m\) as follows: \({\textbf{y}}^\vee ({\textbf{z}})=\langle {\textbf{y}},{\textbf{z}}\rangle =\langle {\textbf{y}}|{\textbf{z}}\rangle \). Hence a rank-one operator in \(\textrm{B}({\mathcal {H}}_m)\) is of the form \({\textbf{x}}\otimes {\textbf{y}}^\vee =|{\textbf{x}}\rangle \langle {\textbf{y}}|\), where \((|{\textbf{x}}\rangle \langle {\textbf{y}}|)({\textbf{z}})=\langle {\textbf{y}}|{\textbf{z}}\rangle |{\textbf{x}}\rangle \). So \(|{\textbf{x}}\rangle \langle {\textbf{y}}|\) can be viewed a matrix \(\rho ={\textbf{x}}{\textbf{y}}^\dagger \in {\mathbb {C}}^{m\times m}\). Assume that \(V_1,V_2\) are linear transformations from \({\mathcal {H}}_m\) to itself. Then \(V_1\otimes V_2\) is a sesquilinear transformation from \({\mathcal {H}}_m\otimes {\mathcal {H}}_m^\vee \) to itself, which acts on rank one operators as follows:
Assume now that \(W_1,W_2\) are linear transformations from \({\mathcal {H}}_n\) to itself. Then
A tensor product of two rank-one operators is identified a 4-tensor:
Thus
Observe next that \(V_1\otimes W_1\otimes V_2\otimes W_2\) is a multi-sesquilinear transformation of \(\textrm{B}({\mathcal {H}}_m\otimes {\mathcal {H}}_n)\) to itself, which acts on a rank-one product operator as follows:
(In the last equality we view \(|{\textbf{x}}\rangle |{\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|\) as an \((mn)\times (mn)\) matrix.) As \({{\,\textrm{Tr}\,}}|{\textbf{x}}\rangle \langle {\textbf{y}}|=\langle {\textbf{y}}|{\textbf{x}}\rangle \) we deduce the following lemma:
Lemma A.1
Let
Then
In particular, if \(V_1=V_2=V\) and \(W_1=W_2=W\) are unitary then
Corollary A.2
Let \(\rho ^A\in \Omega _m,\rho ^B\in \Omega _n\), \(V\in \textrm{B}({\mathcal {H}}_m),W\in \textrm{B}({\mathcal {H}}_n)\) be unitary and \(C\in \textrm{S}({\mathcal {H}}_m\otimes {\mathcal {H}}_n)\). Then
Proof
View \(\rho ^A\in \Omega _m\) as an element in \({\mathcal {H}}_m\otimes {\mathcal {H}}_m^\vee \) to deduce \(V\rho ^AV^\dagger =(V\otimes V) \rho ^A\). Suppose that
Let \({\tilde{\rho }}^{AB}=(V\otimes W\otimes V\otimes W) \rho ^{AB}\). Observe that
Similarly \({{\,\textrm{Tr}\,}}_B {\tilde{\rho }}^{AB}=V \rho ^AV^\dagger \). Hence
and
Hence we deduce the first part of the corollary. The second part of the corollary follows from the identity
\(\square \)
The following result appeared in the literature [29] and we state it here for completeness. For \(\rho ^A\in \textrm{B}({\mathcal {H}}_m)\) denote by range\(\, \rho ^A\subseteq {\mathcal {H}}_m\) the range of \(\rho ^A\).
Lemma A.3
Let \(\rho ^A\in \Omega _m,\rho ^B\in \Omega _n\). Then
In particular if either \(\rho ^A\) or \(\rho ^B\) is a pure state then \(\Gamma ^Q( \rho ^A, \rho ^B)=\{ \rho ^A\otimes \rho ^B\}\).
Proof
It is enough to show that \(\Gamma ^Q(\rho ^A,\rho ^B)\subset \textrm{B}(\text {range}\,\rho ^A)\otimes \textrm{B}({\mathcal {H}}_n)\). To show this condition we can assume that range\(\,\rho ^A\) is a nonzero strict subspace of \({\mathcal {H}}_m\). By choosing a corresponding orthonormal basis consisting of eigenvectors of \(\rho ^A\) we can assume that \(\rho ^A\) is a diagonal matrix whose first \(1\le \ell <m\) diagonal entries are positive, and whose last \(n-\ell \) diagonal entries are zero. Write down \(\rho ^{AB}\) as a block matrix \([R_{pq}] \in {\mathbb {C}}^{(mn)\times (mn)}\), were \(R_{pq}\in {\mathbb {C}}^{m\times m}, p,q\in [n]\). Then \({{\,\textrm{Tr}\,}}_B \rho ^{AB}=\sum _{p=1}^n R_{pp}= \rho ^A\). As \(R_{pp}\ge 0\) we deduce that \( \rho ^A=[a_{ij}]\ge R_{pp}\ge 0\). As \(a_{ii}=0\) for \(i>\ell \) it follows that the (i, i) entry of each \(R_{pp}\) is zero. As \(\rho ^{AB}\) positive semidefinite it follows that the \(((p-1)n+i)\)th row and column of \(\rho ^{AB}\) are zero. This proves \(\Gamma ^Q(\rho ^A,\rho ^B)\subseteq \textrm{B}(\text {range}\,\rho ^A)\otimes \textrm{B}({\mathcal {H}}_n)\). Apply the same argument for \(\rho ^B\) to deduce \(\Gamma ^Q(\rho ^A,\rho ^B)\subseteq \textrm{B}(\text {range}\,\rho ^A)\otimes \textrm{B}(\text {range}\, \rho ^B)\).
Assume that \( \rho ^A=|1\rangle \langle 1|\) and \(\rho ^{AB}\in \Gamma ^Q(\rho ^A,\rho ^B)\). Then \(\rho ^{AB}=\rho ^A\otimes \rho ^B\). \(\square \)
More information concerning the partial trace and its properties can be found in a recent work [23].
The following results are used in the proof of Proposition 2.6:
Lemma A.4
Denote by \(S_N\) the SWAP operator on \({\mathcal {H}}_{N^2}:={\mathcal {H}}_N\otimes {\mathcal {H}}_N\), and by \(S_{n,m}\) and \(R_{n,m}\) the following SWAP operators on \({\mathcal {H}}_{(nm)^2}:={\mathcal {H}}_n\otimes {\mathcal {H}}_m\otimes {\mathcal {H}}_n\otimes {\mathcal {H}}_m\):
-
(a)
Assume that \(|i\rangle \), with \(i\in [N]\), is an orthonormal basis in \({\mathcal {H}}_N\). Suppose that
$$\begin{aligned} \rho =\sum _{i,j,p,q\in [N]} \rho _{(i,p)(j,q)}|i\rangle |p\rangle \langle j|\langle q|\in \textrm{B}({\mathcal {H}}_N\otimes {\mathcal {H}}_N). \end{aligned}$$Then \({{\,\textrm{Tr}\,}}S_N\rho =\sum _{i,p\in [N]} \rho _{(p,i)(i,p)}\).
-
(b)
Assume that
$$\begin{aligned}&\rho ^{AB}\in \textrm{B}({\mathcal {H}}_n\otimes {\mathcal {H}}_n),{} & {} {{\,\textrm{Tr}\,}}_B\rho ^{AB}=\rho ^A \in \textrm{B}({\mathcal {H}}_n),{} & {} {{\,\textrm{Tr}\,}}_A\rho ^{AB}=\rho ^B\in \textrm{B}({\mathcal {H}}_n), \\&\sigma ^{CD}\!\in \textrm{B}({\mathcal {H}}_m\otimes {\mathcal {H}}_m),{} & {} {{\,\textrm{Tr}\,}}_D\sigma ^{CD}\!=\sigma ^C \in \textrm{B}({\mathcal {H}}_m),{} & {} {{\,\textrm{Tr}\,}}_C\sigma ^{CD}\!=\sigma ^D\in \textrm{B}({\mathcal {H}}_m). \end{aligned}$$Then \(\tau ^{ACBD}:=R_{n,m}(\rho ^{AB}\otimes \sigma ^{CD})R_{n,m}\) is in \(\textrm{B}({\mathcal {H}}_{(nm)^2})\). Furthermore
$$\begin{aligned} \begin{aligned}&{{\,\textrm{Tr}\,}}_{BD} \tau ^{ACBD}=\rho ^A\otimes \sigma ^C, \quad {{\,\textrm{Tr}\,}}_{AC} \tau ^{ACBD}=\rho ^B\otimes \sigma ^D,\\&{{\,\textrm{Tr}\,}}S_{n,m}\tau ^{ACBD}=\big ({{\,\textrm{Tr}\,}}S_n\rho ^{AB}\big )\big ({{\,\textrm{Tr}\,}}S_m \sigma ^{CD}). \end{aligned} \end{aligned}$$(A.2)
Proof
(a) View \(S_N\) and \(\rho \) as \(N^2\times N^2\) matrices with entries indexed by the row (i, p) and the column (j, q). Observe that \(S_N\) is a symmetric permutation matrix. Then \((S_N\rho )_{(i,p),(j,q)}=\rho _{(p,i)(j,q)}\). The trace of \(S_N\rho \) is obtained by summation on the entries \(p=q\) and \(i=j\).
Clearly, \(\tau ^{ACBD}\in \textrm{B}({\mathcal {H}}_{(nm)^2})\). Assume that
Then
Observe next that \({{\,\textrm{Tr}\,}}_{BD}\tau ^{ACBD}\) is obtained when we sum on \(i_B=j_B\) and \(p_D=q_D\). Hence
Similarly \({{\,\textrm{Tr}\,}}_{AC}\tau ^{ACBD}=\rho ^B\otimes \sigma ^D\). This proves the first line in (A.2).
We now use (a) to compute \({{\,\textrm{Tr}\,}}S_{n,m}\tau ^{ACBD}\):
This proves the second line in (A.2). \(\square \)
Appendix B: Quantum States of a Single Qubit System
In this Appendix we discuss additional properties of the quantum optimal transport for qubits. Section B.1 provides (Theorem B.1) a closed formula for \(\textrm{T}_{C^Q}^Q(\rho ^A,\rho ^B)\) in terms of solutions of the trigonometric equation (B.1). Lemma B.2 shows that this trigonometric equation is equivalent to a polynomial equation of degree at most 6. Section B.2 gives a nice closed formula for the value of QOT for two isospectral qubit density matrices. In Section B.3 we present a simple example where the supremum of the dual SDP problem to QOT is not achieved.
1.1 B.1: A Semi-analytic Formula for the Single-Qubit Optimal Transport
We begin by introducing a convenient notation for qubits in the \(y=0\) section of the Bloch ball \(\Omega _2\)—see [4, Sect. 5.2]. Let O denote the orthogonal rotation matrix,
and define, for \(r\in [0,1]\),
Because of unitary invariance (2.14), the quantum transport problem between two arbitrary qubits \(\rho ^A, \rho ^B \in \Omega _2\) can be reduced to the case \(\rho ^A = \rho (s,0)\) and \(\rho ^B = \rho (r,\theta )\), with three parameters, \(s,r \in [0,1]\) and \(\theta \in [0,2\pi )\). The parameter \(\theta \) is the angle between the Bloch vectors associated with \(\rho ^A\) and \(\rho ^B\). With such a parametrization we can further simplify the single-qubit transport problem.
Observe first that if \(s \in \{0,1\}\) then \(\rho ^A\) is pure, and if \(r \in \{0,1\}\) then \(\rho ^B\) is pure. In any such case an explicit solution of the qubit transport problem is given (6.2).
Theorem B.1
Let \(\rho ^A = \rho (s,0), \rho ^B = \rho (r,\theta )\) and assume that \(0<r,s<1\). Then
where \(\Phi (s,r,\theta )\) is the set of all \(\phi \in [0,2\pi )\) satisfying the equation
Proof
A unitary \(2 \times 2\) matrix U can be parametrized, up to a global phase, with three angles \(\alpha , \beta , \phi \in [0,2\pi )\),
Thus, setting \(f(r,\theta ;\alpha ,\phi ) = (U^{\dagger }\rho (r,\theta ) U)_{11}\), we have
This quantity does not depend on the parameter \(\beta \), so we can set \(\beta = 0\). Note also that \(f(s,0;\alpha ,\phi )\) does not depend on \(\alpha \). With \(\rho ^A = \rho (s,0), \rho ^B = \rho (r,\theta )\), Theorem 5.1 yields
Now, note that the equation \(\partial _\alpha f(r,\theta ;\alpha ,\phi ) = 0\) yields the extreme points \(\alpha _0 = k \pi /2\), with \(k \in {\mathbb {Z}}\). Since \(f(r,\theta ;\alpha + \pi ,\phi ) = f(r,\theta ;\alpha ,\phi )\) we can take just \(\alpha _0 \in \{0,\pi /2\}\). Consequently,
where we introduce the auxilliary functions
But since \(g_-(s,r,\theta ;2\pi - \phi ) = g_+(s,r,\theta ;\phi )\) we can actually drop the ± index in the above formula. In conclusion, we have shown that it is sufficient to take \(U = O(\phi )\) for \(\phi \in [0,2\pi )\) in Formula (5.2).
Finally, it is straightforward to show that the equation \(\partial _\phi g(s,r,\theta ;\phi ) = 0\) is equivalent to (B.1). Hence, \(\Phi (s,r,\theta )\) is the set of extreme points, and (B.1) follows. \(\square \)
Lemma B.2
The Eq. (B.1) has at most six solutions \(\phi \in [0,2\pi )\) for given \(r,s\in (0,1), \theta \in [0,2\pi )\). Moreover there is an open set of \(s,r\in (0,1),\theta \in [0,2\pi )\) where there are exactly six distinct solutions.
Proof
Write \(z=e^{{\textbf{i}}\phi }, \zeta =e^{{\textbf{i}}\theta }\). Then
Thus (B.1) is equivalent to
This a 6th order polynomial equation in th e variable z, so it has at most 6 real solutions. Since we must have \(\vert z \vert = 1\), not every complex root of (B.3) will yield a real solution to the original (B.1). Nevertheless, it can be shown that there exist open sets in the parameter space \(s,r \in (0,1)\), \(\theta \in [0,2\pi )\) on which (B.1) does have 6 distinct solutions.
Observe that if \(\theta =0\) and \(s,r\in (0,1)\) and \(s\ne r\) then two solutions to the equality (B.1) are \(\phi \in \{0,\pi \}\), which means that \(z=\pm 1\). In this case the equality (B.1) is
As \(\sin ^2\phi =-(1/4)z^{-2}(z^2-1)^2\) we see that \(z=\pm 1\) is a double root.
Another solution \(\phi \notin \{0,\pi \}\) is given by
Assume that \(r+s=1\). Then \(\cos \phi =0\), so \(\phi \in \{\pi /2, 3\pi /2\}\). Thus if \(r+s\) is close to 1 we have that \(\phi \) has two values close to \(\pi /2\) and \(3\pi /2\) respectively. Hence in this case we have 6 solutions counting with multiplicities.
We now take a small \(|\theta |>0\). The two simple solutions \(\phi \) are close to \(\pi /2\) and \(3\pi /2\). We now need to show that the double roots \(\pm 1\) split to two pairs of solutions on the unit disc: one pair close to 1 and the other pair close to \(-1\). Let us consider the pair close to 1, i.e., \(\phi \) close to zero. Then the equation (B.1) can be written in the form
Replacing \(\sin \phi , \sin (\theta +\phi )\) by \(\phi , \theta +\phi \) respectively we see that the first term gives the equation: \((2s-1)^2(2r)\phi ^2-(2r-1)^2 2s(\theta +\phi )^2=0\). Then we obtain two possible Taylor series of \(\phi \) in terms of \(\theta \):
Use the implicit function theorem to show that \(E_1(\theta )\) and \(E_2(\theta )\) are analytic in \(\theta \) in the neighborhood of 0. Hence in this case we have 6 different solutions. \(\square \)
We have thus shown that the general solution of the quantum transport problem of a single qubit with cost matrix \(C^Q = \tfrac{1}{2} \big ({\mathbb {I}}_{4} - S\big )\) is equivalent to solving a 6th degree polynomial equation with certain parameters. For some specific values of these parameters an explicit analytic solution can be given. This is discussed in the next subsection.
1.2 B.2: Two Isospectral Density Matrices of a Single Qubit
In view of unitary invariance (2.14) and the results of the previous section we can assume that two isospectral qubits have the following form: \(\rho ^A = \rho (s,0)\) and \(\rho ^B = \rho (s,\theta )\) for some \(s \in [0,1]\) and \(\theta \in [0,2\pi )\).
Theorem B.3
For any \(s \in [0,1]\) and \(\theta \in [0,2\pi )\) we have
Proof
Note first that if the states \(\rho ^A,\rho ^B\) are pure, i.e. \(s = 0\) or \(s=1\), formula (B.4) gives \(\textrm{T}^Q_{C^Q} \big (\rho (s,0),\rho (s,\theta ) \big ) = \tfrac{1}{2} \sin ^2 (\theta / 2)\), which agrees with (6.2).
From now on we assume that that \(\rho ^A, \rho ^B\) are not pure. When \(r = s\), (B.3) simplifies to the following:
Equation (B.5) is satisfied when \(z = \pm \zeta ^{-1/2}\). This corresponds to \(\phi _0 = -\theta /2\) or \(\phi _0' = \pi - \theta /2\). Observe, however, that we have \(g(s,s,\theta ;\phi _0) = g(s,s,\theta ;\phi _0') = 0\), so we can safely ignore \(\phi _0, \phi _0' \in \Phi (s,s,\theta )\) in the maximum in (B.1).
Hence, we are left with a 4th order equation
which reads
Now, observe that if \(\phi \) satisfies (B.7), then so does \(\phi ' = -\phi - \theta \). This translates to the fact that if z satisfies (B.6), then so does \((z \zeta )^{-1}\). Furthermore, \(g(s,s,\theta ;\phi ) = g(s,s,\theta ;\phi ')\). Hence, in the isospectral case we are effectively taking the maximum over just two values of \(\phi \).
Let us now seek an angle \(\phi _1 \in [0,2\pi )\) such that \(g(s,s,\theta ;\phi _1)\) equals the righthand side of (B.4). The latter equation reads
In terms of z and \(\zeta \), the above is equivalent to a 4th order polynomial equation in z, which can be recast in the following form:
Hence, (B.8) has two double roots:
Furthermore, one can check that \(z_1^{-} = (\zeta z_1^{+})^{-1}\).
Now, it turns out that \(z_1^{\pm }\) are also solutions to (B.6), as one can quickly verify using Mathematica [64]. We thus conclude that \(\phi _1, \phi _1' \in \Phi (s,s,\theta )\).
We now divide the polynomial in (B.6) by \((z-z_1^{+})(z-z_1^{-})\). We are left with the following quadratic equation
Its solutions are
Again, we have \(z_2^{-} = (\zeta z_2^{+})^{-1}\), in agreement with the symmetry argument. Setting \(z_2^+ =:e^{{\textbf{i}}\phi _2}\) and \(z_2^- =:e^{{\textbf{i}}\phi _2'}\) we have \(\phi _2, \phi _2' \in \Phi (s,s,\theta )\). Then we deduce that
Finally, we observe that
This shows that, for any \(s \in (0,1)\), \(\theta \in [0,2\pi )\),
and (B.4) follows. \(\square \)
Note that \(g(s,s,\theta ;\phi _2)\) can become negative for certain values of s and \(\theta \). This means that for such values the set \(\Phi \) of phases defined in Theorem B.1 reads, \(\Phi (s,s,\theta ) = \{\phi _0,\phi _0',\phi _1,\phi _1'\}\).
1.3 B.3: An Example Where the Supremum (3.1) is not Achieved
Assume that \(m=n=2\), \( C=C^Q\), \( \rho ^A=\vert 0 \rangle \langle 0 \vert = \left[ {\begin{matrix} 1 &{}\quad 0 \\ 0 &{}\quad 0 \end{matrix}} \right] \) and \( \rho ^B={\mathbb {I}}_2/2\). Recall that in such a case, \(\Gamma ^Q(\rho ^A,\rho ^B)=\{\rho ^A\otimes \rho ^B\}\) and
Hence \(\textrm{T}^Q(\rho ^A,\rho ^B)=1/4\). We can easily see that the supremum in (3.1) is not attained in this case. Let F be of the form (5.12). Suppose that there exists \(\sigma ^A,\sigma ^B\in \textrm{S}({\mathcal {H}}_2)\) such that \(F\ge 0\) and \(\textrm{T}_{C^Q}^Q(\rho ^A,\rho ^B)={{\,\textrm{Tr}\,}}(\sigma ^A\rho ^A+\sigma ^B\rho ^B)\). As in the proof of Theorem 3.2 we deduce that \({{\,\textrm{Tr}\,}}F( \rho ^A\otimes \rho ^B)=0\). Hence the (1, 1) and (2, 2) entries of F are zero. Since \(F\ge 0\) it follows that the first and the second row and column of F are zero. Observe next that the (2, 3) and (3, 2) entries of F are \(-1/2\). Hence such \(\sigma ^A,\sigma ^B\) do not exist.
Since \(\rho ^A\) is not positive definite and \(\rho ^B\) is positive definite, as pointed out in the proof of Proposition 2.4, one can replace \(\rho ^{A}\) by \(\rho ^{A'}=[1]\in \Omega _1\). Then the dual problem for \(\rho ^{A'},\rho ^B\) boils down to
Then the above maximum is 1/4, achieved for \(a'=-1/2 +t, e=1/2-t, g=-t\) for each \(t\in {\mathbb {R}}\).
To summarize: the supremum of the dual problem to \(\textrm{T}_{C^Q}^Q(\rho ^A,\rho ^B)\) is achieved at infinity, while the supremum of the dual problem to \(\textrm{T}_{C^Q}^Q(\rho ^{A'},\rho ^B)\) is achieved on an unbounded set.
Appendix C: Diagonal States of a Qutrit
In this section we provide a closed formula for \(\textrm{T}_{C^{Q}}^Q(\rho ^A,\rho ^B)\) for a large class of classical qutrits, i.e. diagonal matrices \(\rho ^A,\rho ^B\in \Omega _3\).
Theorem C.1
Let \({\textbf{s}}=(s_1,s_2,s_3)^\top ,{\textbf{t}}=(t_1,t_2,t_3)^\top \in {\mathbb {R}}^3\) be probability vectors. Then the quantum optimal transport problem for diagonal qutrits is determined by the given formulas in the following cases:
-
(a)
$$\begin{aligned} \textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=\frac{1}{2}\max _{p\in [3]}(\sqrt{s_p}-\sqrt{t_p})^2 \end{aligned}$$
if and only if the conditions (5.9) hold for \(n=3\).
-
(b)
Suppose that there exists a renaming of 1, 2, 3 by p, q, r such that
$$\begin{aligned} \begin{aligned}&t_r\ge s_p+s_q \text { and}\\&\text {either } s_p\ge t_p>0, s_q\ge t_q>0 \; \text { or } \; t_p\ge s_p>0, t_q\ge s_q >0. \end{aligned} \end{aligned}$$(C.1)Then
$$\begin{aligned} \textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=\frac{1}{2}\Big ((\sqrt{s_p}-\sqrt{t_p})^2 +(\sqrt{s_q}-\sqrt{t_q})^2\Big ). \end{aligned}$$(C.2) -
(c)
Suppose that there exists \(\{p,q,r\}=\{1,2,3\}\) such that
$$\begin{aligned} s_p>t_q>0, \quad t_p>s_q>0, \quad s_q+s_r\ge t_p, \end{aligned}$$(C.3)and
$$\begin{aligned} \begin{aligned}&1 +\frac{\sqrt{t_q}}{\sqrt{s_q}}-\sqrt{\frac{s_p-t_q}{t_p-s_q}}\ge 0, \qquad 1+\frac{\sqrt{s_q}}{\sqrt{t_q}}-\sqrt{\frac{t_p-s_q}{s_p-t_q}}\ge 0,\\&\left( 1+\frac{\sqrt{t_q}}{\sqrt{s_q}}-\sqrt{\frac{s_p-t_q}{t_p-s_q}} \,\right) \left( 1+\frac{\sqrt{s_q}}{\sqrt{t_q}}-\sqrt{\frac{t_p-s_q}{s_p-t_q}} \, \right) \ge 1, \\&\max \Big (\frac{s_q}{t_q},\frac{t_q}{s_q}\Big )\ge \max \Big (\frac{s_p-t_q}{t_p-s_q},\frac{t_p-s_q}{s_p-t_q} \Big ). \end{aligned} \end{aligned}$$(C.4)Then
$$\begin{aligned} \textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))= \frac{1}{2}\Big ((\sqrt{s_q}-\sqrt{t_q})^2 +(\sqrt{s_p-t_q}-\sqrt{t_p-s_q})^2\Big ).\qquad \quad \end{aligned}$$(C.5) -
(d)
Assume that \({\textbf{s}}=(s_1,s_2,0)^\top ,{\textbf{t}}=(t_1,t_2,t_3)^\top \) are probability vectors. Then
$$\begin{aligned} \textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))= {\left\{ \begin{array}{ll} \frac{1}{2}\big ((\sqrt{t_1}-\sqrt{t_2})^2+t_3\big ), &{} \text { if } s_1\ge t_2 \text { and } s_2\ge t_1,\\ \frac{1}{2}\big ((\sqrt{t_1}-\sqrt{s_1})^2+t_3\big ), &{} \text { if } s_1< t_2,\\ \frac{1}{2}\big ((\sqrt{t_2}-\sqrt{s_2})^2+t_3 \big ), &{} \text { if } s_2< t_1. \end{array}\right. }\nonumber \\ \end{aligned}$$(C.6)If \({\textbf{s}}=(s_1,s_2,s_3)^\top ,{\textbf{t}}=(t_1,t_2,0)^\top \), then formula (C.6) holds after the swapping \(s_i \leftrightarrow t_i\).
Proof
(a) This follows from Theorem 5.3.
(b) Suppose that the condition (C.1) holds. By relabeling the coordinates and interchanging \({\textbf{s}}\) and \({\textbf{t}}\) if needed we can assume the conditions (C.1) are satisfied with \(p=1,q=2, r=3\):
Hence
We claim that the conditions (C.1) yield that \(X^\star \) is a minimizing matrix for \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) as given in (4.5). To show that we use the complementary conditions in Lemma 5.2. Let \(R^\star \in \Gamma ^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) be the matrix induced by \(X^\star \) of the form described in part (a) of Lemma 4.2. That is, the diagonal entries of \(R^\star \) are \(R^\star _{(i,j)(i,j)}=x^\star _{ij}\) with additional nonnegative entries: \(R^\star _{(i,j)(j,i)}=\sqrt{x^\star _{ij}x^\star _{ji}}\) for \(i\ne j\). Clearly, \(R^\star \) is a direct sums of 3 submatrices of order 1 and 3 of order 2 as above. Let \(F^\star \) be defined as in Lemma 5.2 with the following parameters:
We claim that the conditions (C.1) yield that \(F^\star \) is positive semidefinite. We verify that the three blocks of size one and the three blocks of size two of \(F^\star \) are positive semidefinite. The condition \(a_i^\star +b_i^\star \ge 0\) for \(i\in [3]\) is straightforward. The conditions for \(M_{12}^\star \) and \(M_{13}^\star \) are straightforward. We now show that \(M_{12}^\star \) is positive semidefinite. First note that as \(s_1\ge t_1\) and \(s_2\ge t_2\) we get that \(b_1^\star \ge 0\) and \(b_2^\star \ge 0\). Clearly \(a_1^\star >-1/2\) and \(a_2^\star >-1/2\). Hence the diagonal entries of \(M_{12}^\star \) are positive. It is left to show that \(\det M_{12}^\star \ge 0\). Set \(u=\sqrt{t_1}/{\sqrt{s_1}}\le 1\) and \(v=\sqrt{s_2}/{\sqrt{t_2}}\ge 1\). Then
We next observe that equalities (5.3) hold. The first three equalities hold as \(x_{11}^\star =x_{22}^\star =(a_3^\star +b_3^\star )=0\). The equality of \(i=1,j=2\) holds as \(x_{12}^\star =x_{21}^\star =0\). The equalities for \(i=1, j=3\) and \(i=2,j=3\) follow from the following equalities:
Hence \({{\,\textrm{Tr}\,}}R^\star F^\star =0\) and \(X^\star \) is a minimizing matrix. Therefore (C.2) holds for \(p=1\), \(q=2\).
(c) Suppose that the condition (C.3) holds. By relabeling the coordinates we can assume the conditions (C.3) are satisfied with \(p=1,q=2, r=3\):
Hence
We claim that the conditions (C.4) yield that \(X^\star \) is a minimizing matrix for
\(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) as given in (4.5). To show this we use the complementary conditions in Lemma 5.2. Let \(R^\star \in \Gamma ^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) be the matrix induced by \(X^\star \) of the form described in part (a) of Lemma 4.2. Recall that \(R^\star \) is a direct sum of 3 submatrices of order 1 and 3 of order 2 as above. Let \(F^\star \) correspond to
We claim that (C.4) yield that \(F^\star \) is positive semidefinite. We verify that the three blocks of size one and the three blocks of size two matrices of \(F^\star \) are positive semidefinite. The condition \(a_1^\star +b_1^\star \ge 0\) is straightforward. To show the condition \(a_2^\star +b_2^\star \ge 0\) we argue as follows. Let
Then \(2(a_2^\star +b_2^\star )=u+1/u-(v+1/v)\). The fourth condition of (C.4) is \(\max (u,1/u)\ge \max (v,1/v)\). As \(w+1/w\) increases on \([1,\infty )\) we deduce that \(a_2^\star +b_2^\star \ge 0\). Clearly \(a_3^\star +b_3^\star =0\). We now show that the matrices (5.5) are positive semidefinite, where the last three inequalities follow from the first three inequalities of (C.4):
Moreover, the conditions (5.3) hold: as \(x_{11}^\star =x_{22}^\star = a_3^\star +b^\star _3=0\) the first three conditions of (5.3) hold, and as \(x_{23}^\star =x_{32}^\star =0\) the second conditions of (5.3) for \(p=2,q=3\) trivially hold. The other two conditions follow from the following equalities:
\({{\,\textrm{Tr}\,}}F^\star R^\star =0\). Therefore
This proves (C.5).
(d) Observe that the third row of every matrix in \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) is a zero row. Let \({\textbf{s}}'=(s_1,s_2)^\top \). Thus \(\Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) is obtained from \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) by deleting the third row in each matrix in \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\). Proposition 2.4 yields that
(See Lemma 4.3 for the definition of \(C_{2, 3}^Q\).) We use now the minimum characterization of \(\textrm{T}_{C^Q_{2,3}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}'),{{\,\textrm{diag}\,}}({\textbf{t}}))\) given in (4.5). Assume that the minimum is achieved for \(X^\star =[x_{il}^\star ]\in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}}), i\in [2],l\in [3]\). We claim that either \(x_{11}^\star =0\) or \(x_{22}^\star =0\).
Let \(Y=[x_{il}^\star ], i,l\in [2]\). Suppose first that \(Y=0\). Then \(t_1=t_2=0\) and \(t_3=1\). So \({{\,\textrm{diag}\,}}({\textbf{t}})\) is a rank-one matrix and \({{\,\textrm{Tr}\,}}\big ( {{\,\textrm{diag}\,}}({\textbf{s}}){{\,\textrm{diag}\,}}({\textbf{t}}) \big )=0\). The equality (6.2) yields that \(\textrm{T}_{C^Q}^Q \big ({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}) \big )=1\). Clearly, \(s_1\ge t_2=0, s_2\ge t_1=0\). Hence (C.6) holds.
Suppose second that \(Y\ne 0\). Then \(t_1+t_2\), the sum of the entries of Y, is positive. Using continuity arguments it is enough to consider the case \(t_1,t_2,t_3>0\). Denote by \(\Gamma '\) the set of all matrices \(X=[x_{il}] \in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) such that \(x_{i3}=x_{i3}^\star \) for \(i=1,2\). Let f be defined by (4.8). Clearly \(\min _{A\in \Gamma '} f(A)=f(Y)\). We now translate this minimum to the minimum problem we studied above.
Let \(Z=\frac{1}{t_1+t_2} Y\). The vectors corresponding to the row sums and the column sums Z are the probabilty vectors \({\hat{{\textbf{s}}}} = ({\hat{s}}_1,{\hat{s}}_2)^\top \) and \({\hat{{\textbf{t}}}}=\frac{1}{t_1+t_2}(t_1,t_2)^\top \) respectively. Consider the minimum problem \(\min _{W\in \Gamma ^{cl}({\hat{{\textbf{s}}}},{\hat{{\textbf{t}}}})} f(W)\). The proof of Lemma 4.5 yields that this minimum is achieved at \(W^\star \) which has at least one zero diagonal element. Hence Y has at least one zero diagonal element.
Assume first that Y has two zero diagonal elements. Then \(X^\star =\begin{bmatrix}0&{}t_2&{}s_1-t_2\\ t_1&{}0&{}s_2-t_1\end{bmatrix}\). This corresponds to the first case of (C.6). It is left to show that \(X^\star \) is a minimizing matrix. Using the continuity argument we may assume that \(s_1>t_2, s_2>t_1\). Let \(B\in {\mathbb {R}}^{2\times 3}\) be a nonzero matrix such that \(X^\star +cB\in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) for \(c\in [0,\varepsilon ]\) for some small positive \(\varepsilon \). Then \(B=\begin{bmatrix}a&{}-b&{}-a+b\\ -a&{}b&{}a-b\end{bmatrix}\), where \(a,b\ge 0\) and \(a^2+b^2>0\). It is clear that \(f(X^\star )<f(X+cB)\) for each \(c\in (0,\varepsilon ]\). This proves the first case of (C.6).
Assume second that \(x_{11}^\star =0\) and \(x_{22}^\star >0\). Observe that \(x_{21}^\star =t_1>0\). We claim that \(x_{13}^\star =0\). Indeed, suppose that it is not the case. Let \(B=\begin{bmatrix}0&{}1&{}-1\\ 0&{}-1&{}1\end{bmatrix}\). Then \(X^\star + cB\in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) for \(c\in [0,\varepsilon ]\) for some positive \(\varepsilon \). Clearly \(f(X^\star +cB)<f(X^\star )\) for \(c\in (0,\varepsilon ]\). Thus contradicts the minimality of \(X^\star \). Hence \(x_{13}^\star =0\). Therefore \(X^\star =\begin{bmatrix}0&{}s_1&{}0\\ t_1&{}t_2-s_1&{}t_3\end{bmatrix}\). This corresponds to the second case of (C.6).
The third case is when \(x_{11}^\star >0\) and \(x_{22}^\star =0\). We show, as in the second case, that \(x_{23}^\star =0\). Then \(X^\star =\begin{bmatrix}t_1-s_2&{}t_2&{}t_3\\ s_2&{}0&{}0\end{bmatrix}\). This corresponds to the third case of (C.6).
The case \({\textbf{s}}=(s_1,s_2,s_3)^\top ,{\textbf{t}}=(t_1,t_2,0)^\top \) is completely analogous, hence the proof is complete. \(\square \)
Basing on the numerical studies we conjecture that the cases (a)–(d) exhaust the parameter space \(\Pi _3 \times \Pi _3\). Nevertheless, we include for completeness an analysis of the quantum optimal transport \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) under the assumption that this is not the case. The employed techniques might prove useful when studying more general qutrit states or diagonal ququarts.
Proposition C.2
Let \(O\subset \Pi _3\times \Pi _3\) be the set of pairs \({\textbf{s}},{\textbf{t}}\), which do not meet neither of conditions (a)–(d) from Theorem C.1. Suppose that O is nonempty. Then each minimizing \(X^\star \) in the characterization (4.5) of \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) has zero diagonal. Let \(O'\subset O\) be an open dense subset of O such that for each \(({\textbf{s}},{\textbf{t}})\in O'\) and each triple \(\{i,j,k\}=[3]\) the inequalities \(s_p\ne t_q\) and \(s_p+s_q\ne t_r\) hold. Assume that \(({\textbf{s}},{\textbf{t}})\in O'\). The set of matrices in \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) with zero diagonal is an interval spanned by two distinct extreme points \(E_1,E_2\), which have exactly five positive off-diagonal elements. Let \(Z(u)=uE_1+(1-u)E_2\) for \(u\in [0,1]\). Then the minimum of the function \(f(Z(u)), u\in [0,1]\), where f is defined by (4.8), is attained at a unique point \(u^\star \in (0,1)\). The point \(u^\star \) is the unique solution in the interval (0, 1) to a polynomial equation of degree at most 12. The matrix \(X^\star =Z(u^\star )\) is the minimizing matrix for the second minimum problem in (4.5), and \(\textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=f(X^\star )\).
Proof
Assume first that the set \(O\subset \Pi _3\times \Pi _3\) is nonempty and satisfies the conditions (i)-(iv). Combine Theorem 5.3 with part (a) of the theorem to deduce that if the conditions (5.9) do hold for \(n=3\) then
In view of our assumption the above inequality holds. We first observe that \(s_p\ne t_p\) for each \(p\in [3]\). Assume to the contrary that \(s_p=t_p\). Without loss of generality we can assume that \(s_3=t_3\). Assume that in addition \(s_q=t_q\) for some \(q\in [2]\). Then \({\textbf{s}}={\textbf{t}}\) and
This contradicts (C.11). Hence there exists \(q\in [2]\) such that \(s_q>t_q\) for \(q\in [2]\). Without loss of generality we can assume that \(s_2>t_2\), therefore \(s_1<t_1\), as \(s_1+s_2=t_1+t_2=1-s_3=1-t_3\). Hence for \(Y=\begin{bmatrix}s_1&{}0\\ t_1-s_1&{}t_2\end{bmatrix}\) we have \(X=Y\oplus [s_3]\in \Gamma ^{cl}({\textbf{s}},{\textbf{t}})\). Recall that \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\). We replace Y by \(Y^\star =Y+u^\star \begin{bmatrix}-1&{}1\\ 1&{}-1\end{bmatrix}\) such that \(u^\star >0, Y^\star \ge 0\) and one of the diagonal elements of \(Y^\star \) is zero. By relabeling \(\{1,2\}\) if necessary we can assume that \(Y^\star =\begin{bmatrix}0&{} s_1\\ t_1&{}t_2-s_1\end{bmatrix}\) So \(t_2\ge s_1\) and \(X^\star =Y^\star \oplus [s_3]\in \Gamma ^{cl}({\textbf{s}},{\textbf{t}})\). The minimal characterization (4.5) of \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) yields
This contradicts (C.11).
As \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\) there exists a maximizing matrix \(F^\star \) to the dual problem of the form given by Lemma 5.2. Let \(X^\star \) be the corresponding minimizing matrix. We claim that \(X^\star \) has zero diagonal. Assume first that \(X^\star \) has a positive diagonal. Then the arguments in part (b) of Lemma 5.2 yield that \(X^\star \) is a symmetric matrix. Thus \({\textbf{s}}={\textbf{t}}\), and this contradicts (C.11).
Assume second that \(X^\star \) has two positive diagonal entries. By renaming the indices we can assume that \(x_{11}^\star =0\), \(x_{22}^\star , x_{33}^\star >0\). Part (b) of Lemma 5.2 and the arguments of its proof yield that we can assume that \(a_2^\star =a_3^\star =b_2^\star =0\). Let \(u^\star =a_1^\star +1/2, v^\star =b_1^\star +1/2\). As \(M_{12}^\star \) is positive semidefinite we have the inequalities: \(u^\star \ge 0, v^\star \ge 0, u^\star v^\star \ge 1/4\). Hence \(x^\star>0, y^\star >0\). Recall that \(F^\star \) is a maximizing matrix for the dual problem (3.1). Hence
This contradicts (C.11).
We now assume that \(X^\star \) has one positive diagonal entry. Be renaming the indices 1, 2, 3 we can assume that \(x_{11}^\star =x_{22}^\star =0, x_{33}^\star >0\). The conditions (5.3) yield that \(a_3^\star +b_3^\star =0\). Since we can choose \(b_3^\star =0\) we assume that \(a_3^\star =b_3^\star =0\).
Let us assume, case (A1), that \(X^\star \) has six positive off-diagonal entries. We first claim that either \(x^\star _{13}=x^\star _{31}\) or \(x^\star _{23}=x^\star _{32}\). (Those are equivalent conditions if we interchange the indices 1 and 2.) We deduce these conditions and an extra condition using the second conditions of (5.4). First we consider \(x^\star _{12}, x_{13}^\star , x_{32}^\star ,x_{33}^\star \), that is \(i=p=3\), \(j=1, q=2\). By replacing these entries by \(x^\star _{12}-v, x_{13}^\star +v, x_{32}^\star +v,x_{33}^\star -v\) we obtain the equalities
Second we consider \(x^\star _{21}, x_{23}^\star , x_{31}^\star ,x_{33}^\star \). By replacing these entries by \(x^\star _{21}-v, x_{23}^\star +v, x_{31}^\star +v,x_{33}^\star -v\) we obtain the equality:
Multiply the first and the second equality to deduce
Assume first that \(x=u=y/z\). Substitute that into the first equality to deduce that \(z=1\), which implies that \(x_{23}^\star =x_{32}^\star \). Similarly, if \(x=1/u\) we deduce that \(y=1\), which implies that \(x_{13}^\star =x_{31}^\star \). Let us assume for simplicity of exposition that \(x_{23}^\star =x_{32}^\star \). Let X(w) be obtained from \(X^\star \) by replacing \(x_{22}^\star =0,x_{23}^\star , x_{32}^\star , x_{33}^\star \) with \(x_{22}^\star +w,x_{23}^\star -w, x_{32}^\star -w, x_{33}^\star +w\) for \(0<w<x_{23}^\star \). Then X(w) is a minimizing matrix and has two positive diagonal entries. This contradicts our assumption that \(X^\star \) has only one positive diagonal entry.
We now consider the case (A2) that \(x^\star _{ij}=0\) for some \(i\ne j\). Part (a) of Lemma 5.2 yields that \(x^\star _{ji}=0\). We claim that all four off-diagonal entries are positive. Assume to the contrary that \(x^\star _{pq}=0\) for some \(p\ne q\) and \(\{p,q\}\ne \{i,j\}\). Then \(x_{qp}^\star =0\). As \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\) we must have that \(x^\star _{12}x^\star _{21}>0\) and all four other off-diagonal entries are zero. But then \(s_1=t_2, t_1=s_2, s_3=t_3\). This is impossible since we showed that \(s_3\ne t_3\). Hence \(X^\star \) has exactly four positive off-diagonal entries.
Let us assume first that \(x_{12}^\star =x_{21}^\star =0\). Then \(X^\star \) is of the form given by (C.7), where \(t_3>s_1+s_2\). We now recall again the conditions (5.3). As we already showed, we can assume that \(a_3^\star =b_3^\star =0\). As \(x_{11}^\star =x_{22}^\star =0\) all of the first three conditions of (5.3) hold. As \(x_{12}^\star =x_{21}^\star =0\) the second condition of (5.3) holds trivially for \(i=1,j=2\). The conditions for \(i=1,j=3\) and \(i=2, j=3\) are
We claim that (C.8) holds. Using the assumption that \(\det M_{13}^\star \ge 1/4\) and the inequality of arithmetic and geometric means we deduce that \(\det M_{13}^\star = 1/4\). Hence
Equality holds if and only if \(u=\sqrt{t_1}/(2\sqrt{s_1})\). This shows the first equality in (C.8). The second equality in (C.8) is deduced similarly. We now show that the conditions (C.1) hold for \(i=1,j=2,k=3\). As \(t_3>s_1+s_2\) the first condition of (C.8) holds. We use the conditions that \(M_{12}^\star \) is positive semidefinite. Let \(u=\sqrt{t_1}/{\sqrt{s_1}}, v=\sqrt{s_2}{\sqrt{t_2}}\). Then the arguments of the proof of part (b) yield
So either \(u\ge 1\) and \(v\le 1\), or \(u\le 1\) and \(v\ge 1\). Hence (C.1) holds for \(i=1,j=2,k=3\). This contradicts our assumption that (C.1) does not hold.
Let us assume second that \(x_{12}^\star>0, x_{21}^\star >0\). Then either \(x_{13}^\star =x_{31}^\star =0\) or \(x_{23}^\star =x_{32}^\star =0\). By relabeling 1, 2 we can assume that \(x_{23}^\star =x_{32}^\star =0\). Hence \(X^\star \) is of the form (C.9), where \(s_1>t_2>0, t_1>s_2>0, s_2+s_3>t_1\). Hence the conditions (C.3) are satisfied with \(i=1,j=2, k=3\). We now obtain a contradiction by showing that the conditions (C.4) are satisfied. This is done using the same arguments as in the previous case as follows. First observe that the second nontrivial conditions of (5.3) are:
As in the previous case we deduce that
Hence (C.10) holds. We now recall the proof of part (c) of the theorem. We have thus shown that the minimizing matrix \(X^\star \) has zero diagonal.
We now show that O is an open set. Clearly, the set of all pairs of probability vectors \(O_1\subset \Pi _3\times \Pi _3\) such that at least one of them has a zero coordinate is a closed set. Let \(O_2, O_3, O_4\subset \Pi _3\times \Pi _3\) be the sets which satisfy the conditions (a), (b),(c) of the theorem respectively. It it straightforward to show: \(O_2\) is a closed set, and Closure\((O_3)\subset (O_3\cup O_1)\). We now show that Closure\((O_4)\subset O_4\cup O_1\cup O_2\). Indeed, assume that we have a sequence \(({\textbf{s}}_l,{\textbf{t}}_l)\in O_4, l\in {\mathbb {N}}\) that converges to \(({\textbf{s}},{\textbf{t}})\). It is enough to consider the case where \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\). Again we can assume for simplicity that each \(({\textbf{s}}_l,{\textbf{t}}_l)\) satisfies the conditions (C.3) and (C.4) for \(i=1, j=2,k=3\). Then we deduce that the limit of the minimizing matrices \(X^\star _l\) is of the form (C.9). Hence \(\lim _{l\rightarrow \infty }X^\star _l=X^\star \), where \(X^\star \) is of the form (C.9). Also \(X^\star \) is a minimizing matrix for \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\). Recall that \(s_2,t_2>0\). If \(s_1-t_2>0,t_1-s_2>0\) then \(({\textbf{s}},{\textbf{t}})\in O_4\). So assume that \((s_1-t_2)(t_1-s_2)=0\). As \(X^\star \) is minimizes \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) and \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\), part (a) of Lemma 5.2 yields that \(s_1=t_2, t_1=s_2\). Hence \(s_3=t_3\). As \(X^\star \) is minimizes \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) we get that \(\textrm{T}^Q_{C^Q}=\frac{1}{2}(\sqrt{s_2}-\sqrt{t_2})^2\). Hence \(({\textbf{s}},{\textbf{t}})\in O_2\). This shows that \(O_1\cup O_2\cup O_3\cup O_4\) is a closed set. Therefore \(O=\Pi _3\times \Pi _3{\setminus }(O_1\cup O_2\cup O_3\cup O_4)\) is an open set. If O is an empty set then proof of the theorem is concluded.
Assume that O is a nonempty set. Let \(O'\subset O\) be an open dense subset of O such that for each \(({\textbf{s}},{\textbf{t}})\in O'\) and each triple \(\{p,q,r\}=[3]\) the inequality \(s_p\ne t_q\) and \(s_p+s_q\ne t_r\) hold.
Assume that \(({\textbf{s}},{\textbf{t}})\in O'\). Let \(\Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) be the convex subset of \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) of matrices with zero diagonal. We claim that any \(X\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) has at least 5 nonzero entries. Indeed, suppose that \(X\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) has two zero off-diagonal entries. As \({\textbf{s}},{\textbf{t}}>0\) they cannot be in the same row or column. By relabeling the rows we can assume that the two zero elements are in the first and the second row. Suppose first that \(x_{12}^\star =x_{23}^\star =0\). Then \(X=\begin{bmatrix}0&{}0&{}s_1\\ s_2&{}0&{}0\\ t_1-s_2&{}t_2&{}0\end{bmatrix}\). Thus \(s_1=t_3\) which is impossible. Assume now that \(x_{12}^\star =x_{21}^\star =0\). Then \(s_1+s_2=t_3\) which is impossible. All other choices also are impossible.
We claim that \( \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) is spanned by two distinct extreme points \(E_1,E_2\), which have exactly five positive off-diagonal elements. Suppose first that there exists \(X\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) which has six postive off-diagonal elements. Let
Then all matrices in \(\Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) are of the form \(X^\star +uB, u\in [u_1,u_2]\) for some \(u_1<u_2\). Consider the matrix \(E_1=X^\star +u_1B\). It has at least one zero off-diagonal entry hence we conclude that \(E_1\) has exactly five off-diagonal positive elements. Similarly \(E_2=X+u_2B\) has five positive off-diagonal elements. Assume now that \(E\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) has five positive off-diagonal elements. Hence there exits a small \(u>0\) such that either \(E+uB\) or \(E-uB\) has six positive off-diagonal elements. Hence \( \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) contains a matrix with six positive diagonal elements. Therefore \( \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) is an interval spanned by \(E_1\ne E_2\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\), where \(E_1\) and \(E_2\) have five positive off-diagonal elements. Part (a) of Lemma 5.2 yields that \(X^\star \) has six positive off-diagonal elements. Consider \(E_1\) and assume that the (1, 2) entry of \(E_1\) is zero. Then
As \(f(E_1+uB)\) is strictly convex on \([0,u_3]\), there exists a unique \(u^\star \in (0, u_3)\) which satisfies the equation
It is not difficult to show that the above equation is equivalent to a polynomial equation of degree at most 12 in u. Indeed, group the six terms into three groups, multiply by the common denominator, and pass the last group to the other side of the equality to obtain the equality:
Raise this equality to the second power. Put all polynomial terms of degree 6 on the left hand side, and the one term with a square radical on the other side. Raise to the second power to obtain a polynomial equation in u of degree at most 12. Hence \(X^\star =E_1+u^\star B\). This completes the proof of (e). \(\square \)
About this article
Cite this article
Cole, S., Eckstein, M., Friedland, S. et al. On Quantum Optimal Transport. Math Phys Anal Geom 26, 14 (2023). https://doi.org/10.1007/s11040-023-09456-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11040-023-09456-7
Keywords
- Quantum optimal transport
- Classical optimal transport
- Coupling of density matrices
- Semidefinite programming
- Wasserstein-2 distance