Skip to main content
Log in

On Quantum Optimal Transport

  • Published:
Mathematical Physics, Analysis and Geometry Aims and scope Submit manuscript

Abstract

We analyze a quantum version of the Monge–Kantorovich optimal transport problem. The quantum transport cost related to a Hermitian cost matrix C is minimized over the set of all bipartite coupling states \(\rho ^{AB}\) with fixed reduced density matrices \(\rho ^A\) and \(\rho ^B\) of size m and n. The minimum quantum optimal transport cost \(\textrm{T}^Q_{C}(\rho ^A,\rho ^B)\) can be efficiently computed using semidefinite programming. In the case \(m=n\) the cost \(\textrm{T}^Q_{C}\) gives a semidistance if and only if C is positive semidefinite and vanishes exactly on the subspace of symmetric matrices. Furthermore, if C satisfies the above conditions, then \(\sqrt{\textrm{T}^Q_{C}}\) induces a quantum analogue of the Wasserstein-2 distance. Taking the quantum cost matrix \(C^Q\) to be the projector on the antisymmetric subspace, we provide a semi-analytic expression for \(\textrm{T}^Q_{C^Q}\) for any pair of single-qubit states and show that its square root yields a transport distance on the Bloch ball. Numerical simulations suggest that this property holds also in higher dimensions. Assuming that the cost matrix suffers decoherence and that the density matrices become diagonal, we study the quantum-to-classical transition of the Monge–Kantorovich distance, propose a continuous family of interpolating distances, and demonstrate that the quantum transport is cheaper than the classical one. Furthermore, we introduce a related quantity—the SWAP-fidelity—and compare its properties with the standard Uhlmann–Jozsa fidelity. We also discuss the quantum optimal transport for general d-partite systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agredo, J., Fagnola, F.: On quantum versions of the classical Wasserstein distance. Stochastics 89, 910 (2017)

    MathSciNet  MATH  Google Scholar 

  2. Altschuler, J., Weed, J., Rigollet, P.: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In: NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1961–1971 (2017)

  3. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning, PMLR, vol. 70, p. 214 (2017)

  4. Bengtsson, I., Życzkowski, K.: Geometry of Quantum States, 2nd edn. Cambridge University Press, Cambridge (2017)

    MATH  Google Scholar 

  5. Bhatia, R., Gaubert, S., Jain, T.: Matrix versions of the Hellinger distance. Lett. Math. Phys. 109, 1777–1804 (2019)

    ADS  MathSciNet  MATH  Google Scholar 

  6. Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.: Quantum machine learning. Nature 549, 195 (2017)

    ADS  Google Scholar 

  7. Biane, P., Voiculescu, D.: A free probability analogue of the Wasserstein distance on the trace-state space. Geom. Funct. Anal. 11, 1125 (2001)

    MathSciNet  MATH  Google Scholar 

  8. Bigot, J., Gouet, R., Klein, T., López, A.: Geodesic PCA in the Wasserstein space by convex PCA. Ann. Inst. H. Poincaré Probab. Stat. 53, 1–26 (2017)

    ADS  MathSciNet  MATH  Google Scholar 

  9. Bistroń, R., Eckstein, M., Życzkowski, K.: Monotonicity of the quantum 2-Wasserstein distance. J. Phys. A 56, 095301 (2023)

    ADS  MathSciNet  MATH  Google Scholar 

  10. Bonneel, N., van de Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using Lagrangian mass transport. ACM Trans. Graph. 30, 158 (2011)

    Google Scholar 

  11. Brandão, F.G.S.L., Svore, K.: Quantum speed-ups for solving semidefinite programs. In: 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pp. 415–426

  12. Braunstein, S.L., Caves, C.M.: Statistical distance and the geometry of quantum states. Phys. Rev. Lett. 72, 3439 (1994)

    ADS  MathSciNet  MATH  Google Scholar 

  13. Caglioti, E., Golse, F., Paul, T.: Quantum optimal transport is cheaper. J. Stat. Phys. 181, 149 (2020)

    ADS  MathSciNet  MATH  Google Scholar 

  14. Carlen, E.A., Maas, J.: Non-commutative calculus, optimal transport and functional inequalities in dissipative quantum systems. J. Stat. Phys. 178, 319 (2020)

    ADS  MathSciNet  MATH  Google Scholar 

  15. Chakrabarti, S., Huang, Y., Li, T., Feizi, S., Wu, X.: Quantum Wasserstein generative adversarial networks. In: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, arXiv:1911.00111

  16. Chen, Y., Gangbo, W., Georgiou, T.T., Tannenbaum, A.: On the matrix Monge-Kantorovich problem. Eur. J. Appl. Math. 31, 574 (2020)

    MathSciNet  MATH  Google Scholar 

  17. Cook, W.J., Cunningham, W.H., Pulleyblank, W.R., Schrijver, A.: Combinatorial Optimization. Wiley, New York (1998)

    MATH  Google Scholar 

  18. Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2292–2300. Curran Associates Inc., New York (2013)

    Google Scholar 

  19. Datta, N., Rouzé, C.: Relating relative entropy, optimal transport and Fisher information: a quantum HWI inequality. Ann. H. Poincaré 21, 2115 (2020)

    MathSciNet  MATH  Google Scholar 

  20. De Palma, G., Marvian, M., Trevisan, D., Lloyd, S.: The quantum Wasserstein distance of order 1. IEEE Trans. Inf. Theory 67, 6627–6643 (2021). https://doi.org/10.1109/TIT.2021.3076442

    Article  MathSciNet  MATH  Google Scholar 

  21. De Palma, G., Trevisan, D.: Quantum optimal transport with quantum channels. Ann. Henri Poincaré 22, 3199–3234 (2021)

    ADS  MathSciNet  MATH  Google Scholar 

  22. Duvenhage, R.: Quadratic Wasserstein metrics for von Neumann algebras via transport plans. J. Operator Theory 88, 289–308 (2022)

    MathSciNet  Google Scholar 

  23. Filipiak, K., Klein, D., Vojtková, E.: The properties of partial trace and block trace operators of partitioned matrices. Electron. J. Linear Algebra 33, 3–15 (2018)

    MathSciNet  MATH  Google Scholar 

  24. Flamary, R., Cuturi, M., Courty, N., Rakotomamonjy, A.: Wasserstein discriminant analysis. Mach. Learn. 107, 1923–1945 (2018)

    MathSciNet  MATH  Google Scholar 

  25. Friedland, S.: Matrices: Algebra, Analysis and Applications, p. 596. World Scientific, Singapore (2016)

    MATH  Google Scholar 

  26. Friedland, S.: Notes on semidefinite programming, Fall 2017, http://homepages.math.uic.edu/~friedlan/SDPNov17.pdf

  27. Friedland, S.: Tensor optimal transport, distance between sets of measures and tensor scaling, arXiv:2005.00945

  28. Friedland, S., Eckstein, M., Cole, S., Życzkowski, K.: Quantum Monge-Kantorovich problem and transport distance between density matrices. Phys. Rev. Lett. 129, 110402 (2022)

    ADS  MathSciNet  Google Scholar 

  29. Friedland, S., Ge, J., Zhi, L.: Quantum Strassen’s theorem. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 23, 2050020 (2020)

    ADS  MathSciNet  MATH  Google Scholar 

  30. Friesecke, G., Vögler, D.: Breaking the curse of dimension in multi-marginal Kantorovich optimal transport on finite state spaces. SIAM J. Math. Anal. 50(4), 3996–4019 (2018)

    MathSciNet  MATH  Google Scholar 

  31. Gilchrist, A., Langford, N.K., Nielsen, M.A.: Distance measures to compare real and ideal quantum processes. Phys. Rev. A 71, 062310 (2005)

    ADS  Google Scholar 

  32. Golse, F., Mouhot, C., Paul, T.: On the mean field and classical limits of quantum mechanics. Commun. Math. Phys. 343, 165–205 (2016)

    ADS  MathSciNet  MATH  Google Scholar 

  33. Golse, F., Paul, T.: Wave packets and the quadratic Monge-Kantorovich distance in quantum mechanics. Comptes Rendus Math. 356, 177–197 (2018)

    MathSciNet  MATH  Google Scholar 

  34. Hitchcock, F.L.: The distribution of a product from several sources to numerous localities. J. Math. Phys. Mass. Inst. Tech. 20, 224–230 (1941)

    MathSciNet  MATH  Google Scholar 

  35. Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, Cambridge (2013)

    MATH  Google Scholar 

  36. Horodecki, M., Horodecki, P., Horodecki, R.: Separability of mixed states: necessary and sufficient conditions. Phys. Lett. A 223, 1–8 (1996)

    ADS  MathSciNet  MATH  Google Scholar 

  37. Ikeda, K.: Foundation of quantum optimal transport and applications. Quantum Inform. Process. 19, 25 (2020)

    ADS  MathSciNet  MATH  Google Scholar 

  38. Jozsa, R.: Fidelity for mixed quantum states. J. Mod. Opt. 41, 2315–23 (1994)

    ADS  MathSciNet  MATH  Google Scholar 

  39. Kantorovich, L.V.: Mathematical methods of organizing and planning production. Manag. Sci. 6, 366–422 (1959/60)

  40. Keyl, M.: Fundamentals of quantum information theory. Phys. Rep. 369, 431–548 (2002)

    ADS  MathSciNet  MATH  Google Scholar 

  41. Kiani, B.T., De Palma, G., Marvian, M., Liu, Z.-W., Lloyd, S.: Learning quantum data with the quantum earth mover’s distance. Quantum Sci. Technol. 7, 045002 (2022)

    ADS  Google Scholar 

  42. Liu, J., Yuan, H., Lu, X.-M., Wang, X.: Quantum Fisher information matrix and multiparameter estimation. J. Phys. A 53, 023001 (2020)

    ADS  MathSciNet  MATH  Google Scholar 

  43. Lloyd, J.R., Ghahramani, Z.: Statistical model criticism using kernel two sample tests. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS’15, pp. 829–837. MIT Press, Cambridge (2015)

  44. Lloyd, S., Weedbrook, C.: Quantum generative adversarial learning. Phys. Rev. Lett. 121, 040502 (2018)

    ADS  MathSciNet  Google Scholar 

  45. Miszczak, J.A., Puchala, Z., Horodecki, P., Uhlmann, A., Życzkowski, K.: Sub- and super-fidelity as bounds for quantum fidelity. Quantum Inf. Comp. 9, 0103–0130 (2009)

    MathSciNet  MATH  Google Scholar 

  46. Monge, G.: Mémoire sur la théorie des déblais et des remblais, Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année, pp. 666–704 (1781)

  47. Mueller, J., Jaakkola, T.: Principal differences analysis: interpretable characterization of differences between distributions, In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS’15, pp. 1702–1710. MIT Press, Cambridge (2015)

  48. Müller-Hermes, A.: On the monotonicity of a quantum optimal transport cost, preprint arXiv:2211.11713 (2022)

  49. Panaretos, V.M., Zemel, Y.: Amplitude and phase variation of point processes. Ann. Stat. 44, 771–812 (2016)

    MathSciNet  MATH  Google Scholar 

  50. Peres, A.: Separability criterion for density matrices. Phys. Rev. Lett. 77, 1413–1415 (1996)

    ADS  MathSciNet  MATH  Google Scholar 

  51. Renner, R.: Quantum Information Theory, Exercise Sheet 9, http://edu.itp.phys.ethz.ch/hs15/QIT/ex09.pdf

  52. Riera, M.H.: A transport approach to distances in quantum systems, Bachelor’s thesis for the degree in Physics, Universitat Autònoma de Barcelona (2018)

  53. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a distance for image retrieval. Int. J. Comput. Vis. 40, 99–121 (2000)

    MATH  Google Scholar 

  54. Solomon, J., de Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional Wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph. 34, 66 (2015)

    MATH  Google Scholar 

  55. Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth mover’s distance distance for image analysis. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1590–1602 (2011)

    Google Scholar 

  56. Šafránek, D.: Discontinuities of the quantum Fisher information and the Bures distance. Phys. Rev. A 95, 052320 (2017)

    ADS  Google Scholar 

  57. Székely, G.J., Rizzo, M.L.: Testing for equal distributions in high dimension. Inter-Stat. 11, 1–16 (2004)

    Google Scholar 

  58. Uhlmann, A.: The ‘transition probability’ in the state space of a *-algebra. Rep. Math. Phys. 9, 273 (1976)

    ADS  MathSciNet  MATH  Google Scholar 

  59. Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996)

    MathSciNet  MATH  Google Scholar 

  60. Vasershtein, L.N.: Markov processes over denumerable products of spaces describing large system of automata. Probl. Inf. Transmission 5, 47–52 (1969)

    MathSciNet  Google Scholar 

  61. Villani, C.: Optimal Transport, Old and New, Grundlehren der Mathematischen Wissenschaften, 338. Springer, Berlin (2009)

    Google Scholar 

  62. Werner, R.F.: Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden-variable mode. Phys. Rev. A 40, 4277 (1989)

    ADS  MATH  Google Scholar 

  63. Winter, A.: Tight uniform continuity bounds for quantum entropies: conditional entropy, relative entropy distance and energy constraints. Commun. Math. Phys. 347, 291–313 (2016)

    ADS  MathSciNet  MATH  Google Scholar 

  64. Wolfram Research, Inc., Mathematica, Version 12.2, Champaign, IL, USA (2020), https://www.wolfram.com/mathematica

  65. Zhou, L., Yu, N., Ying, S., Ying, M.: Quantum earth mover’s distance, no-go quantum Kantorovich-Rubinstein theorem, and quantum marginal problem. J. Math. Phys. 63, 102201 (2022)

    ADS  MathSciNet  MATH  Google Scholar 

  66. Życzkowski, K., Słomczyński, W.: Monge distance between quantum states. J. Phys. A 31, 9095–9104 (1998)

    ADS  MathSciNet  MATH  Google Scholar 

  67. Życzkowski, K., Słomczyński, W.: The Monge distance on the sphere and geometry of quantum states. J. Phys. A 34, 6689 (2001)

    ADS  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

It is a pleasure to thank Rafał Bistroń, John Calsamiglia, Matt Hoogsteder, Tomasz Miller, Wojciech Słomczyński and Andreas Winter for numerous inspiring discussions and helpful remarks. Financial support by Simons collaboration Grant for mathematicians, Narodowe Centrum Nauki under the Maestro Grant number DEC-2015/18/A/ST2/00274 and by the Foundation for Polish Science under the Team-Net Project no. POIR.04.04.00-00-17C1/18-00 is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shmuel Friedland.

Appendices

Appendix A: Basic Properties of Partial Traces

In order to understand the partial traces on \( \textrm{B}({\mathcal {H}}_m\otimes {\mathcal {H}}_n)\) it is convenient to view this space as a 4-mode tensor space [29] and use Dirac notation. Denote by \({\mathcal {H}}_m^\vee \) the space of linear functionals on \({\mathcal {H}}_m\), i.e., the dual space. Then \({\textbf{y}}^\vee =\langle {\textbf{y}}|\in {\mathcal {H}}_m^\vee \) acts on \({\textbf{z}}\in {\mathcal {H}}_m\) as follows: \({\textbf{y}}^\vee ({\textbf{z}})=\langle {\textbf{y}},{\textbf{z}}\rangle =\langle {\textbf{y}}|{\textbf{z}}\rangle \). Hence a rank-one operator in \(\textrm{B}({\mathcal {H}}_m)\) is of the form \({\textbf{x}}\otimes {\textbf{y}}^\vee =|{\textbf{x}}\rangle \langle {\textbf{y}}|\), where \((|{\textbf{x}}\rangle \langle {\textbf{y}}|)({\textbf{z}})=\langle {\textbf{y}}|{\textbf{z}}\rangle |{\textbf{x}}\rangle \). So \(|{\textbf{x}}\rangle \langle {\textbf{y}}|\) can be viewed a matrix \(\rho ={\textbf{x}}{\textbf{y}}^\dagger \in {\mathbb {C}}^{m\times m}\). Assume that \(V_1,V_2\) are linear transformations from \({\mathcal {H}}_m\) to itself. Then \(V_1\otimes V_2\) is a sesquilinear transformation from \({\mathcal {H}}_m\otimes {\mathcal {H}}_m^\vee \) to itself, which acts on rank one operators as follows:

$$\begin{aligned} (V_1\otimes V_2)( |{\textbf{x}}\rangle \langle {\textbf{y}}|)= |V_1 {\textbf{x}}\rangle \langle V_2{\textbf{y}}|=V_1( |{\textbf{x}}\rangle \langle {\textbf{y}}|)V_2^\dagger , \quad {\textbf{x}},{\textbf{y}}\in {\mathcal {H}}_m. \end{aligned}$$

Assume now that \(W_1,W_2\) are linear transformations from \({\mathcal {H}}_n\) to itself. Then

$$\begin{aligned} (V_1\otimes W_1)|{\textbf{x}}\rangle |{\textbf{v}}\rangle = |V_1{\textbf{x}}\rangle |W_1{\textbf{v}}\rangle , \quad {\textbf{x}}\in {\mathcal {H}}_m,{\textbf{y}}\in {\mathcal {H}}_n. \end{aligned}$$

A tensor product of two rank-one operators is identified a 4-tensor:

$$\begin{aligned} |{\textbf{x}}\rangle \langle {\textbf{y}}|\otimes |{\textbf{u}}\rangle \langle {\textbf{v}}|=|{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|, \quad {\textbf{x}},{\textbf{y}}\in {\mathcal {H}}_m,{\textbf{u}},{\textbf{v}}\in {\mathcal {H}}_n. \end{aligned}$$
(A.1)

Thus

$$\begin{aligned} (|{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|)(|{\textbf{z}}\rangle |{\textbf{w}}\rangle )= \langle {\textbf{y}}|{\textbf{z}}\rangle \langle {\textbf{v}}|{\textbf{w}}\rangle |{\textbf{x}}\rangle | {\textbf{u}}\rangle , \quad {\textbf{x}},{\textbf{y}},{\textbf{z}}\in {\mathcal {H}}_m, {\textbf{u}},{\textbf{v}},{\textbf{w}}\in {\mathcal {H}}_n. \end{aligned}$$

Observe next that \(V_1\otimes W_1\otimes V_2\otimes W_2\) is a multi-sesquilinear transformation of \(\textrm{B}({\mathcal {H}}_m\otimes {\mathcal {H}}_n)\) to itself, which acts on a rank-one product operator as follows:

$$\begin{aligned} (V_1\otimes W_1\otimes V_2\otimes W_2) (|{\textbf{x}}\rangle |{\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|)&= |V_1 {\textbf{x}}\rangle |W_1{\textbf{u}}\rangle \langle V_2{\textbf{y}}|\langle W_2{\textbf{v}}|\\&= (V_1\otimes W_1)(|{\textbf{x}}\rangle |{\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|)(V_2^\dagger \otimes W_2^\dagger ). \end{aligned}$$

(In the last equality we view \(|{\textbf{x}}\rangle |{\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|\) as an \((mn)\times (mn)\) matrix.) As \({{\,\textrm{Tr}\,}}|{\textbf{x}}\rangle \langle {\textbf{y}}|=\langle {\textbf{y}}|{\textbf{x}}\rangle \) we deduce the following lemma:

Lemma A.1

Let

$$\begin{aligned} {\textbf{x}},{\textbf{y}}\in {\mathcal {H}}_m, {\textbf{u}},{\textbf{v}}\in {\mathcal {H}}_n, \quad V_1,V_2\in \textrm{B}({\mathcal {H}}_m), \; W_1,W_2\in \textrm{B}({\mathcal {H}}_n). \end{aligned}$$

Then

$$\begin{aligned}{} & {} {{\,\textrm{Tr}\,}}_A |{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|=\langle {\textbf{y}}|{\textbf{x}}\rangle |{\textbf{u}}\rangle \langle {\textbf{v}}|,\\{} & {} {{\,\textrm{Tr}\,}}_B |{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|=\langle {\textbf{v}}|{\textbf{u}}\rangle |{\textbf{x}}\rangle \langle {\textbf{y}}|, \\{} & {} {{\,\textrm{Tr}\,}}_A (V_1\otimes W_1\otimes V_2\otimes W_2)(|{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|)=\langle V_2{\textbf{y}}|V_1{\textbf{x}}\rangle |W_1{\textbf{u}}\rangle \langle W_2{\textbf{v}}|,\\{} & {} {{\,\textrm{Tr}\,}}_B (V_1\otimes W_1\otimes V_2\otimes W_2)(|{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|)=\langle W_2{\textbf{v}}|W_1{\textbf{u}}\rangle |V_1{\textbf{x}}\rangle \langle V_2{\textbf{y}}|. \end{aligned}$$

In particular, if \(V_1=V_2=V\) and \(W_1=W_2=W\) are unitary then

$$\begin{aligned}&{{\,\textrm{Tr}\,}}_A (V\otimes W\otimes V\otimes W)(|{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|) =\langle {\textbf{y}}|{\textbf{x}}\rangle |W{\textbf{u}}\rangle \langle W{\textbf{v}}|,\\&{{\,\textrm{Tr}\,}}_B (V\otimes W\otimes V\otimes W)(|{\textbf{x}}\rangle | {\textbf{u}}\rangle \langle {\textbf{y}}|\langle {\textbf{v}}|) =\langle {\textbf{v}}|{\textbf{u}}\rangle |V{\textbf{x}}\rangle \langle V{\textbf{y}}|. \end{aligned}$$

Corollary A.2

Let \(\rho ^A\in \Omega _m,\rho ^B\in \Omega _n\), \(V\in \textrm{B}({\mathcal {H}}_m),W\in \textrm{B}({\mathcal {H}}_n)\) be unitary and \(C\in \textrm{S}({\mathcal {H}}_m\otimes {\mathcal {H}}_n)\). Then

$$\begin{aligned}&\Gamma ^Q(V\rho ^AV^\dagger , W\rho ^BW^\dagger )=(V\otimes W)\Gamma ^Q(\rho ^A,\rho ^B)(V^\dagger \otimes W^\dagger ),\\&\textrm{T}_{C}^Q(\rho ^A,\rho ^B)=\textrm{T}_{(V\otimes W)C(V^\dagger \otimes W^\dagger )}( V\rho ^AV^\dagger ,W\rho ^B W^\dagger ). \end{aligned}$$

Proof

View \(\rho ^A\in \Omega _m\) as an element in \({\mathcal {H}}_m\otimes {\mathcal {H}}_m^\vee \) to deduce \(V\rho ^AV^\dagger =(V\otimes V) \rho ^A\). Suppose that

$$\begin{aligned} \rho ^{AB}=\sum _{\begin{array}{c} i,j\in [m]\\ p,q\in [n] \end{array}} r_{(i,p)(j,q)} |i\rangle |p\rangle \langle j| \langle q| \in \Gamma ^Q( \rho ^A, \rho ^B). \end{aligned}$$

Let \({\tilde{\rho }}^{AB}=(V\otimes W\otimes V\otimes W) \rho ^{AB}\). Observe that

$$\begin{aligned}{} & {} {{\,\textrm{Tr}\,}}_A \rho ^{AB}=\sum _{p,q\in [n]}\Big ( \sum _{i\in [m]} r_{(i,p)(i,q)} \Big )|p\rangle \langle q|= \rho ^B,\\{} & {} {{\,\textrm{Tr}\,}}_A {\tilde{\rho }}^{AB}=\sum _{p,q\in [n]} \Big ( \sum _{i\in [m]} r_{(i,p)(i,q)} \Big ) \big ( \langle q|W^\dagger \big )\big (W|p\rangle \big ) =W\rho _{B}W^\dagger . \end{aligned}$$

Similarly \({{\,\textrm{Tr}\,}}_B {\tilde{\rho }}^{AB}=V \rho ^AV^\dagger \). Hence

$$\begin{aligned} (V\otimes W\otimes V\otimes W)\Gamma ^Q( \rho ^A, \rho ^B)\subseteq \Gamma ^Q(V \rho ^AV^\dagger , W \rho ^BW^\dagger ). \end{aligned}$$

and

$$\begin{aligned} (V^\dagger \otimes W^\dagger \otimes V^\dagger \otimes W^\dagger )\Gamma ^Q(V\rho ^AV^\dagger ,W\rho ^BW^\dagger )\subseteq \Gamma ^Q(\rho ^A, \rho ^B). \end{aligned}$$

Hence we deduce the first part of the corollary. The second part of the corollary follows from the identity

$$\begin{aligned} {{\,\textrm{Tr}\,}}C\rho ^{AB}={{\,\textrm{Tr}\,}}(V\otimes W) C(V^\dagger \otimes W^\dagger )(V\otimes W) \rho ^{AB}(V^\dagger \otimes W^\dagger ). \end{aligned}$$

\(\square \)

The following result appeared in the literature [29] and we state it here for completeness. For \(\rho ^A\in \textrm{B}({\mathcal {H}}_m)\) denote by range\(\, \rho ^A\subseteq {\mathcal {H}}_m\) the range of \(\rho ^A\).

Lemma A.3

Let \(\rho ^A\in \Omega _m,\rho ^B\in \Omega _n\). Then

$$\begin{aligned} \Gamma ^Q(\rho ^A,\rho ^B)\subseteq \textrm{B}(\textrm{range}\,\rho ^A)\otimes \textrm{B}(\textrm{range}\,\rho ^B). \end{aligned}$$

In particular if either \(\rho ^A\) or \(\rho ^B\) is a pure state then \(\Gamma ^Q( \rho ^A, \rho ^B)=\{ \rho ^A\otimes \rho ^B\}\).

Proof

It is enough to show that \(\Gamma ^Q(\rho ^A,\rho ^B)\subset \textrm{B}(\text {range}\,\rho ^A)\otimes \textrm{B}({\mathcal {H}}_n)\). To show this condition we can assume that range\(\,\rho ^A\) is a nonzero strict subspace of \({\mathcal {H}}_m\). By choosing a corresponding orthonormal basis consisting of eigenvectors of \(\rho ^A\) we can assume that \(\rho ^A\) is a diagonal matrix whose first \(1\le \ell <m\) diagonal entries are positive, and whose last \(n-\ell \) diagonal entries are zero. Write down \(\rho ^{AB}\) as a block matrix \([R_{pq}] \in {\mathbb {C}}^{(mn)\times (mn)}\), were \(R_{pq}\in {\mathbb {C}}^{m\times m}, p,q\in [n]\). Then \({{\,\textrm{Tr}\,}}_B \rho ^{AB}=\sum _{p=1}^n R_{pp}= \rho ^A\). As \(R_{pp}\ge 0\) we deduce that \( \rho ^A=[a_{ij}]\ge R_{pp}\ge 0\). As \(a_{ii}=0\) for \(i>\ell \) it follows that the (ii) entry of each \(R_{pp}\) is zero. As \(\rho ^{AB}\) positive semidefinite it follows that the \(((p-1)n+i)\)th row and column of \(\rho ^{AB}\) are zero. This proves \(\Gamma ^Q(\rho ^A,\rho ^B)\subseteq \textrm{B}(\text {range}\,\rho ^A)\otimes \textrm{B}({\mathcal {H}}_n)\). Apply the same argument for \(\rho ^B\) to deduce \(\Gamma ^Q(\rho ^A,\rho ^B)\subseteq \textrm{B}(\text {range}\,\rho ^A)\otimes \textrm{B}(\text {range}\, \rho ^B)\).

Assume that \( \rho ^A=|1\rangle \langle 1|\) and \(\rho ^{AB}\in \Gamma ^Q(\rho ^A,\rho ^B)\). Then \(\rho ^{AB}=\rho ^A\otimes \rho ^B\). \(\square \)

More information concerning the partial trace and its properties can be found in a recent work [23].

The following results are used in the proof of Proposition 2.6:

Lemma A.4

Denote by \(S_N\) the SWAP operator on \({\mathcal {H}}_{N^2}:={\mathcal {H}}_N\otimes {\mathcal {H}}_N\), and by \(S_{n,m}\) and \(R_{n,m}\) the following SWAP operators on \({\mathcal {H}}_{(nm)^2}:={\mathcal {H}}_n\otimes {\mathcal {H}}_m\otimes {\mathcal {H}}_n\otimes {\mathcal {H}}_m\):

$$\begin{aligned} S_{n,m}(|{\textbf{x}}\rangle |{\textbf{u}}\rangle |{\textbf{y}}\rangle |{\textbf{v}}\rangle )=|{\textbf{y}}\rangle |{\textbf{v}}\rangle |{\textbf{x}}\rangle |{\textbf{u}}\rangle ,\, R_{n,m}(|{\textbf{x}}\rangle |{\textbf{u}}\rangle |{\textbf{y}}\rangle |{\textbf{v}}\rangle )=|{\textbf{x}}\rangle |{\textbf{y}}\rangle |{\textbf{u}}\rangle |{\textbf{v}}\rangle . \end{aligned}$$
  1. (a)

    Assume that \(|i\rangle \), with \(i\in [N]\), is an orthonormal basis in \({\mathcal {H}}_N\). Suppose that

    $$\begin{aligned} \rho =\sum _{i,j,p,q\in [N]} \rho _{(i,p)(j,q)}|i\rangle |p\rangle \langle j|\langle q|\in \textrm{B}({\mathcal {H}}_N\otimes {\mathcal {H}}_N). \end{aligned}$$

    Then \({{\,\textrm{Tr}\,}}S_N\rho =\sum _{i,p\in [N]} \rho _{(p,i)(i,p)}\).

  2. (b)

    Assume that

    $$\begin{aligned}&\rho ^{AB}\in \textrm{B}({\mathcal {H}}_n\otimes {\mathcal {H}}_n),{} & {} {{\,\textrm{Tr}\,}}_B\rho ^{AB}=\rho ^A \in \textrm{B}({\mathcal {H}}_n),{} & {} {{\,\textrm{Tr}\,}}_A\rho ^{AB}=\rho ^B\in \textrm{B}({\mathcal {H}}_n), \\&\sigma ^{CD}\!\in \textrm{B}({\mathcal {H}}_m\otimes {\mathcal {H}}_m),{} & {} {{\,\textrm{Tr}\,}}_D\sigma ^{CD}\!=\sigma ^C \in \textrm{B}({\mathcal {H}}_m),{} & {} {{\,\textrm{Tr}\,}}_C\sigma ^{CD}\!=\sigma ^D\in \textrm{B}({\mathcal {H}}_m). \end{aligned}$$

    Then \(\tau ^{ACBD}:=R_{n,m}(\rho ^{AB}\otimes \sigma ^{CD})R_{n,m}\) is in \(\textrm{B}({\mathcal {H}}_{(nm)^2})\). Furthermore

    $$\begin{aligned} \begin{aligned}&{{\,\textrm{Tr}\,}}_{BD} \tau ^{ACBD}=\rho ^A\otimes \sigma ^C, \quad {{\,\textrm{Tr}\,}}_{AC} \tau ^{ACBD}=\rho ^B\otimes \sigma ^D,\\&{{\,\textrm{Tr}\,}}S_{n,m}\tau ^{ACBD}=\big ({{\,\textrm{Tr}\,}}S_n\rho ^{AB}\big )\big ({{\,\textrm{Tr}\,}}S_m \sigma ^{CD}). \end{aligned} \end{aligned}$$
    (A.2)

Proof

(a) View \(S_N\) and \(\rho \) as \(N^2\times N^2\) matrices with entries indexed by the row (ip) and the column (jq). Observe that \(S_N\) is a symmetric permutation matrix. Then \((S_N\rho )_{(i,p),(j,q)}=\rho _{(p,i)(j,q)}\). The trace of \(S_N\rho \) is obtained by summation on the entries \(p=q\) and \(i=j\).

Clearly, \(\tau ^{ACBD}\in \textrm{B}({\mathcal {H}}_{(nm)^2})\). Assume that

$$\begin{aligned} \rho ^{AB}&=\sum _{i_A,i_B,j_A,j_B \in [n]} \rho _{(i_A,i_B)(j_A,j_B)}|i_A\rangle |i_B\rangle \langle j_A|\langle j_B|,\\ \sigma ^{CD}&=\sum _{p_C,p_D,q_C,q_D \in [m]} \sigma _{(p_C,p_D)(q_C,q_D)}|p_C\rangle |p_D\rangle \langle q_C|\langle q_D|. \end{aligned}$$

Then

$$\begin{aligned} \tau ^{ACBD} = \!\!\!\!\! \sum _{\begin{array}{c} i_A,i_B,j_A,j_B \in [n] \\ p_C,p_D,q_C,q_D \in [m] \end{array}} \!\!\!\!\! \rho _{(i_A,i_B)(j_A,j_B)} \sigma _{(p_C,p_D)(q_C,q_D)} |i_A\rangle |p_C\rangle |i_B\rangle |p_D\rangle \langle j_A|\langle q_C| \langle j_B| \langle q_D|. \end{aligned}$$

Observe next that \({{\,\textrm{Tr}\,}}_{BD}\tau ^{ACBD}\) is obtained when we sum on \(i_B=j_B\) and \(p_D=q_D\). Hence

$$\begin{aligned} {{\,\textrm{Tr}\,}}_{BD}\tau ^{ACBD}&= \!\! \sum _{\begin{array}{c} i_A,j_A \in [n] \\ p_C,q_C \in [m] \end{array}} (\sum _{i_B=1}^n \rho _{(i_A, i_B)(j_A,i_B)})(\sum _{p_D=1}^m \sigma _{(p_C,p_D)(q_C,p_D)})|i_A\rangle |p_C\rangle \langle j_A|\langle q_C|\\&= \!\! \sum _{\begin{array}{c} i_A,j_A \in [n] \\ p_C,q_C \in [m] \end{array}} \rho ^A_{i_Aj_A}\sigma ^C_{p_Cq_C}|i_A\rangle |p_C\rangle \langle j_A|\langle q_C|=\rho ^A\otimes \sigma ^C. \end{aligned}$$

Similarly \({{\,\textrm{Tr}\,}}_{AC}\tau ^{ACBD}=\rho ^B\otimes \sigma ^D\). This proves the first line in (A.2).

We now use (a) to compute \({{\,\textrm{Tr}\,}}S_{n,m}\tau ^{ACBD}\):

$$\begin{aligned} {{\,\textrm{Tr}\,}}S_{n,m}\tau ^{ACBD}= \!\! \sum _{\begin{array}{c} i_A,i_B \in [n] \\ p_C,p_D \in [m] \end{array}} \rho _{(i_B,i_A)(i_A,i_B)}\sigma _{(p_D,p_C)(p_C,p_D)}= ({{\,\textrm{Tr}\,}}S_n\rho ^{AB})({{\,\textrm{Tr}\,}}S_m\sigma ^{CD}). \end{aligned}$$

This proves the second line in (A.2). \(\square \)

Appendix B: Quantum States of a Single Qubit System

In this Appendix we discuss additional properties of the quantum optimal transport for qubits. Section B.1 provides (Theorem B.1) a closed formula for \(\textrm{T}_{C^Q}^Q(\rho ^A,\rho ^B)\) in terms of solutions of the trigonometric equation (B.1). Lemma B.2 shows that this trigonometric equation is equivalent to a polynomial equation of degree at most 6. Section B.2 gives a nice closed formula for the value of QOT for two isospectral qubit density matrices. In Section B.3 we present a simple example where the supremum of the dual SDP problem to QOT is not achieved.

1.1 B.1: A Semi-analytic Formula for the Single-Qubit Optimal Transport

We begin by introducing a convenient notation for qubits in the \(y=0\) section of the Bloch ball \(\Omega _2\)—see [4, Sect. 5.2]. Let O denote the orthogonal rotation matrix,

$$\begin{aligned} O(\theta )=\begin{bmatrix}\cos (\theta /2)&{}-\sin (\theta /2)\\ \sin (\theta /2)&{}\cos (\theta /2) \end{bmatrix}, \quad \text {for } \theta \in [0,2\pi ), \end{aligned}$$

and define, for \(r\in [0,1]\),

$$\begin{aligned} \rho (r,\theta )&= O(\theta ) \begin{bmatrix}r&{}0\\ 0&{}1-r\end{bmatrix} O(\theta )^\top . \end{aligned}$$

Because of unitary invariance (2.14), the quantum transport problem between two arbitrary qubits \(\rho ^A, \rho ^B \in \Omega _2\) can be reduced to the case \(\rho ^A = \rho (s,0)\) and \(\rho ^B = \rho (r,\theta )\), with three parameters, \(s,r \in [0,1]\) and \(\theta \in [0,2\pi )\). The parameter \(\theta \) is the angle between the Bloch vectors associated with \(\rho ^A\) and \(\rho ^B\). With such a parametrization we can further simplify the single-qubit transport problem.

Observe first that if \(s \in \{0,1\}\) then \(\rho ^A\) is pure, and if \(r \in \{0,1\}\) then \(\rho ^B\) is pure. In any such case an explicit solution of the qubit transport problem is given (6.2).

Theorem B.1

Let \(\rho ^A = \rho (s,0), \rho ^B = \rho (r,\theta )\) and assume that \(0<r,s<1\). Then

$$\begin{aligned} \textrm{T}^Q_{C^Q}(\rho ^A,\rho ^B) = \max _{\phi \in \Phi (s,r,\theta )} \frac{1}{4}\Big (\sqrt{1+ (2s-1)\cos \phi } - \sqrt{1+(2r-1)\cos (\theta +\phi )}\Big )^2, \end{aligned}$$

where \(\Phi (s,r,\theta )\) is the set of all \(\phi \in [0,2\pi )\) satisfying the equation

$$\begin{aligned} \frac{(2s-1)^2\sin ^2\phi }{1+(2s-1)\cos \phi }=\frac{(2r-1)^2\sin ^2(\theta +\phi )}{1+(2r-1)\cos (\theta +\phi )}. \end{aligned}$$
(B.1)

Proof

A unitary \(2 \times 2\) matrix U can be parametrized, up to a global phase, with three angles \(\alpha , \beta , \phi \in [0,2\pi )\),

$$\begin{aligned} U = \begin{bmatrix} e^{{\textbf{i}}\alpha } &{} 0\\ 0 &{} e^{-{\textbf{i}}\alpha } \end{bmatrix} O(\phi ) \begin{bmatrix} e^{{\textbf{i}}\beta } &{} 0\\ 0 &{} e^{-{\textbf{i}}\beta } \end{bmatrix}. \end{aligned}$$

Thus, setting \(f(r,\theta ;\alpha ,\phi ) = (U^{\dagger }\rho (r,\theta ) U)_{11}\), we have

$$\begin{aligned} f(r,\theta ;\alpha ,\phi ) = \frac{1}{2} \Big ( 1+ (2 r-1) \big ( \cos (\theta ) \cos (\phi ) + \cos (2 \alpha ) \sin (\theta ) \sin (\phi ) \big ) \Big ). \end{aligned}$$

This quantity does not depend on the parameter \(\beta \), so we can set \(\beta = 0\). Note also that \(f(s,0;\alpha ,\phi )\) does not depend on \(\alpha \). With \(\rho ^A = \rho (s,0), \rho ^B = \rho (r,\theta )\), Theorem 5.1 yields

$$\begin{aligned} \textrm{T}^Q_{C^Q}(\rho ^A,\rho ^B) = \frac{1}{2}\max _{\alpha ,\phi \in [0,2\pi )} \Big ( \sqrt{f(s,0;0,\phi )} - \sqrt{f(r,\theta ;\alpha ,\phi )} \Big )^2. \end{aligned}$$

Now, note that the equation \(\partial _\alpha f(r,\theta ;\alpha ,\phi ) = 0\) yields the extreme points \(\alpha _0 = k \pi /2\), with \(k \in {\mathbb {Z}}\). Since \(f(r,\theta ;\alpha + \pi ,\phi ) = f(r,\theta ;\alpha ,\phi )\) we can take just \(\alpha _0 \in \{0,\pi /2\}\). Consequently,

$$\begin{aligned} \textrm{T}^Q_{C^Q}(\rho ^A,\rho ^B) = \max _{\phi \in [0,2\pi )} \{ g_-(s,r,\theta ;\phi ), g_+(s,r,\theta ;\phi ) \}, \end{aligned}$$

where we introduce the auxilliary functions

$$\begin{aligned} g_\pm (s,r,\theta ;\phi ) = \frac{1}{4}\Big (\sqrt{1+ (2s-1)\cos \phi } - \sqrt{1+(2r-1)\cos (\theta \pm \phi )}\Big )^2. \end{aligned}$$
(B.2)

But since \(g_-(s,r,\theta ;2\pi - \phi ) = g_+(s,r,\theta ;\phi )\) we can actually drop the ± index in the above formula. In conclusion, we have shown that it is sufficient to take \(U = O(\phi )\) for \(\phi \in [0,2\pi )\) in Formula (5.2).

Finally, it is straightforward to show that the equation \(\partial _\phi g(s,r,\theta ;\phi ) = 0\) is equivalent to (B.1). Hence, \(\Phi (s,r,\theta )\) is the set of extreme points, and (B.1) follows. \(\square \)

Lemma B.2

The Eq. (B.1) has at most six solutions \(\phi \in [0,2\pi )\) for given \(r,s\in (0,1), \theta \in [0,2\pi )\). Moreover there is an open set of \(s,r\in (0,1),\theta \in [0,2\pi )\) where there are exactly six distinct solutions.

Proof

Write \(z=e^{{\textbf{i}}\phi }, \zeta =e^{{\textbf{i}}\theta }\). Then

$$\begin{aligned}&2\cos \phi =z+\frac{1}{z},{} & {} 2{\textbf{i}}\sin \phi =z-\frac{1}{z}, \\&2\cos (\theta +\phi )=\zeta z+\frac{1}{\zeta z},{} & {} 2{\textbf{i}}\sin (\theta +\phi )=\zeta z-\frac{1}{\zeta z}. \end{aligned}$$

Thus (B.1) is equivalent to

$$\begin{aligned}{} & {} (1-2 r)^2 \left[ (2 s-1) \left( z^2+1\right) +2 z\right] \left( \zeta ^2 z^2-1\right) ^2 \nonumber \\{} & {} \quad -\zeta (1-2 s)^2 \left( z^2-1\right) ^2 \left[ (2 r-1) \left( \zeta ^2 z^2+1\right) +2 \zeta z\right] = 0. \end{aligned}$$
(B.3)

This a 6th order polynomial equation in th e variable z, so it has at most 6 real solutions. Since we must have \(\vert z \vert = 1\), not every complex root of (B.3) will yield a real solution to the original (B.1). Nevertheless, it can be shown that there exist open sets in the parameter space \(s,r \in (0,1)\), \(\theta \in [0,2\pi )\) on which (B.1) does have 6 distinct solutions.

Observe that if \(\theta =0\) and \(s,r\in (0,1)\) and \(s\ne r\) then two solutions to the equality (B.1) are \(\phi \in \{0,\pi \}\), which means that \(z=\pm 1\). In this case the equality (B.1) is

$$\begin{aligned} \sin ^2\phi \, \bigg (\frac{(2s-1)^2}{1+(2s-1)\cos \phi }-\frac{(2r-1)^2}{1+(2r-1)\cos (\phi )}\bigg )=0. \end{aligned}$$

As \(\sin ^2\phi =-(1/4)z^{-2}(z^2-1)^2\) we see that \(z=\pm 1\) is a double root.

Another solution \(\phi \notin \{0,\pi \}\) is given by

$$\begin{aligned} \cos \phi =\frac{(2s-1)^2-(2r-1)^2}{(2r-1)^2(2s-1)-(2r-1)(2s-1)^2}=\frac{2(1-r-s)}{(2r-1)(2s-1)}. \end{aligned}$$

Assume that \(r+s=1\). Then \(\cos \phi =0\), so \(\phi \in \{\pi /2, 3\pi /2\}\). Thus if \(r+s\) is close to 1 we have that \(\phi \) has two values close to \(\pi /2\) and \(3\pi /2\) respectively. Hence in this case we have 6 solutions counting with multiplicities.

We now take a small \(|\theta |>0\). The two simple solutions \(\phi \) are close to \(\pi /2\) and \(3\pi /2\). We now need to show that the double roots \(\pm 1\) split to two pairs of solutions on the unit disc: one pair close to 1 and the other pair close to \(-1\). Let us consider the pair close to 1, i.e., \(\phi \) close to zero. Then the equation (B.1) can be written in the form

$$\begin{aligned}{} & {} (2s-1)^2\big (1+(2r-1)\cos (\theta +\phi )\big )\sin ^2\phi \\{} & {} \quad - (2r-1)^2\big (1+(2s-1)\cos \phi \big )\sin ^2(\theta +\phi )=0. \end{aligned}$$

Replacing \(\sin \phi , \sin (\theta +\phi )\) by \(\phi , \theta +\phi \) respectively we see that the first term gives the equation: \((2s-1)^2(2r)\phi ^2-(2r-1)^2 2s(\theta +\phi )^2=0\). Then we obtain two possible Taylor series of \(\phi \) in terms of \(\theta \):

$$\begin{aligned} \phi _1(\theta )&=\frac{(2r-1)\sqrt{s}\theta }{(2s-1)\sqrt{r}-(2r-1)\sqrt{s}} + \theta ^2 E_1(\theta ), \\ \phi _2(\theta )&=-\frac{(2r-1)\sqrt{s}\theta }{(2s-1)\sqrt{r}+(2r-1)\sqrt{s}}+\theta ^2E_2(\theta ). \end{aligned}$$

Use the implicit function theorem to show that \(E_1(\theta )\) and \(E_2(\theta )\) are analytic in \(\theta \) in the neighborhood of 0. Hence in this case we have 6 different solutions. \(\square \)

We have thus shown that the general solution of the quantum transport problem of a single qubit with cost matrix \(C^Q = \tfrac{1}{2} \big ({\mathbb {I}}_{4} - S\big )\) is equivalent to solving a 6th degree polynomial equation with certain parameters. For some specific values of these parameters an explicit analytic solution can be given. This is discussed in the next subsection.

1.2 B.2: Two Isospectral Density Matrices of a Single Qubit

In view of unitary invariance (2.14) and the results of the previous section we can assume that two isospectral qubits have the following form: \(\rho ^A = \rho (s,0)\) and \(\rho ^B = \rho (s,\theta )\) for some \(s \in [0,1]\) and \(\theta \in [0,2\pi )\).

Theorem B.3

For any \(s \in [0,1]\) and \(\theta \in [0,2\pi )\) we have

$$\begin{aligned} \textrm{T}^Q_{C^Q} \big (\rho (s,0),\rho (s,\theta ) \big ) = \Big ( \tfrac{1}{2} -\sqrt{s(1-s)} \Big ) \sin ^2 (\theta /2). \end{aligned}$$
(B.4)

Proof

Note first that if the states \(\rho ^A,\rho ^B\) are pure, i.e. \(s = 0\) or \(s=1\), formula (B.4) gives \(\textrm{T}^Q_{C^Q} \big (\rho (s,0),\rho (s,\theta ) \big ) = \tfrac{1}{2} \sin ^2 (\theta / 2)\), which agrees with (6.2).

From now on we assume that that \(\rho ^A, \rho ^B\) are not pure. When \(r = s\), (B.3) simplifies to the following:

$$\begin{aligned}{} & {} (\zeta -1) (1-2 s)^2 \left( \zeta z^2-1\right) \times \nonumber \\{} & {} \quad \times \left[ 4 s (\zeta +1) \left( \zeta z^2+1\right) z +(2 s-1) (z-1)^2 (\zeta z-1)^2 \right] = 0. \end{aligned}$$
(B.5)

Equation (B.5) is satisfied when \(z = \pm \zeta ^{-1/2}\). This corresponds to \(\phi _0 = -\theta /2\) or \(\phi _0' = \pi - \theta /2\). Observe, however, that we have \(g(s,s,\theta ;\phi _0) = g(s,s,\theta ;\phi _0') = 0\), so we can safely ignore \(\phi _0, \phi _0' \in \Phi (s,s,\theta )\) in the maximum in (B.1).

Hence, we are left with a 4th order equation

$$\begin{aligned} 4 s (\zeta +1) \left( \zeta z^2+1\right) z +(2 s-1) (z-1)^2 (\zeta z-1)^2 = 0, \end{aligned}$$
(B.6)

which reads

$$\begin{aligned} (2 s-1) \big [ 2 + \cos (\theta +2 \phi )+ \cos (\theta ) \big ] +2 \big [ \cos (\theta +\phi )+ \cos (\phi ) \big ] = 0. \end{aligned}$$
(B.7)

Now, observe that if \(\phi \) satisfies (B.7), then so does \(\phi ' = -\phi - \theta \). This translates to the fact that if z satisfies (B.6), then so does \((z \zeta )^{-1}\). Furthermore, \(g(s,s,\theta ;\phi ) = g(s,s,\theta ;\phi ')\). Hence, in the isospectral case we are effectively taking the maximum over just two values of \(\phi \).

Let us now seek an angle \(\phi _1 \in [0,2\pi )\) such that \(g(s,s,\theta ;\phi _1)\) equals the righthand side of (B.4). The latter equation reads

$$\begin{aligned}&\Big \{ (2 s-1) \big [\cos \left( \theta +\phi _1\right) +\cos \left( \phi _1\right) \big ] -\big (2 \sqrt{s(1-s)}-1\big ) \big (\cos (\theta )-1\big )+2\Big \}^2 \\&\quad = 4 \big [(2 s-1) \cos \left( \phi _1\right) +1\big ] \big [(2 s-1) \cos \left( \theta +\phi _1\right) +1\big ]. \end{aligned}$$

In terms of z and \(\zeta \), the above is equivalent to a 4th order polynomial equation in z, which can be recast in the following form:

$$\begin{aligned} \Big [ \zeta (1-2 s) z^2+(\zeta +1) \big (2 \sqrt{s(1-s)}-1\big ) z-2 s+1 \big ]^2 = 0. \end{aligned}$$
(B.8)

Hence, (B.8) has two double roots:

$$\begin{aligned}{} & {} z_1^{\pm } = \big [ 2 \zeta (1-2 s) \big ]^{-1} \bigg \{ (\zeta +1) \big ( 1-2 \sqrt{s(1-s)} \, \big ) \\{} & {} \qquad \quad \pm \sqrt{(\zeta +1)^2 \big (1-2 \sqrt{s(1-s)} \,\big )^2-4 \zeta (1-2s)^2} \bigg \}. \end{aligned}$$

Furthermore, one can check that \(z_1^{-} = (\zeta z_1^{+})^{-1}\).

Now, it turns out that \(z_1^{\pm }\) are also solutions to (B.6), as one can quickly verify using Mathematica [64]. We thus conclude that \(\phi _1, \phi _1' \in \Phi (s,s,\theta )\).

We now divide the polynomial in (B.6) by \((z-z_1^{+})(z-z_1^{-})\). We are left with the following quadratic equation

$$\begin{aligned} \zeta \Big [ (2 s-1) \left( \zeta z^2+1\right) +(\zeta +1) \big (2 \sqrt{(1-s) s}+1\big ) z\Big ] = 0. \end{aligned}$$

Its solutions are

$$\begin{aligned}{} & {} z_2^{\pm } = \big [ 2 \zeta (1-2 s) \big ]^{-1} \bigg \{ (\zeta +1) \big ( 1+2 \sqrt{s(1-s)} \, \big ) \\{} & {} \qquad \quad \pm \sqrt{(\zeta +1)^2 \big (1+2 \sqrt{s(1-s)} \,\big )^2-4 \zeta (1-2s)^2} \bigg \}. \end{aligned}$$

Again, we have \(z_2^{-} = (\zeta z_2^{+})^{-1}\), in agreement with the symmetry argument. Setting \(z_2^+ =:e^{{\textbf{i}}\phi _2}\) and \(z_2^- =:e^{{\textbf{i}}\phi _2'}\) we have \(\phi _2, \phi _2' \in \Phi (s,s,\theta )\). Then we deduce that

$$\begin{aligned} g(s,s,\theta ;\phi _2)&= g(s,s,\theta ;\phi _2') = \tfrac{1}{4} \Big [ (1-6 \sqrt{(1-s) s} - \big (1+2 \sqrt{(1-s) s} \, \big ) \cos (\theta ) \Big ]. \end{aligned}$$

Finally, we observe that

$$\begin{aligned} g(s,s,\theta ;\phi _1) - g(s,s,\theta ;\phi _2) = \sqrt{(1-s) s} \, \big (1+\cos (\theta ) \big ) \ge 0. \end{aligned}$$

This shows that, for any \(s \in (0,1)\), \(\theta \in [0,2\pi )\),

$$\begin{aligned} \textrm{T}^Q_{C^Q} \big (\rho (s,0),\rho (s,\theta ) \big ) = g(s,s,\theta ;\phi _1), \end{aligned}$$

and (B.4) follows. \(\square \)

Note that \(g(s,s,\theta ;\phi _2)\) can become negative for certain values of s and \(\theta \). This means that for such values the set \(\Phi \) of phases defined in Theorem B.1 reads, \(\Phi (s,s,\theta ) = \{\phi _0,\phi _0',\phi _1,\phi _1'\}\).

1.3 B.3: An Example Where the Supremum (3.1) is not Achieved

Assume that \(m=n=2\), \( C=C^Q\), \( \rho ^A=\vert 0 \rangle \langle 0 \vert = \left[ {\begin{matrix} 1 &{}\quad 0 \\ 0 &{}\quad 0 \end{matrix}} \right] \) and \( \rho ^B={\mathbb {I}}_2/2\). Recall that in such a case, \(\Gamma ^Q(\rho ^A,\rho ^B)=\{\rho ^A\otimes \rho ^B\}\) and

$$\begin{aligned} \rho ^A\otimes \rho ^B=\left[ \begin{array}{rrrr}\frac{1}{2}&{}\quad 0&{}\quad 0&{}\quad 0\\ 0&{}\quad \frac{1}{2}&{}\quad 0&{}\quad 0\\ 0&{}\quad 0&{}\quad 0&{}\quad 0\\ 0&{}\quad 0&{}\quad 0&{}\quad 0 \end{array}\right] . \end{aligned}$$

Hence \(\textrm{T}^Q(\rho ^A,\rho ^B)=1/4\). We can easily see that the supremum in (3.1) is not attained in this case. Let F be of the form (5.12). Suppose that there exists \(\sigma ^A,\sigma ^B\in \textrm{S}({\mathcal {H}}_2)\) such that \(F\ge 0\) and \(\textrm{T}_{C^Q}^Q(\rho ^A,\rho ^B)={{\,\textrm{Tr}\,}}(\sigma ^A\rho ^A+\sigma ^B\rho ^B)\). As in the proof of Theorem 3.2 we deduce that \({{\,\textrm{Tr}\,}}F( \rho ^A\otimes \rho ^B)=0\). Hence the (1, 1) and (2, 2) entries of F are zero. Since \(F\ge 0\) it follows that the first and the second row and column of F are zero. Observe next that the (2, 3) and (3, 2) entries of F are \(-1/2\). Hence such \(\sigma ^A,\sigma ^B\) do not exist.

Since \(\rho ^A\) is not positive definite and \(\rho ^B\) is positive definite, as pointed out in the proof of Proposition 2.4, one can replace \(\rho ^{A}\) by \(\rho ^{A'}=[1]\in \Omega _1\). Then the dual problem for \(\rho ^{A'},\rho ^B\) boils down to

$$\begin{aligned}&\sigma ^{A'}=-a',\quad \sigma ^B={{\,\textrm{diag}\,}}(-e,-g),\quad F={{\,\textrm{diag}\,}}(a'+e,a'+g+1/2)\ge 0,\\&\max _{a'+e\ge 0, a'+g+1/2\ge 0} \Big ( -a'-\frac{e+g}{2} \Big ). \end{aligned}$$

Then the above maximum is 1/4, achieved for \(a'=-1/2 +t, e=1/2-t, g=-t\) for each \(t\in {\mathbb {R}}\).

To summarize: the supremum of the dual problem to \(\textrm{T}_{C^Q}^Q(\rho ^A,\rho ^B)\) is achieved at infinity, while the supremum of the dual problem to \(\textrm{T}_{C^Q}^Q(\rho ^{A'},\rho ^B)\) is achieved on an unbounded set.

Appendix C: Diagonal States of a Qutrit

In this section we provide a closed formula for \(\textrm{T}_{C^{Q}}^Q(\rho ^A,\rho ^B)\) for a large class of classical qutrits, i.e. diagonal matrices \(\rho ^A,\rho ^B\in \Omega _3\).

Theorem C.1

Let \({\textbf{s}}=(s_1,s_2,s_3)^\top ,{\textbf{t}}=(t_1,t_2,t_3)^\top \in {\mathbb {R}}^3\) be probability vectors. Then the quantum optimal transport problem for diagonal qutrits is determined by the given formulas in the following cases:

  1. (a)
    $$\begin{aligned} \textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=\frac{1}{2}\max _{p\in [3]}(\sqrt{s_p}-\sqrt{t_p})^2 \end{aligned}$$

    if and only if the conditions (5.9) hold for \(n=3\).

  2. (b)

    Suppose that there exists a renaming of 1, 2, 3 by pqr such that

    $$\begin{aligned} \begin{aligned}&t_r\ge s_p+s_q \text { and}\\&\text {either } s_p\ge t_p>0, s_q\ge t_q>0 \; \text { or } \; t_p\ge s_p>0, t_q\ge s_q >0. \end{aligned} \end{aligned}$$
    (C.1)

    Then

    $$\begin{aligned} \textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=\frac{1}{2}\Big ((\sqrt{s_p}-\sqrt{t_p})^2 +(\sqrt{s_q}-\sqrt{t_q})^2\Big ). \end{aligned}$$
    (C.2)
  3. (c)

    Suppose that there exists \(\{p,q,r\}=\{1,2,3\}\) such that

    $$\begin{aligned} s_p>t_q>0, \quad t_p>s_q>0, \quad s_q+s_r\ge t_p, \end{aligned}$$
    (C.3)

    and

    $$\begin{aligned} \begin{aligned}&1 +\frac{\sqrt{t_q}}{\sqrt{s_q}}-\sqrt{\frac{s_p-t_q}{t_p-s_q}}\ge 0, \qquad 1+\frac{\sqrt{s_q}}{\sqrt{t_q}}-\sqrt{\frac{t_p-s_q}{s_p-t_q}}\ge 0,\\&\left( 1+\frac{\sqrt{t_q}}{\sqrt{s_q}}-\sqrt{\frac{s_p-t_q}{t_p-s_q}} \,\right) \left( 1+\frac{\sqrt{s_q}}{\sqrt{t_q}}-\sqrt{\frac{t_p-s_q}{s_p-t_q}} \, \right) \ge 1, \\&\max \Big (\frac{s_q}{t_q},\frac{t_q}{s_q}\Big )\ge \max \Big (\frac{s_p-t_q}{t_p-s_q},\frac{t_p-s_q}{s_p-t_q} \Big ). \end{aligned} \end{aligned}$$
    (C.4)

    Then

    $$\begin{aligned} \textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))= \frac{1}{2}\Big ((\sqrt{s_q}-\sqrt{t_q})^2 +(\sqrt{s_p-t_q}-\sqrt{t_p-s_q})^2\Big ).\qquad \quad \end{aligned}$$
    (C.5)
  4. (d)

    Assume that \({\textbf{s}}=(s_1,s_2,0)^\top ,{\textbf{t}}=(t_1,t_2,t_3)^\top \) are probability vectors. Then

    $$\begin{aligned} \textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))= {\left\{ \begin{array}{ll} \frac{1}{2}\big ((\sqrt{t_1}-\sqrt{t_2})^2+t_3\big ), &{} \text { if } s_1\ge t_2 \text { and } s_2\ge t_1,\\ \frac{1}{2}\big ((\sqrt{t_1}-\sqrt{s_1})^2+t_3\big ), &{} \text { if } s_1< t_2,\\ \frac{1}{2}\big ((\sqrt{t_2}-\sqrt{s_2})^2+t_3 \big ), &{} \text { if } s_2< t_1. \end{array}\right. }\nonumber \\ \end{aligned}$$
    (C.6)

    If \({\textbf{s}}=(s_1,s_2,s_3)^\top ,{\textbf{t}}=(t_1,t_2,0)^\top \), then formula (C.6) holds after the swapping \(s_i \leftrightarrow t_i\).

Proof

(a) This follows from Theorem 5.3.

(b) Suppose that the condition (C.1) holds. By relabeling the coordinates and interchanging \({\textbf{s}}\) and \({\textbf{t}}\) if needed we can assume the conditions (C.1) are satisfied with \(p=1,q=2, r=3\):

$$\begin{aligned} s_1\ge t_1>0,\quad s_2\ge t_2>0, \quad t_3\ge s_1+s_2. \end{aligned}$$

Hence

$$\begin{aligned} X^\star =\begin{bmatrix}0&{}0&{}s_1\\ 0&{}0&{}s_2\\ t_1&{}t_2&{}t_3-(s_1+s_2)\end{bmatrix} \in \Gamma ^{cl}({\textbf{s}},{\textbf{t}}). \end{aligned}$$
(C.7)

We claim that the conditions (C.1) yield that \(X^\star \) is a minimizing matrix for \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) as given in (4.5). To show that we use the complementary conditions in Lemma 5.2. Let \(R^\star \in \Gamma ^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) be the matrix induced by \(X^\star \) of the form described in part (a) of Lemma 4.2. That is, the diagonal entries of \(R^\star \) are \(R^\star _{(i,j)(i,j)}=x^\star _{ij}\) with additional nonnegative entries: \(R^\star _{(i,j)(j,i)}=\sqrt{x^\star _{ij}x^\star _{ji}}\) for \(i\ne j\). Clearly, \(R^\star \) is a direct sums of 3 submatrices of order 1 and 3 of order 2 as above. Let \(F^\star \) be defined as in Lemma 5.2 with the following parameters:

$$\begin{aligned}&a_{1}^\star =\frac{1}{2}\Big (\frac{\sqrt{t_1}}{\sqrt{s_1}}-1\Big ), \qquad b_{1}^\star =\frac{1}{2}\Big (\frac{\sqrt{s_1}}{\sqrt{t_1}}-1\Big ),\nonumber \\&a_{2}^\star =\frac{1}{2}\Big (\frac{\sqrt{t_2}}{\sqrt{s_2}}-1\Big ), \qquad b_{2}^\star =\frac{1}{2}\Big (\frac{\sqrt{s_2}}{\sqrt{t_2}}-1\Big ),\nonumber \\&a_3^\star =b_3^\star =0. \end{aligned}$$
(C.8)

We claim that the conditions (C.1) yield that \(F^\star \) is positive semidefinite. We verify that the three blocks of size one and the three blocks of size two of \(F^\star \) are positive semidefinite. The condition \(a_i^\star +b_i^\star \ge 0\) for \(i\in [3]\) is straightforward. The conditions for \(M_{12}^\star \) and \(M_{13}^\star \) are straightforward. We now show that \(M_{12}^\star \) is positive semidefinite. First note that as \(s_1\ge t_1\) and \(s_2\ge t_2\) we get that \(b_1^\star \ge 0\) and \(b_2^\star \ge 0\). Clearly \(a_1^\star >-1/2\) and \(a_2^\star >-1/2\). Hence the diagonal entries of \(M_{12}^\star \) are positive. It is left to show that \(\det M_{12}^\star \ge 0\). Set \(u=\sqrt{t_1}/{\sqrt{s_1}}\le 1\) and \(v=\sqrt{s_2}/{\sqrt{t_2}}\ge 1\). Then

$$\begin{aligned} 2(a_1^\star +b_2^\star +1/2)=u+v-1,{} & {} 2(a_2^\star +b_1^\star +1/2)=1/u+1/v-1, \end{aligned}$$
$$\begin{aligned} 4\det M_{12}^\star&=(u+v-1)(1/u+1/v-1)-1 \\&=\big (1/(uv\big ) \big )\big (u+v-1)(u+v-uv)-uv\big ) \\&=\big (1/(uv)\big ) (u+v)(1-u)(v-1)\ge 0. \end{aligned}$$

We next observe that equalities (5.3) hold. The first three equalities hold as \(x_{11}^\star =x_{22}^\star =(a_3^\star +b_3^\star )=0\). The equality of \(i=1,j=2\) holds as \(x_{12}^\star =x_{21}^\star =0\). The equalities for \(i=1, j=3\) and \(i=2,j=3\) follow from the following equalities:

$$\begin{aligned}&x_{13}^\star (a_1^\star +b_3^\star +1/2)+x_{31}^\star (a_3^\star +b_1^\star +1/2)= \tfrac{1}{2}\big (s_1\tfrac{\sqrt{t_1}}{\sqrt{s_1}}+t_1\tfrac{\sqrt{s_1}}{\sqrt{t_1}}\big )=\sqrt{s_1t_1}=\sqrt{x_{13}^\star x_{31}^\star },\\&x_{23}^\star (a_2^\star +b_3^\star +1/2)+x_{32}^\star (a_3^\star +b_2^\star +1/2)= \tfrac{1}{2}\big (s_2\tfrac{\sqrt{t_2}}{\sqrt{s_2}}+t_2\tfrac{\sqrt{s_2}}{\sqrt{t_2}}\big )=\sqrt{s_2t_2}=\sqrt{x_{23}^\star x_{32}^\star }. \end{aligned}$$

Hence \({{\,\textrm{Tr}\,}}R^\star F^\star =0\) and \(X^\star \) is a minimizing matrix. Therefore (C.2) holds for \(p=1\), \(q=2\).

(c) Suppose that the condition (C.3) holds. By relabeling the coordinates we can assume the conditions (C.3) are satisfied with \(p=1,q=2, r=3\):

$$\begin{aligned} s_1>t_2,\quad t_1> s_2, \quad s_2+s_3-t_1\ge 0. \end{aligned}$$

Hence

$$\begin{aligned} X^\star =\begin{bmatrix}0&{}t_2&{}s_1-t_2\\ s_2&{}0&{}0\\ t_1-s_2&{}0&{}s_2+s_3-t_1\end{bmatrix}\in \Gamma ^{cl}({\textbf{s}},{\textbf{t}}). \end{aligned}$$
(C.9)

We claim that the conditions (C.4) yield that \(X^\star \) is a minimizing matrix for

\(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) as given in (4.5). To show this we use the complementary conditions in Lemma 5.2. Let \(R^\star \in \Gamma ^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) be the matrix induced by \(X^\star \) of the form described in part (a) of Lemma 4.2. Recall that \(R^\star \) is a direct sum of 3 submatrices of order 1 and 3 of order 2 as above. Let \(F^\star \) correspond to

$$\begin{aligned} \begin{aligned} a^\star _1=\frac{1}{2}\Big (\frac{\sqrt{t_1-s_2}}{\sqrt{s_1-t_2}}-1\Big ),{} & {} a_2^\star = \frac{1}{2}\Big (\frac{\sqrt{t_2}}{\sqrt{s_2}}-\sqrt{\frac{s_1-t_2}{t_2-s_1}} \, \Big ),{} & {} a_3^\star =0,\\ b^\star _1=\frac{1}{2}\Big (\frac{\sqrt{s_1-t_2}}{\sqrt{t_1-s_2}}-1\Big ),{} & {} b_2^\star =\frac{1}{2}\Big (\frac{\sqrt{s_2}}{\sqrt{t_2}}-\sqrt{\frac{t_1-s_2}{s_1-t_2}} \,\Big ),{} & {} b_3^\star =0. \end{aligned} \end{aligned}$$
(C.10)

We claim that (C.4) yield that \(F^\star \) is positive semidefinite. We verify that the three blocks of size one and the three blocks of size two matrices of \(F^\star \) are positive semidefinite. The condition \(a_1^\star +b_1^\star \ge 0\) is straightforward. To show the condition \(a_2^\star +b_2^\star \ge 0\) we argue as follows. Let

$$\begin{aligned} u=\frac{\sqrt{t_1}}{\sqrt{s_1}}, \quad v=\sqrt{\frac{s_1-t_2}{t_2-s_1}}. \end{aligned}$$

Then \(2(a_2^\star +b_2^\star )=u+1/u-(v+1/v)\). The fourth condition of (C.4) is \(\max (u,1/u)\ge \max (v,1/v)\). As \(w+1/w\) increases on \([1,\infty )\) we deduce that \(a_2^\star +b_2^\star \ge 0\). Clearly \(a_3^\star +b_3^\star =0\). We now show that the matrices (5.5) are positive semidefinite, where the last three inequalities follow from the first three inequalities of (C.4):

$$\begin{aligned}&2(a_1^\star +b_2^\star +1/2)=\frac{\sqrt{s_2}}{\sqrt{t_2}}>0, \qquad \qquad 2(a_2^\star +b_1^\star +1/2)=\frac{\sqrt{t_2}}{\sqrt{s_2}}>0,\\&(a_1^\star +b_2^\star +1/2)(a_2^\star +b_1^\star +1/2)-1/4=0,\\&2(a_1^\star +b_3^\star +1/2)=\frac{\sqrt{t_1-s_2}}{\sqrt{s_1-t_2}}>0, \qquad 2(a_3^\star +b_1^\star +1/2)=\frac{\sqrt{s_1-t_2}}{\sqrt{t_1-s_2}}>0,\\&(a_1^\star +b_3^\star +1/2)(a_3^\star +b_1^\star +1/2)-1/4=0,\\&2(a_2^\star +b_3^\star +1/2)=\frac{\sqrt{s_2}}{\sqrt{t_2}}-\sqrt{\frac{t_1-s_2}{s_1-t_2}}+1\ge 0,\\&2(a_3^\star +b_2^\star +1/2)=\frac{\sqrt{t_2}}{\sqrt{s_2}}-\sqrt{\frac{s_1-t_2}{t_1-s_2}}+1\ge 0,\\&(a_2^\star +b_3^\star +1/2)(a_3^\star +b_2^\star +1/2)-1/4\ge 0. \end{aligned}$$

Moreover, the conditions (5.3) hold: as \(x_{11}^\star =x_{22}^\star = a_3^\star +b^\star _3=0\) the first three conditions of (5.3) hold, and as \(x_{23}^\star =x_{32}^\star =0\) the second conditions of (5.3) for \(p=2,q=3\) trivially hold. The other two conditions follow from the following equalities:

$$\begin{aligned}&x_{12}^\star (a_{1}^\star +b_{2}^\star +1/2) + x_{21}^\star (a_{2}^\star +b_{1}^\star +1/2) -\sqrt{x_{12}^\star x_{21}^\star } \\&\quad = t_2\frac{\sqrt{s_2}}{2\sqrt{t_2}}+s_2\frac{\sqrt{t_2}}{2\sqrt{s_2}}-\sqrt{t_2 s_2}=0,\\&x_{13}^\star (a_{1}^\star +b_{3}^\star +1/2) + x_{31}^\star (a_{3}^\star +b_{1}^\star +1/2) -\sqrt{x_{13}^\star x_{31}^\star }\\&\quad = (s_1-t_2)\frac{\sqrt{t_1-s_2}}{2\sqrt{s_1-t_2}}+s_2\frac{\sqrt{t_2}}{2\sqrt{s_2}}-\sqrt{(s_1-t_2)(t_1- s_2)}=0. \end{aligned}$$

\({{\,\textrm{Tr}\,}}F^\star R^\star =0\). Therefore

$$\begin{aligned}{} & {} \textrm{T}^Q_{C^Q}\big ({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}})\big )={{\,\textrm{Tr}\,}}C^Q R^\star \\{} & {} \quad = \frac{1}{2}\big (t_2+s_2+(s_1-t_2)+(t_1-s_2)\big )-\sqrt{t_2 s_2}-\sqrt{(s_1-t_2)(t_1-s_2)}. \end{aligned}$$

This proves (C.5).

(d) Observe that the third row of every matrix in \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) is a zero row. Let \({\textbf{s}}'=(s_1,s_2)^\top \). Thus \(\Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) is obtained from \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) by deleting the third row in each matrix in \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\). Proposition 2.4 yields that

$$\begin{aligned} \textrm{T}_{C^Q}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=\textrm{T}_{C^Q_{2,3}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}'),{{\,\textrm{diag}\,}}({\textbf{t}})). \end{aligned}$$

(See Lemma 4.3 for the definition of \(C_{2, 3}^Q\).) We use now the minimum characterization of \(\textrm{T}_{C^Q_{2,3}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}'),{{\,\textrm{diag}\,}}({\textbf{t}}))\) given in (4.5). Assume that the minimum is achieved for \(X^\star =[x_{il}^\star ]\in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}}), i\in [2],l\in [3]\). We claim that either \(x_{11}^\star =0\) or \(x_{22}^\star =0\).

Let \(Y=[x_{il}^\star ], i,l\in [2]\). Suppose first that \(Y=0\). Then \(t_1=t_2=0\) and \(t_3=1\). So \({{\,\textrm{diag}\,}}({\textbf{t}})\) is a rank-one matrix and \({{\,\textrm{Tr}\,}}\big ( {{\,\textrm{diag}\,}}({\textbf{s}}){{\,\textrm{diag}\,}}({\textbf{t}}) \big )=0\). The equality (6.2) yields that \(\textrm{T}_{C^Q}^Q \big ({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}) \big )=1\). Clearly, \(s_1\ge t_2=0, s_2\ge t_1=0\). Hence (C.6) holds.

Suppose second that \(Y\ne 0\). Then \(t_1+t_2\), the sum of the entries of Y, is positive. Using continuity arguments it is enough to consider the case \(t_1,t_2,t_3>0\). Denote by \(\Gamma '\) the set of all matrices \(X=[x_{il}] \in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) such that \(x_{i3}=x_{i3}^\star \) for \(i=1,2\). Let f be defined by (4.8). Clearly \(\min _{A\in \Gamma '} f(A)=f(Y)\). We now translate this minimum to the minimum problem we studied above.

Let \(Z=\frac{1}{t_1+t_2} Y\). The vectors corresponding to the row sums and the column sums Z are the probabilty vectors \({\hat{{\textbf{s}}}} = ({\hat{s}}_1,{\hat{s}}_2)^\top \) and \({\hat{{\textbf{t}}}}=\frac{1}{t_1+t_2}(t_1,t_2)^\top \) respectively. Consider the minimum problem \(\min _{W\in \Gamma ^{cl}({\hat{{\textbf{s}}}},{\hat{{\textbf{t}}}})} f(W)\). The proof of Lemma 4.5 yields that this minimum is achieved at \(W^\star \) which has at least one zero diagonal element. Hence Y has at least one zero diagonal element.

Assume first that Y has two zero diagonal elements. Then \(X^\star =\begin{bmatrix}0&{}t_2&{}s_1-t_2\\ t_1&{}0&{}s_2-t_1\end{bmatrix}\). This corresponds to the first case of (C.6). It is left to show that \(X^\star \) is a minimizing matrix. Using the continuity argument we may assume that \(s_1>t_2, s_2>t_1\). Let \(B\in {\mathbb {R}}^{2\times 3}\) be a nonzero matrix such that \(X^\star +cB\in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) for \(c\in [0,\varepsilon ]\) for some small positive \(\varepsilon \). Then \(B=\begin{bmatrix}a&{}-b&{}-a+b\\ -a&{}b&{}a-b\end{bmatrix}\), where \(a,b\ge 0\) and \(a^2+b^2>0\). It is clear that \(f(X^\star )<f(X+cB)\) for each \(c\in (0,\varepsilon ]\). This proves the first case of (C.6).

Assume second that \(x_{11}^\star =0\) and \(x_{22}^\star >0\). Observe that \(x_{21}^\star =t_1>0\). We claim that \(x_{13}^\star =0\). Indeed, suppose that it is not the case. Let \(B=\begin{bmatrix}0&{}1&{}-1\\ 0&{}-1&{}1\end{bmatrix}\). Then \(X^\star + cB\in \Gamma ^{cl}({\textbf{s}}',{\textbf{t}})\) for \(c\in [0,\varepsilon ]\) for some positive \(\varepsilon \). Clearly \(f(X^\star +cB)<f(X^\star )\) for \(c\in (0,\varepsilon ]\). Thus contradicts the minimality of \(X^\star \). Hence \(x_{13}^\star =0\). Therefore \(X^\star =\begin{bmatrix}0&{}s_1&{}0\\ t_1&{}t_2-s_1&{}t_3\end{bmatrix}\). This corresponds to the second case of (C.6).

The third case is when \(x_{11}^\star >0\) and \(x_{22}^\star =0\). We show, as in the second case, that \(x_{23}^\star =0\). Then \(X^\star =\begin{bmatrix}t_1-s_2&{}t_2&{}t_3\\ s_2&{}0&{}0\end{bmatrix}\). This corresponds to the third case of (C.6).

The case \({\textbf{s}}=(s_1,s_2,s_3)^\top ,{\textbf{t}}=(t_1,t_2,0)^\top \) is completely analogous, hence the proof is complete. \(\square \)

Basing on the numerical studies we conjecture that the cases (a)–(d) exhaust the parameter space \(\Pi _3 \times \Pi _3\). Nevertheless, we include for completeness an analysis of the quantum optimal transport \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) under the assumption that this is not the case. The employed techniques might prove useful when studying more general qutrit states or diagonal ququarts.

Proposition C.2

Let \(O\subset \Pi _3\times \Pi _3\) be the set of pairs \({\textbf{s}},{\textbf{t}}\), which do not meet neither of conditions (a)–(d) from Theorem C.1. Suppose that O is nonempty. Then each minimizing \(X^\star \) in the characterization (4.5) of \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) has zero diagonal. Let \(O'\subset O\) be an open dense subset of O such that for each \(({\textbf{s}},{\textbf{t}})\in O'\) and each triple \(\{i,j,k\}=[3]\) the inequalities \(s_p\ne t_q\) and \(s_p+s_q\ne t_r\) hold. Assume that \(({\textbf{s}},{\textbf{t}})\in O'\). The set of matrices in \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) with zero diagonal is an interval spanned by two distinct extreme points \(E_1,E_2\), which have exactly five positive off-diagonal elements. Let \(Z(u)=uE_1+(1-u)E_2\) for \(u\in [0,1]\). Then the minimum of the function \(f(Z(u)), u\in [0,1]\), where f is defined by (4.8), is attained at a unique point \(u^\star \in (0,1)\). The point \(u^\star \) is the unique solution in the interval (0, 1) to a polynomial equation of degree at most 12. The matrix \(X^\star =Z(u^\star )\) is the minimizing matrix for the second minimum problem in (4.5), and \(\textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=f(X^\star )\).

Proof

Assume first that the set \(O\subset \Pi _3\times \Pi _3\) is nonempty and satisfies the conditions (i)-(iv). Combine Theorem 5.3 with part (a) of the theorem to deduce that if the conditions (5.9) do hold for \(n=3\) then

$$\begin{aligned} \textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))>\max _{p\in [3]} \frac{1}{2}(\sqrt{s_p}-\sqrt{t_p})^2. \end{aligned}$$
(C.11)

In view of our assumption the above inequality holds. We first observe that \(s_p\ne t_p\) for each \(p\in [3]\). Assume to the contrary that \(s_p=t_p\). Without loss of generality we can assume that \(s_3=t_3\). Assume that in addition \(s_q=t_q\) for some \(q\in [2]\). Then \({\textbf{s}}={\textbf{t}}\) and

$$\begin{aligned} \textrm{T}_{C^{Q}}^Q({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))=\frac{1}{2}\max _{p\in [3]}(\sqrt{s_p}-\sqrt{t_p})^2=0 \end{aligned}$$

This contradicts (C.11). Hence there exists \(q\in [2]\) such that \(s_q>t_q\) for \(q\in [2]\). Without loss of generality we can assume that \(s_2>t_2\), therefore \(s_1<t_1\), as \(s_1+s_2=t_1+t_2=1-s_3=1-t_3\). Hence for \(Y=\begin{bmatrix}s_1&{}0\\ t_1-s_1&{}t_2\end{bmatrix}\) we have \(X=Y\oplus [s_3]\in \Gamma ^{cl}({\textbf{s}},{\textbf{t}})\). Recall that \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\). We replace Y by \(Y^\star =Y+u^\star \begin{bmatrix}-1&{}1\\ 1&{}-1\end{bmatrix}\) such that \(u^\star >0, Y^\star \ge 0\) and one of the diagonal elements of \(Y^\star \) is zero. By relabeling \(\{1,2\}\) if necessary we can assume that \(Y^\star =\begin{bmatrix}0&{} s_1\\ t_1&{}t_2-s_1\end{bmatrix}\) So \(t_2\ge s_1\) and \(X^\star =Y^\star \oplus [s_3]\in \Gamma ^{cl}({\textbf{s}},{\textbf{t}})\). The minimal characterization (4.5) of \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) yields

$$\begin{aligned} \textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\le f(X^\star )=\frac{1}{2}(\sqrt{s_1}-\sqrt{t_1})^2. \end{aligned}$$

This contradicts (C.11).

As \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\) there exists a maximizing matrix \(F^\star \) to the dual problem of the form given by Lemma 5.2. Let \(X^\star \) be the corresponding minimizing matrix. We claim that \(X^\star \) has zero diagonal. Assume first that \(X^\star \) has a positive diagonal. Then the arguments in part (b) of Lemma 5.2 yield that \(X^\star \) is a symmetric matrix. Thus \({\textbf{s}}={\textbf{t}}\), and this contradicts (C.11).

Assume second that \(X^\star \) has two positive diagonal entries. By renaming the indices we can assume that \(x_{11}^\star =0\), \(x_{22}^\star , x_{33}^\star >0\). Part (b) of Lemma 5.2 and the arguments of its proof yield that we can assume that \(a_2^\star =a_3^\star =b_2^\star =0\). Let \(u^\star =a_1^\star +1/2, v^\star =b_1^\star +1/2\). As \(M_{12}^\star \) is positive semidefinite we have the inequalities: \(u^\star \ge 0, v^\star \ge 0, u^\star v^\star \ge 1/4\). Hence \(x^\star>0, y^\star >0\). Recall that \(F^\star \) is a maximizing matrix for the dual problem (3.1). Hence

$$\begin{aligned} \textrm{T}^Q_{C^Q}\big ({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}})\big )&= -(u^\star -1/2)s_1-(v^\star -1/2)t_1 \\&= -u^\star s_1-v^\star t_1+(s_1+t_1)/2 \\&\le -u^\star s_1-t_1/(4u^\star ) +(s_1+t_1)/2\\&\le -\sqrt{s_1t_1} +(s_1+t_1)/2=(\sqrt{s_1}-\sqrt{t_1})^2/2. \end{aligned}$$

This contradicts (C.11).

We now assume that \(X^\star \) has one positive diagonal entry. Be renaming the indices 1, 2, 3 we can assume that \(x_{11}^\star =x_{22}^\star =0, x_{33}^\star >0\). The conditions (5.3) yield that \(a_3^\star +b_3^\star =0\). Since we can choose \(b_3^\star =0\) we assume that \(a_3^\star =b_3^\star =0\).

Let us assume, case (A1), that \(X^\star \) has six positive off-diagonal entries. We first claim that either \(x^\star _{13}=x^\star _{31}\) or \(x^\star _{23}=x^\star _{32}\). (Those are equivalent conditions if we interchange the indices 1 and 2.) We deduce these conditions and an extra condition using the second conditions of (5.4). First we consider \(x^\star _{12}, x_{13}^\star , x_{32}^\star ,x_{33}^\star \), that is \(i=p=3\), \(j=1, q=2\). By replacing these entries by \(x^\star _{12}-v, x_{13}^\star +v, x_{32}^\star +v,x_{33}^\star -v\) we obtain the equalities

$$\begin{aligned} 1 +x=y+z, \qquad x=\frac{\sqrt{x_{21}^\star }}{\sqrt{x_{12}^\star }}, \quad y=\frac{\sqrt{x_{31}^\star }}{\sqrt{x_{13}^\star }}, \quad z=\frac{\sqrt{x_{23}^\star }}{\sqrt{x_{32}^\star }}. \end{aligned}$$

Second we consider \(x^\star _{21}, x_{23}^\star , x_{31}^\star ,x_{33}^\star \). By replacing these entries by \(x^\star _{21}-v, x_{23}^\star +v, x_{31}^\star +v,x_{33}^\star -v\) we obtain the equality:

$$\begin{aligned} 1+\frac{1}{x}=\frac{1}{z} +\frac{1}{y}. \end{aligned}$$

Multiply the first and the second equality to deduce

$$\begin{aligned} x+\frac{1}{x}=u+\frac{1}{u}, \quad u=\frac{y}{z}\Rightarrow \text { either } x=u \text { or } x=\frac{1}{u}. \end{aligned}$$

Assume first that \(x=u=y/z\). Substitute that into the first equality to deduce that \(z=1\), which implies that \(x_{23}^\star =x_{32}^\star \). Similarly, if \(x=1/u\) we deduce that \(y=1\), which implies that \(x_{13}^\star =x_{31}^\star \). Let us assume for simplicity of exposition that \(x_{23}^\star =x_{32}^\star \). Let X(w) be obtained from \(X^\star \) by replacing \(x_{22}^\star =0,x_{23}^\star , x_{32}^\star , x_{33}^\star \) with \(x_{22}^\star +w,x_{23}^\star -w, x_{32}^\star -w, x_{33}^\star +w\) for \(0<w<x_{23}^\star \). Then X(w) is a minimizing matrix and has two positive diagonal entries. This contradicts our assumption that \(X^\star \) has only one positive diagonal entry.

We now consider the case (A2) that \(x^\star _{ij}=0\) for some \(i\ne j\). Part (a) of Lemma 5.2 yields that \(x^\star _{ji}=0\). We claim that all four off-diagonal entries are positive. Assume to the contrary that \(x^\star _{pq}=0\) for some \(p\ne q\) and \(\{p,q\}\ne \{i,j\}\). Then \(x_{qp}^\star =0\). As \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\) we must have that \(x^\star _{12}x^\star _{21}>0\) and all four other off-diagonal entries are zero. But then \(s_1=t_2, t_1=s_2, s_3=t_3\). This is impossible since we showed that \(s_3\ne t_3\). Hence \(X^\star \) has exactly four positive off-diagonal entries.

Let us assume first that \(x_{12}^\star =x_{21}^\star =0\). Then \(X^\star \) is of the form given by (C.7), where \(t_3>s_1+s_2\). We now recall again the conditions (5.3). As we already showed, we can assume that \(a_3^\star =b_3^\star =0\). As \(x_{11}^\star =x_{22}^\star =0\) all of the first three conditions of (5.3) hold. As \(x_{12}^\star =x_{21}^\star =0\) the second condition of (5.3) holds trivially for \(i=1,j=2\). The conditions for \(i=1,j=3\) and \(i=2, j=3\) are

$$\begin{aligned}&s_1(a_1^\star +1/2)+t_1(b_1^\star +1/2)=\sqrt{s_1t_1},\\&s_2(a_2^\star +1/2)+t_2(b_2^\star +1/2)=\sqrt{s_2t_2}. \end{aligned}$$

We claim that (C.8) holds. Using the assumption that \(\det M_{13}^\star \ge 1/4\) and the inequality of arithmetic and geometric means we deduce that \(\det M_{13}^\star = 1/4\). Hence

$$\begin{aligned}&a_1^\star +1/2=u, \quad b_1^\star +1/2=1/(4u), \qquad \text { for some }u>0, \\&s_1u+t_1/(4u)t_1\ge \sqrt{s_1t_1}. \end{aligned}$$

Equality holds if and only if \(u=\sqrt{t_1}/(2\sqrt{s_1})\). This shows the first equality in (C.8). The second equality in (C.8) is deduced similarly. We now show that the conditions (C.1) hold for \(i=1,j=2,k=3\). As \(t_3>s_1+s_2\) the first condition of (C.8) holds. We use the conditions that \(M_{12}^\star \) is positive semidefinite. Let \(u=\sqrt{t_1}/{\sqrt{s_1}}, v=\sqrt{s_2}{\sqrt{t_2}}\). Then the arguments of the proof of part (b) yield

$$\begin{aligned}&2(a_1^\star +b_2^\star +1)=u+v-1>0, \quad 2(a_2^\star +b_1^\star +1)=(1/u+1/v-1)>0,\\&4\det M_{12}^\star =\big (1/(uv)\big )(1-u)(v-1). \end{aligned}$$

So either \(u\ge 1\) and \(v\le 1\), or \(u\le 1\) and \(v\ge 1\). Hence (C.1) holds for \(i=1,j=2,k=3\). This contradicts our assumption that (C.1) does not hold.

Let us assume second that \(x_{12}^\star>0, x_{21}^\star >0\). Then either \(x_{13}^\star =x_{31}^\star =0\) or \(x_{23}^\star =x_{32}^\star =0\). By relabeling 1, 2 we can assume that \(x_{23}^\star =x_{32}^\star =0\). Hence \(X^\star \) is of the form (C.9), where \(s_1>t_2>0, t_1>s_2>0, s_2+s_3>t_1\). Hence the conditions (C.3) are satisfied with \(i=1,j=2, k=3\). We now obtain a contradiction by showing that the conditions (C.4) are satisfied. This is done using the same arguments as in the previous case as follows. First observe that the second nontrivial conditions of (5.3) are:

$$\begin{aligned}&t_2(a_1^\star +b_2^\star +1/2)+s_2(a_2^\star +b_1^\star +1/2)=\sqrt{s_2t_2},\\&(s_1-t_2)(a_1^\star +1/2)+t_1(b_1^\star +1/2)=\sqrt{(s_1-t_2)(t_1-s_2)}. \end{aligned}$$

As in the previous case we deduce that

$$\begin{aligned}&a_1^\star +b_2^\star +1/2=\sqrt{s_2}/(2\sqrt{t_2}),{} & {} b_1^\star +a_2^\star +1/2=\sqrt{t_2}/(2\sqrt{s_2}),\\&a_1^\star +1/2=\sqrt{t_1-s_2}/(2\sqrt{s_1-t_2}),{} & {} b_1^\star +1/2=\sqrt{s_1-t_2}/(2\sqrt{t_1-s_2}). \end{aligned}$$

Hence (C.10) holds. We now recall the proof of part (c) of the theorem. We have thus shown that the minimizing matrix \(X^\star \) has zero diagonal.

We now show that O is an open set. Clearly, the set of all pairs of probability vectors \(O_1\subset \Pi _3\times \Pi _3\) such that at least one of them has a zero coordinate is a closed set. Let \(O_2, O_3, O_4\subset \Pi _3\times \Pi _3\) be the sets which satisfy the conditions (a), (b),(c) of the theorem respectively. It it straightforward to show: \(O_2\) is a closed set, and Closure\((O_3)\subset (O_3\cup O_1)\). We now show that Closure\((O_4)\subset O_4\cup O_1\cup O_2\). Indeed, assume that we have a sequence \(({\textbf{s}}_l,{\textbf{t}}_l)\in O_4, l\in {\mathbb {N}}\) that converges to \(({\textbf{s}},{\textbf{t}})\). It is enough to consider the case where \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\). Again we can assume for simplicity that each \(({\textbf{s}}_l,{\textbf{t}}_l)\) satisfies the conditions (C.3) and (C.4) for \(i=1, j=2,k=3\). Then we deduce that the limit of the minimizing matrices \(X^\star _l\) is of the form (C.9). Hence \(\lim _{l\rightarrow \infty }X^\star _l=X^\star \), where \(X^\star \) is of the form (C.9). Also \(X^\star \) is a minimizing matrix for \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\). Recall that \(s_2,t_2>0\). If \(s_1-t_2>0,t_1-s_2>0\) then \(({\textbf{s}},{\textbf{t}})\in O_4\). So assume that \((s_1-t_2)(t_1-s_2)=0\). As \(X^\star \) is minimizes \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) and \({\textbf{s}},{\textbf{t}}>{\textbf{0}}\), part (a) of Lemma 5.2 yields that \(s_1=t_2, t_1=s_2\). Hence \(s_3=t_3\). As \(X^\star \) is minimizes \(\textrm{T}^Q_{C^Q}({{\,\textrm{diag}\,}}({\textbf{s}}),{{\,\textrm{diag}\,}}({\textbf{t}}))\) we get that \(\textrm{T}^Q_{C^Q}=\frac{1}{2}(\sqrt{s_2}-\sqrt{t_2})^2\). Hence \(({\textbf{s}},{\textbf{t}})\in O_2\). This shows that \(O_1\cup O_2\cup O_3\cup O_4\) is a closed set. Therefore \(O=\Pi _3\times \Pi _3{\setminus }(O_1\cup O_2\cup O_3\cup O_4)\) is an open set. If O is an empty set then proof of the theorem is concluded.

Assume that O is a nonempty set. Let \(O'\subset O\) be an open dense subset of O such that for each \(({\textbf{s}},{\textbf{t}})\in O'\) and each triple \(\{p,q,r\}=[3]\) the inequality \(s_p\ne t_q\) and \(s_p+s_q\ne t_r\) hold.

Assume that \(({\textbf{s}},{\textbf{t}})\in O'\). Let \(\Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) be the convex subset of \(\Gamma ^{cl}({\textbf{s}},{\textbf{t}})\) of matrices with zero diagonal. We claim that any \(X\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) has at least 5 nonzero entries. Indeed, suppose that \(X\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) has two zero off-diagonal entries. As \({\textbf{s}},{\textbf{t}}>0\) they cannot be in the same row or column. By relabeling the rows we can assume that the two zero elements are in the first and the second row. Suppose first that \(x_{12}^\star =x_{23}^\star =0\). Then \(X=\begin{bmatrix}0&{}0&{}s_1\\ s_2&{}0&{}0\\ t_1-s_2&{}t_2&{}0\end{bmatrix}\). Thus \(s_1=t_3\) which is impossible. Assume now that \(x_{12}^\star =x_{21}^\star =0\). Then \(s_1+s_2=t_3\) which is impossible. All other choices also are impossible.

We claim that \( \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) is spanned by two distinct extreme points \(E_1,E_2\), which have exactly five positive off-diagonal elements. Suppose first that there exists \(X\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) which has six postive off-diagonal elements. Let

$$\begin{aligned} B=\begin{bmatrix}0&{}1&{}-1\\ -1&{}0&{}1\\ 1&{}-1&{}0\end{bmatrix}. \end{aligned}$$

Then all matrices in \(\Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) are of the form \(X^\star +uB, u\in [u_1,u_2]\) for some \(u_1<u_2\). Consider the matrix \(E_1=X^\star +u_1B\). It has at least one zero off-diagonal entry hence we conclude that \(E_1\) has exactly five off-diagonal positive elements. Similarly \(E_2=X+u_2B\) has five positive off-diagonal elements. Assume now that \(E\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) has five positive off-diagonal elements. Hence there exits a small \(u>0\) such that either \(E+uB\) or \(E-uB\) has six positive off-diagonal elements. Hence \( \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) contains a matrix with six positive diagonal elements. Therefore \( \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\) is an interval spanned by \(E_1\ne E_2\in \Gamma ^{cl}_0({\textbf{s}},{\textbf{t}})\), where \(E_1\) and \(E_2\) have five positive off-diagonal elements. Part (a) of Lemma 5.2 yields that \(X^\star \) has six positive off-diagonal elements. Consider \(E_1\) and assume that the (1, 2) entry of \(E_1\) is zero. Then

$$\begin{aligned} E_1=\begin{bmatrix}0&{}0&{}s_1\\ s_1+s_2-t_3&{}0&{}t_3-s_1\\ s_3-t_2&{}t_2&{}0 \end{bmatrix}. \end{aligned}$$

As \(f(E_1+uB)\) is strictly convex on \([0,u_3]\), there exists a unique \(u^\star \in (0, u_3)\) which satisfies the equation

$$\begin{aligned}{} & {} -\frac{\sqrt{s_1+s_2-t_3-u}}{\sqrt{u}}+\frac{\sqrt{u}}{\sqrt{s_1+s_2-t_3-u}} -\frac{\sqrt{s_1-u}}{\sqrt{s_3-t_2+u}} \\{} & {} \quad +\frac{\sqrt{s_3-t_2+u}}{\sqrt{s_1-u}}- \frac{\sqrt{t_2-u}}{\sqrt{t_3-s_1+u}}+\frac{\sqrt{t_3-s_1+u}}{\sqrt{t_2-u}}=0. \end{aligned}$$

It is not difficult to show that the above equation is equivalent to a polynomial equation of degree at most 12 in u. Indeed, group the six terms into three groups, multiply by the common denominator, and pass the last group to the other side of the equality to obtain the equality:

$$\begin{aligned}&\sqrt{(s_1-u)(s_3-t_2+u)(t_3-s_1+u)(t_2-u)}(2u+t_3-s_1-s_2)\\&\qquad +\sqrt{u(s_1+s_2-t_3-u)(t_3-s_1+u)(t_2-u)}(2u+t_3-s_1-s_2)(2u+s_3-s_1-t_2)\\&\quad =\sqrt{u(s_1+s_2-t_3-u)(s_3-t_2+u)(t_3-s_1+u)}(-2u+s_1+t_2-t_3). \end{aligned}$$

Raise this equality to the second power. Put all polynomial terms of degree 6 on the left hand side, and the one term with a square radical on the other side. Raise to the second power to obtain a polynomial equation in u of degree at most 12. Hence \(X^\star =E_1+u^\star B\). This completes the proof of (e). \(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cole, S., Eckstein, M., Friedland, S. et al. On Quantum Optimal Transport. Math Phys Anal Geom 26, 14 (2023). https://doi.org/10.1007/s11040-023-09456-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11040-023-09456-7

Keywords

Mathematics Subject Classification

Navigation