Machine Learning

, 78:63

Bayesian generalized probability calculus for density matrices

Open Access


One of the main concepts in quantum physics is a density matrix, which is a symmetric positive definite matrix of trace one. Finite probability distributions can be seen as a special case when the density matrix is restricted to be diagonal.

We develop a probability calculus based on these more general distributions that includes definitions of joints, conditionals and formulas that relate these, including analogs of the Theorem of Total Probability and various Bayes rules for the calculation of posterior density matrices. The resulting calculus parallels the familiar “conventional” probability calculus and always retains the latter as a special case when all matrices are diagonal. We motivate both the conventional and the generalized Bayes rule with a minimum relative entropy principle, where the Kullbach-Leibler version gives the conventional Bayes rule and Umegaki’s quantum relative entropy the new Bayes rule for density matrices.

Whereas the conventional Bayesian methods maintain uncertainty about which model has the highest data likelihood, the generalization maintains uncertainty about which unit direction has the largest variance. Surprisingly the bounds also generalize: as in the conventional setting we upper bound the negative log likelihood of the data by the negative log likelihood of the MAP estimator.


Generalized probability Probability calculus Density matrix Quantum Bayes rule 


  1. Alexa, M. (2002). Linear combination of transformations. In SIGGRAPH’02: Proceedings of the 29th annual conference on computer graphics and interactive techniques (pp. 380–387). New York: ACM Press. CrossRefGoogle Scholar
  2. Bernstein, D. S. (2005). Matrix mathematics: theory, facts, and formulas with application to linear systems theory. Princeton: Princeton University Press. MATHGoogle Scholar
  3. Bhatia, R. (1997). Matrix analysis. Berlin: Springer. Google Scholar
  4. Buz̆ek, V., Drobný, G., Derka, R., Adam, G., & Wiedemann, H. (1999). Quantum state reconstruction from incomplete data. Chaos Solitons Fractals, 10, 981–1074. CrossRefMathSciNetGoogle Scholar
  5. Caves, C. M., Fuchs, C. A., Manne, K. K., & Renes, J. M. (2004). Gleason-type derivations of the quantum probability rule for generalized measurements. Foundations of Physics, 34, 193–209. MATHCrossRefMathSciNetGoogle Scholar
  6. Cerf, N. J., & Adami, C. (1999). Quantum extension of conditional probability. Physical Review A, 60(2), 893–897. CrossRefMathSciNetGoogle Scholar
  7. Feynman, R. P. (1972). Statistical mechanics: a set of lectures. Reading: Addison-Wesley. Google Scholar
  8. Gleason, A. (1957). Measures on the closed subspaces of a Hilbert space. Indiana University Mathematics Journal, 6, 885–893. MATHCrossRefMathSciNetGoogle Scholar
  9. Holevo, A. S. (2001). Lecture notes in physics. Monographs: Vol. 67. Statistical structure of quantum theory, Berlin, New York: Springer. Google Scholar
  10. Kato, T. (1978). Trotter’s product formula for an arbitrary pair of self-adjoint contraction semigroups. Topics in Functional Analysis (Advances in Mathematics—Supplementary Studies), 3, 185–195. Google Scholar
  11. Kivinen, J., & Warmuth, M. K. (1997). Additive versus exponentiated gradient updates for linear prediction. Information and Computation, 132(1), 1–64. MATHCrossRefMathSciNetGoogle Scholar
  12. Kivinen, J., & Warmuth, M. K. (1999). Averaging expert predictions. In Lecture notes in artificial intelligence : Vol. 1572. Computational learning theory, 4th European conference (EuroCOLT’99), Nordkirchen, Germany, March 29–31, 1999, Proceedings (pp. 153–167). Berlin: Springer. Google Scholar
  13. Nielsen, M. A., & Chuang, I. L. (2000). Quantum computation and quantum information. Cambridge: Cambridge University Press. MATHGoogle Scholar
  14. Olivares, S., & Paris, M. G. A. Quantum estimation via the minimum Kullback entropy principle. Physical Review A, 76, 2007. Google Scholar
  15. Schack, R., Brun, T. A., & Caves, C. M. (2001). Quantum Bayes rule. Physical Review A, 64, 014305. CrossRefMathSciNetGoogle Scholar
  16. Simon, B. (1979). Functional integration and quantum physics. San Diego: Academic Press. MATHGoogle Scholar
  17. Singh, R., Warmuth, M. K., Raj, B., & Lamere, P. (2003). Classification with free energy at raised temperatures. In Proc. of EUROSPEECH 2003, September 2003 (pp. 1773–1776) Google Scholar
  18. Tsuda, K., Raätsch, G., & Warmuth, M. K. (2005). Matrix exponentiated gradient updates for on-line learning and Bregman projections. Journal of Machine Learning Research, 6, 995–1018. MathSciNetGoogle Scholar
  19. Warmuth, M. K. (2005). Bayes rule for density matrices. In Advances in neural information processing systems 18 (NIPS’05). Cambridge: MIT Press. Google Scholar
  20. Warmuth, M. K. (2007). Winnowing subspaces. In Proceedings of the 24th international conference on machine learning (ICML’07). New York: ACM. Google Scholar
  21. Warmuth, M. K., & Kuzmin, D. (2006). Online variance minimization. In Proceedings of the 19th annual conference on learning theory (COLT’06), Pittsburg, June 2006. New York: Springer. Google Scholar
  22. Warmuth, M. K., & Kuzmin, D. (2008). Randomized PCA algorithms with regret bounds that are logarithmic in the dimension. Journal of Machine Learning Research, 9, 2217–2250. MathSciNetGoogle Scholar
  23. Zellner, A. (1998). Optimal information processing and Bayes’s theorem. The American Statistician, 42(4), 278–284. CrossRefMathSciNetGoogle Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  1. 1.UC California—Santa CruzSanta CruzUSA

Personalised recommendations