A mini-introduction to information theory

Witten, Edward

doi:10.1007/s40766-020-00004-5

A mini-introduction to information theory

Review Paper
Published: 23 March 2020

Volume 43, pages 187–227, (2020)
Cite this article

La Rivista del Nuovo Cimento Aims and scope

Edward Witten¹

6689 Accesses
55 Citations
359 Altmetric
10 Mentions
Explore all metrics

Abstract

This article consists of a very short introduction to classical and quantum information theory. Basic properties of the classical Shannon entropy and the quantum von Neumann entropy are described, along with related concepts such as classical and quantum relative entropy, conditional entropy, and mutual information. A few more detailed topics are considered in the quantum case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fundamentals of Artificial Neural Networks and Deep Learning

Neural Networks – State of Art, Brief History, Basic Models and Architecture

Centrality measures in networks

Article 24 April 2023

Notes

The article is based on a lecture at the 2018 summer program Prospects in Theoretical Physics at the Institute for Advanced Study.
Generically, a random variable will be denoted X, Y, Z, etc. The probability to observe \(X=x\) is denoted \(P_X(x)\), so if \(x_i\), \(i=1,\ldots ,n\) are the possible values of X, then \(\sum _i P_X(x_i)=1\). Similarly, if X, Y are two random variables, the probability to observe \(X=x\), \(Y=y\) will be denoted \(P_{X,Y}(x,y)\).
Here \(\frac{N!}{\prod _{j=1}^s(p_jN)!}\) is the number of sequences in which outcome \(x_i\) occurs \(p_iN\) times, and \(\prod _{i=1}^s q_i^{p_iN}\) is the probability of any specific such sequence, assuming that the initial hypothesis \(Q_X\) is correct.
What we have described is not the most general statement of monotonicity of relative entropy in classical information theory. More generally, relative entropy is monotonic under an arbitrary stochastic map. We will not explain this here, though later we will explain the quantum analog (quantum relative entropy is monotonic in any quantum channel).
See, however, [6] for a partial substitute.
The von Neumann entropy is the most important quantum entropy, but generalizations such as the Rényi entropies \(S_\alpha (\rho _A)=\frac{1}{1-\alpha }\log \mathrm{Tr}\, \rho _A^\alpha \) can also be useful.
For this, consider an arbitrary density matrix \(\rho \) and a first order perturbation \(\rho \rightarrow \rho +\delta \rho \). After diagonalizing \(\rho \), one observes that to first order in \(\delta \rho \), the off-diagonal part of \(\delta \rho \) does not contribute to the trace in the definition of \(S(\rho +\delta \rho )\). Therefore, \(S(\rho (t))\) can be differentiated assuming that \(\rho \) and \({{\dot{\rho }}}\) commute. So it suffices to check (3.35) for a diagonal family of density matrices \(\rho (t)={\mathrm {diag}}(\lambda _1(t),\lambda _2(t),\ldots ,\lambda _n(t))\), with \(\sum _i \lambda _i(t)=1\). Another approach is to use (3.36) to substitute for \(\log \rho (t)\) in the definition \(S(\rho (t))=-\mathrm{Tr}\,\rho (t)\log \rho (t)\). Differentiating with respect to t, observing that \(\rho (t)\) commutes with \(1/(s+\rho (t))\), and then integrating over s, one arrives at (3.35). In either approach, one uses that \(\mathrm{Tr}\,{{\dot{\rho }}}=0\) since \(\mathrm{Tr}\,\rho (t)=1\).
The following paragraph may be omitted on first reading. It is included to make possible a more general statement in Sect. 3.7.
In the most general case, a quantum channel is a “completely positive trace-preserving” (CPTP) map from density matrices on one Hilbert space \({{\mathcal {H}}}\) to density matrices on another Hilbert space \({{\mathcal {H}}}'\).
See Eq. (6.16) of [17]. One approach to this upper bound is as follows. In general, the highest weight of an irreducible representation of the group SU(k) is a linear combination of certain fundamental weights with nonnegative integer coefficients \(a_i\), \(i=1,\ldots ,k-1\). In the case of a representation associated to a Young diagram with N boxes, the \(a_i\) are bounded by N. The dimension of an irreducible representation with highest weights \((a_1,a_2,\ldots ,a_{k-1})\) is a polynomial in the \(a_i\) of total degree \(k(k-1)/2\), so if all \(a_i\) are bounded by N, the dimension is bounded by a constant times \(N^{k(k-1)/2}\). One way to prove that the dimension is a polynomial in the \(a_i\) of the stated degree is to use the Borel-Weil-Bott theorem. According to this theorem, a representation with highest weights \((a_1,a_2,\ldots ,a_{k-1})\) can be realized as \(H^0(F,\otimes _{i=1}^{k-1} {{\mathcal {L}}}_i^{a_i})\), where \(F=SU(k)/U(1)^{k-1}\) is the flag manifold of the group SU(k) and \({{\mathcal {L}}}_i\rightarrow F\) are certain holomorphic line bundles. Because F has complex dimension \(k(k-1)/2\), the Riemann-Roch theorem says that the dimension of \(H^0(F,\otimes _{i=1}^{k-1} {{\mathcal {L}}}_i^{a_i})\) is a polynomial in the \(a_i\) of that degree.
The right hand side is actually positive because of the inequality (3.42).

References

M.A. Nielsen, I.L. Chuang, Quantum Computation And Quantum Information (Cambridge University Press, Cambridge, 2000)
MATH Google Scholar
T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd edn. (Wiley, New York, 2006)
MATH Google Scholar
M.M. Wilde, Quantum Information Theory, 2nd edn. (Cambridge University Press, Cambridge, 2017)
Book Google Scholar
J. Preskill, Lecture notes (2019). http://www.theory.caltech.edu/~preskill/ph219/index.html#lecture
C.E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27, 379-423–623-656 (1918)
MathSciNet MATH Google Scholar
M.F. Leifer, R.W. Spekkens, Towards a formulation of quantum theory as a causally neutral theory of Bayesian inference. Phys. Rev. A 88, 052130 (2013). arXiv:1107.5849
Article ADS Google Scholar
A.S. Holevo, Bounds for the quantity of information transmitted by a quantum communication channel. Probl. Inf. Transm. 9, 177–83 (1973)
Google Scholar
H. Araki, E.H. Lieb, Entropy inequalities. Commun. Math. Phys. 18, 160–70 (1970)
Article ADS MathSciNet Google Scholar
H. Umegaki, Conditional expectation in an operator algebra. Kodai Math. Sem. Rep. 14, 59–85 (1962)
Article Google Scholar
E.H. Lieb, M.B. Ruskai, Proof of the strong subadditivity of quantum mechanical entropy. J. Math. Phys. 14, 1938 (1973)
Article ADS MathSciNet Google Scholar
E.H. Lieb, Convex trace functions and the Wigner–Yanase–Dyson conjecture. Adv. Math. 11, 267–88 (1973)
Article MathSciNet Google Scholar
E. Witten, Notes on some entanglement properties of quantum field theory. Rev. Mod. Phys. 90, 045003 (2018). arXiv:1803.04993
Article ADS Google Scholar
C.H. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres, W.K. Wootters, Teleporting an unknown quantum state via dual classical and Einstein–Podolsky–Rosen channels. Phys. Rev. Lett. 70, 1895–9 (1993)
Article ADS MathSciNet Google Scholar
M. Horodecki, J. Oppenheim, A. Winter, Quantum state merging and negative information. Commun. Math. Phys. 269, 107–36 (2007). arXiv:quant-ph/0512247
Article ADS MathSciNet Google Scholar
F. Hiai, D. Petz, The proper formula for relative entropy and its asymptotics in quantum probability. Commun. Math. Phys. 143, 99–114 (1991)
Article ADS MathSciNet Google Scholar
M. Hayashi, Asymptotics of quantum relative entropy from representation theoretical viewpoint. J. Phys. A 34, 3413–20 (2001)
Article ADS MathSciNet Google Scholar
M. Hayashi, A Group Theoretic Approach to Quantum Information (Springer, New York, 2017)
Book Google Scholar
I. Bjelakovic, R. Siegmund-Schultze, Quantum Stein’s Lemma Revisited, Inequalities For Quantum Entropies, and a Concavity Theorem of Lieb (2012). arXiv:quant-ph/0307170

Download references

Acknowledgements

Research supported in part by NSF Grant PHY-1606531. I thank N. Arkani-Hamed, J. Cotler, B. Czech, M. Headrick, and R. Witten for discussions. I also thank M. Hayashi, as well as the referees, for some explanations and helpful criticisms and for a careful reading of the manuscript.

Author information

Authors and Affiliations

School of Natural Sciences, Institute for Advanced Study Einstein Drive, Princeton, NJ, 08540, USA
Edward Witten

Authors

Edward Witten
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edward Witten.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Witten, E. A mini-introduction to information theory. Riv. Nuovo Cim. 43, 187–227 (2020). https://doi.org/10.1007/s40766-020-00004-5

Download citation

Received: 01 February 2020
Accepted: 05 February 2020
Published: 23 March 2020
Issue Date: April 2020
DOI: https://doi.org/10.1007/s40766-020-00004-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A mini-introduction to information theory

Abstract

Access this article

Similar content being viewed by others

Fundamentals of Artificial Neural Networks and Deep Learning

Neural Networks – State of Art, Brief History, Basic Models and Architecture

Centrality measures in networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A mini-introduction to information theory

Abstract

Access this article

Similar content being viewed by others

Fundamentals of Artificial Neural Networks and Deep Learning

Neural Networks – State of Art, Brief History, Basic Models and Architecture

Centrality measures in networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation