Skip to main content
Log in

Exact Maximum-Entropy Estimation with Feynman Diagrams

  • Published:
Journal of Statistical Physics Aims and scope Submit manuscript

Abstract

A longstanding open problem in statistics is finding an explicit expression for the probability measure which maximizes entropy with respect to given constraints. In this paper a solution to this problem is found, using perturbative Feynman calculus. The explicit expression is given as a sum over weighted trees.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. See for example the proof in Joel Feldman’s lecture notes, http://www.math.ubc.ca/~feldman/m425/impFnThm.pdf. To apply the argument we need that \((\partial f)(x,y) \in \textit{Hom}(V,V^*)\) is invertible for all \((x,y) \in U \times W\). Since \(\partial f = B({\text {id}} - \partial g)\) this follows from our assumption that \((\partial g)(x,y) \in \textit{Hom}(V,V)\) is contracting.

References

  1. Avellaneda, M., Friedman, C., Holmes, R., Samperi, D.: Calibrating volatility surfaces via relative-entropy minimization. App. Math. Financ. 4(1), 37–64 (1997)

    Article  MATH  Google Scholar 

  2. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2012)

    MATH  Google Scholar 

  3. Etingof, P.: Geometry and quantum field theory. MIT OpenCourseware 18.238 (2002)

  4. Frisch, H.L., Lebowitz, J.L.: The Equilibrium Theory of Classical Fluids. Benjamin, New York (1964)

    MATH  Google Scholar 

  5. Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106(4), 620–630 (1957)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  6. Jaynes, E.T.: Information theory and statistical mechanics. II. Phys. Rev. 108(2), 171–190 (1957)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  7. Lin, H.W., Tegmark, M., Rolnick. D.: Why does deep and cheap learning work so well? arXiv preprint. arXiv:1608.08225 (2016)

  8. Shell, S.M.: The relative entropy is fundamental to multiscale and inverse thermodynamic problems. J. Chem. Phys. 129(14), 144108 (2008)

    Article  ADS  Google Scholar 

Download references

Acknowledgements

We thank O. Bozo, B. Gomberg, R.S. Melzer, A. Moscovitch-Eiger, R. Schweiger, A. Solomon and D. Zernik for discussions related to the work presented here. R.T. was partially supported by Dr. Max Rössler, the Walter Haefner Foundation and the ETH Zurich Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ran J. Tessler.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Appendix A: Proof of Lemma 3

Appendix A: Proof of Lemma 3

Recall that by Lemma 24

$$\begin{aligned} \log \left( \sum _{\sigma }q_{\sigma }\exp \left( -\sum _{i=1}^{k}\lambda _{i}r_{i}(\sigma )\right) \right) =1+\lambda _{0}=1+\lambda _{0}(\lambda _{1},\ldots ,\lambda _{k}). \end{aligned}$$

Hence \(\lambda _{0}\) is an analytic function of \(\lambda _{1},\ldots ,\lambda _{k}\) around \(\lambda _{1}=\dots =\lambda _{k}=0\). Now,

$$\begin{aligned}&\rho _{i}(\lambda _{1},\ldots ,\lambda _{k})=\sum _{\sigma }r_{i}(\sigma )q_{\sigma }\exp \left( -1-\lambda _{0}(\lambda _{1},\ldots ,\lambda _{k})-\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) \\&\quad =\frac{\sum _{\sigma }r_{i}(\sigma )q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) }{\exp (1+\lambda _{0}(\lambda _{1},\ldots ,\lambda _{k}))}=\frac{\sum _{\sigma }r_{i}(\sigma )q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) }{\sum _{\sigma }q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) } \end{aligned}$$

So that \(\rho _{i}(\lambda _{1},\ldots ,\lambda _{k})\) is an analytic function of \(\lambda _{1},\ldots ,\lambda _{k}\)

The proof that \(\lambda _{i}=\lambda _{i}(\rho _{1},\ldots ,\rho _{k})\) is an analytic function of \(\rho _{1},\ldots ,\rho _{k}\) around

$$\begin{aligned} \lambda _{1}=\dots =\lambda _{k}=\rho _{1}=\dots =\rho _{k}=0, \end{aligned}$$

uses the analytic inverse function theorem. It is enough to show that the Jacobian \(\frac{\partial (\rho _{1},\ldots ,\rho _{k})}{\partial (\lambda _{1},\ldots ,\lambda _{k})}\) is invertible for\(\lambda _{1}=\dots =\lambda _{k}=0.\)

But

$$\begin{aligned} \frac{\partial \rho _{i}}{\partial \lambda _{j}}&= \frac{\left( \sum _{\sigma }r_{i}(\sigma )r_{j}(\sigma )q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) \right) \left( \sum _{\sigma }q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) \right) }{\left( \sum _{\sigma }q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) \right) ^{2}}\nonumber \\&\quad -\frac{\left( \sum _{\sigma }r_{l}(\sigma )q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) \right) \left( \sum _{\sigma }r_{j}(\sigma )q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) \right) }{{\left( \sum _{\sigma }q_{\sigma }\exp \left( -\sum _{l=1}^{k}\lambda _{l}r_{l}(\sigma )\right) \right) ^{2}}}. \end{aligned}$$
(35)

Evaluation at \(\lambda _{1}=\dots =\lambda _{k}=0\) gives

$$\begin{aligned} \frac{\partial \rho _{i}}{\partial \lambda _{j}}\left| _{\lambda _{1}=\cdots =\lambda _{k}=0}\right.= & {} \frac{\left( \sum _{\sigma }r_{i}(\sigma )r_{j}(\sigma )q_{\sigma }\right) \left( \sum _{\sigma }q_{\sigma }\right) -\left( \sum _{\sigma }r_{i}(\sigma )q_{\sigma }\right) \left( \sum _{\sigma }r_{j}(\sigma )q_{\sigma }\right) }{\left( \sum _{\sigma }q_{\sigma }\right) ^{2}} \\= & {} \frac{\mathbb {E}_{Q}(r_{i}r_{j})\cdot 1-\mathbb {E}_{Q}(r_{i})\mathbb {E}_{Q}(r_{j})}{1^{2}}=\mathrm {Cov}_{Q}(r_{i},r_{j}). \end{aligned}$$

By assumption the KL constraint problem is normalized, hence

$$\begin{aligned} \frac{\partial \rho _{i}}{\partial \lambda _{j}}\left| _{\lambda _{1}=\cdots =\lambda _{k}=0}\right. =\mathrm {Cov}_{Q}(r_{i},r_{j})=\delta _{i,j}. \end{aligned}$$

The Jacobian \(\frac{\partial (\rho _{1},\ldots ,\rho _{k})}{\partial (\lambda _{1},\ldots ,\lambda _{k})}=I_{k}\) is thus invertible. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Netser Zernik, A., Schlank, T.M. & Tessler, R.J. Exact Maximum-Entropy Estimation with Feynman Diagrams. J Stat Phys 170, 731–747 (2018). https://doi.org/10.1007/s10955-018-1960-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10955-018-1960-x

Keywords

Navigation