The Bayesian maximum entropy method for lognormal variables

Orton, T. G.; Lark, R. M.

doi:10.1007/s00477-008-0217-7

The Bayesian maximum entropy method for lognormal variables

Original Paper
Published: 06 February 2008

Volume 23, pages 319–328, (2009)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

T. G. Orton¹ &
R. M. Lark¹

183 Accesses
4 Citations
Explore all metrics

Abstract

The Bayesian maximum entropy (BME) method can be used to predict the value of a spatial random field at an unsampled location given precise (hard) and imprecise (soft) data. It has mainly been used when the data are non-skewed. When the data are skewed, the method has been used by transforming the data (usually through the logarithmic transform) in order to remove the skew. The BME method is applied for the transformed variable, and the resulting posterior distribution transformed back to give a prediction of the primary variable. In this paper, we show how the implementation of the BME method that avoids the use of a transform, by including the logarithmic statistical moments in the general knowledge base, gives more appropriate results, as expected from the maximum entropy principle. We use a simple illustration to show this approach giving more intuitive results, and use simulations to compare the approaches in terms of the prediction errors. The simulations show that the BME method with the logarithmic moments in the general knowledge base reduces the errors, and we conclude that this approach is more suitable to incorporate soft data in a spatial analysis for lognormal data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Article 07 February 2017

Check your outliers! An introduction to identifying statistical outliers in R with easystats

Article 25 March 2024

Topp-Leone Exponentiated Pareto Distribution: Properties and Application to Covid-19 Data

Article Open access 08 May 2024

References

Aitchison J, Brown JAC (1957) The lognormal distribution. Cambridge University Press, Cambridge
Google Scholar
Christakos G (1990) A Bayesian/maximum-entropy view to the spatial estimation problem. Math Geol 22:763–777
Article Google Scholar
Christakos G (2000) Modern spatiotemporal geostatistics. Oxford University Press, New York
Google Scholar
Christakos G, Bogaert P, Serre ML (2002) Temporal GIS: advanced functions for field-based applications. Springer, New York
Google Scholar
Cressie N (2006) Block kriging for lognormal processes. Math Geol 38:413–443
Article Google Scholar
Deutsch CV, Journel AG (1998) GSLIB: geostatistical software library and user’s guide. Oxford University Press, New York
Google Scholar
Douaik A, Van Meirvenne M, Tóth T (2005) Soil salinity mapping using spatio-temporal kriging and Bayesian maximum entropy with interval soft data. Geoderma 128:234–248
Article Google Scholar
Dowd PA (1982) Lognormal kriging—the general case. Math Geol 14:475–499
Article Google Scholar
Journel AG (1980) The lognormal approach to predicting local distributions of selective mining unit grades. Math Geol 12:285–303
Article Google Scholar
Journel AG, Huijbregts ChJ (1978) Mining geostatistics. Academic Press, London
Google Scholar
Kapur JN, Kesavan HK (1992) Entropy optimization principles with applications. Academic Press, London
Google Scholar
Kendall MG, Stuart A (1963) The advanced theory of statistics, vol 1: distribution theory. 2nd edn. Charles Griffin & Company Limited, London
Google Scholar
Kerry R, Oliver MA (2007) Determining the effect of asymmetric data on the variogram. I. Underlying asymmetry. Comput Geosci 33:1212–1232
Article Google Scholar
Lee Y-H, Ellis JH (1997) Estimation and simulation of lognormal random fields. Comput Geosci 23:19–31
Article Google Scholar
MATLAB (2004) MATLAB 7.0.1. The MathWorks Inc., Natick
Google Scholar
Orton TG, Lark RM (2007a) Estimating the local mean for Bayesian maximum entropy by generalized least squares and maximum likelihood, and an application to the spatial analysis of a censored soil variable. Eur J Soil Sci 58:60–73
Article Google Scholar
Orton TG, Lark RM (2007b) Accounting for the uncertainty in the local mean in spatial prediction by Bayesian Maximum Entropy. Stoch Env Res Risk A 21:773–784
Article Google Scholar
Pawlowsky-Glahn V, Olea RA (2004) Geostatistical analysis of compositional data. Oxford University Press, New York
Google Scholar
Rawlins BG, Lark RM, Webster R, O’Donnell KE (2006) The use of soil survey data to determine the magnitude and extent of historic metal deposition related to atmospheric smelter emissions across Humberside, UK. Environ Pollut 143:416–426
Article CAS Google Scholar
Rendu J-MM (1979) Normal and lognormal estimation. Math Geol 11:407–422
Article Google Scholar
Rivoirard J (1990) A review of lognormal estimators for in situ reserves. Math Geol 22:213–221
Article Google Scholar
Roth C (1998) Is lognormal kriging suitable for local estimation? Math Geol 30:999–1009
Article Google Scholar
Savelieva E, Demyanov V, Kanevski M, Serre M, Christakos G (2005) BME-based uncertainty assessment of the Chernobyl fallout. Geoderma 128:312–324
Article CAS Google Scholar
Serre ML, Christakos G, Lee SJ (2004) Soft Data Space/Time Mapping of Coarse Particulate Matter Annual Arithmetic Average over the US. In: Sanchez-Vila X, Carrera J, Gomez-Hernandez J (eds) Proceedings of the 4th conference on geostatistics for environmental applications, geoEnV IV. Quantitative geology and geostatistics. Kluwer, Dordrecht
Wackernagel H (2003) Multivariate geostatistics: an introduction with applications. Springer, Berlin
Google Scholar
Webster R, Oliver MA (2001) Geostatistics for environmental scientists. Wiley, Chichester
Google Scholar

Download references

Acknowledgments

This research was funded by the Biotechnology and Biological Sciences Research Council of the United Kingdom through its Core Strategic Grant to Rothamsted Research. We are grateful to the comments of the reviewers, through which the paper has been greatly improved.

Author information

Authors and Affiliations

Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, UK
T. G. Orton & R. M. Lark

Authors

T. G. Orton
View author publications
You can also search for this author in PubMed Google Scholar
R. M. Lark
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. G. Orton.

Appendix A

In this appendix, we show that the multivariate lognormal distribution, Eq. (7), is the maximum entropy pdf for a vector of variables, $ {\mathbf{Z}} = {\left( {Z_{1} ,Z_{2} ,...,Z_{N} } \right)}^{{\text{T}}} , $ given the general knowledge base stated in Eq. 6.

For a transformation, Y = φ(Z), the pdfs for Z and for Y are linked by:

$$ f{\left( {\mathbf{z}} \right)} = {\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}g{\left( {\mathbf{y}} \right)}. $$

(A1)

So the entropy for f(z) is given by:

$$ \begin{aligned}{} H_{{\mathbf{Z}}} {\left[ {f{\left( {\mathbf{z}} \right)}} \right]} & = - {\int {f{\left( {\mathbf{z}} \right)}\ln f{\left( {\mathbf{z}} \right)}} }{\text{d}}{\mathbf{z}} \\ & = - {\int {{\left\{ {{{\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}g{\left( {\mathbf{y}} \right)}\ln {\left[ {{\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}g{\left( {\mathbf{y}} \right)}} \right]}} \mathord{\left/ {\vphantom {{{\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}g{\left( {\mathbf{y}} \right)}\ln {\left[ {{\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}g{\left( {\mathbf{y}} \right)}} \right]}} {{\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}}}} \right. \kern-\nulldelimiterspace} {{\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}}} \right\}}} }{\text{d}}{\mathbf{y}} \\ & = - {\int {g{\left( {\mathbf{y}} \right)}\ln g{\left( {\mathbf{y}} \right)}} }{\text{ d}}{\mathbf{y}} - {\int {g{\left( {\mathbf{y}} \right)}\ln {\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}} }{\text{d}}{\mathbf{y}} \\ & = H_{{\mathbf{Y}}} {\left[ {g{\left( {\mathbf{y}} \right)}} \right]} - {\text{E}}{\left[ {\ln {\left| {\frac{{{\text{d}}{\mathbf{y}}}} {{{\text{d}}{\mathbf{z}}}}} \right|}} \right]} \\ & = H_{{\mathbf{Y}}} {\left[ {g{\left( {\mathbf{y}} \right)}} \right]} - {\text{E}}{\left[ {\ln J} \right]}, \\ \end{aligned} $$

(A2)

where J is the Jacobian of the transformation. In the case of φ(Z) being the logarithmic transform we get $ J = {\prod {1 \mathord{\left/ {\vphantom {1 {z_{i} }}} \right. \kern-\nulldelimiterspace} {z_{i} }} }; $ the expectation on the right-hand side of Equation (A2) now gives the sum of the expectations of the y _is, which are known because of our constraints on the logarithmic means, $ {\int\limits_{ - \infty }^\infty {y_{i} g{\left( {\mathbf{y}} \right)}} }{\text{ d}}{\mathbf{y}} = \mu _{i} . $ Therefore we have:

$$ H_{{\mathbf{Z}}} {\left[ {f{\left( {\mathbf{z}} \right)}} \right]} = H_{{\mathbf{Y}}} {\left[ {g{\left( {\mathbf{y}} \right)}} \right]} + {\sum\limits_{i = 1}^N {\mu _{i} } }. $$

(A3)

Thus, if we choose g(y) such that it maximizes H _Y[g(y)] over all distributions for Y, then we know that this distribution gives the maximum value of the entropy, H _z[f(z)], over all distributions for Z. Since the Gaussian distribution for g(y) maximizes H _Y[g(y)], the back-transform of this distribution through Equation (A1) (i.e. the multivariate lognormal distribution) is the maximum entropy pdf for the SRF, Z, given the mean and covariance for Y.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Orton, T.G., Lark, R.M. The Bayesian maximum entropy method for lognormal variables. Stoch Environ Res Risk Assess 23, 319–328 (2009). https://doi.org/10.1007/s00477-008-0217-7

Download citation

Published: 06 February 2008
Issue Date: March 2009
DOI: https://doi.org/10.1007/s00477-008-0217-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Bayesian maximum entropy method for lognormal variables

Abstract

Access this article

Similar content being viewed by others

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Check your outliers! An introduction to identifying statistical outliers in R with easystats

Topp-Leone Exponentiated Pareto Distribution: Properties and Application to Covid-19 Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Bayesian maximum entropy method for lognormal variables

Abstract

Access this article

Similar content being viewed by others

The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective

Check your outliers﻿! An introduction to identifying statistical outliers in R with easystats

Topp-Leone Exponentiated Pareto Distribution: Properties and Application to Covid-19 Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Check your outliers! An introduction to identifying statistical outliers in R with easystats