Skip to main content

Advertisement

Log in

Bayesian maximum entropy and data fusion for processing qualitative data: theory and application for crowdsourced cropland occurrences in Ethiopia

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Categorical data play an important role in a wide variety of spatial applications, while modeling and predicting this type of statistical variable has proved to be complex in many cases. Among other possible approaches, the Bayesian maximum entropy methodology has been developed and advocated for this goal and has been successfully applied in various spatial prediction problems. This approach aims at building a multivariate probability table from bivariate probability functions used as constraints that need to be fulfilled, in order to compute a posterior conditional distribution that accounts for hard or soft information sources. In this paper, our goal is to generalize further the theoretical results in order to account for a much wider type of information source, such as probability inequalities. We first show how the maximum entropy principle can be implemented efficiently using a linear iterative approximation based on a minimum norm criterion, where the minimum norm solution is obtained at each step from simple matrix operations that converges to the requested maximum entropy solution. Based on this result, we show then how the maximum entropy problem can be related to the more general minimum divergence problem, which might involve equality and inequality constraints and which can be solved based on iterated minimum norm solutions. This allows us to account for a much larger panel of information types, where more qualitative information, such as probability inequalities can be used. When combined with a Bayesian data fusion approach, this approach deals with the case of potentially conflicting information that is available. Although the theoretical results presented in this paper can be applied to any study (spatial or non-spatial) involving categorical data in general, the results are illustrated in a spatial context where the goal is to predict at best the occurrence of cultivated land in Ethiopia based on crowdsourced information. The results emphasize the benefit of the methodology, which integrates conflicting information and provides a spatially exhaustive map of these occurrence classes over the whole country.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abramov R (2007) A practical computational framework for the multidimensional moment-constrained maximum entropy principle. J Comput Phys 211:198–209

    Article  Google Scholar 

  • Abramov R (2010) The multidimensional maximum entropy moment problem : a review on numerical methods. Commun Math Sci 8(2):377–392

    Article  Google Scholar 

  • Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, Hoboken

    Google Scholar 

  • Ali AL, Schmid F, Al-Salman R, Kauppinen T (2014) Ambiguity and plausibility: managing classification quality in volunteered geographic information. In: Proceedings of the 22nd ACM SIGSPATIAL international conference on advances in geographic information systems, pp 143–152

  • Allard D, D’Or D, Froidevaux R (2011) An efficient maximum entropy approach for categorical variable prediction. Eur J Soil Sci 62(3):381–393

    Article  Google Scholar 

  • Andersen EB (1980) Discrete statistical models with social science applications. North Holland, Amsterdam

    Google Scholar 

  • Bandyopadhyay K, Bhattacharya A, Biswas P, Drabold D (2005) Maximum entropy and the problem of moments: a stable algorithm. Phys Rev E 71(5):057701

    Article  CAS  Google Scholar 

  • Bayat B, Nasseri M, Zahraie B (2015) Identification of long-term annual pattern of meteorological drought based on spatiotemporal methods: evaluation of different geostatistical approaches. Nat Hazards 76:515–541

    Article  Google Scholar 

  • Bierkens MFP, Burrough PA (1993) The indicator approach to categorical soil data, I. Theory. Eur J Soil Sci 44(2):361–368

    Article  Google Scholar 

  • Bishop YMM, Fienberg SE, Holland PW (2007) Discrete multivariate analysis: theory and practice. Springer, Berlin

    Google Scholar 

  • BMELib : a MATLAB numerical toolbox of modern spatiotemporal geostatistics implementing the Bayesian maximum entropy theory. http://www.unc.edu/depts/case/BMElab/

  • Bogaert P (2002) Spatial prediction of categorical variables: the Bayesian maximum entropy approach. Stoch Environ Res Risk Assess 16(6):425–448

    Article  Google Scholar 

  • Bogaert P, Gengler S (2014) MinNorm approximation of MaxEnt/MinDiv problems for probability tables. In MaxEnt 2014—Bayesian inference and maximum entropy methods in science and engineering, Amboise, France, 21–26 September 2014, pp 287–296

  • Brus DJ, Bogaert P, Heuvelink GBM (2008) Bayesian maximum entropy prediction of soil categories using a traditional soil map as soft information. Eur J Soil Sci 59(2):166–177

    Article  Google Scholar 

  • Canosa N, Miller HG, Plastino A, Rossignoli R (1995) Maximum entropy-minimum norm method for the determination of level densities. Physica A 220:611–617

    Article  Google Scholar 

  • Cao C, Kyriakidis PC, Goodchild MF (2011) A multinomial logistic mixed model for the prediction of categorical spatial data. Int J Geogr Inf Sci 25(12):2017–2086

    Google Scholar 

  • Cao G, Yoo EH, Wang S (2014) A statistical framework of data fusion for spatial prediction of categorical variables. Stoch Environ Res Risk Assess 28:1785–1799

    Article  Google Scholar 

  • Cardille JA, Clayton MK (2007) A regression tree-based method for integrating land-cover and land-use data collected at multiple scales. Environ Ecol Stat 14:161–179

    Article  Google Scholar 

  • Christakos G (2000) Modern spatiotemporal geostatistics. Oxford University Press, Oxford

    Google Scholar 

  • Christakos G, Bogaert P, Serre M (2002) Temporal geographical information systems: advanced functions for field-based applications. Springer, Berlin

    Google Scholar 

  • Christensen R (1997) Log-linear models and logistic regression, 2nd edn. Springer, Berlin

    Google Scholar 

  • Comber A, See L, Fritz S, Van der Velde M, Perger C, Foody G (2013) Using control data to determine the reliability of volunteered geographic information about land cover. Int J Appl Earth Obs Geoinf 23:37–48

    Article  Google Scholar 

  • Comber A, Mooney P, Purves R, Rocchini D, Walz A (2015) Comparing national differences in what the people perceive to be there: mapping variations in crowd sourced land cover. Int Arch Photogramm Remote Sens Spat Inf Sci: ISPRS 1:71–75

    Article  Google Scholar 

  • Comber A, Fonte C, Foody G, Fritz S, Harris P, Olteanu-Raimond AM, See L (2016) Geographically weighted evidence combination approaches for combining discordant and inconsistent volunteered geographical information. Geoinformatica 20:503–527

    Article  Google Scholar 

  • Cressie N (2015) Statistics for spatial data, 2nd edn. Wiley-Interscience, Hoboken

    Google Scholar 

  • Cressie N, Wikle CK (2011) Statistics for spatial-temporal Data. Wiley, Hoboken

    Google Scholar 

  • D’Or D, Bogaert P (2004) Spatial prediction of categorical variables with the Bayesian maximum entropy approach: the Ooypolder case study. Eur J Soil Sci 55(4):763–775

    Article  Google Scholar 

  • Fienberg SE (1970) An iterative procedure for estimation in contingency tables. Ann Math Stat 41(3):907–917

    Article  Google Scholar 

  • Fienberg SE, Rinaldo A (2012) Maximum likelihood estimation in log-linear models. Ann Stat 40(2):996–1023

    Article  Google Scholar 

  • Foody GM, See L, Fritz S, Van der Velde M, Perger C, Schill C, Boyd DS, Comber A (2015) Accurate attribute mapping from volunteered geographic information: issues of volunteer quantity and quality. Cartogr J 52:336–344

  • Fritz S, MacCallum I, Schill C, Perger C, Grillmayer R, Achard F, Kraxner F, Obersteiner M (2009) Geo-Wiki.Org: the use of crowdsourcing to improve global land cover. Remote Sens 1:345–354

    Article  Google Scholar 

  • Fritz S, See LM, Rembold F (2010) Comparison of global and regional land cover maps with statistical information for the agricultural domain in Africa. Int J Remote Sens 25:1527–1532

    Google Scholar 

  • Fritz S, You L, Bun A, See L, McCallum I, Schill C, Perger C, Liu J, Hansen M, Obersteiner M (2011) Cropland for sub-Saharan Africa: a synergistic approach using five land cover data sets. Geophys Res Lett 38. doi:10.1029/2010GL046213

  • Gengler S, Bogaert P (2015) Bayesian data fusion applied to soil drainage classes spatial mapping. Math Geosci 48:79–88

    Article  Google Scholar 

  • Gengler S, Bogaert P (2016) Integrating crowdsourced data with a land cover product: a Bayesian data fusion approach. Remote Sens 8:545

    Article  Google Scholar 

  • Goodchild MF, Li L (2012) Assuring the quality of volunteered geographic information. Spat Stat 1:110–120

    Article  Google Scholar 

  • Goovaerts P (1997) Geostatistics for natural resources evaluation (applied geostatistics). Oxford University Press, Oxford

    Google Scholar 

  • Huang X, Li J, Liang Y, Wang Z, Guo J, Jiao P (2017) Spatial hidden Markov chain models for estimation of petroleum reservoir categorical variable. J Pet Explor Prod Technol 7(1):11–22

    Article  Google Scholar 

  • Hunter J, Alabri A, Ingen CV (2013) Assessing the quality and trustworthiness of citizen science data. Concurr Comput Pract Exp 25:454–466

    Article  Google Scholar 

  • Hurtt GC, Rosentrater L, Frolking S, Moore B (2001) Linking remote-sensing estimates of land cover and census statistics on land use to produce maps of land use of the conterminous United States. Glob Biogeochem Cycles 15:673–685

    Article  CAS  Google Scholar 

  • Jafari A, Khademi H, Finke PA, Van de Wauw J, Ayoubi S (2014) Spatial prediction of soil great groups by boosted regression trees using a limited point dataset in an arid region, southeastern Iran. Geoderma 232–234:148–163

    Article  Google Scholar 

  • Jaynes ET (2003) Probability theory: the logic of science. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Jin C, Zhu J, Steen-Adams MM, Sain SR, Gangnon RE (2013) Spatial multinomial regression models for nominal categorical data: a study of land cover in Northern Wisconsin, USA. Environmetrics 24(2):98–108

    Article  Google Scholar 

  • Johnson BA, Iizuka K (2016) Integrating OpenStreetMap crowdsourced data and landsat time-series imagery for rapid land use/land cover (LULC) mapping: case study of the laguna Bay area of the Philippines. Appl Geogr 67:140–149

    Article  Google Scholar 

  • Kapur JN (2009) Maximum entropy models in science and engineering. New Age, New Delhi

  • Kou X, Jiang L, Bo Y, Yan S, Chai L (2016) Estimation of land surface temperature through blending MODIS and AMSR-E data with Bayesian maximum entropy. Remote Sens 8:105

    Article  Google Scholar 

  • Messier KP, Campbell T, Bradley PJ, Serre M (2015) Estimation of groundwater Radon in North Carolina using land use regression and Bayesian maximum entropy. Environ Sci Technol 49:9817–9825

    Article  CAS  Google Scholar 

  • Muller C, Chapman L, Johnston S, Kidd C, Illingworth S, Foody G, Overeem A, Leigh R (2015) Crowdsourcing for climate and atmospheric sciences: current status and future potential. Int J Climatol 35:3185–3203

    Article  Google Scholar 

  • Pérez-Hoyos A, García-Haro F, San-Miguel-Ayanz J (2012) A methodology to generate a synergetic land-cover map by fusion of different land-cover products. Int J Appl Earth Obs Geoinf 19:72–87

    Article  Google Scholar 

  • Poser K, Dransch D (2010) Volunteered geographic information for disaster management with application to rapid flood damage estimation. Geomatica 64:89–98

    Google Scholar 

  • See L, McCallum I, Fritz S, Perger C, Kraxner F, Obersteiner M, Baruah UD, Mili N, Kalita NR (2013) Mapping cropland in Ethiopia using crowdsourcing. Int J Geosci 4:6–13

    Article  Google Scholar 

  • See L, Fritz S, You L, Ramankutty N, Herrero M, Justice C, Becker-Reshef I, Thornton P, Erb K, Gong P, Tang H, van der Velde M, Ericksen P, McCallum I, Kraxner F, Obersteiner M (2015) Improved global cropland data as an essential ingredient for food security. Glob Food Secur 4:37–45

    Article  Google Scholar 

  • See L, Mooney P, Foody G, Bastin L, Comber A, Estima J, Fritz S, Kerle N, Jiang B, Laakso M, Liu HY, Milčinski G, Nikšic M, Painho M, Pödör A, Olteanu-Raimond AM, Rutzinger M (2016) Crowdsourcing, citizen science or volunteered geographic information? The current state of crowdsourced geographic information. Int J Geo Inf 5:55

    Article  Google Scholar 

  • Thenkabail PS (ed) (2015) Remotely sensed data characterization, classification, and accuracies (remote sensing handbook). CRC Press, Boca Raton

    Google Scholar 

  • Wahyudi A, Bartzke M, Kuster E, Bogaert P (2013) Maximum entropy estimation of a Benzene contaminated plume using ecotoxicological assays. Environ Pollut 172:170–179

    Article  CAS  Google Scholar 

  • Waller LA (2005) Spatial models for categorical data. In: John Wiley and sons (ed) Encyclopedia of biostatistics. Wiley, Hoboken

  • Werner H, Hanke M, Neubauer A (2000) Regularization of inverse problems. Kluwer, Berlin

    Google Scholar 

  • Whittaker J, McLennan B, Handmer J (2015) A review of informal volunteerism in emergencies and disasters: definition, opportunities and challenges. Int J Disaster Risk Reduct 13:358–368

    Article  Google Scholar 

  • Wrigley N (2002) Categorical data analysis for geographers and environmental scientists. Blackburn Press, Caldwell

    Google Scholar 

  • Wu X (2003) Calculation of maximum entropy densities with application to income distribution. J Econom 115(2):347–354

    Article  Google Scholar 

  • Xu Y, Serre M, Reyes J, Vizuete W (2016) Bayesian maximum entropy integration of ozone observation and model prediction: a national application. Environ Sci Technol 50:4393–4400

    Article  CAS  Google Scholar 

  • Zook M, Graham M, Shelton T, Gorman S (2010) Volunteered geographic information and crowdsourcing disaster relief: a case study of the Haitian Earthquake. World Med Health Policy 2:6–32

    Article  Google Scholar 

Download references

Acknowledgements

We are indebted to two anonymous reviewers for their detailed and numerous comments that greatly helped improve the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick Bogaert.

Appendices

Appendix A1: Proof for the concavity of the entropy

Let us consider the entropy \(H({\mathbf {p}})\) defined as in Eq. (2). It is easy to prove that \(H({\mathbf {p}})\) is convex everywhere with respect to \({\mathbf {p}}\), and general proofs can be found in the literature (see e.g. Jaynes 2003; Kapur 2009). We will reproduce the proof here for the sake of completeness in order to discuss the results for the corresponding Hessian matrix in our specific context.

Let us rewrite \(H({\mathbf {p}})\) as a function of the vector of \(n-1\) first probabilities \({\mathbf {p}}_0=(p_1,\ldots ,p_{n-1})'\), where the last one is recovered from the condition \(\mathbf {1'p}_0+p_n=1\), so that

$$ H({\mathbf {p}}_0)=-{\mathbf {p}}'_0\ln {\mathbf {p}}_0-(1-\mathbf {1'p}_0)\ln (1-\mathbf {1'p}_0). $$

Taking the derivatives with respect to \({\mathbf {p}}_0\) yields

$$ \frac{\partial H({\mathbf {p}}_0)}{\partial {\mathbf {p}}_0}=-\ln {\mathbf {p}}_0+{\mathbf {1}}\ln (1-\mathbf {1'p}_0), $$

with derivatives equal to \(\mathbf {0}\) when \({\mathbf {p}}_0={\mathbf {1}}(1-\mathbf {1'p}_0)\); i.e., when

$$ {\mathbf {p}}_0=(\mathbf {I+11'})^{-1}{\mathbf {1}}. $$

Using the fact that \((\mathbf {I+11'})^{-1}={\mathbf {I}}-(1/n)\mathbf {11'}\), it thus comes that

$$ {\mathbf {p}}_0={\mathbf {1}}-\frac{1}{n}\mathbf {11'1}={\mathbf {1}}-\frac{n-1}{n}{\mathbf {1}}=\frac{1}{n}{\mathbf {1}}, $$
(10)

where \({\mathbf {1}}\) is the unit vector of lengths \((n-1)\) so that \(\mathbf {1'1}=n-1\). Taking additionally derivatives with respect to \({\mathbf {p}}'_0\) yields

$$ \frac{\partial ^2H({\mathbf {p}}_0)}{\partial {\mathbf {p}}_0\partial {\mathbf {p}}'_0}=-\left[ (diag({\mathbf {p}}_0))^{-1}+\frac{1}{1-\mathbf {1'p}_0}\mathbf {11'}\right] , $$

where \(1-\mathbf {1'p}_0\ge 0\) and where \(\mathbf {11'}\) and \(diag({\mathbf {p}}_0)\) are positive semidefinite and positive definite matrices, respectively, thus leading to \(\det (H({\mathbf {p}}))<0\) and so the entropy is concave with respect to the probabilities as defined over the unit simplex. In particular, at the absolute minimum where \({\mathbf {p}}_0=(1/n){\mathbf {1}}\), the Hessian matrix is equal to

$$ \frac{\partial ^2H({\mathbf {p}}_0)}{\partial {\mathbf {p}}_0\partial {\mathbf {p}}'_0}\big \vert _{{\mathbf {p}}_0=\frac{1}{n}{\mathbf {1}}}=-n\big ({\mathbf {I}}+\mathbf {11'}\big ). $$
(11)

Appendix A2: Proof for the convexity of the squared norm

It can be proven that the squared norm \(||{\mathbf {p}}||^2={\mathbf {p'p}}\) subject to the constraint \(\mathbf {1'p}=1\) is convex everywhere, and so \(||{\mathbf {p}}||^2\) is also convex over the space restricted by the additional set of linear equations \(\mathbf {Ap=b}\). Thus, there is always a solution, and the solution is unique. To prove this, let us rewrite \(||{\mathbf {p}}||^2\) as a function of the vector of \(n-1\) first probabilities \({\mathbf {p}}_0=(p_1,\ldots ,p_{n-1})'\), where the last one is recovered from the condition \(\mathbf {1'p}_0+p_n=1\), so that

$$ ||{\mathbf {p}}_0||^2=\mathbf {p'}_0{\mathbf {p}}_0+(1-\mathbf {1'p}_0)(1-\mathbf {1'p}_0). $$

Taking the derivatives with respect to \({\mathbf {p}}_0\) yields

$$ \frac{\partial ||{\mathbf {p}}_0||^2}{\partial {\mathbf {p}}_0}=2\left[ ({\mathbf {I}}+\mathbf {11'}){\mathbf {p}}_0-{\mathbf {1}})\right] , $$

with derivatives equal to \(\mathbf {0}\) when

$$ {\mathbf {p}}_0=(\mathbf {I+11'})^{-1}{\mathbf {1}}\,\Longleftrightarrow \,{\mathbf {p}}_0=\frac{1}{n}{\mathbf {1}}, $$
(12)

Taking additionally derivatives with respect to \({\mathbf {p}}'_0\) yields

$$ \frac{\partial ^2||{\mathbf {p}}_0||^2}{\partial {\mathbf {p}}_0\partial {\mathbf {p}}'_0}=2(\mathbf {I+11'}), $$

showing that the Hessian matrix does not depend on \({\mathbf {p}}_0\) and is equal up to a multiplicative constant to the MaxEnt Hessian matrix when \({\mathbf {p}}_0=(1/n){\mathbf {1}}\), as seen from Eq. (11).

Appendix A3: Polytopes and simplices

Let us define a convex polytope \(\overline{P}({\mathbf {V}})\) in \({\mathbb {R}}^n\) as

$$\begin{aligned} \overline{P}({\mathbf {V}})=\{{\mathbf {x}}:{\mathbf {x}}={\mathbf {V}}\varvec{\lambda },{\mathbf {1'}}\varvec{\lambda }=1,\varvec{\lambda }\in {\mathbb {R}}_{\ge 0}^m,{\mathbf {V}}\in {\mathbb {R}}^n\times {\mathbb {R}}^m,{\mathbf {x}}\in {\mathbb {R}}^n\} \end{aligned}$$

(with \(m<\infty \)) where \({\mathbf {V}}\) is the matrix specifying the m vertices of the polytope, and the coordinates in \({\mathbb {R}}^n\) of the ith vertex are given by the ith column of \({\mathbf {V}}\). \({\mathbf {V}}\varvec{\lambda }\) is a convex linear combination of these vertices, and thus, \(\overline{P}({\mathbf {V}})\) is the convex hull (i.e., a convex faceted solid – or convex bounded polyhedron – defined over \({\mathbb {R}}^n\)) generated by these linear combinations over \({\mathbb {R}}^n\). Alternatively, let us define

$$ P({\mathbf {A}},{\mathbf {b}})=\{{\mathbf {x}}\in {\mathbb {R}}^n:{\mathbf {A}}{\mathbf {x}}\le {\mathbf {b}}\} $$

as the (possibly unbounded) convex polyhedron generated by the intersection of the set of half-spaces \({\mathbf {A}}{\mathbf {x}}\le {\mathbf {b}}\). If this polyhedron is bounded, then \(P({\mathbf {A}},{\mathbf {b}})\) is called the half-spaces (or H-) representation of the corresponding polytope \(\overline{P}({\mathbf {V}})\). Identifying the vertices \({\mathbf {V}}\) generated by this intersection of half-spaces is part of the so-called enumeration problem, for which an efficient algorithm is available. From above, it is clear that any polytope can be univoquely defined either from its half-spaces definition or from its vertices. From topological properties, the intersection of two polytopes \(\overline{P}({\mathbf {V}}_1)\) and \(\overline{P}({\mathbf {V}}_2)\) is a new polytope, where each polytope can possibly be specified by its H-representation if needed.

A specific polytope of interest here is the simplex \(S({\mathbf {W}})\equiv \overline{P}({\mathbf {W}})\) when \(m=n\), so that it corresponds to a polytope that has n vertices in \({\mathbb {R}}^n\); i.e., these vertices lie on the same hyperplane in \({\mathbb {R}}^n\) so the polytope is, in fact, \((n-1)\) dimensional. In particular, the unit simplex is defined as \(S({\mathbf {I}})\), where \({\mathbf {I}}\) is the orthonormal basis in \({\mathbb {R}}^n\). By the light of the general results given above, it thus comes that the intersection between the unit simplex \(S({\mathbf {I}})\) (i.e., a polytope) with another polytope \(P({\mathbf {A}},{\mathbf {b}})\) yields another simplex \(S({\mathbf {W}})\) that is a subset of \(S({\mathbf {I}})\). From the topological properties again, it comes that

  1. 1.

    any simplex \(S({\mathbf {W}})\) can be represented as a simplicial complex, i.e. as the union of a finite set of simplices lying on the same hyperplane in \({\mathbb {R}}^n\), with

    $$ S({\mathbf {W}})=\bigcup _i S({\mathbf {W}}_i) \quad {\text{where }}\dim S({\mathbf {W}}_i)\cap S({\mathbf {W}}_j)<n \quad \forall i\ne j $$

    the condition on the intersection meaning that two distinct simplices \(S({\mathbf {W}}_i)\) and \(S({\mathbf {W}}_j)\) can, at most, share a common face (noting that the empty set is a face of every simplex, so that for simplices built on a disjoint set of vertices, their intersection, which is the empty set, obeys the previous definition);

  2. 2.

    for any arbitrary simplex \(S({\mathbf {W}}_i)\), there is always an affine transformation that allows us to map the vertices of the unit simplex \(S({\mathbf {I}})\) of the same dimension n to the vertices of \(S({\mathbf {W}}_i)\). That is, obviously there exists an infinite set of possible linear transformations such that

    $$ {\mathbf {W}}_i=\mathbf {CI}+{\mathbf {D}}. $$

    In particular, using, e.g., arbitrarily the first column \(\mathbf {w}_{i1}\) of \({\mathbf {W}}_i\),

    $$ \mathbf {C=W}_i-{\mathbf {D}}\quad \mathbf {D=w}_{i1}{\mathbf {1'}}, $$

    where \({\mathbf {W}}_i-{\mathbf {D}}\) is a translation of the simplex so that the first vertex is now at the origin of the orthonormal basis.

Considering the particular case of \(P({\mathbf {A}},{\mathbf {b}})\) and \(S({\mathbf {I}})\), their intersection \(\varOmega \) is thus a simplicial complex, with

$$ \varOmega =P({\mathbf {A}},{\mathbf {b}})\cap S({\mathbf {I}})=\bigcup _i S({\mathbf {W}}_i), $$

where (i) all vertices of all simplices \(S({\mathbf {W}}_i)\)s lie on the same hyperplane, which is the hyperplane where the vertices of \(S({\mathbf {I}})\) lie, and (ii) each \(S({\mathbf {W}}_i)\) is an affine transform from \(S({\mathbf {I}})\) itself. Because all vertices lie on the same hyperplane, their projection \({\mathbf {W}}_{i,p}\) on the same \(n-1\) dimensional subspace as obtained by dropping one line in \({\mathbf {W}}_i\) (the same line for all \({\mathbf {W}}_i\)’s, of course) yields a set of n points in as \((n-1)\) dimensional space, with the corresponding enclosed volumes \(v_i\) given by

$$ v_i=\frac{1}{2}\left| \det \left( \begin{array}{c} {\mathbf {W}}_{i,p} \\ {\mathbf {1'}} \end{array} \right) \right| . $$

These volumes are in the same ratios as the corresponding surfaces of the simplices over the hyperplane. In other words,

$$ w_i=\frac{v_i}{\sum _i v_i} $$

is the percentage of the surface of the simplicial complex over the hyperplane which is covered by the ith simplex.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bogaert, P., Gengler, S. Bayesian maximum entropy and data fusion for processing qualitative data: theory and application for crowdsourced cropland occurrences in Ethiopia. Stoch Environ Res Risk Assess 32, 815–831 (2018). https://doi.org/10.1007/s00477-017-1426-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-017-1426-8

Keywords

Navigation