Skip to main content
Log in

The quest for conditional independence in prospectivity modeling: weights-of-evidence, boost weights-of-evidence, and logistic regression

  • Research Article
  • Published:
Frontiers of Earth Science Aims and scope Submit manuscript

Abstract

The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists’ favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to “validate” this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agterberg F P (2014). Geomathematics: Theoretical Foundations, Applications and Future Developments. Cham, Heidelberg, New York, Dordrecht, London: Springer

    Google Scholar 

  • Agterberg F P, Bonham-Carter G F, Wright D F (1990). Statistical pattern integration for mineral exploration. In: Gaál G, Merriam D F, eds. Computer Applications in Resource Estimation Prediction and Assessment for Metals and Petroleum. Oxford, New York: Pergamon Press, 1–21

    Chapter  Google Scholar 

  • Agterberg F P, Cheng Q (2002). Conditional independence test for weights-of-evidence modeling. Nat Resour Res, 11(4): 249–255

    Article  Google Scholar 

  • Berkson J (1944). Application of the logistic function to bio-assay. J Am Stat Assoc, 39(227): 357–365

    Google Scholar 

  • Bonham-Carter G (1994). Geographic Information Systems for Geoscientists: Modeling with GIS. New York: Pergamon, Elsevier Science

    Google Scholar 

  • Butz C J, Sanscartier M J (2002). Properties of weak conditional independence. In: Alpigini J J, Peters J F, Skowron A, Zhong N, eds. Rough Sets and Current Trends in Computing, Lecture Notes in Computer Science (Volume 2475). Berlin, Heidelberg: Springer, 349–356www2.cs.uregina.ca/butz/publications/properties.ps.gz

    Google Scholar 

  • Chalak K, White H (2012). Causality, conditional independence, and graphical separation in settable systems. Neural Comput, 24(7): 1611–1668

    Article  Google Scholar 

  • Cheng Q (2012). Application of a newly developed boost weights of evidence model (BoostWofE) for mineral resources quantitative assessments. Journal of Jilin University, Earth Sci Ed, 42(6): 1976–1985

    Google Scholar 

  • Cheng Q (2015). BoostWofE: a new sequential weights of evidence model reducing the effect of conditional dependency. Math Geosci, 47(5): 591–621

    Article  Google Scholar 

  • Chilès J P, Delfiner P (2012). Geostatistics- Modeling Spatial Uncertainty (2nd ed). New York, Chichester, Weinheim, Brisbane, Singapore, Toronto: John Wiley & Sons

    Book  Google Scholar 

  • Dawid A P (1979). Conditional independence in statistical theory. J R Stat Soc, B, 41(1): 1–31

    Google Scholar 

  • Dawid A P (2004). Probability, causality and the empirical world: a Bayes-de Finetti-Popper-Borel synthesis. Stat Sci, 19(1): 44–57

    Article  Google Scholar 

  • Dawid A P (2007). Fundamentals of Statistical Causality. Research Report 279, Department of Statistical Science, University College London ESRI, ArcGIS. http://www.esri.com/software/arcgis

    Google Scholar 

  • Ford A, Miller J M, Mol A G (2016). A comparative analysis of weights of evidence, evidential belief functions, and fuzzy logic for mineral potential mapping using incomplete data at the scale of investigation. Nat Resour Res, 25(1): 19–33

    Article  Google Scholar 

  • Freund Y, Schapire R E (1997). A decision theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci, 55 (1): 119–139

    Article  Google Scholar 

  • Freund Y, Schapire R E (1999). A short introduction to boosting. Jinko Chino Gakkaishi, 14(5): 771–780

    Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2000). Additive logistic regression: a statistical view of boosting. Ann Stat, 28(2): 337–407

    Article  Google Scholar 

  • Good I J (1950). Probability and the Weighing of Evidence. London: Griffin

    Google Scholar 

  • Good I J (1960). Weight of evidence, corroboration, explanatory power, information and the utility of experiments. J R Stat Soc, B, 22(2): 319–331

    Google Scholar 

  • Good I J (1968). The Estimation of Probabilities: An Essay on Modern Bayesian Methods. MIT Research Monograph No. 30, The MIT Press, Cambridge, MA, 109

    Google Scholar 

  • Harris D P, Pan G C (1999). Mineral favorability mapping: a comparison of artificial neural networks, logistic regression and discriminant analysis. Nat Resour Res, 8(2): 93–109

    Article  Google Scholar 

  • Harris D P, Zurcher L, Stanley M, Marlow J, Pan G (2003). A comparative analysis of favorability mappings by weights of evidence, probabilistic neural networks, discriminant analysis, and logistic regression. Nat Resour Res, 12(4): 241–255

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2009). The Elements of Statistical Learning (2nd ed). New York: Springer

    Book  Google Scholar 

  • Hosmer D W, Lemeshow S, Sturdivant R X (2013). Applied Logistic Regression (3rd ed). Hoboken, NJ: Wiley & Sons

    Book  Google Scholar 

  • Journel A G (2002). Combining knowledge from diverse sources: an alternative to traditional data independence hypotheses. Math Geol, 34(5): 573–596

    Article  Google Scholar 

  • Kreuzer O, Porwal A, eds. (2010). Special Issue “Mineral Prospectivity Analysis and Quantitative Resource Estimation”. Ore Geol Rev, 38 (3): 121–304

    Google Scholar 

  • Krishnan S (2008). The t-model for data redundancy and information combination in Earth sciences: theory and application. Math Geol, 40(6): 705–727

    Google Scholar 

  • Minsky M, Selfridge O G (1961). Learning in random nets. In: Cherry C, ed. 4th London Symposium on Information Theory. London: Butterworths, 335–347

    Google Scholar 

  • Pearl J (2009). Causality: Models, Reasoning, and Inference. 2nd ed. New York: Cambridge University Press

    Book  Google Scholar 

  • Polyakova E I, Journel A G (2007). The Math Geol, 39(8): 715–733

    Article  Google Scholar 

  • Porwal A, Carranza E JM(2015). Introduction to the Special Issue: GISbased mineral potential modelling and geological data analyses for mineral exploration. Ore Geol Rev, 71: 477–483

    Article  Google Scholar 

  • Porwal A, González-Álvarez I, Markwitz V, McCuaig T C, Mamuse A (2010). Weights of evidence and logistic regression modeling of magmatic nickel sulfide prospectivity in the Yilgarn Craton, Western Australia. Ore Geol Rev, 38(3): 184–196

    Article  Google Scholar 

  • Reed L J, Berkson J (1929). The application of the logistic function to experimental data. J Phys Chem, 33(5): 760–779

    Article  Google Scholar 

  • Rodriguez-Galiano V, Sanchez-Castillo M, Chica-Olmo M, Chica-Rivas M (2015). Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol Rev, 71: 804–818

    Article  Google Scholar 

  • Schaeben H (2014a). Targeting: logistic regression, special cases and extensions. ISPRS Int J Geoinf, 3(4): 1387–1411. Available at: http://www.mdpi.com/2220-9964/3/4/1387

    Article  Google Scholar 

  • Schaeben H (2014b). Potential modeling: conditional independence matters. GEM-International Journal on Geomathematics, 5(1): 99–116

    Article  Google Scholar 

  • Schaeben H (2014c). A mathematical view of weights-of-evidence, conditional independence, and logistic regression in terms of Markov random fields. Math Geosci, 46(6): 691–709

    Article  Google Scholar 

  • Šochman J, Matas J (2004). Adaboost with totally corrective updates for fast face detection. In: Proc. 6th IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, South Korea, 445–450

    Google Scholar 

  • Suppes P (1970). A Probabilistic Theory of Causality. Amsterdam: North-Holland

    Google Scholar 

  • Tolosana-Delgado R, van den Boogaart K G, Schaeben H (2014). Potential mapping from geochemical surveys using a Cox process. 10th Conference on Geostatistics for Environmental Applications, Paris, July 9–11, 2014

    Google Scholar 

  • van den Boogaart K G, Schaeben H (2012). Mineral potential mapping using Cox–type regression for marked point processes. 34th IGC Brisbane, Australia

    Google Scholar 

  • Wong M S K M, Butz C J (1999). Contextual weak independence in Bayesian networks. In: Proc. 15th Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, 670–679

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Helmut Schaeben.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schaeben, H., Semmler, G. The quest for conditional independence in prospectivity modeling: weights-of-evidence, boost weights-of-evidence, and logistic regression. Front. Earth Sci. 10, 389–408 (2016). https://doi.org/10.1007/s11707-016-0595-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11707-016-0595-y

Keywords

Navigation