Mathematical Geosciences

, Volume 41, Issue 2, pp 193–213 | Cite as

Simpson’s Paradox in Natural Resource Evaluation

Article

Abstract

Reversals of statistical relationships, when two or more groups of data in a cross tabulation are aggregated, were first revealed more than a century ago. The reversal was later named Simpson’s paradox after his reversal examples in a seminal paper drew the attention of the statistical community. However, almost all the published cases have been in sociology and biomedical statistics. Does Simpson’s reversal occur in geosciences? Various examples from petroleum geology and reservoir modeling will be shown in this paper. Boundary conditions for such a reversal will be discussed under a broader framework of sampling analysis. Ecological inference bias, change of support problem, modifiable areal unit problem, and reference class problem will be discussed in relation to the Simpson’s paradox in the framework of spatial statistics. It will be demonstrated that the traditional interpretation of the paradox as a result of disproportional sampling based on a contingency table is not always true in the framework of spatial statistics, and the reversal while theoretically benign is inferentially treacherous. Therefore, emphasis will be on the discussion of combining statistical and scientific inferences in geologic modeling and hydrocarbon resource evaluation under various sampling schemes or support effect with or without a Simpson’s reversal.

Keywords

Petroleum resource Reservoir modeling Sampling Ecological inference problem Categorical variable change of support problem Reference class problem Geostatistics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abrahamsen P, Egeland T, Lia O, More H (1992) An integrated approach to prediction of hydrocarbon in place and recoverable reserve with uncertainty measures: In: SPE 24276, SPE European petroleum computer conference, Stavanger, Norway, 25–27 May 1992 Google Scholar
  2. Agresti A (2002) Categorical data analysis, 2nd edn. Wiley, New York Google Scholar
  3. Appleton DR, French JM, Vanderpump MPJ (1996) Ignoring a covariate: an example of Simpson’s paradox. Am Stat 50(4):340–341 CrossRefGoogle Scholar
  4. Berteig V, Halvorsen KB, Omre H, Jorde K, Steinlein OA (1988) Prediction of hydrocarbon pore volume with uncertainties. Presented at the 63rd SPE ATCE, Houston, TX. SPE 18325 Google Scholar
  5. Bickel PJ, Hammel EA, O’Connell JW (1975) Sex bias in graduate admissions: data from Berkeley. Science 187:398–404 CrossRefGoogle Scholar
  6. Blyth CR (1972) On Simpson’s paradox and the sure-thing principle. J Am Stat Assoc 67(338):364–366 CrossRefGoogle Scholar
  7. Carrington A, Rahman N, Ralphs M (2006) The modifiable areal unit problem: research planning. In: 11th Meeting, national sta method adv cmte, NSMAC(11), https://www.statistics.gov.uk/methods_quality/nsmac_eleventh_meeting.asp
  8. Cressie N (1996) Change of support and the modifiable areal unit problem. Geogr Syst 3(2–3):159–180 Google Scholar
  9. Colyvan M, Regan HM, Ferson S (2003) Is it a crime to belong to a reference class? In: Kyburg, Thalos (eds) Probability is the very guide of life. Open Court, Chicago and La Salle, pp 331–347 Google Scholar
  10. Farmer CL (1992) Numerical rocks. In: Fayers FJ, King PR (eds) Proceeding of 1st European conference on mathematics of oil recovery. Oxford University Press, Oxford, pp 437–447 Google Scholar
  11. Frykman P, Deutsch CV (2002) Practical application of geostatistical scaling laws for data integration. Petrophysics 43(3):153–171 Google Scholar
  12. Gillies D (2000) Philosophical theories of probability. Routledge, London Google Scholar
  13. Good IJ, Mittal Y (1987) The amalgamation and geometry of two-by-two contingency tables. Ann Stat 15(2):694–711 CrossRefGoogle Scholar
  14. Gotway CA, Young LJ (2002) Combining incompatible spatial data. J Am Stat Assoc 97(458):632–648 CrossRefGoogle Scholar
  15. Greenland S, Robins J (1994) Ecologic studies—biases, misconceptions, and counterexamples. Am J Epidemiol 18:269–274 Google Scholar
  16. Hajek A (2007) The reference class problem is your problem too. Synthese 156(3):563–585 CrossRefGoogle Scholar
  17. Haldorsen H, Damsleth E (1990) Stochastic modeling. JPT 1990:404–412 Google Scholar
  18. Jelinski DE, Wu J (1996) The modifiable areal unit problem and implication for landscape ecology. Landsc Ecol 11(3):129–140 CrossRefGoogle Scholar
  19. Journel A (1983) Nonparametric estimation of spatial distribution. Math Geol 15(3):445–468 CrossRefGoogle Scholar
  20. King G (1997) A solution to the ecological inference problem: reconstructing individual behaviour from aggregate data. Princeton University Press, Princeton Google Scholar
  21. Lake LW, Srinivasan S (2004) Statistical scale-up of reservoir properties: concepts and applications. J Pet Sci Eng 44:27–39 CrossRefGoogle Scholar
  22. Lucia JF (1995) Rock–fabric/petrophysical classification of carbonate pore space for reservoir characterization. AAPG Bull 79(9):1275–1300 Google Scholar
  23. Ma YZ, Seto A, Edwards D, Gomez E (2008a) Revisiting Judy Creek: uncovering 100 million barrels of in-place oil. In: Abstract and presentation to AAPG annual convention, San Antonio, TX, 20–23 April 2008. AAPG online journal: http://www.searchanddiscovery.com/avpresnt.htm, 23 p
  24. Ma YZ, Seto A, Gomez E (2008b) Frequentist meets spatialist: A marriage made in reservoir characterization and modeling. In: SPE ATCE, Denver, CO. SPE 115836, 12 p Google Scholar
  25. Malinas G (2001) Simpson’s paradox: A logically benign, empirically treacherous hydra. In: Kyburg, Thalos (eds) The monist, vol. 84. Open Court, Chicago and La Salle, pp 165–182; reprinted in “Probability is the very guide of life” Google Scholar
  26. Matheron G (1989) Estimating and choosing: an essay on probability in practice. Springer, Berlin Google Scholar
  27. Matheron G (1981a) Remarques sur le changement of support. Technical Report, N-690, Centre de Geostatistique, Fontainebleaux Google Scholar
  28. Matheron G (1981b) La sélectivité des distribution. Technical Report, N-686, Centre de Géostatistique, Fontainebleaux Google Scholar
  29. Myers DE (2006) Reflections on geostatistics and stochastic modeling. In Coburn TC, Yarus JM, Chambers RL (eds) Stochastic modeling and geostatistics: principles, methods, and case studies, vol II. AAPG Comput Appl Geol, vol 5, pp 11–22 Google Scholar
  30. Olea R (2007) Declustering of clustered preferential sampling for histogram and semivariogram inference. Math Geol 39(5):453–467 CrossRefGoogle Scholar
  31. Openshaw S, Taylor PJ (1979) A million or so correlation coefficients: three experiments on the modifiable areal unit problem. In: Wrigley N (ed) Statistical applications in the spatial sciences, London, pp 127–134 Google Scholar
  32. Openshaw S (1983) The modifiable areal unit problem. In: CATMOG, concepts and techniques in modern geography, vol 38 Google Scholar
  33. O’Sullivan D, Unwin DJ (2002) Geographic information analysis. Wiley, Hoboken Google Scholar
  34. Pearl J (2000) Causality: models, reasoning and inference. Cambridge University Press, Cambridge Google Scholar
  35. Reichenbach H (1949) The theory of probability, an inquiry into the logical and mathematical foundations of the calculus of probability, 2nd edn. University of California Press, Berkeley Google Scholar
  36. Rivoirard J (1994) Introduction to disjunctive kriging and non-linear geostatistics. Oxford University Press, New York Google Scholar
  37. Simpson EH (1951) The interpretation of interaction in contingency tables. J R Stat Soc Ser B 13:238–241 Google Scholar
  38. Strasser A, Pittet B, Hillgartner H, Pasquier J-B (1999) Depositional sequences in shallow carbonate-dominated sedimentary systems: concepts for a high-resolution analysis. Sediment Geol 128(3–4):201–221 CrossRefGoogle Scholar
  39. Venn J (1876) The logic of chance, 2nd edn. MacMillan, New York Google Scholar
  40. Wakefield J (2004) Ecological inference for 2×2 tables. J R Stat Soc A 167(3):385–445 CrossRefGoogle Scholar
  41. Wardrop RL (1995) Simpson’s paradox and the hot hand in basketball. Am Stat 49(1):24–28 CrossRefGoogle Scholar
  42. Westbrooke I (1998) Simpson’s paradox: an example in a New Zealand survey of jury composition. Chance 11(2):40–42 Google Scholar
  43. Worthington PF, Cosentino L (2005) The role of cutoffs in integrated reservoir studies. SPE Reserv Eval Eng 8(4):276–290 Google Scholar
  44. Yule GH (1903) Notes on the theory of association of attributes in statistics. Biometrika 2:121–134 CrossRefGoogle Scholar

Copyright information

© International Association for Mathematical Geology 2008

Authors and Affiliations

  1. 1.Greenwood VillageUSA

Personalised recommendations