pp 1–22 | Cite as

Using models to correct data: paleodiversity and the fossil record

  • Alisa BokulichEmail author
S.I.: Abstraction and Idealization in Scientific Modelling


Despite an enormous philosophical literature on models in science, surprisingly little has been written about data models and how they are constructed. In this paper, I examine the case of how paleodiversity data models are constructed from the fossil data. In particular, I show how paleontologists are using various model-based techniques to correct the data. Drawing on this research, I argue for the following related theses: first, the ‘purity’ of a data model is not a measure of its epistemic reliability. Instead it is the fidelity of the data that matters. Second, the fidelity of a data model in capturing the signal of interest is a matter of degree. Third, the fidelity of a data model can be improved ‘vicariously’, such as through the use of post hoc model-based correction techniques. And, fourth, data models, like theoretical models, should be assessed as adequate (or inadequate) for particular purposes.


Paleontology Paleobiology Evolution Data Model Suppes Fossil Biodiversity Representation Simulations Climate science Sepkoski Data models 



I am grateful to Wendy Parker, Adrian Currie, Mike Benton, and two anonymous referees for helpful comments on an earlier version of this paper. I also thank Demetris Portides for first encouraging me to write this paper and for his patience seeing it through to completion. I gratefully acknowledge the support of the Institute of Advanced Study at Durham University, COFUND Senior Research Fellowship, under EU grant agreement number 609412.


  1. Alroy, J. (2010a). Geographical, environmental, and intrinsic biotic controls on phanerozoic marine diversification. Paleontology, 53(6), 1211–1235.CrossRefGoogle Scholar
  2. Alroy, J. (2010b). Fair sampling of taxanomic richness and unbiased estimation of origination and extinction rates. In J. Alroy & G. Hunt (Eds.), Quantitative methods in paleobiology (pp. 55–80). Baltimore: The Paleontological Society.Google Scholar
  3. Benton, M., Dunhill, A., Lloyd, G., & Marx, F. (2011). Assessing the quality of the fossil record: Insights from vertebrates. In A. McGowan & A. Smith (Eds.), Comparing the geological and fossil records: Implications for biodiversity studies (Vol. 358, pp. 63–94). London: Geological Society.Google Scholar
  4. Benton, M., & Harper, D. (2009). Introduction to paleobiology and the fossil record. Chichester: Wiley.Google Scholar
  5. Bokulich, A. (forthcoming). Towards a taxonomy of the model-ladenness of data. In Presentation in Symposium session: Exploring model-data symbiosis in the geosciences. Philosophy of Science Association Biennial Meeting, November 2018, Seattle, WA.Google Scholar
  6. Brocklehurst, N. (2015). A simulation-based examination of residual diversity estimates as a method of correcting for sampling bias. Palaeontologia Electronica, 18.3.7T, 1–15.Google Scholar
  7. Collins, M., & Simberloff, D. (2009). Rarefaction and nonrandom spatial dispersion patterns. Environmental and Ecological Statistics, 16, 89–103.CrossRefGoogle Scholar
  8. Currie, A. (2018). Rock, bone, and ruin: An optimist’s guide to the historical sciences. Cambridge, MA: The MIT Press.Google Scholar
  9. Darwin, C. (1859). On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. London: John Murray. Retrieved from
  10. Edwards, P. (2001). Representing the global atmosphere: Computer models, data, and knowledge about climate change. In C. Miller & P. Edwards (Eds.), Changing the atmosphere: Expert knowledge and environmental governance (pp. 31–65). Cambridge, MA: MIT Press.Google Scholar
  11. Edwards, P. (2010). A vast machine: Computer models, climate data, and the politics of global warming. Cambridge, MA: MIT Press.Google Scholar
  12. Eldredge, N., & Gould, S. J. (1972). Punctuated equilibria: An alternative to phyletic gradualism. In T. Schopf (Ed.), Models in paleobiology (pp. 82–115). San Francisco: Freeman, Cooper, and Co.Google Scholar
  13. Erwin, D., & Droser, M. (1993). Elvis taxa. Palaios, 8, 623–624.CrossRefGoogle Scholar
  14. Foote, M. (1996). Perspective: Evolutionary patterns in the fossil record. Evolution, 50(1), 1–11.CrossRefGoogle Scholar
  15. Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika Trust, 40(3/4), 237–264.CrossRefGoogle Scholar
  16. Gould, S. J., Raup, D., Sepkoski, J., Jr., Schopf, T., & Simberloff, D. (1977). The shape of evolution: A comparison of real and random clades. Paleobiology, 3, 23–40.CrossRefGoogle Scholar
  17. Huss, J. (2009). The shape of evolution: The MBL model and clade shape. In D. Sepkoski & M. Ruse (Eds.), The paleobiological revolution: Essays on the growth of modern paleontology. Chicago: University of Chicago Press.Google Scholar
  18. Lane, A., Janis, C., & Sepkoski, J. (2005). Estimating paleodiversities: A test of taxic and phylogenetic methods. Paleobiology, 31(1), 21–34.CrossRefGoogle Scholar
  19. Leonelli, S. (2016). Data-centric biology: A philosophical study. Chicago: University of Chicago Press.CrossRefGoogle Scholar
  20. Lyell, C. (1830). Principles of geology: Being an attempt to explain the former changes of the earth’s surface, by references to causes now in operation. London: John Murray. Retrieved from
  21. Magnani, L., & Bertolotti, T. (Eds.). (2017). Springer handbook of model-based science. Dordrecht: Springer.Google Scholar
  22. Metcalfe, I., & Isozaki, Y. (2009). Current perspectives on the permian-triassic boundary and end-permian mass extinction: Preface. Journal of Asian Earth Sciences, 36, 407–412.CrossRefGoogle Scholar
  23. Norton, S., & Suppe, F. (2001). Why atmospheric modeling is good science. In C. Miller & P. Edwards (Eds.), Changing the atmosphere: Expert knowledge and environmental governance (pp. 67–105). Cambridge, MA: MIT Press.Google Scholar
  24. Norwell, M. (1993). Tree-based approaches to understanding history: Comments on ranks, rules, and the quality of the fossil record. American Journal of Science, 293, 407–417.CrossRefGoogle Scholar
  25. Parker, W. (2010). Scientific models and adequacy for purpose. The Modern Schoolman, 87, 285–293.CrossRefGoogle Scholar
  26. Parker, W., & Bokulich, A. (in preparation). Data models, representation, and adequacy-for-purpose.Google Scholar
  27. Raup, D. (1972). Taxonomic diversity during the phanerozoic. Science, 177(4054), 1065–1071.CrossRefGoogle Scholar
  28. Raup, D. (1975). Taxanomic diversity estimation using rarefaction. Paleobiology, 1, 333–342.CrossRefGoogle Scholar
  29. Sakamoto, M., Benton, M., & Venditti, C. (2016). Dinosaurs in decline tens of millions of years before their final extinction. Proceedings of the National Academy of Science, 113(18), 5036–5040.CrossRefGoogle Scholar
  30. Sakamoto, M., Venditti, C., & Benton, M. (2017). ‘Residual diversity estimates’ do not correct for sampling bias in palaeodiversity data. Methods in Ecology and Evolution, 8, 453–459.CrossRefGoogle Scholar
  31. Sepkoski, J. (1982). Compendium of fossil marine families. Milwaukee Public Museum Contributions in Biology and Geology, 51, 1–125.Google Scholar
  32. Sepkoski, J. (1984). A kinetic model of phanerozoic taxanomic diversity. III. Post-paleozoic families and mass extinctions. Paleobiology, 10(2), 246–267.CrossRefGoogle Scholar
  33. Sepkoski, J. (1994). What I did with my research career: Or how research on biodiversity yielded data on extinction. In W. Glenn (Ed.), Mass-extinction debates: How science works in a crisis. Stanford, CA: Stanford University Press.Google Scholar
  34. Sepkoski, D. (2012a). Reading the fossil record: The growth of paleobiology as an evolutionary discipline. Chicago: University of Chicago Press.CrossRefGoogle Scholar
  35. Sepkoski, D. (2012b). ‘Replying life’s tape’: Simulations, metaphors, and historicity in Stephen Jay Gould’s view of life. Studies in History and Philosophy of Biological and Biomedical Sciences, 58, 73–81.CrossRefGoogle Scholar
  36. Sepkoski, D. (2013). ‘Towards a natural history of data’: Evolving practices and epistemologies of data in paleontology, 1800–2000. Journal of the History of Biology, 46, 401–444.CrossRefGoogle Scholar
  37. Sepkoski, D. (2016). ‘Replaying life’s tape’: Simulations, metaphors, and historicity in Stephen Jay Gould’s view of life. Studies in History and Philosophy of Biological and Biomedical Sciences, 58, 73–81.CrossRefGoogle Scholar
  38. Sepkoski, D., & Ruse, M. (2009). The paleobiological revolution: Essays on the growth of modern paleontology. Chicago: University of Chicago Press.CrossRefGoogle Scholar
  39. Signor, P., III, & Lipps, J. (1982). Sampling bias, gradual extinction patterns and catastrophes in the fossil record. In L. Silver & P. Schultz (Eds.), Geological implications of large asteroids and comets on the earth (Vol. 190, pp. 291–296). Boulder: Geological Society of America.CrossRefGoogle Scholar
  40. Smith, A. (1994). Systematics and the fossil record: Documenting evolutionary patterns. Oxford: Blackwell Science Ltd.CrossRefGoogle Scholar
  41. Smith, A., & McGowan, A. (2007). The shape of the phanerozoic marine paleodiversity curve: How much can be predicted from the sedimentary rock record of Western Europe. Palaeontology, 50(4), 765–774.CrossRefGoogle Scholar
  42. Suppes, P. (1962). Models of data. In E. Nagel, P. Suppes, & A. Tarski (Eds.), Logic, methodology and philosophy of science: Proceedings of the 1960 international congress (pp. 252–261). Stanford: Stanford University Press.Google Scholar
  43. Turner, D. (2007). Making prehistory: Historical science and the scientific realism debate. Cambridge studies in philosophy and biology. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  44. Turner, D. (2011). Paleontology: A philosophical introduction. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  45. Upchurch, P., & Barrett, P. (2005). Phylogenetic and taxic perspectives on sauropod diversity. In K. Rogers & J. Wilson (Eds.), The sauropods: Evolution and paleobiology (pp. 104–124). Berkeley: University of California Press.Google Scholar
  46. van Fraassen, B. (2008). Scientific representation: Paradoxes of perspective. Oxford: Clarendon Press.CrossRefGoogle Scholar
  47. Wylie, C. (2009). Preparation in action: Paleontological skill and the role of the fossil preparator. In: M. Brown, J. Kane, & W. Parker (Eds.), Methods in fossil preparation: Proceedings of the first annual fossil preparation and collections symposium (pp. 3–12).Google Scholar
  48. Wylie, C. (2016). “Overcoming underdetermination” on extinct: The philosophy of palaeontology blog (April 11, 2016). Retrieved August 5, 2017 from

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of PhilosophyBoston UniversityBostonUSA

Personalised recommendations