The Importance of Ontological Structure: Why Validation by ‘Fit-to-Data’ Is Insufficient

  • Gary Polhill
  • Doug Salt
Part of the Understanding Complex Systems book series (UCS)


This chapter will briefly describe some common methods by which people make quantitative estimates of how well they expect empirical models to make predictions. However, the chapter’s main argument is that fit-to-data, the traditional yardstick for establishing confidence in models, is not quite the solid ground on which to build such belief some people think it is, especially for the kind of system agent-based modelling is usually applied to. Further, the chapter will show that the amount of data required to establish confidence in an arbitrary model by fit-to-data is often infeasible, unless there is some appropriate ‘big data’ available. This arbitrariness can be reduced by constraining the choice of model. In agent-based models, these constraints are introduced by their descriptiveness rather than by removing variables from consideration or making assumptions for the sake of simplicity. By comparing with neural networks, we show that agent-based models have a richer ontological structure. For agent-based models, in particular, this richness means that the ontological structure has a greater significance and yet is all too commonly taken for granted or assumed to be ‘common sense’. The chapter therefore also discusses some approaches to validating ontologies.


Validation Fit-to-data Ontology Ontological structure Neural net Machine learning Calibration Generalization Model bias Variance Ockham’s razor Vapnik-Chervonenkis dimension Knowledge elicitation Interoperability Validation measures Description logic 



We acknowledge funding from the Engineering and Physical Sciences Research Council (award no. 91310127), the European Commission Framework Programme 7 ‘GLAMURS’ project (grant agreement no. 613420) and the Scottish Government Rural Affairs, Food and the Environment Strategic Research Programme, Theme 2: Productive and Sustainable Land Management and Rural Economies. We are also grateful to Bruce Edmonds and Mark Brewer for useful comments on earlier drafts of this chapter; any mistakes are of course our own.


  1. Abu-Mostafa, Y. S. (1989). The Vapnik-Chervonenkis dimension: Information versus complexity in learning. Neural Computation, 1(3), 312–317.CrossRefGoogle Scholar
  2. Aha, D. W. (1992). Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms. International Journal of Man-Machine Studies, 36(2), 267–287.CrossRefGoogle Scholar
  3. Baader, F., & Nutt, W. (2003). Basic description logics. In F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, & P. F. Patel-Schneider (Eds.), The description logic handbook (pp. 43–95). New York, NY: Cambridge University Press.Google Scholar
  4. Baader, F., Küsters, R., & Wolter, F. (2003). Extensions to description logics. In F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, & P. F. Patel-Schneider (Eds.), The description logic handbook (pp. 219–261). New York, NY: Cambridge University Press.Google Scholar
  5. Bagosi, T., Calvanese, D., Hardi, J., Komla-Ebri, S., Lanti, D., Rezk, M., et al. (2014, August 8–12). The ontop framework for ontology based data access. In D. Zhao, J. Du, H. Wang, P. Wang, J. Donghong, & J. Z. Pan (Eds.), The semantic web and web science. 8th Chinese conference, CSWS, revised selected papers (pp. 67–77). Berlin: Springer-Verlag, Wuhan, China.Google Scholar
  6. Barwise, J., & Seligman, J. (1997). Information flow: The logic of distributed systems. Cambridge: Cambridge University Press.CrossRefzbMATHGoogle Scholar
  7. Bellatreche, L., Xuan Dong, N., Peirra, G., & Hondjack, D. (2006). Contribution of ontology-based data modeling to automatic integration of electronic catalogues within engineering databases. Computers in Industry, 57, 711–724.CrossRefGoogle Scholar
  8. Becu, N., Bousquet, F., Barreteau, O., Perez, P., & Walker, A. (2003). A methodology for eliciting and modelling stakeholders’ representations with agent based modelling. In D. Hales, B. Edmonds, E. Norling, & J. Rouchier (Eds.), Multi-Agent-Based Simulation III. MABS 2003. Lecture Notes in Computer Science 2927 (pp. 131–148). Berlin, Heidelberg: Springer.Google Scholar
  9. Bergman, M. (2014). 50 ontology mapping and alignment tools. Accessed May 2017.
  10. Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web: A new form of web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American, 284(5), 28–37.CrossRefGoogle Scholar
  11. Bharwani, S., Besa, M. C., Taylor, R., Fischer, M., Devisscher, T., & Kenfack, C. (2015). Identifying salient drivers of livelihood decision-making in the forest communities of Cameroon: Adding value to social simulation models. Journal of Artificial Societies and Social Simulation, 18(1), 3. Accessed May 2017.
  12. Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press.zbMATHGoogle Scholar
  13. Brewer, M. J., Butler, A., & Cooksley, S. (2016). The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity. Methods in Ecology and Evolution, 7, 679–692.CrossRefGoogle Scholar
  14. Calvanese, D., & De Giacomo, G. (2003). Expressive description logics. In F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, & P. F. Patel-Schneider (Eds.), The description logic handbook (pp. 178–218). New York, NY: Cambridge University Press.Google Scholar
  15. Cheng, B., & Titterington, D. M. (1994). Neural networks: A review from a statistical perspective. Statistical Science, 9(1), 2–30.MathSciNetCrossRefzbMATHGoogle Scholar
  16. Chenoweth, S. V. (1991). On the NP-hardness of blocks world. In AAAI-91 proceedings (pp. 623–628).Google Scholar
  17. Chester, D. L. (1990, January 15–19). Why two hidden layers are better than one. In Proceedings of the international joint conference on neural networks, (Vol. 1, pp. 265–268), Washington DC.Google Scholar
  18. Clarke, K. A. (2005). The phantom menace: Omitted variable bias in econometric research. Conflict Management and Peace Science, 22(4), 341–352.CrossRefGoogle Scholar
  19. Cuenca Grau, B., Horrocks, I., Motik, B., Parsia, B., Patel-Schneider, P., & Sattler, U. (2008). OWL 2: The next step for OWL. Journal of Web Semantics, 6(4), 309–322.CrossRefGoogle Scholar
  20. Cybenko, G. (1989). Approximation by superposition of a sigmoidal function. Mathematics of Control, Signals, and Systems, 2(4), 303–314.MathSciNetCrossRefzbMATHGoogle Scholar
  21. Devlin, K. (1991). Logic and information. Cambridge, Cambridge University Press.Google Scholar
  22. Do, H.-H., & Rahm, E. (2002, August 20–23) COMA: A system for flexible combination of schema matching approaches. In VLDB 2002: 28th International Conference on Very Large Data Bases, Kowloon Shangri-La Hotel, Hong Kong, China. Accessed May 2017.
  23. Doan, A., Madhavan, J., Domingos, P., & Halevy, A. (2004). Ontology matching: A machine learning approach. In S. Staab & R. Studer (Eds.), Handbook on ontologies (pp. 385–403). Berlin: Springer-Verlag.CrossRefGoogle Scholar
  24. Drchal, J., Čertický, M., & Jakob, M. (2016). VALFRAM: Validation framework for activity-based models. Journal of Artificial Societies and Social Simulation, 19(3), 15. Accessed May 2017.
  25. Edmonds, B. (2002, June 3). Simplicity is not truth-indicative. In Centre for policy modelling discussion papers CPM-02-99. Accessed May 2017.
  26. Edmonds, B., & Moss, S. (2005, July 19). From KISS to KIDS: An ‘anti-simplistic’ modelling approach. In P. Davidsson, B. Logan, & K. Takadama (Eds.), Multi-agent and multi-agent-based simulation, joint workshop MABS 2004, Revised selected papers. Lecture notes in artificial intelligence 3415 (pp. 130–114), New York, NY, USA.Google Scholar
  27. Elsenbroich, C. (2012). Explanation in agent-based modelling: Functions, causality or mechanisms? Journal of Artificial Societies and Social Simulation, 15(3), 1. Accessed May 2017.
  28. Epstein, J. M. (2008). Why model? Journal of Artificial Societies and Social Simulation, 11(4), 12. Accessed May 2017.
  29. Etienne, M. (2014). Companion modelling: A participatory approach to support sustainable development. The Netherlands: Springer.CrossRefGoogle Scholar
  30. Evans, J. S. B. T., & Over, D. E. (2004). If. Oxford: Oxford University Press.CrossRefGoogle Scholar
  31. Faria, D., Pesquita, C., Santos, E., Palmonari, M., Cruz, I. F., & Couto, F. M. (2013, September 9–13). The agreementmakerlight ontology matching system. In R. Meersman, H. Panetto, T. Dillon, J. Eder, Z. Bellahsene, N. Ritter, P. De Leenheer, & D. Dou (Eds.), On the move to meaningful internet systems: OTM 2013 conferences. Confederated international conferences CoopIS, DOA-trusted cloud, and ODBASE 2013, Proceedings. lecture notes in computer science 8185 (pp. 527–541), , Graz, Austria.Google Scholar
  32. Filatova, T., Polhill, J. G., & van Ewijk, S. (2016). Regime shifts in coupled socio-environmental systems: Review of modelling challenges and approaches. Environmental Modelling and Software, 75, 333–347.CrossRefGoogle Scholar
  33. Funahashi, K. (1989). On the approximate realisation of continuous mappings by neural networks. Neural Networks, 2(3), 183–192.CrossRefGoogle Scholar
  34. Ge, J., & Polhill, J. G. (2016). Exploring the combined effect of factors influencing commuting patterns and CO2 emissions in Aberdeen using an agent-based model. Journal of Artificial Societies and Social Simulation, 19(3), 11. Accessed May 2017.
  35. Giunchiglia, F., Autayeu, A., & Pane, J. (2012). S-match: An open source framework for matching lightweight ontologies. Semantic Web, 3(3), 307–317.Google Scholar
  36. Gotts, N. M., & Polhill, J. G. (2009, October 5–6). Narrative scenarios, mediating formalisms, and the agent-based simulation of land use change. In F. Squazzoni (Ed.), Epistemological aspects of computer simulation in the social sciences. Second international workshop EPOS, Revised selected and invited papers. Lecture notes in artificial intelligence 5466 (pp. 99–116), Brescia, Italy.Google Scholar
  37. Gotts, N. M., & Polhill, J. G. (2010). Size matters: Large-scape replications of experiments with FEARLUS. Advances in Complex Systems, 13(4), 453–467.MathSciNetCrossRefGoogle Scholar
  38. Grimm, V., Frank, K., Jeltsch, F., Brandl, R., Uchmański, J., & Wissel, C. (1996). Pattern-oriented modelling in population ecology. The Science of the Total Environment, 153, 151–166.CrossRefGoogle Scholar
  39. Gruber, T. R. (1993). A translation approach to portable ontology specification. Knowledge Acquisition, 5(2), 199–220.CrossRefGoogle Scholar
  40. Grubic, T., & Fan, I.-S. (2010). Supply chain ontology: Review, analysis and synthesis. Computers in Industry, 61, 776–786.CrossRefGoogle Scholar
  41. Guarino, N., & Welty, C. A. (2009). An overview of ontoclean. In S. Staab & R. Studer (Eds.), Handbook on ontologies (pp. 201–220). Berlin: Springer Verlag.CrossRefGoogle Scholar
  42. Gurney, K. (1997). An introduction to neural networks. London: UCL Press.CrossRefGoogle Scholar
  43. Hanson, S. J., & Burr, D. J. (1990). What connectionist models learn: Learning and representation in connectionist networks. The Behavioral and Brain Sciences, 13, 471–518.CrossRefGoogle Scholar
  44. Hertz, J., Krogh, A., & Palmer, R. G. (1991). Introduction to the theory of neural computation. Boston, MA: Addison-Wesley.Google Scholar
  45. Holland, J. H. (1986). Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning: An artificial intelligence approach (Vol. II). Burlington, MA: Morgan Kaufmann.Google Scholar
  46. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.CrossRefGoogle Scholar
  47. Horrocks, I., Patel-Schneider, P. F., & van Harmelen, F. (2003). From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics, 1(1), 7–26.CrossRefGoogle Scholar
  48. Hu, W., & Qu, Y. (2008). Falcon-AO: A practical ontology matching system. Web Semantics: Science, Services and Agents on the World Wide Web, 6(3), 237–239.CrossRefGoogle Scholar
  49. Hu, W., Qu, Y., & Cheng, G. (2008). Matching large ontologies: A divide-and-conquer approach. Data & Knowledge Engineering, 67, 140–160.CrossRefGoogle Scholar
  50. Huhn, U., & Schulz, S. (2004). Building a very large ontology from medical thesauri. In S. Staab & R. Studer (Eds.), Handbook on ontologies (pp. 133–150). Berlin: Springer-Verlag.CrossRefGoogle Scholar
  51. Jean-Mary, Y. R., Shironoshita, E. P., & Kabuka, M. R. (2009). Ontology matching with semantic verification. Web Semantics: Science, Services and Agents on the World Wide Web, 7(3), 235–251.CrossRefGoogle Scholar
  52. Jones, D. M., Bench-Capon, T. J. M., & Visser, P. R. S. (1998, 31 August–4 September). Methodologies for ontology development. In J. Cuena (Ed.), IT & knows: Information technologies and knowledge systems. Proceedings of a conference held as part of the XV IFIP world computer congress (pp. 62–75.), Vienna, Austria and Budapest, Hungary. Accessed May 2017.
  53. Kalfoglou, Y., & Schorlemmer, M. (2003). Ontology mapping: The state of the art. The Knowledge Engineering Review, 18(1), 1–31.CrossRefzbMATHGoogle Scholar
  54. Klein, H. K., & Hirschheim, R. A. (1987). A comparative framework of data modelling paradigms and approaches. The Computer Journal, 30(1), 8–15.CrossRefGoogle Scholar
  55. Livet, P., Muller, J.-P., Phan, D., & Sanders, L. (2010). Ontology, a mediator for agent-based modeling in social science. Journal of Artificial Societies and Social Simulation, 13(1), 3. Accessed May 2017.
  56. Moss, S. (2002). Agent based modelling for integrated assessment. Integrated Assessment, 3(1), 63–77.CrossRefGoogle Scholar
  57. Moss, S., & Edmonds, B. (2005). Sociology and simulation: Statistical and qualitative cross-validation. American Journal of Sociology, 110(4), 1095–1131.CrossRefGoogle Scholar
  58. Moss, S. (2008). Alternative approaches to the empirical validation of agent-based models. Journal of Artificial Societies and Social Simulation, 11(1), 5. Accessed May 2017.
  59. Müller, J. P. (2010). A framework for integrated modeling using a knowledge-driven approach. In D. A. Swayne, W. Yang, A. A. Voinov, A. Rizzoli, & T. Filatova (Eds.), Fifth Biennial international congress on environmental modelling and software, Ottawa, Canada.http:// %20MULLER.pdf. Accessed May 2017.
  60. Ngo, D., & Bellahsene, Z. (2012, October 8–12). YAM++: A multi-strategy based approach for ontology matching task. In A. ten Teije, J. Völker, S. Handschuh, H. Stuckenschmidt, M. d’Acquin, A. Nikolov, N. Aussenac-Gilles, & N. Hernandez (Eds.), Knowledge engineering and knowledge management. 18th international conference, EKAW. Proceedings. Lecture notes in computer science 7603 (pp. 421–425), Galway City, Ireland.Google Scholar
  61. Object Modelling Group. (2014). Ontology definition metamodel version 1.1. In OMG Document Number: Formal/2014–09-02. Accessed May 2017.
  62. Oreskes, N., Shrader-Frechette, K., & Belitz, K. (1994). Verification, validation, and confirmation of numerical models in the earth sciences. Science, 263(5147), 641–646.CrossRefGoogle Scholar
  63. Perez, P., Dray, A., Dietze, P., Moore, D., Jenkinson, R., Siokou, C., et al. (2009). An ontology-based simulation model exploring the social contexts of psychostimulant use among young Australians. International Society for the Study of Drug Policy. Accessed May 2017.
  64. Polhill, J. G. (2015). Extracting OWL ontologies from agent-based models: A Netlogo extension. Journal of Artificial Societies and Social Simulation, 18(2), 15. Accessed May 2017.
  65. Polhill, J. G., & Gotts, N. M. (2009). Ontologies for transparent integrated human-natural systems modelling. Landscape Ecology, 24, 1255–1267.CrossRefGoogle Scholar
  66. Polhill, J. G., Sutherland, L.-A., & Gotts, N. M. (2010). Using qualitative evidence to enhance an agent-based modelling system for studying land use change. Journal of Artificial Societies and Social Simulation, 13(2), 10. Accessed May 2017.
  67. Radax, W., & Rengs, B. (2010). Prospects and pitfalls of statistical testing: Insights from replicating the demographic prisoner’s dilemma. Journal of Artificial Societies and Social Simulation, 13(4), 1. Accessed May 2017.
  68. Rossiter, S., Noble, J., & Bell, K. R. W. (2010). Social simulations: Improving interdisciplinary understanding of scientific positioning and validity. Journal of Artificial Societies and Social Simulation, 13(1), 10. Accessed May 2017.
  69. Rumbaugh, J. (2003). Object-oriented analysis and design (OOAD). In A. Ralston, E. D. Reilly, & D. Hemmendinger (Eds.), Encyclopedia of computer science (4th ed., pp. 1275–1279). Chichester: John Wiley and Sons Ltd..Google Scholar
  70. Schulze, J., Müller, B., Groeneveld, J., & Grimm, V. (2017). Agent-based modelling of social-ecological systems: Achievements, challenges, and a way forward. Journal of Artificial Societies and Social Simulation, 20(2), 8. Accessed May 2017.
  71. Shalizi, C. R. (2006). Methods and techniques of complex systems science: An overview. In T. S. Deisboeck & J. Y. Kresh (Eds.), Complex systems science in biomedicine (pp. 33–114). New York, NY: Springer.CrossRefGoogle Scholar
  72. Shearer, R., Motik, B. and Horrocks, I. (2008, 26–27 October). HermiT: A highly-efficient OWL reasoner. In OWLED 2008. OWL: Experiences and Directions. Fifth International Workshop, Karlsruhe, Germany. Accessed May 2017.
  73. Shvaiko, P., & Euzenat, J. (2013). Ontology matching: State of the art and future challenges. IEEE Transactions on Knowledge and Data Engineering, 25(1), 158–176.CrossRefGoogle Scholar
  74. Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., & Katz, Y. (2007). Pellet: A practical OWL-DL reasoner. Web Semantics: Science, Services and Agents on the World Wide Web, 5(2), 51–53.CrossRefGoogle Scholar
  75. Sowa, J. (1999). Knowledge representation: Logical, philosophical, and computational foundations. Pacific Grove, CA: Brooks/Cole.Google Scholar
  76. Sure, Y., Staab, S., & Studer, R. (2004). On-to-knowledge methodology (OTKM). In S. Staab & R. Studer (Eds.), Handbook on ontologies (pp. 117–132). Berlin: Springer-Verlag.CrossRefGoogle Scholar
  77. ten Broeke, G., van Voorn, G., & Ligtenberg, A. (2016). Which sensitivity analysis method should I use for my agent-based model? Journal of Artificial Societies and Social Simulation, 19(1), 5. Accessed May 2017.
  78. Thiele, J. C., Kurth, W., & Grimm, V. (2012). Agent-based modelling: Tools for linking NetLogo and R. Journal of Artificial Societies and Social Simulation, 15(3), 8. Accessed May 2017.
  79. Thompson, N. S., & Derr, P. (2009). Contra Epstein, good explanations predict. Journal of Artificial Societies and Social Simulation, 12(1), 9. Accessed May 2017.
  80. Troitzsch, K. G. (2009). Not all explanations predict satisfactorily, and not all good predictions explain. Journal of Artificial Societies and Social Simulation, 12(1), 10. Accessed May 2017.
  81. Troitzsch, K. G. (2015). What one can learn from extracting OWL ontologies from a NetLogo model that was not designed for such an exercise. Journal of Artificial Societies and Social Simulation, 18(2), 14. Accessed May 2017.
  82. Tsarkov, D., & Horrocks, I. (2006, August 17–20). FaCT++ description logic reasoner: System description. In U. Furbach & N. Shankar (Eds.), Automated reasoning. Third international joint conference, IJCAR 2006. Proceedings. Lecture notes in computer science 4130 (pp. 292–297), Seattle, WA, USA.Google Scholar
  83. Vapnik, V. N., & Chervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16, 264–280.CrossRefzbMATHGoogle Scholar
  84. Watkin, T. L. H., Rau, A., & Biehl, M. (1993). The statistical mechanics of learning a rule. Reviews of Modern Physics, 65(2), 499–555.MathSciNetCrossRefGoogle Scholar
  85. Windrum, P., Fagiolo, G., & Moneta, A. (2007) Empirical validation of agent-based models: Alternatives and prospects. Journal of Artificial Societies and Social Simulation 10(2), 8. Accessed May 2017.
  86. Winograd, T. (1972). Understanding natural language. Edinburgh: Edinburgh University Press.Google Scholar
  87. Wilensky, U. (1999). NetLogo. Center for connected learning and computer-based modeling. Evanston, IL: Northwestern University. Accessed May 2017
  88. Wood, S. N., & Augustin, N. H. (2002). GAMs with integrated model selection using penalized regression splines and applications to environmental modelling. Ecological Modelling, 157(2–3), 157–177.CrossRefGoogle Scholar
  89. Yang, G., & Feng, J. (2012). Database semantic interoperability based on information flow theory and formal concept analysis. International Journal of Information Technology and Computer Science, 4(7), 33–42.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.The James Hutton InstituteAberdeenUK

Personalised recommendations