Skip to main content

Scientific Discovery, Process Models, and the Social Sciences

  • Chapter
  • First Online:
Scientific Discovery in the Social Sciences

Part of the book series: Synthese Library ((SYLI,volume 413))

Abstract

In this chapter, we review research on computational approaches to scientific discovery, starting with early work on the induction of numeric laws before turning to the construction of models that explain observations in terms of domain knowledge. We focus especially on inductive process modeling, which involves finding a set of linked differential equations, organized into processes, that reproduce, predict, and explain multivariate time series. We review the notion of quantitative process models, present two approaches to their construction that search through a space of model structures and associated parameters, and report their successful application to the explanation of ecological data. After this, we explore the relevance of process models to the social sciences, including the reasons they seem appropriate and some challenges to discovering them. In closing, we discuss other causal frameworks, including structural equation models and agent-based accounts, that researchers have developed to construct models of social phenomena.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Some variants (e.g., Bridewell & Langley 2010) ensure that candidate structures are consistent with constraints on relations among processes, say that an organism cannot take part in two distinct growth elements. These offer another form of theoretical knowledge about the domain.

References

  • Asgharbeygi, N., Bay, S., Langley, P., & Arrigo, K. (2006). Inductive revision of quantitative process models. Ecological Modelling, 194, 70–79.

    Article  Google Scholar 

  • Arvay, A., & Langley, P. (2016). Selective induction of rate-based process models. In Proceedings of the Fourth Annual Conference on Cognitive Systems. Evanston, IL.

    Google Scholar 

  • Bradley, E., Easley, M., & Stolle, R. (2001). Reasoning about nonlinear system identification. Artificial Intelligence, 133, 139–188.

    Article  Google Scholar 

  • Bridewell, W., Bani Asadi, N., Langley, P., & Todorovski, L. (2005). Reducing overfitting in process model induction. In Proceedings of the Twenty-Second International Conference on Machine Learning (pp. 81–88). Bonn, Germany.

    Book  Google Scholar 

  • Bridewell, W., Langley P., Racunas, S., & Borrett, S. R. (2006). Learning process models with missing data. In Proceedings of the Seventeenth European Conference on Machine Learning (pp. 557–565). Berlin: Springer.

    Google Scholar 

  • Bridewell, W., Langley, P., Todorovski, L., & Džeroski, S. (2008). Inductive process modeling. Machine Learning, 71, 1–32.

    Article  Google Scholar 

  • Bridewell, W., & Langley, P. (2010). Two kinds of knowledge in scientific discovery. Topics in Cognitive Science, 2, 36–52.

    Article  Google Scholar 

  • Bruk, L. G., Gorodskii, S. N., Zeigarnik, A. V., Valdés-Pérez, R. E., & Temkin, O. N. (1998). Oxidative carbonylation of phenylacetylene catalyzed by Pd(II) and Cu(I): Experimental tests of forty-one computer-generated mechanistic hypotheses. Journal of Molecular Catalysis A: Chemical, 130, 29–40.

    Article  Google Scholar 

  • Colton, S., Bundy, A., & Walsh, T. (2000). Automatic identification of mathematical concepts. In Proceedings of the Seventeenth International Conference on Machine Learning (pp. 183–190). Stanford, CA: Morgan Kaufmann.

    Google Scholar 

  • Džeroski, S., & Todorovski, L. (1995). Discovering dynamics: From inductive logic programming to machine discovery. Journal of Intelligent Information Systems, 4, 89–108.

    Article  Google Scholar 

  • Džeroski, S., & Todorovski, L. (Eds.). (2007). Computational discovery of communicable scientific knowledge. Berlin: Springer.

    Google Scholar 

  • Džeroski, S., & Todorovski, L. (2008). Equation discovery for systems biology: Finding the structure and dynamics of biological networks from time course data. Current Opinion in Biotechnology, 19, 360–368.

    Article  Google Scholar 

  • Epstein, J. M., & R Axtell, R. (1996). Growing artificial societies: Social science from the bottom up. Cambridge, MA: MIT Press.

    Google Scholar 

  • Fajtlowicz, S. (1988). On conjectures of GRAFFITI. Discrete Mathematics, 72, 113–118.

    Article  Google Scholar 

  • Falkenhainer, B. C., & Michalski, R. S. (1986). Integrating quantitative and qualitative discovery: The ABACUS system. Machine Learning, 1, 367–401.

    Google Scholar 

  • Fayyad, U., Haussler, D., & Stolorz, P. (1996). KDD for science data analysis: Issues and examples. In Proceedings of the Second International Conference of Knowledge Discovery and Data Mining (pp. 50–56). Portland, OR: AAAI Press.

    Google Scholar 

  • Forbus, K. D. (1984). Qualitative process theory. Artificial Intelligence, 24, 85–168.

    Article  Google Scholar 

  • Glymour, C., Scheines, R., Spirtes, P., & Kelly, K. (1987). Discovering causal structure: Artificial intelligence, philosophy of science, and statistical modeling. San Diego: Academic.

    Google Scholar 

  • Goldberger, A. S. (1972). Structural equation models in the social sciences. Econometrica, 40, 979–1001.

    Article  Google Scholar 

  • Gordon, A., Edwards, P., Sleeman, D., & Kodratoff, Y. (1994). Scientific discovery in a space of structural models: An example from the history of solution chemistry. In Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society (pp. 381–386). Atlanta: Lawrence Erlbaum.

    Google Scholar 

  • Hempel, C. G. (1965). Aspects of scientific explanation and other essays. New York: Free Press.

    Google Scholar 

  • Hempel, C. G. (1966). Philosophy of natural science. Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • King, R. D., & Srinivasan, A. (1996). Prediction of rodent carcinogenicity bioassays from molecular structure using inductive logic programming. Environmental Health Perspectives, 104(Supplement 5), 1031–1040.

    Google Scholar 

  • King, R. D., Whelan, K. E., Jones, F. M., Reiser, P. G. K., Bryant, C. H., Muggleton, S. H., Kell, D. B., Oliver, S. G. (2004). Functional genomic hypothesis generation and experimentation by a robot scientist, Nature, 427, 247–252.

    Article  Google Scholar 

  • Kokar, M. M. (1986). Determining arguments of invariant functional descriptions. Machine Learning, 1, 403–422.

    Google Scholar 

  • Koza, J. R., Mydlowec, W., Lanza, G., Yu, J., & Keane, M. A. (2001). Reverse engineering of metabolic pathways from observed data using genetic programming. Pacific Symposium on Biocomputing, 6, 434–445.

    Google Scholar 

  • Langley, P. (1981). Data-driven discovery of physical laws. Cognitive Science, 5, 31–54.

    Article  Google Scholar 

  • Langley, P. (2000). The computational support of scientific discovery. International Journal of Human-Computer Studies, 53, 393–410.

    Article  Google Scholar 

  • Langley, P., & Arvay, A. (2015). Heuristic induction of rate-based process models. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (pp. 537–543). Austin, TX: AAAI Press.

    Google Scholar 

  • Langley, P., & Arvay, A. (2017). Flexible model induction through heuristic process discovery. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (pp. 4415–4421). San Francisco: AAAI Press.

    Google Scholar 

  • Langley, P., Simon, H. A., Bradshaw, G. L., & Żytkow, J. M. (1987). Scientific discovery: Computational explorations of the creative processes. Cambridge, MA: MIT Press.

    Book  Google Scholar 

  • Langley, P., Sanchez, J., Todorovski, L., & Džeroski, S. (2002). Inducing process models from continuous data. In Proceedings of the Nineteenth International Conference on Machine Learning (pp. 347–354). Sydney: Morgan Kaufmann.

    Google Scholar 

  • Langley, P., Shiran, O., Shrager, J., Todorovski, L., & Pohorille, A. (2006). Constructing explanatory process models from biological data and knowledge. Artificial Intelligence in Medicine, 37, 191–201.

    Article  Google Scholar 

  • Maier, M., Taylor, B., Oktay, H., & Jensen, D. (2010). Learning causal models of relational domains. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (pp. 531–538). Atlanta: AAAI Press.

    Google Scholar 

  • Mitchell, F., Sleeman, D., Duffy, J. A., Ingram, M. D., & Young, R. W. (1997). Optical basicity of metallurgical slags: A new computer-based system for data visualisation and analysis. Ironmaking and Steelmaking, 24, 306–320.

    Google Scholar 

  • Moulet, M. (1992). ARC.2: Linear regression in ABACUS. In Proceedings of the ML 92 Workshop on Machine Discovery (pp. 137–146), Aberdeen, Scotland.

    Google Scholar 

  • Murata, T., Mizutani, M., & Shimura, M. (1994). A discovery system for trigonometric functions. In Proceedings of the Twelfth National Conference on Artificial Intelligence (pp. 645–650). Seattle: AAAI Press.

    Google Scholar 

  • Nordhausen, B., & Langley, P. (1990). A robust approach to numeric discovery. In Proceedings of the Seventh International Conference on Machine Learning (pp. 411–418). Austin, TX: Morgan Kaufmann.

    Google Scholar 

  • Park, C., Bridewell, W., & Langley, P. (2010). Integrated systems for inducing spatio-temporal process models. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (pp. 1555–1560). Atlanta: AAAI Press.

    Google Scholar 

  • Popper, K. R. (1961). The logic of scientific discovery. New York: Science Editions.

    Google Scholar 

  • Saito, K., & Nakano, R. (1997). Law discovery using neural networks. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 1078–1083). Yokohama: Morgan Kaufmann.

    Google Scholar 

  • Schaffer, C. (1990). Bivariate scientific function finding in a sampled, real-data testbed. Machine Learning, 12, 167–183.

    Google Scholar 

  • Schmidt, M., & Lipson, H. (2009). Distilling free-form natural laws from experimental data. Science, 324, 81–85.

    Article  Google Scholar 

  • Shrager, J., & Langley, P. (Eds.) (1990). Computational models of scientific discovery and theory formation. San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Simon, H. A. (1966). Scientific discovery and the psychology of problem solving. In R. Colodny (Ed.), Mind and cosmos. Pittsburgh, PA: University of Pittsburgh Press.

    Google Scholar 

  • Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction, and search. New York: Springer.

    Book  Google Scholar 

  • Todorovski, L., Bridewell, W., Shiran, O., & Langley, P. (2005). Inducing hierarchical process models in dynamic domains. In Proceedings of the Twentieth National Conference on Artificial Intelligence (pp. 892–897). Pittsburgh, PA: AAAI Press.

    Google Scholar 

  • Todorovski, L., Džeroski, S., & Kompare, B. (1998). Modeling and prediction of phytoplankton growth with equation discovery. Ecological Modelling, 113, 71–81.

    Article  Google Scholar 

  • Valdés-Pérez, R. E. (1994). Human/computer interactive elucidation of reaction mechanisms: Application to catalyzed hydrogenolysis of ethane. Catalysis Letters, 28, 79–87.

    Article  Google Scholar 

  • Washio, T., & Motoda, H. (1997). Discovering admissable models of complex systems based on scale types and identity constraints. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 810–817). Yokohama: Morgan Kaufmann.

    Google Scholar 

  • Żytkow, J. M. Zhu, J., & Hussam, A. (1990). Automated discovery in a chemistry laboratory. In Proceedings of the Eighth National Conference on Artificial Intelligence (pp. 889–894). Boston: AAAI Press.

    Google Scholar 

Download references

Acknowledgements

The research reported in this chapter was supported by Grant No. N00014-11-1-0107 from the US Office of Naval Research, which is not responsible for its contents. We thank Will Bridewell, Sašo Džeroski, Ruolin Jia, and Ljupčo Todorovski for useful discussions that led to the results reported.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pat Langley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Langley, P., Arvay, A. (2019). Scientific Discovery, Process Models, and the Social Sciences. In: Addis, M., Lane, P.C.R., Sozou, P.D., Gobet, F. (eds) Scientific Discovery in the Social Sciences. Synthese Library, vol 413. Springer, Cham. https://doi.org/10.1007/978-3-030-23769-1_11

Download citation

Publish with us

Policies and ethics