Learning Process Models with Missing Data

  • Will Bridewell
  • Pat Langley
  • Steve Racunas
  • Stuart Borrett
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)


In this paper, we review the task of inductive process modeling, which uses domain knowledge to compose explanatory models of continuous dynamic systems. Next we discuss approaches to learning with missing values in time series, noting that these efforts are typically applied for descriptive modeling tasks that use little background knowledge. We also point out that these methods assume that data are missing at random—a condition that may not hold in scientific domains. Using experiments with synthetic and natural data, we compare an expectation maximization approach with one that simply ignores the missing data. Results indicate that expectation maximization leads to more accurate models in most cases, even though its basic assumptions are unmet. We conclude by discussing the implications of our findings along with directions for future work.


Expectation Maximization Expectation Maximization Algorithm Inductive Logic Programming Natural Data Continuous Dynamic System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Åström, K.J., Eykhoff, P.: System identification—a survey. Automatica 7, 123–167 (1971)CrossRefMATHGoogle Scholar
  2. Bradley, E., Easley, M., Stolle, R.: Reasoning about nonlinear system identification. Artificial Intelligence 133, 139–188 (2001)CrossRefMATHGoogle Scholar
  3. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39, 1–38 (1977)MathSciNetMATHGoogle Scholar
  4. Džeroski, S., Todorovski, L.: Discovering dynamics: From inductive logic programming to machine discovery. Journal of Intelligent Information Systems 4, 89–108 (1995)CrossRefGoogle Scholar
  5. Falkenhainer, B., Forbus, K.D.: Compositional modeling: Finding the right model for the job. Artificial Intelligence 51, 95–143 (1991)CrossRefGoogle Scholar
  6. Forbus, K.: Qualitative process theory. Artificial Intelligence 24, 85–168 (1984)CrossRefGoogle Scholar
  7. Holling, C.S.: Some characteristics of simple types of predation and parasitism. Canadian Entomologist 91, 385–398 (1959)CrossRefGoogle Scholar
  8. Isaksson, A.: System identification subject to missing data. In: American Control Conference, pp. 693–698 (1991)Google Scholar
  9. Jost, C., Ellner, S.: Testing for predator dependence in predator-prey dynamics: A non-parametric approach. Proceedings of the Royal Society of London B 267, 1611–1620 (2000)CrossRefGoogle Scholar
  10. Langley, P.: Data-driven discovery of physical laws. Cognitive Science 5, 31–54 (1981)CrossRefGoogle Scholar
  11. Langley, P., Sánchez, J., Todorovski, L., Džeroski, S.: Inducing process models from continuous data. In: The Nineteenth International Conference on Machine Learning, pp. 347–354 (2002)Google Scholar
  12. Little, R.J., Rubin, D.B.: Statistical analysis with missing data, 2nd edn. John Wiley, HobokenGoogle Scholar
  13. Stoica, P., Xu, L., Li, J.: Parameter estimation with missing data via equalization-maximization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2005), pp. IV–57–IV–60 (2005)Google Scholar
  14. Todorovski, L.: Using domain knowledge for automated modeling of dynamic systems with equation discovery Doctoral dissertation, Faculty of computer and information science, University of Ljubljana (2003)Google Scholar
  15. Todorovski, L., Bridewell, W., Shiran, O., Langley, P.: Inducing hierarchical process models in dynamic domains. In: The Twentieth National Conference on Artificial Intelligence, pp. 892–897 (2005)Google Scholar
  16. Veilleux, B.G.: The analysis of a predatory interaction between didinium and paramecium. Master’s thesis, University of Alberta (1976)Google Scholar
  17. Washio, T., Motoda, H., Niwa, Y.: Enhancing the plausibility of law equation discovery. In: The Seventeenth International Conference on Machine Learning, pp. 1127–1134 (2000)Google Scholar
  18. Żytkow, J.M., Zhu, J., Hussam, A.: Automated discovery in a chemistry laboratory. In: The Eighth National Conference on Artificial Intelligence, pp. 889–894 (1990)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Will Bridewell
    • 1
  • Pat Langley
    • 1
  • Steve Racunas
    • 1
  • Stuart Borrett
    • 1
  1. 1.Computational Learning Laboratory, CSLIStanford UniversityStanfordUSA

Personalised recommendations