Skip to main content

Learning Ensembles of Process-Based Models by Bagging of Random Library Samples

Part of the Lecture Notes in Computer Science book series (LNAI,volume 9956)

Abstract

We propose a new method for learning ensembles of process-based models for predictive modeling of dynamic systems from data and knowledge. Previous work has shown that ensembles based on sampling data (i.e., bagging), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial computational overhead needed for learning. On the other hand, methods for constructing ensembles based on sampling knowledge (i.e., random library samples, RLS) allow for efficient learning ensembles of process-based models, while maintaining their long-term predictive performance. This paper aims at checking the conjecture whether the combination of these methods has a potential for further performance improvements. The proposed method, bagging of random library samples for learning ensembles of process-based models combines the afore-mentioned approaches in terms of sampling both data and knowledge. We apply the method to and evaluate its performance on a set of automated predictive modeling tasks in two lake ecosystems from data and library of knowledge for modeling population dynamics. The experimental results serve both to identify the optimal design decisions regarding the proposed method as well as to asses its predictive ability. The results show that such ensembles outperform single process-based model, but also outperform each of the two methods for learning ensembles from data samples (bagging) and knowledge samples (RLS).

Keywords

  • Differential Evolution
  • Domain Knowledge
  • Predictive Performance
  • Modeling Population Dynamic
  • Incomplete Model

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-46307-0_16
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   64.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-46307-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   84.99
Price excludes VAT (USA)
Fig. 1.

Notes

  1. 1.

    Note that the sampling procedure does not require for generation of the power set \(\mathbb {P}(L)\).

  2. 2.

    Predictions are obtained by simulating the model m on the test (and not the validation) set.

References

  1. Aleksovski, D., Kocijan, J., Džeroski, S.: Ensembles of fuzzy linear model trees for the identification of multi-output systems. IEEE Trans. Fuzzy Syst. 24(4), 916–929 (2015)

    CrossRef  Google Scholar 

  2. Atanasova, N., Todorovski, L., Džeroski, S., Kompare, B.: Constructing a library of domain knowledge for automated modelling of aquatic ecosystems. Ecol. Model. 194(1–3), 14–36 (2006)

    CrossRef  Google Scholar 

  3. Atanasova, N., Todorovski, L., Džeroski, S., Remec, R., Recknagel, F., Kompare, B.: Automated modelling of a food web in Lake Bled using measured data and a library of domain knowledge. Ecol. Model. 194(1–3), 37–48 (2006)

    CrossRef  Google Scholar 

  4. Breiman, L., Friedman, J.H., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman & Hall, London (1984)

    MATH  Google Scholar 

  5. Bridewell, W., Asadi, N.B., Langley, P.W., Todorovski, L.: Reducing overfitting in process model induction. In: Proceedings of the 22nd International Conference on Machine Learning, ICML 2005, pp. 81–88. ACM, New York (2005)

    Google Scholar 

  6. Bridewell, W., Langley, P.W., Todorovski, L., Džeroski, S.: Inductive process modeling. Mach. Learn. 71, 1–32 (2008)

    CrossRef  Google Scholar 

  7. Cohen, S.D., Hindmarsh, A.C.: CVODE, a stiff/nonstiff ODE solver in C. J. Comput. Phys. 10(2), 138–143 (1996)

    CrossRef  Google Scholar 

  8. Dietzel, A., Mieleitner, J., Kardaetz, S., Reichert, P.: Effects of changes in the driving forces on water quality and plankton dynamics in three swiss lakes – long-term simulations with BELAMO. Freshw. Biol. 58(1), 10–35 (2013)

    CrossRef  Google Scholar 

  9. Džeroski, S., Todorovski, L.: Modeling the dynamics of biological networks from time course data. In: Choi, S. (ed.) Systems Biology of Signaling Networks. Systems Biology, pp. 275–295. Springer, New York (2010)

    CrossRef  Google Scholar 

  10. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)

    CrossRef  Google Scholar 

  11. Langley, P.W., Simon, H.A., Bradshaw, G., Zytkow, J.M.: Scientific Discovery: Computational Explorations of the Creative Processes. The MIT Press, MA (1987)

    Google Scholar 

  12. Ljung, L.: System identification - Theory for the User. Prentice-Hall, Upper Saddle River (1999)

    MATH  Google Scholar 

  13. Simidjievski, N., Todorovski, L., Džeroski, S.: Predicting long-term population dynamics with bagging and boosting of process-based models. Expert Syst. Appl. 42(22), 8484–8496 (2015)

    CrossRef  Google Scholar 

  14. Simidjievski, N., Todorovski, L., Džeroski, S.: Modeling dynamic systems with efficient ensembles of process-based models. PLoS ONE 11(4), 1–27 (2016)

    CrossRef  Google Scholar 

  15. Storn, R., Price, K.: Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)

    MathSciNet  CrossRef  MATH  Google Scholar 

  16. Tanevski, J., Todorovski, L., Džeroski, S.: Learning stochastic process-based models of dynamical systems from knowledge and data. BMC Syst. Biol. 10(1), 30 (2016)

    CrossRef  Google Scholar 

  17. Taškova, K., Šilc, J., Atanasova, N., Džeroski, S.: Parameter estimation in a nonlinear dynamic model of an aquatic ecosystem with meta-heuristic optimization. Ecol. Model. 226, 36–61 (2012)

    CrossRef  Google Scholar 

  18. Todorovski, L., Bridewell, W., Shiran, O., Langley, P.W.: Inducing hierarchical process models in dynamic domains. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings of the 20th National Conference on Artificial Intelligence, NCAI 2005, pp. 892–897. AAAI Press, Pittsburgh (2005)

    Google Scholar 

  19. Todorovski, L., Džeroski, S.: Integrating domain knowledge in equation discovery. In: Džeroski, S., Todorovski, L. (eds.) Computational Discovery of Scientific Knowledge. LNCS (LNAI), vol. 4660, pp. 69–97. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73920-3_4

    CrossRef  Google Scholar 

  20. Čerepnalkoski, D., Taškova, K., Todorovski, L., Atanasova, N., Džeroski, S.: The influence of parameter fitting methods on model structure selection in automated modeling of aquatic ecosystems. Ecol. Model. 245(0), 136–165 (2012)

    CrossRef  Google Scholar 

  21. Štrumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikola Simidjievski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Simidjievski, N., Todorovski, L., Džeroski, S. (2016). Learning Ensembles of Process-Based Models by Bagging of Random Library Samples. In: Calders, T., Ceci, M., Malerba, D. (eds) Discovery Science. DS 2016. Lecture Notes in Computer Science(), vol 9956. Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46307-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46306-3

  • Online ISBN: 978-3-319-46307-0

  • eBook Packages: Computer ScienceComputer Science (R0)