Designing and Evaluating an Interpretable Predictive Modeling Technique for Business Processes

  • Dominic BreukerEmail author
  • Patrick Delfmann
  • Martin Matzner
  • Jörg Becker
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 202)


Process mining is a field traditionally concerned with retrospective analysis of event logs, yet interest in applying it online to running process instances is increasing. In this paper, we design a predictive modeling technique that can be used to quantify probabilities of how a running process instance will behave based on the events that have been observed so far. To this end, we study the field of grammatical inference and identify suitable probabilistic modeling techniques for event log data. After tailoring one of these techniques to the domain of business process management, we derive a learning algorithm. By combining our predictive model with an established process discovery technique, we are able to visualize the significant parts of predictive models in form of Petri nets. A preliminary evaluation demonstrates the effectiveness of our approach.


Data mining Process mining Grammatical inference Predictive modeling 


  1. 1.
    van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    van der Aalst, W.M.P.: Process discovery: capturing the invisible. IEEE Comput. Intell. Mag. 5(1), 28–41 (2010)CrossRefGoogle Scholar
  3. 3.
    van der Aalst, W.M.P., Pesic, M., Song, M.: Beyond process mining: from the past to present and future. In: Pernici, B. (ed.) CAiSE 2010. LNCS, vol. 6051, pp. 38–52. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    van der Aalst, W.M.P., Schonenberg, M.H., Song, M.: Time prediction based on process mining. Inf. Syst. J. 36(2), 450–475 (2011)CrossRefGoogle Scholar
  5. 5.
    Kim, A., Obregon, J., Jung, J.-Y.: Constructing Decision Trees from Process Logs for Performer Recommendation. In: Lohmann, N., Song, M., Wohed, P. (eds.) BPM 2013 Workshops. LNBIP, vol. 171, pp. 224–236. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  6. 6.
    de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recogn. 38, 1332–1348 (2005)CrossRefGoogle Scholar
  7. 7.
    Verwer, S., Eyraud, R., de la Higuera, C.: PAutomaC: a PFA/HMM learning competition. Mach. Learn. J. (2013)Google Scholar
  8. 8.
    Shmuelli, G., Koppius, O.R.: Predictive analytics in information systems research. Manag. Inf. Syst. Q. 35(3), 553–572 (2011)Google Scholar
  9. 9.
    van der Aalst, W.M.P., Rubin, V., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87–111 (2010)CrossRefGoogle Scholar
  10. 10.
    de la Higuera, C.: Grammatical Inference. Cambride University Press, Cambridge (2010)CrossRefzbMATHGoogle Scholar
  11. 11.
    van der Aalst, W.M.P.: The application of petri nets to workflow management. J. Circuits Syst. Comput. 8(1), 21–66 (1998)CrossRefGoogle Scholar
  12. 12.
    Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. The MIT Press, Cambridge (2009)Google Scholar
  13. 13.
    Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  14. 14.
    Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., Carrasco, R.C.: Probabilistic finite-state machines - part I. IEEE Trans. Pattern Anal. Mach. Intell. 27(7), 1013–1025 (2005)CrossRefGoogle Scholar
  15. 15.
    Verwer, S., Eyraud, R., de la Higuera, C.: Results of the PAutomaC probabilistic automaton learning competition. In: 11th International Conference on Grammatical Inference, pp. 243–248 (2012)Google Scholar
  16. 16.
    Shibata, C., Yoshinaka, R.: Marginalizing out transition probabilities for several subclasses of Pfas. In: JMLR - Workshop Conference Proceedings, ICGI 2012, vol. 21, pp. 259–263 (2012)Google Scholar
  17. 17.
    Hulden, M.: Treba: efficient numerically stable EM for PFA. In: JMLR Workshop Conference Proceedings, vol. 21, pp. 249–253 (2012)Google Scholar
  18. 18.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning - Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2009)zbMATHGoogle Scholar
  19. 19.
    van der Aalst, W.M.P., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM Workshops 2011, Part I. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  20. 20.
    Steck, H., Jaakkola, T.: On the dirichlet prior and bayesian regularization. Neural Inf. Process. Syst. 15, 1441–1448 (2002)Google Scholar
  21. 21.
    Barber, D.: Bayesian Reasoning and Machine Learning. Cambridge University Press, Cambridge (2011)CrossRefGoogle Scholar
  22. 22.
    Bishop, C.M.: A new framework for machine learning. In: Zurada, J.M., Yen, G.G., Wang, J. (eds.) WCCI 2008, Plenary/Invited Lectures. LNCS, vol. 5050, pp. 1–24. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  23. 23.
    Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–22 (1977)zbMATHMathSciNetGoogle Scholar
  24. 24.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)zbMATHGoogle Scholar
  25. 25.
    Moon, T.K.: The expectation-maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)CrossRefGoogle Scholar
  26. 26.
    Murphy, K.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)Google Scholar
  27. 27.
    Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Second International Symposium on Information Theory, pp. 267–281 (1973)Google Scholar
  28. 28.
    Cortadella, J., Kishinevsky, M., Kondratyev, A., Lavagno, L., Yakovlev, A.: Petrify: a tool for manipulating concurrent specifications and synthesis of asynchronous controllers. IEICE Trans. Inf. Syst. E80-D(3), 315–325 (1997)Google Scholar
  29. 29.
    van Dongen, B.F.: BPI Challenge 2012 (2012).
  30. 30.
    Adriansyah, A., Buijs, J.: Mining Process Performance from Event Logs: The BPI Challenge 2012 Case Study. BPM Center Report BPM-12-15 (2012).

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Dominic Breuker
    • 1
    Email author
  • Patrick Delfmann
    • 1
  • Martin Matzner
    • 1
  • Jörg Becker
    • 1
  1. 1.Department for Information SystemsMuensterGermany

Personalised recommendations