Business & Information Systems Engineering

, Volume 61, Issue 6, pp 713–728 | Cite as

Towards Confirmatory Process Discovery: Making Assertions About the Underlying System

  • Gert JanssenswillenEmail author
  • Benoît Depaire
Research Paper


The focus in the field of process mining, and process discovery in particular, has thus far been on exploring and describing event data by the means of models. Since the obtained models are often directly based on a sample of event data, the question whether they also apply to the real process typically remains unanswered. As the underlying process is unknown in real life, there is a need for unbiased estimators to assess the system-quality of a discovered model, and subsequently make assertions about the process. In this paper, an experiment is described and discussed to analyze whether existing fitness, precision and generalization metrics can be used as unbiased estimators of system fitness and system precision. The results show that important biases exist, which makes it currently nearly impossible to objectively measure the ability of a model to represent the system.


Process mining Process discovery Process quality Fitness Precision Generalization Exploratory data analysis Confirmatory data analysis 



The computational resources and services used in this work for both process discovery and process conformance tasks were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government.


  1. Adriansyah A, Munoz-Gama J, Carmona J, van Dongen BF, van der Aalst WM (2015) Measuring precision of modeled behavior. Inf Syst e-Bus Manag 13(1):37–67CrossRefGoogle Scholar
  2. Agrawal R, Gunopulos D, Leymann F (1998) Mining process models from workflow logs. In: Schek HJ, Saltor F, Ramos I, Alonso G (eds) Adv Database Technol - EDBT ’98, vol 1377. Springer, Berlin, pp 467–483CrossRefGoogle Scholar
  3. Buijs JCAM (2014) Flexible evolutionary algorithms for mining structured process models. Ph.D. thesis, Technische Universiteit Eindhoven, EindhovenGoogle Scholar
  4. Buijs JCAM, van Dongen BF, van der Aalst WMP (2012) On the role of fitness, precision, generalization and simplicity in process discovery. In: On the move to meaningful internet systems: OTM 2012, Springer, Berlin, pp 305–322Google Scholar
  5. Cook JE, Wolf AL (1995) Automating process discovery through event-data analysis. In: 17th international conference on software engineering, 1995. ICSE 1995, IEEE, pp 73–73Google Scholar
  6. Datta A (1998) Automating the discovery of as-is business process models: probabilistic and algorithmic approaches. Inf Syst Res 9(3):275–301CrossRefGoogle Scholar
  7. Erickson B, Nosanchuk T (1992) Understanding data. McGraw-Hill Education, New YorkGoogle Scholar
  8. Gelman A (2004) Exploratory data analysis for complex models. J Comput Gr Stat 13(4):755–779CrossRefGoogle Scholar
  9. Goedertier S, Martens D, Vanthienen J, Baesens B (2009) Robust process discovery with artificial negative events. J Mach Learn Res 10:1305–1340Google Scholar
  10. Greco G, Guzzo A, Ponieri L, Sacca D (2006) Discovering expressive process models by clustering log traces. IEEE Trans Knowl Data Eng 18(8):1010–1027CrossRefGoogle Scholar
  11. Janssenswillen G, Depaire B, Jouck T (2016) Calculating the number of unique paths in a block-structured process model. In: Proceedings of the international workshop on algorithms and theories for the analysis of event data 2016Google Scholar
  12. Janssenswillen G, Donders N, Jouck T, Depaire B (2017) A comparative study of existing quality measures for process discovery. Inf Syst 71:1–15CrossRefGoogle Scholar
  13. Jouck T, Depaire B (Mar 2016) Generating artificial data for empirical analysis of process discovery algorithms: a process tree and log generator. Technical report, Universiteit Hasselt, HasseltGoogle Scholar
  14. Kunze M, Luebbe A, Weidlich M, Weske M (2011) Towards understanding process modeling-the case of the BPM academic initiative. In: International workshop on business process modeling notation, Springer, Berlin, pp 44–58Google Scholar
  15. Leemans SJJ, Fahland D, van der Aalst WMP (2013) Discovering block-structured process models from event logs-a constructive approach. Appl Theory Petri Nets Concurr. Springer, Berlin, pp 311–329CrossRefGoogle Scholar
  16. Maruster L (2003) A machine learning approach to understand business processes. Technische Universiteit EindhovenGoogle Scholar
  17. de Medeiros AKA, Weijters AJ, van der Aalst WMP (2007) Genetic process mining: an experimental evaluation. Data Min Knowl Discov 14(2):245–304CrossRefGoogle Scholar
  18. de Medeiros AKA (2006) Genetic process mining. Ph.D. thesis, Technische Universiteit Eindhoven, EindhovenGoogle Scholar
  19. Muñoz-Gama J, Carmona J (2010) A fresh look at precision in process conformance. In: Business process management. vol 6336, Springer, Hoboken, pp 211–226Google Scholar
  20. Rogge-Solti A, Senderovich A, Weidlich M, Mendling J, Gal A (2016) In log and model we trust? In: EMISA, pp 91–94Google Scholar
  21. Rozinat A, De Medeiros AA, Günther CW, Weijters A, Van der Aalst WM (2007) Towards an evaluation framework for process mining algorithms, vol 123Google Scholar
  22. Rozinat A, van der Aalst WMP (2008) Conformance checking of processes based on monitoring real behavior. Inf Syst 33(1):64–95CrossRefGoogle Scholar
  23. Tukey JW (1977) Exploratory data analysis, vol 2. Addison-Wesley, Reading, MAGoogle Scholar
  24. Tukey JW, Wilk MB (1966) Data analysis and statistics: an expository overview. In: Proceedings of the November 7-10, 1966, fall joint computer conference, ACM, New York, pp 695–709Google Scholar
  25. van der Aalst WMP, Adriansyah A, van Dongen B (2012) Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip Rev Data Min Knowl Discov 2(2):182–192CrossRefGoogle Scholar
  26. van der Aalst WMP (2013) Mediating between modeled and observed behavior: the quest for the “right” process. In: IEEE international conference on research challenges in information science (RCIS 2013), pp 31–43Google Scholar
  27. van der Aalst WMP (2016) Process mining: data science in action. Springer, BerlinCrossRefGoogle Scholar
  28. van der Werf JME, van Dongen BF, Hurkens CA, Serebrenik A (2008) Process discovery using integer linear programming. In: International conference on applications and theory of petri nets. Springer, Berlin, pp 368–387Google Scholar
  29. van Dongen BF, Carmona J, Chatain T (2016) A unified approach for measuring precision and generalization based on anti-alignments. In: International conference on business process management. Springer, ChamGoogle Scholar
  30. vandenBroucke SKLM, DeWeerdt J, Vanthienen Jan B, Baesens B (2014) Determining process model precision and generalization with weighted artificial negative events. IEEE Trans Knowl Data Eng 26(8):1877–1889CrossRefGoogle Scholar
  31. Weidlich M, Polyvyanyy A, Desai N, Mendling J, Weske M (2011) Process compliance analysis based on behavioural profiles. Inf Syst 36(7):1009–1025CrossRefGoogle Scholar
  32. Weijters AJMM, van Der Aalst WMP, De Medeiros AKA (2006) Process mining with the heuristics miner-algorithm. Technische Universiteit Eindhoven, Tech. Rep. WP vol 166, pp 1–34Google Scholar

Copyright information

© Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2018

Authors and Affiliations

  1. 1.Hasselt UniversityDiepenbeekBelgium
  2. 2.Research Foundation Flanders (FWO)BrusselsBelgium

Personalised recommendations