Data-Driven Process Discovery - Revealing Conditional Infrequent Behavior from Event Logs

  • Felix MannhardtEmail author
  • Massimiliano de Leoni
  • Hajo A. Reijers
  • Wil M. P. van der Aalst
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10253)


Process discovery methods automatically infer process models from event logs. Often, event logs contain so-called noise, e.g., infrequent outliers or recording errors, which obscure the main behavior of the process. Existing methods filter this noise based on the frequency of event labels: infrequent paths and activities are excluded. However, infrequent behavior may reveal important insights into the process. Thus, not all infrequent behavior should be considered as noise. This paper proposes the Data-aware Heuristic Miner (DHM), a process discovery method that uses the data attributes to distinguish infrequent paths from random noise by using classification techniques. Data- and control-flow of the process are discovered together. We show that the DHM is, to some degree, robust against random noise and reveals data-driven decisions, which are filtered by other discovery methods. The DHM has been successfully tested on several real-life event logs, two of which we present in this paper.


Process mining Process discovery Event logs Noise Rules 


  1. 1.
    Davies, I., Green, P., Rosemann, M., Indulska, M., Gallo, S.: How do practitioners use conceptual modeling in practice? Data Knowl. Eng. 58(3), 358–380 (2006)CrossRefGoogle Scholar
  2. 2.
    van der Aalst, W.M.P.: Process Mining - Data Science in Action, 2nd edn. Springer, Heidelberg (2016)CrossRefGoogle Scholar
  3. 3.
    Weerdt, J.D., Backer, M.D., Vanthienen, J., Baesens, B.: A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs. Inf. Syst. 37(7), 654–676 (2012)CrossRefGoogle Scholar
  4. 4.
    Suriadi, S., Andrews, R., ter Hofstede, A., Wynn, M.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)CrossRefGoogle Scholar
  5. 5.
    van der Aalst, W.M.P., Weijters, T., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)CrossRefGoogle Scholar
  6. 6.
    Carmona, J., Cortadella, J., Kishinevsky, M.: A region-based algorithm for discovering petri nets from event logs. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 358–373. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-85758-7_26 CrossRefGoogle Scholar
  7. 7.
    Günther, C.W., van der Aalst, W.M.P.: Fuzzy mining – adaptive process simplification based on multi-perspective metrics. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 328–343. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-75183-0_24 CrossRefGoogle Scholar
  8. 8.
    Weijters, A., Ribeiro, J.: Flexible heuristics miner (FHM). In: CIDM, pp. 310–317. IEEE (2011)Google Scholar
  9. 9.
    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs containing infrequent behaviour. In: Lohmann, N., Song, M., Wohed, P. (eds.) BPM 2013. LNBIP, vol. 171, pp. 66–78. Springer, Cham (2014). doi: 10.1007/978-3-319-06257-0_6 CrossRefGoogle Scholar
  10. 10.
    Liesaputra, V., Yongchareon, S., Chaisiri, S.: Efficient process model discovery using maximal pattern mining. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) BPM 2015. LNCS, vol. 9253, pp. 441–456. Springer, Cham (2015). doi: 10.1007/978-3-319-23063-4_29 CrossRefGoogle Scholar
  11. 11.
    Goedertier, S., Martens, D., Vanthienen, J., Baesens, B.: Robust process discovery with artificial negative events. J. Mach. Learn. Res. 10, 1305–1340 (2009)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Ponce-de-León, H., Carmona, J., vanden Broucke, S.K.L.M.: Incorporating negative information in process discovery. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) BPM 2015. LNCS, vol. 9253, pp. 126–143. Springer, Cham (2015). doi: 10.1007/978-3-319-23063-4_8 CrossRefGoogle Scholar
  13. 13.
    Buijs, J., van Dongen, B.F., van der Aalst, W.M.P.: A genetic algorithm for discovering process trees. In: IEEE Congress on Evolutionary Computation, pp. 1–8. IEEE (2012)Google Scholar
  14. 14.
    Rembert, A.J., Omokpo, A., Mazzoleni, P., Goodwin, R.T.: Process discovery using prior knowledge. In: Basu, S., Pautasso, C., Zhang, L., Fu, X. (eds.) ICSOC 2013. LNCS, vol. 8274, pp. 328–342. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-45005-1_23 CrossRefGoogle Scholar
  15. 15.
    Bellodi, E., Riguzzi, F., Lamma, E.: Statistical relational learning for workflow mining. Intell. Data Anal. 20(3), 515–541 (2016)CrossRefGoogle Scholar
  16. 16.
    Ghionna, L., Greco, G., Guzzo, A., Pontieri, L.: Outlier detection techniques for process mining applications. In: An, A., Matwin, S., Raś, Z.W., Ślęzak, D. (eds.) ISMIS 2008. LNCS, vol. 4994, pp. 150–159. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-68123-6_17 CrossRefGoogle Scholar
  17. 17.
    Conforti, R., Rosa, M.L., ter Hofstede, A.H.M.: Filtering out infrequent behavior from business process event logs. IEEE Trans. Knowl. Data Eng. 29(2), 300–314 (2017)CrossRefGoogle Scholar
  18. 18.
    Rozinat, A., Mans, R.S., Song, M., van der Aalst, W.M.P.: Discovering simulation models. Inf. Syst. 34(3), 305–327 (2009)CrossRefGoogle Scholar
  19. 19.
    de Leoni, M., van der Aalst, W.M.P.: Data-aware process mining: discovering decisions in processes using alignments. In: SAC 2013, pp. 1454–1461. ACM (2013)Google Scholar
  20. 20.
    Bazhenova, E., Buelow, S., Weske, M.: Discovering decision models from event logs. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 255, pp. 237–251. Springer, Cham (2016). doi: 10.1007/978-3-319-39426-8_19 CrossRefGoogle Scholar
  21. 21.
    Schönig, S., Ciccio, C., Maggi, F.M., Mendling, J.: Discovery of multi-perspective declarative process models. In: Sheng, Q.Z., Stroulia, E., Tata, S., Bhiri, S. (eds.) ICSOC 2016. LNCS, vol. 9936, pp. 87–103. Springer, Cham (2016). doi: 10.1007/978-3-319-46295-0_6 CrossRefGoogle Scholar
  22. 22.
    van der Aalst, W., Adriansyah, A., van Dongen, B.: Causal nets: a modeling language tailored towards process discovery. In: Katoen, J.-P., König, B. (eds.) CONCUR 2011. LNCS, vol. 6901, pp. 28–42. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-23217-6_3 CrossRefGoogle Scholar
  23. 23.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Burlington (1993)Google Scholar
  24. 24.
    Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)CrossRefGoogle Scholar
  25. 25.
    Ben-David, A.: About the relationship between ROC curves and Cohen’s kappa. Eng. Appl. Artif. Intell. 21(6), 874–882 (2008)CrossRefGoogle Scholar
  26. 26.
    vanden Broucke, S.: Advances in process mining: artificial negative events and othertechniques. Ph.D. thesis, KU Leuven (2014)Google Scholar
  27. 27.
    Dijkman, R., Dumas, M., García-Bañuelos, L.: Graph matching algorithms for business process model similarity search. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 48–63. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-03848-8_5 CrossRefGoogle Scholar
  28. 28.
    de Leoni, M., Mannhardt, F.: Road traffic fine management process (2015). doi: 10.4121/uuid:270fd440-1057-4fb9-89a9-b699b47990f5
  29. 29.
    Mannhardt, F., de Leoni, M., Reijers, H.A., van der Aalst, W.M.P.: Balanced multi-perspective checking of process conformance. Computing 98(4), 407–437 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Augusto, A., Conforti, R., Dumas, M., Rosa, M., Bruno, G.: Automated discovery of structured process models: discover structured vs. discover and structure. In: Comyn-Wattiau, I., Tanaka, K., Song, I.-Y., Yamamoto, S., Saeki, M. (eds.) ER 2016. LNCS, vol. 9974, pp. 313–329. Springer, Cham (2016). doi: 10.1007/978-3-319-46397-1_25 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Felix Mannhardt
    • 1
    Email author
  • Massimiliano de Leoni
    • 1
  • Hajo A. Reijers
    • 1
    • 2
  • Wil M. P. van der Aalst
    • 1
  1. 1.Eindhoven University of TechnologyEindhovenThe Netherlands
  2. 2.Vrije Universiteit AmsterdamAmsterdamThe Netherlands

Personalised recommendations