Skip to main content

AI-Empowered Process Mining for Complex Application Scenarios: Survey and Discussion

A Correction to this article was published on 24 June 2021

This article has been updated


The ever-increasing attention of process mining (PM) research to the logs of low structured processes and of non-process-aware systems (e.g., ERP, IoT systems) poses a number of challenges. Indeed, in such cases, the risk of obtaining low-quality results is rather high, and great effort is needed to carry out a PM project, most of which is usually spent in trying different ways to select and prepare the input data for PM tasks. Two general AI-based strategies are discussed in this paper, which can improve and ease the execution of PM tasks in such settings: (a) using explicit domain knowledge and (b) exploiting auxiliary AI tasks. After introducing some specific data quality issues that complicate the application of PM techniques in the above-mentioned settings, the paper illustrates these two strategies and the results of a systematic review of relevant literature on the topic. Finally, the paper presents a taxonomical scheme of the works reviewed and discusses some major trends, open issues and opportunities in this field of research.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

Change history



  2. The choice of performing separate searches for the two AI exploitation strategies mainly serves the objective of easing the interpretation, analysis and presentation of the two classes of works considered in our review.

  3. The cutoff value of 1 citation per year acts as a reasonable “survival threshold” for purging both obsolete and low-impact studies. This threshold is not applied to the works published in the current year 2020 for the sake of fairness, seeing as such works might still have no citations at all only due to their short life, independently of their quality and relevance.

  4. In fact, in a scenario affected by a high level of uncertainty (e.g., owing to the combined presence of flexible processes and of ambiguous event-activity mappings), selecting just one optimal interpretation for a trace leads to losing information whenever the trace can be explained via different similarly plausible alternative interpretations, and the analyst’s expertise does not suffice to identify the “right interpretation” and definitely discard the other ones.


  1. Augusto A, Conforti R, Dumas M, La Rosa M, Maggi FM, Marrella A, Mecella M, Soo A (2018) Automated discovery of process models from event logs: review and benchmark. IEEE TKDE 31(4):686–705

    Google Scholar 

  2. Baier T, Mendling J, Weske M (2014) Bridging abstraction layers in process mining. Inf Syst 46:123–139

    Google Scholar 

  3. Bolt A, van der Aalst WM (2015) Multidimensional process mining using process cubes. In: Enterprise, business-process and information systems modeling, pp 102–116

  4. Bose RJC, Van der Aalst WM (2009) Abstractions in process mining: a taxonomy of patterns. In: BPM, pp 159–175

  5. Bose RJC, Mans RS, van der Aalst WM (2013) Wanna improve process mining results? In: IEEE Symposium CIDM, pp 127–134

  6. Bose RJC, Verbeek EH, van der Aalst WM (2011) Discovering hierarchical process models using prom. In: International conference on advanced information systems engineering. Springer, pp 33–48

  7. Camargo M, Dumas M, González-Rojas O (2019) Learning accurate LSTM models of business processes. In: International conference on business process management. Springer, pp 286–302

  8. Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75

    MathSciNet  Google Scholar 

  9. Chakraborty S, Tomsett R, Raghavendra R, Harborne D, Alzantot M, Cerutti F, Srivastava M, Preece A, Julier S, Rao RM et al. (2017) Interpretability of deep learning models: a survey of results. In: 2017 IEEE SmartWorld, ubiquitous intelligence & computing, advanced & trusted computed, scalable computing & communications, cloud & big data computing, internet of people and smart city innovation, pp 1–6

  10. Chapman P, Clinton J, Kerber R, Khabaza T, Reinartz T, Shearer CRH, Wirth R (2000) Crisp-dm 1.0: Step-by-step data mining guide

  11. Conforti R, La Rosa M, ter Hofstede AH (2016) Filtering out infrequent behavior from business process event logs. IEEE Trans Knowl Data Eng 29(2):300–314

    Google Scholar 

  12. Cuzzocrea A, Folino F, Guarascio M, Pontieri L (2016) A robust and versatile multi-view learning framework for the detection of deviant business process instances. Int J Coop Inf Syst 25(4):1–56

    Google Scholar 

  13. Cuzzocrea A, Folino F, Guarascio M, Pontieri L (2017) Deviance-aware discovery of high quality process models. In: ICTAI, pp 724–731

  14. Cuzzocrea A, Folino F, Guarascio M, Pontieri L (2019) Predictive monitoring of temporally-aggregated performance indicators of business processes against low-level streaming events. Inf Syst 81:236–266

    Google Scholar 

  15. De Medeiros AA, van der Aalst W, Pedrinaci C (2008) Semantic process mining tools: core building blocks. In: ECIS, pp 1953–1964

  16. De Medeiros AKA, Guzzo A, Greco G, Van Der Aalst WM, Weijters A, Van Dongen BF, Saccà D (2007) Process mining based on clustering: A quest for precision. In: International conference on business process management, pp 17–29

  17. De Weerdt J, vanden Broucke SK, Vanthienen J, Baesens B (2012) Leveraging process discovery with trace clustering and text mining for intelligent analysis of incident management processes. In: 2012 IEEE congress on evolutionary computation, pp 1–8

  18. dos Santos Garcia C, Meincheim A, Junior ERF, Dallagassa MR, Sato DMV, Carvalho DR, Santos EAP, Scalabrin EE (2019) Process mining techniques and applications-a systematic mapping study. Expert Syst Appl

  19. dos Santos Garcia C et al (2019) Process mining techniques and applications—a systematic mapping study. Expert Syst Appl 133:260–295

    Google Scholar 

  20. De Weerdt J, Vanden Broucke S, Vanthienen J, Baesens B (2013) Active trace clustering for improved process discovery. IEEE TKDE 25(12):2708–2720

    Google Scholar 

  21. Desel J, Esparza J (2005) Free choice Petri nets, vol 40. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  22. Di Francescomarino C, Ghidini C, Maggi FM, Milani F (2018) Predictive process monitoring methods: Which one suits me best? In: BPM, pp 462–479

  23. Di Francescomarino C, Ghidini C, Maggi FM, Petrucci G, Yeshchenko A (2017) An eye into the future: leveraging a-priori knowledge in predictive business process monitoring. In: BPM, pp 252–268

  24. Di Mauro N, Appice A, Basile TM (2019) Activity prediction of business process instances with inception cnn models. In: International conference of the italian association for artificial intelligence, pp 348–361

  25. Diamantini C, Genga L, Potena D (2016) Behavioral process mining for unstructured processes. J Intell Inf Syst 47(1):5–32

    Google Scholar 

  26. Diba K, Batoulis K, Weidlich M, Weske M (2019) Extraction, correlation, and abstraction of event data for process mining. WIREs Data Min Knowl Discov.

    Article  Google Scholar 

  27. Dixit P, Buijs JC, van der Aalst WM, Hompes B, Buurman J (2015) Using domain knowledge to enhance process mining results. In: SIMPDA, pp 76–104

  28. Dixit PM, Buijs JC, van der Aalst WM (2018) Prodigy: Human-in-the-loop process discovery. In: 2018 12th international conference on research challenges in information science (RCIS), pp 1–12

  29. Dixit PM, Verbeek H, Buijs JC, van der Aalst WM (2018) Interactive data-driven process model construction. In: International conference on conceptual modeling, pp 251–265

  30. Evermann J, Rehse JR, Fettke P (2017) Predicting process behaviour using deep learning. Decis Support Syst 100:129–140

    Google Scholar 

  31. Fazzinga B, Flesca S, Furfaro F, Masciari E, Pontieri L (2018) Efficiently interpreting traces of low level events in business process logs. Inf Syst 73:1–24

    Google Scholar 

  32. Fazzinga B, Flesca S, Furfaro F, Pontieri L (2018) Online and offline classification of traces of event logs on the basis of security risks. J Intell Inf Syst 50(1):195–230

    Google Scholar 

  33. Fazzinga B, Flesca S, Furfaro F, Pontieri L (2018) Process discovery from low-level event logs. In: CAISE, pp 257–273

  34. Folino F, Folino G, Guarascio M, Pontieri L (2019) Learning effective neural nets for outcome prediction from partially labelled log data. In: 31st IEEE international conference on tools with artificial intelligence (ICTAI)

  35. Folino F, Greco G, Guzzo A, Pontieri L (2011) Mining usage scenarios in business processes: outlier-aware discovery and run-time prediction. Data Knowl Eng 70(12):1005–1029

    Google Scholar 

  36. Folino F, Guarascio M, Pontieri L (2014) Mining predictive process models out of low-level multidimensional logs. In: CAiSE, pp 533–547

  37. Folino F, Guarascio M, Pontieri L (2015) Mining multi-variant process models from low-level logs. In: International conference on business information systems (BIS), pp 165–177

  38. García-Bañuelos L, Dumas M, La Rosa M, De Weerdt J, Ekanayake CC (2014) Controlled automated discovery of collections of business process models. Inf Syst 46:85–101

    Google Scholar 

  39. Goedertier S, Martens D, Vanthienen J, Baesens B (2009) Robust process discovery with artificial negative events. J Mach Learn Res 10:1305–1340

    MathSciNet  MATH  Google Scholar 

  40. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: Annual conference on neural information processing systems (NIPS), pp 2672–2680

  41. Gori M (2017) Machine Learning: a constraint-based approach. Morgan Kaufm

  42. Greco G, Guzzo A, Lupia F, Pontieri L (2015) Process discovery under precedence constraints. TKDD 9(4):32:1-32:39

    Google Scholar 

  43. Greco G, Guzzo A, Pontieri L, Saccà D (2006) Discovering expressive process models by clustering log traces. IEEE TKDE 18(8):1010–1027

    Google Scholar 

  44. Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Surv (CSUR) 51(5):1–42

    Google Scholar 

  45. Günther CW, Van Der Aalst WM (2007) Fuzzy mining-adaptive process simplification based on multi-perspective metrics. In: BPM, pp 328–343

  46. Harl M, Weinzierl S, Stierle M, Matzner M (2020) Explainable predictive business process monitoring using gated graph neural networks. J Decis Syst 1–16

  47. Janiesch C, Koschmider A, Mecella M, Weber B, Burattin A, Di Ciccio C, Fortino G, Gal A, Kannengiesser U, Leotta F et al (2020) The internet of things meets business process management: a manifesto. IEEE Syst Man Cybern Mag 6(4):34–44

    Google Scholar 

  48. Kitchenham B (2004) Procedures for performing systematic reviews. Keele UK Keele University 33(2004):1–26

    Google Scholar 

  49. Koller D, Friedman N, Džeroski S, Sutton C, McCallum A, Pfeffer A, Abbeel P, Wong MF, Heckerman D, Meek C et al (2007) Introduction to statistical relational learning. MIT Press, Cambridge

    Google Scholar 

  50. Leemans SJ, Fahland D, van der Aalst WM (2013) Discovering block-structured process models from event logs containing infrequent behaviour. In: International conference on business process management, pp 66–78

  51. Leemans SJ, Fahland D, van der Aalst WM (2014) Exploring processes and deviations. In: International conference on business process management. Springer, pp 304–316

  52. Leno V, Dumas M, Maggi FM (2018) Correlating activation and target conditions in data-aware declarative process discovery. In: International conference on business process management, pp 176–193

  53. Leotta F, Mecella M, Mendling J (2015) Applying process mining to smart spaces: perspectives and research challenges. In: Persson A, Stirna J (eds) Advanced information systems engineering workshops, pp 298–304

  54. Letia IA, Goron A (2015) Model checking as support for inspecting compliance to rules in flexible processes. J Vis Lang Comput 28:100–121

    Google Scholar 

  55. Li J, Bose RJC, van der Aalst WM (2010) Mining context-dependent and interactive business process maps using execution patterns. In: International conference on business process management, pp 109–121

  56. Lin L, Wen L, Wang J (2019) MM-ORED: a deep predictive model for multi-attribute event sequence. In: Proceedings of the 2019 SIAM international conference on data mining. SIAM, pp 118–126

  57. Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Advances in neural information processing systems, pp 4765–4774

  58. Mannhardt F, de Leoni M, Reijers HA, van der Aalst WM (2017) Data-driven process discovery-revealing conditional infrequent behavior from event logs. In: International conference on advanced information systems engineering, pp 545–560

  59. Mannhardt F, de Leoni M, Reijers HA, van der Aalst WM, Toussaint PJ (2018) Guided process discovery-a pattern-based approach. Inf Syst 76:1–18

    Google Scholar 

  60. Mariscal G, Marban O, Fernandez C (2010) A survey of data mining and knowledge discovery process models and methodologies. Knowl Eng Rev 25(2):137–166

    Google Scholar 

  61. Marrella A (2017) What automated planning can do for business process management. In: International conference on business process management. Springer, pp 7–19

  62. Martín-Martín A, Orduna-Malea E, Thelwall M, López-Cózar ED (2018) Google scholar, web of science, and scopus: a systematic comparison of citations in 252 subject categories. J Inf 12(4):1160–1177

    Google Scholar 

  63. Mehdiyev N, Evermann J, Fettke P (2018) A novel business process prediction model using a deep learning method. Bus Inf Syst Eng 1–15

  64. Metzger A, Leitner P, Ivanović D, Schmieders E, Franklin R, Carro M, Dustdar S, Pohl K (2014) Comparing and combining predictive business process monitoring techniques. IEEE Trans Syst Man Cybern Syst 45(2):276–290

    Google Scholar 

  65. Nezhad HRM, Akkiraju R (2014) Towards cognitive BPM as the next generation BPM platform for analytics-driven business processes. In: International conference on business process management. Springer, pp 158–164

  66. Nolle T, Luettgen S, Seeliger A, Mühlhäuser M (2019) Binet: multi-perspective business process anomaly classification. Inf Syst 101458

  67. Park G, Song M (2020) Predicting performances in business processes using deep neural networks. Decis Support Syst 129:113191

    Google Scholar 

  68. Pasquadibisceglie V, Appice A, Castellano G, Malerba D (2019) Using convolutional neural networks for predictive process analytics. In: 2019 International conference on process mining (ICPM), pp 129–136

  69. Pasquadibisceglie V, Appice A, Castellano G, Malerba D (2020) Predictive process mining meets computer vision. In: International conference on business process management (BPM), pp 176–192

  70. Philipp P, Jacob R, Robert S, Beyerer J (2020) Predictive analysis of business processes using neural networks with attention mechanism. In: 2020 International conference on artificial intelligence in information and communication (ICAIIC), pp 225–230

  71. Poll R, Polyvyanyy A, Rosemann M, Röglinger M, Rupprecht L (2018) Process forecasting: towards proactive business process management. In: International conference on business process management. Springer, pp 496–512

  72. Prat N (2019) Augmented analytics. Bus Inf Syst Eng 61(3):375–380

    Google Scholar 

  73. Raedt LD, Nijssen S, O’Sullivan B, Hentenryck PV (2011) Constraint programming meets machine learning and data mining. Dagstuhl Rep 1(5):61–83

    Google Scholar 

  74. Rembert AJ, Omokpo A, Mazzoleni P, Goodwin RT (2013) Process discovery using prior knowledge. In: International conference on service-oriented computing, pp 328–342

  75. Ribeiro MT, Singh S, Guestrin C (2016) “Why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144

  76. Rogge-Solti A, Weske M (2013) Prediction of remaining service execution time using stochastic Petri nets with arbitrary firing delays. In: Basu S, Pautasso C, Zhang L, Fu X (eds) Service-oriented computing, pp 389–403

  77. Schönig S, Jasinski R, Ackermann L, Jablonski S (2018) Deep learning process prediction with discrete and continuous data features. In: Proceedings of the 13th international conference on evaluation of novel approaches to software engineering, pp 314–319

  78. Senderovich A, Shleyfman A, Weidlich M, Gal A, Mandelbaum A (2016) P3-folder: optimal model simplification for improving accuracy in process performance prediction. In: BPM, pp 418–436

  79. Sindhgatta R, Moreira C, Ouyang C, Barros A (2020) Exploring interpretable predictive models for business processes. In: International conference on business process management, pp 257–272

  80. Sugiyama M (2015) Statistical reinforcement learning: modern machine learning approaches. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  81. Suriadi S, Andrews R, ter Hofstede A, Wynn M (2017) Event log imperfection patterns for process mining. Inf Syst 64(C):132–150

    Google Scholar 

  82. Suriadi S, Andrews R, ter Hofstede AHM, Wynn MT (2017) Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf Syst 64:132–150

    Google Scholar 

  83. Tax N, Sidorova N, Haakma R, van der Aalst WM (2016) Event abstraction for process mining using supervised learning techniques. In: SAI IntelliSys, pp 251–269

  84. Tax N, Verenich I, Rosa ML, Dumas M (2017) Predictive business process monitoring with LSTM neural networks. In: 29th International conference on advanced information systems engineering (CAISE’17), pp 477–492

  85. Taymouri F, Rosa ML, Erfani S, Bozorgi ZD, Verenich I (2020) Predictive business process monitoring via generative adversarial nets: the case of next event prediction. In: Business process management

  86. van der Aa H, Leopold H, Weidlich M (2020) Partial order resolution of event logs for process conformance checking. Decis Support Syst 136:113347

    Google Scholar 

  87. van der Aalst W (2010) Business process simulation revisited. In: Workshop on enterprise and organizational modeling and simulation, pp 1–14

  88. van der Aalst W et al (2011) Process mining manifesto. In: BPM, pp 169–194

  89. van der Aalst WMP (2011) Process mining: discovery, conformance and enhancement of business processes. Springer, Berlin

    MATH  Google Scholar 

  90. Van Der Aa H, Leopold H, Reijers HA (2019) Efficient process conformance checking on the basis of uncertain event-to-activity mappings. IEEE Trans Knowl Data Eng 32(5):927–940

    Google Scholar 

  91. vanden Broucke SKLM, Weerdt JD (2017) Fodina: a robust and flexible heuristic process discovery technique. Decis Support Syst 100:109–118

    Google Scholar 

  92. Vanschoren J (2019) Meta-learning. Springer, Berlin, pp 35–61

    Google Scholar 

  93. Verenich I, Dumas M, Rosa ML, Maggi FM, Teinemaa I (2019) Survey and cross-benchmark comparison of remaining time prediction methods in business process monitoring. ACM TIST 10(4):34

    Google Scholar 

  94. van Dongen BF et al. (2005) The ProM framework: a new era in process mining tool support. In: International conference on application and theory of petri nets, pp 444–454

  95. van Eck ML, Lu X, Leemans SJ, van der Aalst WM (2015) Pm2: a process mining project methodology. In: International conference on advanced information systems engineering. Springer, pp 297–313

  96. van Der Aalst WM, Pesic M, Schonenberg H (2009) Declarative workflows: balancing between flexibility and support. Comput Sci Res Dev 23(2):99–113

    Google Scholar 

  97. von Rueden L, Mayer S, Garcke J, Bauckhage C, Schuecker J (2019) Informed machine learning-towards a taxonomy of explicit integration of knowledge into machine learning. arXiv preprint arXiv:1903.12394

  98. Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst 12(4):5–33

    Google Scholar 

  99. Watson HJ, Wixom BH (2007) The current state of business intelligence. Computer 40(9):96–99

    Google Scholar 

  100. Weijters A, Ribeiro J (2011) Flexible heuristics miner (FHM). In: 2011 IEEE symposium on computational intelligence and data mining (CIDM). IEEE, pp 310–317

  101. Weijters AJMM, van der Aalst WMP (2003) Rediscovering workflow models from event-based data using Little Thumb. Integr Comput-Aided Eng 10(2):151–162

    Google Scholar 

  102. Weinzierl S, Dunzer S, Zilker S, Matzner M (2020) Prescriptive business process monitoring for recommending next best actions. In: International conference on business process management, pp 193–209

  103. Yahya BN, Song M, Bae H, Sul So, Wu JZ (2016) Domain-driven actionable process model discovery. Comput Ind Eng 99:382–400

    Google Scholar 

Download references


This work was partially supported by the European Commission funded project “Humane AI: Toward AI Systems That Augment and Empower Humans by Understanding Us, our Society and the World Around Us” (grant # 820437). The support is gratefully acknowledged.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Luigi Pontieri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The Acknowledgements section has been included.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Folino, F., Pontieri, L. AI-Empowered Process Mining for Complex Application Scenarios: Survey and Discussion. J Data Semant 10, 77–106 (2021).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Process mining
  • Artificial intelligence
  • Data quality
  • Augmented analytics
  • Informed machine learning
  • Structured literature review