Skip to main content

Machine Learning-Based Framework for Log-Lifting in Business Process Mining Applications

  • Conference paper
  • First Online:
Business Process Management (BPM 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11675))

Included in the following conference series:

Abstract

Real-life event logs are typically much less structured and more complex than the predefined business activities they refer to. Most of the existing process mining techniques assume that there is a one-to-one mapping between process model activities and events recorded during process execution. Unfortunately, event logs and process model activities are defined at different levels of granularity. The challenges posed by this discrepancy can be addressed by means of log-lifting. In this work we develop a machine-learning-based framework aimed at bridging the abstraction level gap between logs and process models. The proposed framework operates of two main phases: log segmentation and machine-learning-based classification. The purpose of the segmentation phase is to identify the potential segment separators in a flow of low-level events, in which each segment corresponds to an unknown high-level activity. For this, we propose a segmentation algorithm based on maximum likelihood with n-gram analysis. In the second phase, event segments are mapped into their corresponding high-level activities using a supervised machine learning technique. Several machine learning classification methods are explored including ANNs, SVMs, and random forest. We demonstrate the applicability of our framework using a real-life event log provided by the SAP company. The results obtained show that a machine learning approach based on the random forest algorithm outperforms the other methods with an accuracy of 96.4%. The testing time was found to be around 0.01s, which makes the algorithm a good candidate for real-time deployment scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    SAP dataset https://doi.org/10.5281/zenodo.2566022.

References

  1. Van der Aalst, W., Weijters, T., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)

    Article  Google Scholar 

  2. Van der Aalst, W.M.: Process Mining - Discovery, Conformance and Enhancement of Business Processes. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19345-3

    Book  MATH  Google Scholar 

  3. Altendrof, J., Brende, P., Lessard, L.: Fraud detection for online retail using random forests. Technical report (2005)

    Google Scholar 

  4. Boinee, P., De Angelis, A., Foresti, G.L.: Ensembling classifiers-an application to image data classification from Cherenkov telescope experiment. In: IEC (Prague), pp. 394–398 (2005)

    Google Scholar 

  5. Bose, R.P.J.C., Verbeek, E.H.M.W., van der Aalst, W.M.P.: Discovering hierarchical process models using ProM. In: Nurcan, S. (ed.) CAiSE Forum 2011. LNBIP, vol. 107, pp. 33–48. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29749-6_3

    Chapter  Google Scholar 

  6. Casati, F., Shan, M.-C.: Semantic analysis of business process executions. In: Jensen, C.S., et al. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 287–296. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45876-X_19

    Chapter  Google Scholar 

  7. Ceravolo, P., Damiani, E., Torabi, M., Barbon, S.: Toward a new generation of log pre-processing methods for process mining. In: Carmona, J., Engels, G., Kumar, A. (eds.) BPM 2017. LNBIP, vol. 297, pp. 55–70. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65015-9_4

    Chapter  Google Scholar 

  8. Alves de Medeiros, A.K., van der Aalst, W.M.P.: Process mining towards semantics. In: Dillon, T.S., Chang, E., Meersman, R., Sycara, K. (eds.) Advances in Web Semantics I. LNCS, vol. 4891, pp. 35–80. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89784-2_3

    Chapter  Google Scholar 

  9. de Medeiros, A.K.A., et al.: An outlook on semantic business process mining and monitoring. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2007. LNCS, vol. 4806, pp. 1244–1255. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76890-6_52

    Chapter  Google Scholar 

  10. van der Aalst, W.M.P., de Medeiros, A.K.A., Weijters, A.J.M.M.: Genetic process mining. In: Ciardo, G., Darondeau, P. (eds.) ICATPN 2005. LNCS, vol. 3536, pp. 48–69. Springer, Heidelberg (2005). https://doi.org/10.1007/11494744_5

    Chapter  Google Scholar 

  11. Diaconis, P.: The Markov chain Monte Carlo revolution. Bull. Am. Math. Soc. 46(2), 179–205 (2009)

    Article  MathSciNet  Google Scholar 

  12. Dumas, M., Van der Aalst, W.M., Ter Hofstede, A.H.: Process-Aware Information Systems: Bridging People and Software Through Process Technology. Wiley, New York (2005)

    Book  Google Scholar 

  13. Fazzinga, B., Flesca, S., Furfaro, F., Masciari, E., Pontieri, L.: Efficiently interpreting traces of low level events in business process logs. Inf. Syst. 73, 1–24 (2018)

    Article  Google Scholar 

  14. Folleco, A., Khoshgoftaar, T.M., Van Hulse, J., Bullard, L.: Software quality modeling: the impact of class noise on the random forest classifier. In: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), CEC 2008, pp. 3853–3859. IEEE (2008)

    Google Scholar 

  15. Grando, M.A., Schonenberg, M., van der Aalst, W.M.: Semantic process mining for the verification of medical recommendations. In: HEALTHINF, pp. 5–16 (2011)

    Google Scholar 

  16. Günther, C.W., van der Aalst, W.M.: Mining activity clusters from low-level event logs. Beta, Research School for Operations Management and Logistics (2006)

    Google Scholar 

  17. Günther, C.W., Rozinat, A., van der Aalst, W.M.P.: Activity mining by global trace segmentation. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 128–139. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12186-9_13

    Chapter  Google Scholar 

  18. Jareevongpiboon, W., Janecek, P.: Ontological approach to enhance results of business process mining and analysis. Bus. Process. Manag. J. 19(3), 459–476 (2013)

    Article  Google Scholar 

  19. Leonardi, G., Striani, M., Quaglini, S., Cavallini, A., Montani, S.: Towards semantic process mining through knowledge-based trace abstraction. In: Ceravolo, P., van Keulen, M., Stoffel, K. (eds.) SIMPDA 2017. LNBIP, vol. 340, pp. 45–64. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11638-5_3

    Chapter  Google Scholar 

  20. Li, J., Bose, R.P.J.C., van der Aalst, W.M.P.: Mining context-dependent and interactive business process maps using execution patterns. In: zur Muehlen, M., Su, J. (eds.) BPM 2010. LNBIP, vol. 66, pp. 109–121. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20511-8_10

    Chapter  Google Scholar 

  21. Ma, Y., Guo, L., Cukic, B.: A statistical framework for the prediction of fault-proneness. In: Advances in Machine Learning Applications in Software Engineering, pp. 237–263. IGI Global (2007)

    Google Scholar 

  22. Mannhardt, F., de Leoni, M., Reijers, H.A., van der Aalst, W.M.P., Toussaint, P.J.: From low-level events to activities - a pattern-based approach. In: La Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNCS, vol. 9850, pp. 125–141. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45348-4_8

    Chapter  Google Scholar 

  23. Pérez-Castillo, R., Weber, B., de Guzmán, I.G.R., Piattini, M., Pinggera, J.: Assessing event correlation in non-process-aware information systems. Softw. Syst. Model. 13(3), 1117–1139 (2014)

    Google Scholar 

  24. Veiga, G.M., Ferreira, D.R.: Understanding spaghetti models with sequence clustering for ProM. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 92–103. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12186-9_10

    Chapter  Google Scholar 

  25. Weijters, A., van der Aalst, W., Alves de Medeiros, A.: Process mining with the heuristics algorithm. Technical report, BETA Working Paper Series 166, TU Eindhoven (2006)

    Google Scholar 

  26. Zhang, J., Zulkernine, M.: Network intrusion detection using random forests. In: PST. Citeseer (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ghalia Tello , Gabriele Gianini , Rabeb Mizouni or Ernesto Damiani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tello, G., Gianini, G., Mizouni, R., Damiani, E. (2019). Machine Learning-Based Framework for Log-Lifting in Business Process Mining Applications. In: Hildebrandt, T., van Dongen, B., Röglinger, M., Mendling, J. (eds) Business Process Management. BPM 2019. Lecture Notes in Computer Science(), vol 11675. Springer, Cham. https://doi.org/10.1007/978-3-030-26619-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26619-6_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26618-9

  • Online ISBN: 978-3-030-26619-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics