Skip to main content

Extracting Event Data from Document-Driven Enterprise Systems

  • Conference paper
  • First Online:
Advanced Information Systems Engineering (CAiSE 2023)

Abstract

The preparation of input event data is one of the most critical phases in process mining projects. Different frameworks have been developed to offer methodologies and/or supporting toolkits for data preparation. One of these frameworks, called OnProm, relies on sophisticated semantic technologies to extract event logs from relational databases. The toolkit consists of a series of general steps, meant to work on arbitrary, legacy databases. However, in many settings, the input database is not a legacy one but is structured with conceptually understandable object types and relationships that can be effectively employed to support business users in the extraction process. This is, for example, the case for document-driven enterprise systems. In this paper, we focus on this class of systems and propose a guided approach, erprep, to support a group of business and technical users in setting up OnProm with minimal effort. We demonstrate the approach in a real-life use case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The DL-Lite family provides the formal counterpart of the lightweight ontology language OWL 2 QL, standardized by the W3C [15].

  2. 2.

    See, e.g., the tools developed by Ontopic, https://ontopic.ai/.

  3. 3.

    History-tables closely relate to the notion of redo logs in databases, previously studied within process mining in [8].

References

  1. Aalst, W.M.P.: Object-centric process mining: dealing with divergence and convergence in event data. In: Ölveczky, P.C., Salaün, G. (eds.) SEFM 2019. LNCS, vol. 11724, pp. 3–25. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30446-1_1

    Chapter  Google Scholar 

  2. Calvanese, D., et al.: Ontologies and databases: the DL-Lite approach. In: Tessaris, S., et al. (eds.) Reasoning Web 2009. LNCS, vol. 5689, pp. 255–356. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03754-2_7

    Chapter  Google Scholar 

  3. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. of Autom. Reason. 39(3), 385–429 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  4. Calvanese, D., Kalayci, T.E., Montali, M., Santoso, A.: OBDA for Log extraction in process mining. In: Ianni, G., Lembo, D., Bertossi, L., Faber, W., Glimm, B., Gottlob, G., Staab, S. (eds.) Reasoning Web 2017. LNCS, vol. 10370, pp. 292–345. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61033-7_9

    Chapter  Google Scholar 

  5. Calvanese, D., Kalayci, T.E., Montali, M., Tinella, S.: Ontology-based data access for extracting event logs from legacy data: the onprom tool and methodology. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 288, pp. 220–236. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59336-4_16

    Chapter  Google Scholar 

  6. Cohn, D., Hull, R.: Business artifacts: a data-centric approach to modeling business operations and processes. IEEE Bull. Data Eng. 32(3) (2009)

    Google Scholar 

  7. van Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM\(^2\): a process mining project methodology. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 297–313. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19069-3_19

    Chapter  Google Scholar 

  8. González López de Murillas, E., Reijers, H.A., van der Aalst, W.M.P.: Connecting databases with process mining: a meta model and toolset. Softw. Syst. Model. 18(2) (2019)

    Google Scholar 

  9. Guarino, N., Welty, C.A.: An overview of OntoClean. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies. International Handbooks on Information Systems, pp. 151–171. Springer, Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24750-0_8

  10. Ingvaldsen, J.E., Gulla, J.A.: Preprocessing support for large scale process mining of SAP transactions. In: ter Hofstede, A., Benatallah, B., Paik, H.-Y. (eds.) BPM 2007. LNCS, vol. 4928, pp. 30–41. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78238-4_5

    Chapter  Google Scholar 

  11. Jans, M., Soffer, P.: From relational database to event log: decisions with quality impact. In: Teniente, E., Weidlich, M. (eds.) BPM 2017. LNBIP, vol. 308, pp. 588–599. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74030-0_46

    Chapter  Google Scholar 

  12. Jans, M., Soffer, P., Jouck, T.: Building a valuable event log for process mining: an experimental exploration of a guided process. Enterp. Inf. Syst. 13(5) (2019)

    Google Scholar 

  13. Li, G., de Murillas, E.G.L., de Carvalho, R.M., van der Aalst, W.M.P.: Extracting object-centric event logs to support process mining on databases. In: Mendling, J., Mouratidis, H. (eds.) CAiSE 2018. LNBIP, vol. 317, pp. 182–199. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92901-9_16

    Chapter  Google Scholar 

  14. Lu, X., Nagelkerke, M., v. d. Wiel, D., Fahland, D.: Discovering interacting artifacts from ERP systems. IEEE Trans. Serv. Comput. 8(6) (2015)

    Google Scholar 

  15. Motik, B., Cuenca Grau, B., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language profiles (second edition). W3C Recommendation, W3C (2012). https://www.w3.org/TR/owl2-profiles/

  16. Mueller-Wickop, N., Schultz, M.: ERP event log preprocessing: timestamps vs. accounting logic. In: vom Brocke, J., Hekkala, R., Ram, S., Rossi, M. (eds.) i. LNCS, vol. 7939, pp. 105–119. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38827-9_8

    Chapter  Google Scholar 

  17. Nooijen, E.H.J., van Dongen, B.F., Fahland, D.: Automatic discovery of data-centric and artifact-centric processes. In: La Rosa, M., Soffer, P. (eds.) BPM 2012. LNBIP, vol. 132, pp. 316–327. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36285-9_36

    Chapter  Google Scholar 

  18. Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Emp. Softw. Eng. 14(2) (2008)

    Google Scholar 

  19. Shull, F., et al.: Replicating software engineering experiments: addressing the tacit knowledge problem. In: Proceedings of International Symposium on Empirical Software Engineering (2002)

    Google Scholar 

  20. Venable, J., Pries-Heje, J., Baskerville, R.: FEDS: a framework for evaluation in design science research. Eur. J. Inf. Syst. 25(1) (2016)

    Google Scholar 

  21. Xiao, G., et al.: Ontology-based data access: a survey. In: Proceedings of IJCAI (2018)

    Google Scholar 

  22. Xiong, J., Xiao, G., Kalayci, T.E., Montali, M., Gu, Z., Calvanese, D.: Extraction of object-centric event logs through virtual knowledge graphs (extended abstract). In: Proceedings of DL. CEUR, vol. 3263. CEUR-WS.org (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mieke Jans .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Calvanese, D., Jans, M., Kalayci, T.E., Montali, M. (2023). Extracting Event Data from Document-Driven Enterprise Systems. In: Indulska, M., Reinhartz-Berger, I., Cetina, C., Pastor, O. (eds) Advanced Information Systems Engineering. CAiSE 2023. Lecture Notes in Computer Science, vol 13901. Springer, Cham. https://doi.org/10.1007/978-3-031-34560-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34560-9_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34559-3

  • Online ISBN: 978-3-031-34560-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics