Abstract
The preparation of input event data is one of the most critical phases in process mining projects. Different frameworks have been developed to offer methodologies and/or supporting toolkits for data preparation. One of these frameworks, called OnProm, relies on sophisticated semantic technologies to extract event logs from relational databases. The toolkit consists of a series of general steps, meant to work on arbitrary, legacy databases. However, in many settings, the input database is not a legacy one but is structured with conceptually understandable object types and relationships that can be effectively employed to support business users in the extraction process. This is, for example, the case for document-driven enterprise systems. In this paper, we focus on this class of systems and propose a guided approach, erprep, to support a group of business and technical users in setting up OnProm with minimal effort. We demonstrate the approach in a real-life use case.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The DL-Lite family provides the formal counterpart of the lightweight ontology language OWL 2 QL, standardized by the W3C [15].
- 2.
See, e.g., the tools developed by Ontopic, https://ontopic.ai/.
- 3.
History-tables closely relate to the notion of redo logs in databases, previously studied within process mining in [8].
References
Aalst, W.M.P.: Object-centric process mining: dealing with divergence and convergence in event data. In: Ölveczky, P.C., Salaün, G. (eds.) SEFM 2019. LNCS, vol. 11724, pp. 3–25. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30446-1_1
Calvanese, D., et al.: Ontologies and databases: the DL-Lite approach. In: Tessaris, S., et al. (eds.) Reasoning Web 2009. LNCS, vol. 5689, pp. 255–356. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03754-2_7
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. of Autom. Reason. 39(3), 385–429 (2007)
Calvanese, D., Kalayci, T.E., Montali, M., Santoso, A.: OBDA for Log extraction in process mining. In: Ianni, G., Lembo, D., Bertossi, L., Faber, W., Glimm, B., Gottlob, G., Staab, S. (eds.) Reasoning Web 2017. LNCS, vol. 10370, pp. 292–345. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61033-7_9
Calvanese, D., Kalayci, T.E., Montali, M., Tinella, S.: Ontology-based data access for extracting event logs from legacy data: the onprom tool and methodology. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 288, pp. 220–236. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59336-4_16
Cohn, D., Hull, R.: Business artifacts: a data-centric approach to modeling business operations and processes. IEEE Bull. Data Eng. 32(3) (2009)
van Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM\(^2\): a process mining project methodology. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 297–313. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19069-3_19
González López de Murillas, E., Reijers, H.A., van der Aalst, W.M.P.: Connecting databases with process mining: a meta model and toolset. Softw. Syst. Model. 18(2) (2019)
Guarino, N., Welty, C.A.: An overview of OntoClean. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies. International Handbooks on Information Systems, pp. 151–171. Springer, Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24750-0_8
Ingvaldsen, J.E., Gulla, J.A.: Preprocessing support for large scale process mining of SAP transactions. In: ter Hofstede, A., Benatallah, B., Paik, H.-Y. (eds.) BPM 2007. LNCS, vol. 4928, pp. 30–41. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78238-4_5
Jans, M., Soffer, P.: From relational database to event log: decisions with quality impact. In: Teniente, E., Weidlich, M. (eds.) BPM 2017. LNBIP, vol. 308, pp. 588–599. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74030-0_46
Jans, M., Soffer, P., Jouck, T.: Building a valuable event log for process mining: an experimental exploration of a guided process. Enterp. Inf. Syst. 13(5) (2019)
Li, G., de Murillas, E.G.L., de Carvalho, R.M., van der Aalst, W.M.P.: Extracting object-centric event logs to support process mining on databases. In: Mendling, J., Mouratidis, H. (eds.) CAiSE 2018. LNBIP, vol. 317, pp. 182–199. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92901-9_16
Lu, X., Nagelkerke, M., v. d. Wiel, D., Fahland, D.: Discovering interacting artifacts from ERP systems. IEEE Trans. Serv. Comput. 8(6) (2015)
Motik, B., Cuenca Grau, B., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language profiles (second edition). W3C Recommendation, W3C (2012). https://www.w3.org/TR/owl2-profiles/
Mueller-Wickop, N., Schultz, M.: ERP event log preprocessing: timestamps vs. accounting logic. In: vom Brocke, J., Hekkala, R., Ram, S., Rossi, M. (eds.) i. LNCS, vol. 7939, pp. 105–119. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38827-9_8
Nooijen, E.H.J., van Dongen, B.F., Fahland, D.: Automatic discovery of data-centric and artifact-centric processes. In: La Rosa, M., Soffer, P. (eds.) BPM 2012. LNBIP, vol. 132, pp. 316–327. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36285-9_36
Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Emp. Softw. Eng. 14(2) (2008)
Shull, F., et al.: Replicating software engineering experiments: addressing the tacit knowledge problem. In: Proceedings of International Symposium on Empirical Software Engineering (2002)
Venable, J., Pries-Heje, J., Baskerville, R.: FEDS: a framework for evaluation in design science research. Eur. J. Inf. Syst. 25(1) (2016)
Xiao, G., et al.: Ontology-based data access: a survey. In: Proceedings of IJCAI (2018)
Xiong, J., Xiao, G., Kalayci, T.E., Montali, M., Gu, Z., Calvanese, D.: Extraction of object-centric event logs through virtual knowledge graphs (extended abstract). In: Proceedings of DL. CEUR, vol. 3263. CEUR-WS.org (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Calvanese, D., Jans, M., Kalayci, T.E., Montali, M. (2023). Extracting Event Data from Document-Driven Enterprise Systems. In: Indulska, M., Reinhartz-Berger, I., Cetina, C., Pastor, O. (eds) Advanced Information Systems Engineering. CAiSE 2023. Lecture Notes in Computer Science, vol 13901. Springer, Cham. https://doi.org/10.1007/978-3-031-34560-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-34560-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34559-3
Online ISBN: 978-3-031-34560-9
eBook Packages: Computer ScienceComputer Science (R0)