OBDA for Log Extraction in Process Mining

  • Diego Calvanese
  • Tahir Emre Kalayci
  • Marco MontaliEmail author
  • Ario Santoso
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10370)


Process mining is an emerging area that synergically combines model-based and data-oriented analysis techniques to obtain useful insights on how business processes are executed within an organization. Through process mining, decision makers can discover process models from data, compare expected and actual behaviors, and enrich models with key information about their actual execution. To be applicable, process mining techniques require the input data to be explicitly structured in the form of an event log, which lists when and by whom different case objects (i.e., process instances) have been subject to the execution of tasks. Unfortunately, in many real world set-ups, such event logs are not explicitly given, but are instead implicitly represented in legacy information systems. To apply process mining in this widespread setting, there is a pressing need for techniques able to support various process stakeholders in data preparation and log extraction from legacy information systems. The purpose of this paper is to single out this challenging, open issue, and didactically introduce how techniques from intelligent data management, and in particular ontology-based data access, provide a viable solution with a solid theoretical basis.


Process mining Ontology-based data access Event log extraction Relational database management systems 



This research has been partially supported by the Euregio IPN12 “KAOS: Knowledge-Aware Operational Support” project, which is funded by the “European Region Tyrol-South Tyrol-Trentino” (EGTC) under the first call for basic research projects, and by the UNIBZ internal project “OnProm (ONtology-driven PROcess Mining)”. We thank Wil van der Aalst for the interesting discussions and insights on the problem of extracting event logs from legacy information systems.


  1. 1.
    Dumas, M., Rosa, M.L., Mendling, J., Reijers, H.A.: Fundamentals of Business Process Management. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  2. 2.
    Weske, M.: Business Process Management - Concepts, Languages, Architectures, 2nd edn. Springer, Heidelberg (2012)Google Scholar
  3. 3.
    van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-28108-2_19 CrossRefGoogle Scholar
  4. 4.
    van der Aalst, W.M.P.: Process Mining - Data Science in Action, 2nd edn. Springer, Heidelberg (2016)Google Scholar
  5. 5.
    IEEE Computational Intelligence Society: IEEE Standard for eXtensible Event Stream (XES) for Achieving Interoperability in Event Logs and Event Streams. IEEE Std 1849–2016 (2016). i–50Google Scholar
  6. 6.
    Poggi, A., Lembo, D., Calvanese, D., Giacomo, G., Lenzerini, M., Rosati, R.: Linking data to ontologies. In: Spaccapietra, S. (ed.) Journal on Data Semantics X. LNCS, vol. 4900, pp. 133–173. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-77688-8_5 CrossRefGoogle Scholar
  7. 7.
    Calvanese, D., Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A., Rodriguez-Muro, M., Rosati, R.: Ontologies and databases: the DL-Lite approach. In: Tessaris, S., Franconi, E., Eiter, T., Gutierrez, C., Handschuh, S., Rousset, M.-C., Schmidt, R.A. (eds.) Reasoning Web 2009. LNCS, vol. 5689, pp. 255–356. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-03754-2_7 CrossRefGoogle Scholar
  8. 8.
    Calvanese, D., Cogrel, B., Komla-Ebri, S., Kontchakov, R., Lanti, D., Rezk, M., Rodriguez-Muro, M., Xiao, G.: Ontop: answering SPARQL queries over relational databases. Semant. Web J. 8(3), 471–487 (2017)CrossRefGoogle Scholar
  9. 9.
    Calvanese, D., Kalayci, T.E., Montali, M., Tinella, S.: Ontology-based data access for extracting event logs from legacy data: the onprom tool and methodology. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 288, pp. 220–236. Springer, Heidelberg (2017).
  10. 10.
    van der Aalst, W., Weijters, T., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)CrossRefGoogle Scholar
  11. 11.
    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Process and deviation exploration with inductive visual miner. In: Proceedings of BPM Demo Sessions. CEUR Workshop Proceedings, vol. 1295, p. 46. (2014).
  12. 12.
    Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM\(^2\): a process mining project methodology. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 297–313. Springer, Cham (2015). doi: 10.1007/978-3-319-19069-3_19 CrossRefGoogle Scholar
  13. 13.
    Verbeek, H.M.W., Buijs, J.C.A.M., Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60–75. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-17722-4_5 CrossRefGoogle Scholar
  14. 14.
    Dongen, B.F., Medeiros, A.K.A., Verbeek, H.M.W., Weijters, A.J.M.M., van der Aalst, W.M.P.: The ProM framework: a new era in process mining tool support. In: Ciardo, G., Darondeau, P. (eds.) ICATPN 2005. LNCS, vol. 3536, pp. 444–454. Springer, Heidelberg (2005). doi: 10.1007/11494744_25 CrossRefGoogle Scholar
  15. 15.
    van der Aalst, W.M.P., Bolt, A., van Zelst, S.J.: RapidProM: Mine your processes and not just your data. CoRR Technical Report abs/1703.03740, e-Print archive, March 2017.
  16. 16.
    Günther, C.W., Rozinat, A.: Disco: discover your processes. In; Lohmann, N., Moser, S. (eds.) Proceedings of the Demonstration Track of the 10th International Conference on Business Process Management (BPM). CEUR Workshop Proceedings, vol. 940, pp. 40–44 (2012).
  17. 17.
    Günther, C.W.: XES Standard Definition Version 1.0. Technical report, Fluxicon Process Laboratories, November 2009.
  18. 18.
    van Dongen, B.F., van der Aalst, W.M.P.: A meta model for process mining data. In: Proceedings of EMOI - INTEROP. CEUR Workshop Proceedings, vol. 160. (2005).
  19. 19.
    Günther, C.W., Verbeek, E.: XES Standard Definition Version 2.0. Technical report, Fluxicon Process Laboratories, March 2014.
  20. 20.
    Günther, C.W., Aalst, W.M.P.: A generic import framework for process event logs. In: Eder, J., Dustdar, S. (eds.) BPM 2006. LNCS, vol. 4103, pp. 81–92. Springer, Heidelberg (2006). doi: 10.1007/11837862_10 CrossRefGoogle Scholar
  21. 21.
    Bao, J., et al.: OWL 2 Web Ontology Language document overview, 2nd edn. W3C Recommendation, World Wide Web Consortium, December 2012.
  22. 22.
    Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press (2003)Google Scholar
  23. 23.
    Calvanese, D.: Query answering over description logic ontologies. In: Fermé, E., Leite, J. (eds.) JELIA 2014. LNCS (LNAI), vol. 8761, pp. 1–17. Springer, Cham (2014). doi: 10.1007/978-3-319-11558-0_1 Google Scholar
  24. 24.
    Vardi, M.Y.: The complexity of relational query languages. In: Proceedings of the 14th ACM SIGACT Symposium on Theory of Computing (STOC), pp. 137–146 (1982)Google Scholar
  25. 25.
    Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning 39(3), 385–429 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Data complexity of query answering in description logics. Artif. Intell. 195, 335–360 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Motik, B., Cuenca Grau, B., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language profiles, 2nd edn. W3C Recommendation, World Wide Web Consortium, December 2012.
  28. 28.
    Calvanese, D., Lenzerini, M., Nardi, D.: Unifying class-based representation formalisms. J. Artif. Intell. Res. 11, 199–240 (1999)MathSciNetzbMATHGoogle Scholar
  29. 29.
    Berardi, D., Calvanese, D., De Giacomo, G.: Reasoning on UML class diagrams. Artif. Intell. 168(1–2), 70–118 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison Wesley Publ. Co. (1995)Google Scholar
  31. 31.
    Antonioli, N., Castanò, F., Coletta, S., Grossi, S., Lembo, D., Lenzerini, M., Poggi, A., Virardi, E., Castracane, P.: Ontology-based data management for the Italian public debt. In: Proceedings of the 8th International Conference on Formal Ontology in Information Systems (FOIS). Frontiers in Artificial Intelligence and Applications, vol. 267, pp. 372–385. IOS Press (2014)Google Scholar
  32. 32.
    Gottlob, G., Kikot, S., Kontchakov, R., Podolskii, V.V., Schwentick, T., Zakharyaschev, M.: The price of query rewriting in ontology-based data access. Artif. Intell. 213, 42–59 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Kontchakov, R., Lutz, C., Toman, D., Wolter, F., Zakharyaschev, M.: The combined approach to query answering in DL-Lite. In: Proceedings of the 12th International Conference on the Principles of Knowledge Representation and Reasoning (KR), pp. 247–257 (2010)Google Scholar
  34. 34.
    Rodriguez-Muro, M., Calvanese, D.: High performance query answering over DL-Lite ontologies. In: Proceedings of the 13th International Conference on the Principles of Knowledge Representation and Reasoning (KR), pp. 308–318 (2012)Google Scholar
  35. 35.
    Rodriguez-Muro, M., Rezk, M.: Efficient SPARQL-to-SQL with R2RML mappings. J. Web Semant. 33, 141–169 (2015)CrossRefGoogle Scholar
  36. 36.
    Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: DB-XES: enabling process discovery in the large. In: Ceravolo, P., Guetl, C., Rinderle-Ma, S. (eds.) Proceedings of the 6th International Symposium on Data-driven Process Discovery and Analysis (SIMPDA). CEUR Workshop Proceedings, vol. 1757, pp. 63–77 (2016).
  37. 37.
    Jiménez-Ruiz, E., Kharlamov, E., Zheleznyakov, D., Horrocks, I., Pinkel, C., Skjæveland, M.G., Thorstensen, E., Mora, J.: BootOX: Bootstrapping OWL 2 Ontologies and R2RML Mappings from Relational Databases. In Villata, S., Pan, J.Z., Dragoni, M. (eds.) Proceedings of the 14th International Semantic Web Conference Posters & Demonstrations Track (ISWC). CEUR Workshop Proceedings, vol. 1486 (2015).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Diego Calvanese
    • 1
  • Tahir Emre Kalayci
    • 1
  • Marco Montali
    • 1
    Email author
  • Ario Santoso
    • 1
  1. 1.KRDB Research Centre for Knowledge and DataFree University of Bozen-BolzanoBolzanoItaly

Personalised recommendations