Skip to main content
Log in

Semantics-based event log aggregation for process mining and analytics

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

In highly complex and flexible environments, event logs tend to exhibit high levels of heterogeneity, and clustering-based methods are candidate techniques for simplifying the mined process models from the process observations. To compensate for the information loss occurring during clustering, semantic information from event logs may be extracted and organized in the form of knowledge structures such as process ontologies using methods of ontology learning. In this article, we propose an overall computational framework for event log pre-processing, and then focus on a specific component of the framework, namely event log aggregation. We develop a detailed system architecture for this component, along with an implemented and evaluated research prototype SemAgg. We use phrase-based semantic similarity between normalized event names to aggregate event logs in a hierarchical form. We discuss the practical implications of this work for learning lower level process ontology classes as well as performing further process mining and analytics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Alves de Medeiros, A. K., van Dongen, B. F., van der Aalst, W. M. P., & Weijters, A. J. M. M. (2004). Process mining for ubiquitous mobile systems: An overview and a concrete algorithm. In L. Baresi, S. Dustdar, H. Gall, & M. Matera (Eds.), Ubiquitous Mobile Information and Collaboration Systems (UMICS 2004) (Vol. 3272, pp. 151–165). Berlin: Springer.

    Chapter  Google Scholar 

  • Alves de Medeiros, A. K., Weijters, A. J. M. M., & van der Aalst, W. M. P. (2006). Genetic process mining: an experimental evaluation. Data Mining and Knowledge Discovery, 14(2), 245–304.

    Article  Google Scholar 

  • Alves de Medeiros, A. K., Pedrinaci, C., van der Aalst, W. M. P., Domingue, J., Song, M., Rozinat, A., Norton, B., Cabral, L. (2007). An outlook on semantic business process and monitoring. In Proceedings of the 2007 OTM Confederated international conference on the move to meaningful internet systems - Volume Part II (1244–1255). Berlin: Springer-Verlag.

  • Alves de Medeiros, A. K., Guzzo, A., Greco, G., & van der Aalst, W. M. P. (2008a). Process mining based on clustering: A quest for precision. In Business process management workshops lecture notes in Computer Science, (Vol. 4928, pp. 17–29. Berlin: Springer.

  • Alves de Medeiros, A. K., Karla, A., van der Aalst, W. M. P., Pedrinaci, C., & Alves de Medeiros, A. K. (2008b). Semantic process mining tools: Core building blocks. In W. Golden, T. Acton, K. Conboy, H. van der Heijden, & V. Tuunainen (Eds.), Proceedings of the 16th European Conference on Information Systems (ECIS’08) (pp. 1953–1964). Ireland: Galway.

    Google Scholar 

  • APQC. (2012). APQC’s Process Classification Framework, Version 6.0.0-en-XI.

  • Bae, J., Caverlee, J., Liu, L., & Yan, H. (2006). Process mining by measuring process block similarity. In Business Process Management Workshops, Lecture Notes in Computer Science (Vol. 4103, pp. 141–152). Berlin: Springer-Verlag.

  • Bose, R., & van der Aalst, W. M. P. (2009). Abstractions in process mining: A taxonomy of patterns. In Business process management lecture notes in Computer Science (Vol. 5701, pp. 159–175). Berlin: Springer.

  • Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V. (2002). GATE: An architecture for development of robust HLT applications. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL) (pp. 168–175).

  • Edgington, T. M., Raghu, T. S., & Vinze, A. S. (2010). Using process mining to identify coordination patterns in IT service management. Decision Support Systems, 49(2), 175–186. doi:10.1016/j.dss.2010.02.003.

    Article  Google Scholar 

  • Ehrig, M., Koschmider, A., & Oberweis, A. (2007). Measuring similarity between semantic business process models. In Proceedings of the 4th Asia-Pacific conference on Comceptual modelling - Volume 67 (APCCM '07) (pp. 71–80). Australia: Darlinghurst.

  • Ferreira, D. R., & Thom, L. H. (2012). A semantic approach to the discovery of workflow activity patterns in event logs. International Journal of Business Process Integration and Management, 6(1), 4–17.

    Article  Google Scholar 

  • Folino, F., Greco, G., Guzzo, A., & Pontieri, L. (2011). Mining usage scenarios in business processes: outlier-aware discovery and run-time prediction. Data & Knowledge Engineering, 70(12), 1005–1029. doi:10.1016/j.datak.2011.07.002.

    Article  Google Scholar 

  • Fowlkes, E. B., & Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383), 553–569.

    Article  Google Scholar 

  • Grau, B. C., Parsia, B., Sirin, E. (2004). Working with Multiple Ontologies on the Semantic Web. In The Semantic Web – ISWC 2004, Lecture Notes in Computer Science (Vol. 3298, pp. 620–634)

  • Greco, G., Guzzo, A., Pontieri, L., & Sacca, D. (2006). Discovering expressive process models by clustering log traces. IEEE Transactions on Knowledge and Data Engineering, 18(8), 1010–1027.

    Article  Google Scholar 

  • Günther, C. W. & van der Aalst, W. M. P. (2007). Fuzzy mining – adaptive process simplification based on multi-perspective metrics. In Business process management lecture notes in Computer Science (Vol. 4714, pp. 328–343). Berlin: Springer.

  • Hwang, M., Choi, C., & Kim, P. (2011). Automatic enrichment of semantic relation network and its application to word sense disambiguation. IEEE Transactions on Knowledge and Data Engineering, 23(6), 845–858.

    Article  Google Scholar 

  • IEEE Task Force on Process Mining. (2011). Process mining manifesto.

  • Iglesias, J. A., Angelov, P., Ledezma, A., & Sanchis, A. (2012). Creating evolving user behavior profiles automatically. IEEE Transactions on Knowledge and Data Engineering, 24(5), 854–867.

    Article  Google Scholar 

  • Jareevongpiboon, W., & Janecek, P. (2013). Ontological approach to enhance results of business process mining and analysis. Business Process Management Journal, 19(3), 459–476. doi:10.1108/14637151311319905.

    Article  Google Scholar 

  • Leacock, C., & Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In WordNet: An electronic lexical database (pp. 265–283). MIT press.

  • Lin, D. (1998). An information-theoretic definition of similarity. In Proceeding ICML’98 Proceedings of the Fifteenth International Conference on Machine Learning (pp. 296 – 304).

  • Lin, Y. (2008). Semantic annotation for process models: facilitating process knowledge management via semantic interoperability. Department of computer and information science. Trondheim: Norwegian University of Science and Technology.

    Google Scholar 

  • Ly, L. T., Indiono, C., Mangler, J., Rinderle-Ma, S. (2012). Data transformation and semantic log purging for process mining. In Advanced Information Systems Engineering, Lecture Notes in Computer Science (Vol. 7328, pp. 238–253).

  • Maedche, A., Motik, B., Stojanovic, L., Studer, R., Volz, R. (2002). Managing multiple ontologies and ontology evolution in ontologging. In Intelligent Information Processing: IFIP — The International Federation for Information Processing (Vol. 93, pp. 51–63).

  • Malone, T. W., Crowston, K., & Herman, G. A. (2003). Organizing business knowledge: The MIT process handbook. Cambridge: The MIT Press.

    Google Scholar 

  • Mans, R. S., Schonenberg, M. H., Song, M., & van der Aalst, W. M. P. (2009). Application of process mining in healthcare—a case study in a dutch hospital. Biomedical Engineering Systems and Technologies Communications in Computer and Information Science, 25, 425–438.

    Article  Google Scholar 

  • Navigli, R. (2009). Word sense disambiguation: a survey. ACM Computing Surveys, 41(2), 1–69. doi:10.1145/1459352.1459355.

    Article  Google Scholar 

  • Patwardhan, S., Banerjee, S., Pedersen, T. (2003). Using measures of semantic relatedness for word sense disambiguation. In Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics (pp. 241–257). Mexico City, Mexico.

  • Pedersen, T., Patwardhan, S., Michelizzi, J. (2004). WordNet: Similarity - measuring the relatedness of concepts. In Proceeding HLT-NAACL--Demonstrations’04 Demonstration Papers at HLT-NAACL 2004 (pp. 38–41).

  • Princeton-University. (2012). About WordNet. Retrieved from http://wordnet.princeton.edu/.

  • Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (Vol. 1).

  • Sánchez, D., Batet, M., Valls, A., & Gibert, K. (2009). Ontology-driven web-based semantic similarity. Journal of Intelligent Information Systems, 35(3), 383–413. doi:10.1007/s10844-009-0103-x.

    Article  Google Scholar 

  • Shamsfard, M., & Abdollahzadeh Barforoush, A. (2003). The state of the art in ontology learning: a framework for comparison. The Knowledge Engineering Review, 18(4), 293–316. doi:10.1017/S0269888903000687.

    Article  Google Scholar 

  • Shepitsen, A., Gemmell, J., Mobasher, B., & Burke, R. (2008). Personalized recommendation in social tagging systems using hierarchical clustering. Proceedings of the 2008 ACM conference on Recommender systems - RecSys’08, 259. doi:10.1145/1454008.1454048.

  • Shima, H. (2013). WS4J. Retrieved from https://code.google.com/p/ws4j/.

  • Smirnov, S., Reijers, H. A., & Weske, M. (2011). From fine-grained to abstract process models : a semantic approach. Information Systems, 37(8), 784–797.

    Article  Google Scholar 

  • Sokal, R. R., & Rohlf, F. J. (1962). The comparison of dendrograms by objective methods. Taxon, 11(2), 33–40.

    Article  Google Scholar 

  • Song, M., & van der Aalst, W. M. P. (2008). Towards comprehensive support for organizational mining. Decision Support Systems, 46(1), 300–317. doi:10.1016/j.dss.2008.07.002.

    Article  Google Scholar 

  • Song, M., Günther, C., & van der Aalst, W. (2009). Trace clustering in process mining. Business Process Management Workshops Lecture Notes in Business Information Processing, 17(2), 109–120.

    Article  Google Scholar 

  • Tao, J., & Deokar, A. V. (2012). Creating semantic activity profiles using semantically-annotated event logs. In Proceedings of the 2012 SIGBPS Workshop on Business Processes and Services (SIGBPS’12) (pp. 136–140).

  • Thomas, O., & Fellmann, M. (2006). Semantic event-driven process chains. In Proceedings of the Workshop on Semantics for Business Process Management (SBPM’06), held at the 3rd European Semantic Web Conference (ESWC 2006). Budva, Montenegro.

  • Tiwari, A., Turner, C. J., & Majeed, B. (2008). A review of business process mining: state-of-the-art and future trends. Business Process Management Journal, 14(1), 5–22. doi:10.1108/14637150810849373.

    Article  Google Scholar 

  • Van der Aalst, W. M. P. (2008). Decision support based on process mining. In F. Burstein & C. W. Holsapple (Eds.), Handbook on decision support systems. Berlin: Springer.

    Google Scholar 

  • Van der Aalst, W. M. P., & Weijters, A. J. M. M. (2004). Process mining: a research agenda. Computers in Industry, 53(3), 231–244. doi:10.1016/j.compind.2003.10.001.

    Article  Google Scholar 

  • Van der Aalst, W. M. P., de Beer, H. T., van Dongen, B. F. (2005). Process mining and verification of properties: An approach based on temporal logic. In On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE, Pt 1, Proceedings (Vol. 3760, pp. 130–147). Berlin: Springer-Verlag Berlin.

  • Van der Aalst, W. M. P., Reijers, H. A., Weijters, A. J. M., van Dongen, M., de Medeiros, B. F., Song, A. K. A. M., & Verbeek, H. M. W. (2007). Business process mining: an industrial application. Information Systems, 32(5), 713–732. doi:10.1016/j.is.2006.05.003.

    Article  Google Scholar 

  • Van Dongen, B. F., & van der Aalst, W. M. P. (2004). EMiT: A process mining tool. In Applications and Theory of Petri Nets 2004, Proceedings (Vol. 3099, pp. 454–463). Berlin: Springer-Verlag Berlin.

  • Van Dongen, B. F., & van der Aalst, W. M. P. (2005). A meta model for process mining data. In J. Casto & E. Teniente (Eds.), Proceedings of the open interop workshop on enterprise modelling and ontologies for interoperability (EMOI-INTEROP “05), co-located with CAiSE”05 conference (Vol. 5, pp. 309–320). Porto: FEUP.

    Google Scholar 

  • Van Dongen, B., Ferreira, D. R., & Weber, B. (2011). Business Processing Intelligence Challenge (BPIC). doi:10.4121/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54.

    Google Scholar 

  • Van Dongen, B., Ferreira, D. R., Weber, B. (2012). Business Processing Intelligence Challenge (BPIC) 2012. Retrieved from http://www.win.tue.nl/bpi2012/doku.php?id=challenge.

  • Veiga, G. M., & Ferreira, D. R. (2010). Understanding spaghetti models with sequence clustering for ProM. Business Process Management Workshops Lecture Notes in Business Information Processing, 43, 92–103.

    Article  Google Scholar 

  • Wang, H. J., & Wu, H. (2011). Supporting process design for e-business via an integrated process repository. Information Technology and Management, 12(2), 97–109. doi:10.1007/s10799-010-0076-z.

    Article  Google Scholar 

  • Weber, P., Bordbar, B., & Tiño, P. (2013). A framework for the analysis of process mining algorithms. Systems IEEE Transactions on Systems Man and Cybernetics, 43(2), 303–317.

    Article  Google Scholar 

  • Weijters, A. J. M. M., van der Aalst, W. M. P., Alves de Medeiros, A. K. (2006). Process Mining with the Heuristics Miner-algorithm. Eindhoven.

  • Wetzstein, B., & Ma, Z. (2007). Semantic business process management: A lifecycle based requirements analysis. In Workshop on Semantic Business Process Lifecycle Management (pp. 1–11).

  • Wu, Z., & Palmer, M. (1994). Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics (ACL’94) (pp. 133–138).

  • Yarowsky, D. (2000). Hierarchical decision lists for word sense disambiguation. Computers and the Humanities, 34(1/2), 179–186. doi:10.1023/A:1002674829964.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amit V. Deokar.

Appendix

Appendix

Fig. 10
figure 10

Event name normalization algorithm

Fig. 11
figure 11

Similarities between event names from the experiment output

Fig. 12
figure 12

Similarities between event names generated by domain experts

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deokar, A.V., Tao, J. Semantics-based event log aggregation for process mining and analytics. Inf Syst Front 17, 1209–1226 (2015). https://doi.org/10.1007/s10796-015-9563-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-015-9563-4

Keywords

Navigation