Data Transformation and Semantic Log Purging for Process Mining

  • Linh Thao Ly
  • Conrad Indiono
  • Jürgen Mangler
  • Stefanie Rinderle-Ma
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7328)


Existing process mining approaches are able to tolerate a certain degree of noise in the process log. However, processes that contain infrequent paths, multiple (nested) parallel branches, or have been changed in an ad-hoc manner, still pose major challenges. For such cases, process mining typically returns “spaghetti-models”, that are hardly usable even as a starting point for process (re-)design. In this paper, we address these challenges by introducing data transformation and pre-processing steps that improve and ensure the quality of mined models for existing process mining approaches. We propose the concept of semantic log purging, the cleaning of logs based on domain specific constraints utilizing semantic knowledge which typically complements processes. Furthermore we demonstrate the feasibility and effectiveness of the approach based on a case study in the higher education domain. We think that semantic log purging will enable process mining to yield better results, thus giving process (re-)designers a valuable tool.


Process mining Data transformation Log purging Process constraints 


  1. 1.
    van der Aalst, W.M.P., et al.: Process Mining Manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM Workshops 2011, Part I. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  2. 2.
    De Medeiros, A.K.A., Weijters, A.J.M.M.: Genetic process mining: an experimental evaluation. Data Mining and Knowledge Discovery14 (2007)Google Scholar
  3. 3.
    Weijters, A., van der Aalst, W.M.P.: Rediscovering workflow models from event-based data using little thumb. In: ICAE, vol. 10, pp. 151–162 (2003)Google Scholar
  4. 4.
    Fahland, D., van der Aalst, W.M.P.: Simplifying Mined Process Models: An Approach Based on Unfoldings. In: Rinderle-Ma, S., Toumani, F., Wolf, K. (eds.) BPM 2011. LNCS, vol. 6896, pp. 362–378. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Heidelberg (2011)zbMATHGoogle Scholar
  6. 6.
    Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60–75. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Derntl, M., Mangler, J.: Web services for blended learning patterns. In: Proc. IEEE International Conference on Advanced Learning Technologies, pp. 614–618 (2004)Google Scholar
  8. 8.
    Ly, L.T., Knuplesch, D., Rinderle-Ma, S., Göser, K., Pfeifer, H., Reichert, M., Dadam, P.: SeaFlows Toolset – Compliance Verification Made Easy for Process-Aware Information Systems. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 76–91. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Ly, L.T., Rinderle-Ma, S., Dadam, P.: Design and Verification of Instantiable Compliance Rule Graphs in Process-Aware Information Systems. In: Pernici, B. (ed.) CAiSE 2010. LNCS, vol. 6051, pp. 9–23. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Rahm, E., Do, H.: Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin 23(4), 313 (2000)Google Scholar
  11. 11.
    Heiko Müller, J.F.: Problems, methods, and challenges in comprehensive data cleansing. Technical Report 164, Humboldt University Berlin (2003)Google Scholar
  12. 12.
    Dunkl, R., Fröschl, K.A., Grossmann, W., Rinderle-Ma, S.: Assessing Medical Treatment Compliance Based on Formal Process Modeling. In: Holzinger, A., Simonic, K.-M. (eds.) USAB 2011. LNCS, vol. 7058, pp. 533–546. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Rinderle-Ma, S., Mangler, J.: Integration of process constraints from heterogeneous sources in Process-Aware information systems. In: Int’l. Workshop Enterprise Modelling and Information Systems Architectures, EMISA (2011)Google Scholar
  14. 14.
    Funk, M., Rozinat, A., Alves de Medeiros, A.K., van der Putten, P., Corporaal, H., van der Aalst, W.M.P.: Improving Product Usage Monitoring and Analysis with Semantic Concepts. In: Yang, J., Ginige, A., Mayr, H.C., Kutsche, R.-D. (eds.) UNISCON 2009. LNBIP, vol. 20, pp. 190–201. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  15. 15.
    Mans, R.S., Schonenberg, H., Song, M., van der Aalst, W.M.P., Bakker, P.J.M.: Application of Process Mining in Healthcare - A Case Study in a Dutch Hospital. In: Fred, A., Filipe, J., Gamboa, H. (eds.) BIOSTEC 2008. CCIS, vol. 25, pp. 425–438. Springer, Heidelberg (2008)Google Scholar
  16. 16.
    van der Aalst, W.M.P., de Beer, H.T., van Dongen, B.F.: Process Mining and Verification of Properties: An Approach Based on Temporal Logic. In: Meersman, R., Tari, Z. (eds.) OTM 2005. LNCS, vol. 3760, pp. 130–147. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    de Medeiros, A.K.A., van der Aalst, W.M.P., Pedrinaci, C.: Semantic process mining tools: Core building blocks. In: Proc. ECIS 2008, pp. 1953–1964 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Linh Thao Ly
    • 1
  • Conrad Indiono
    • 2
  • Jürgen Mangler
    • 2
  • Stefanie Rinderle-Ma
    • 2
  1. 1.Institute of Databases and Information SystemsUlm UniversityGermany
  2. 2.Faculty of Computer ScienceUniversity of ViennaAustria

Personalised recommendations