Multi-objective Trace Clustering: Finding More Balanced Solutions

  • Pieter De KoninckEmail author
  • Jochen De Weerdt
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 281)


In recent years, a multitude of techniques has been proposed for the task of clustering traces. In general, these techniques either focus on optimizing their solution based on a certain type of similarity between the traces, such as the number of insertions and deletions needed to transform one trace into another; by mapping the traces onto a vector space model, based on certain patterns in each trace; or on the quality of a process model discovered from each cluster. Currently, the main technique of the latter category, ActiTraC, constructs its clusters based on a single objective: fitness. However, a typical view in process discovery is that one needs to balance fitness, generalization, precision and simplicity. Therefore, a multi-objective approach to trace clustering is deemed more appropriate. In this paper, a thorough overview of current trace clustering techniques and potential approaches for multi-objective trace clustering is given. Furthermore, a multi-objective trace clustering technique is proposed. Our solution is shown to provide unique results on a number of real-life event logs, validating its existence.


Trace clustering Process mining Process model quality Multi-objective learning 


  1. 1.
    Appice, A., Malerba, D.: A co-training strategy for multiple view clustering in process mining. IEEE Trans. Serv. Comput. PP(99), 1 (2015)Google Scholar
  2. 2.
    Basu, S., Banerjee, A., Mooney, R.: Semi-supervised clustering by seeding. In: Proceedings of 19th International Conference on Machine Learning, ICML-2002 (2002)Google Scholar
  3. 3.
    Bose, R., van der Aalst, W.M.P.: Context aware trace clustering: towards improving process mining results. In: Sdm, pp. 401–412 (2009)Google Scholar
  4. 4.
    Bose, R.P.J.C., Aalst, W.M.P.: Trace clustering based on conserved patterns: towards achieving better process models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 170–181. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12186-9_16 CrossRefGoogle Scholar
  5. 5.
    Buijs, J.C.A.M., Dongen, B.F., Aalst, W.M.P.: Discovering and navigating a collection of process models using multiple quality dimensions. In: Lohmann, N., Song, M., Wohed, P. (eds.) BPM 2013. LNBIP, vol. 171, pp. 3–14. Springer, Cham (2014). doi: 10.1007/978-3-319-06257-0_1 CrossRefGoogle Scholar
  6. 6.
    De Koninck, P., De Weerdt, J.: Determining the number of trace clusters: a stability-based approach. In: ATAED 2016, vol. 1592, pp. 1–15. Ceur Workshop Proceedings (2016)Google Scholar
  7. 7.
    De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B.: A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs. Inf. Syst. 37(7), 654–676 (2012)CrossRefGoogle Scholar
  8. 8.
    De Weerdt, J., Vanden Broucke, S., Vanthienen, J., Baesens, B.: Active trace clustering for improved process discovery. IEEE Trans. Knowl. Data Eng. 25(12), 2708–2720 (2013)CrossRefGoogle Scholar
  9. 9.
    Delias, P., Doumpos, M., Grigoroudis, E., Manolitzas, P., Matsatsinis, N.: Supporting healthcare management decisions via robust clustering of event logs. Knowl. Based Syst. 84, 203–213 (2015)CrossRefGoogle Scholar
  10. 10.
    Ekanayake, C.C., Dumas, M., García-Bañuelos, L., Rosa, M.: Slice, mine and dice: complexity-aware automated discovery of business process models. In: Daniel, F., Wang, J., Weber, B. (eds.) BPM 2013. LNCS, vol. 8094, pp. 49–64. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40176-3_6 CrossRefGoogle Scholar
  11. 11.
    Ferreira, D., Zacarias, M., Malheiros, M., Ferreira, P.: Approaching process mining with sequence clustering: experiments and findings. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 360–374. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-75183-0_26 CrossRefGoogle Scholar
  12. 12.
    Folino, F., Greco, G., Guzzo, A., Pontieri, L.: Editorial: mining usage scenarios in business processes: outlier-aware discovery and run-time prediction. Data Knowl. Eng. 70, 1005–1029 (2011)CrossRefGoogle Scholar
  13. 13.
    Fred, A., Lourenço, A.: Cluster ensemble methods: from single clusterings to combined solutions. In: Okun, O., Valentini, G. (eds.) Supervised and Unsupervised Ensemble Methods and Their Applications. SCI, vol. 126, pp. 3–30. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Greco, G., Guzzo, A., Pontieri, L., Saccà, D.: Discovering expressive process models by clustering log traces. IEEE Trans. Knowl. Data Eng. 18(8), 1010–1027 (2006)CrossRefGoogle Scholar
  15. 15.
    Alves de Medeiros, A.K.: Genetic process mining (2006)Google Scholar
  16. 16.
    Song, M., Günther, C.W., Aalst, W.M.P.: Trace clustering in process mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-00328-8_11 CrossRefGoogle Scholar
  17. 17.
    van der Aalst, W.M.P., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2(2), 182–192 (2012)CrossRefGoogle Scholar
  18. 18.
    Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: In ICML, pp. 577–584. Morgan Kaufmann (2001)Google Scholar
  19. 19.
    Weijters, A., van der Aalst, W.M.P., De Medeiros, A.A.: Process mining with the heuristics miner-algorithm. Technische Universiteit Eindhoven, Technical report WP 166, pp. 1–34 (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Faculty of Economics and Business, Research Center for Management InformaticsKU LeuvenLeuvenBelgium

Personalised recommendations