An Approach for Incorporating Expert Knowledge in Trace Clustering

  • Pieter De KoninckEmail author
  • Klaas Nelissen
  • Bart Baesens
  • Seppe vanden Broucke
  • Monique Snoeck
  • Jochen De Weerdt
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10253)


Trace clustering techniques are a set of approaches for partitioning traces or process instances into similar groups. Typically, this partitioning is based on certain patterns or similarity between the traces, or done by discovering a process model for each cluster of traces. In general, however, it is likely that clustering solutions obtained by these approaches will be hard to understand or difficult to validate given an expert’s domain knowledge. Therefore, we propose a novel semi-supervised trace clustering technique based on expert knowledge. Our approach is validated using a case in tablet reading behaviour, but widely applicable in other contexts. In an experimental evaluation, the technique is shown to provide a beneficial trade-off between performance and understandability.


Trace clustering Process mining Domain knowledge Semi-supervised learning 


  1. 1.
    Appice, A., Malerba, D.: A co-training strategy for multiple view clustering in process mining. IEEE Trans. Serv. Comput. PP(99), 1 (2015)Google Scholar
  2. 2.
    Basu, S., Banerjee, A., Mooney, R.: Semi-supervised clustering by seeding. In: Proceedings of 19th International Conference on Machine Learning (ICML-2002), pp. 27–34 (2002)Google Scholar
  3. 3.
    Bose, R.P.J.C., van der Aalst, W.M.P.: Context aware trace clustering: towards improving process mining results. In: SDM, pp. 401–412 (2009)Google Scholar
  4. 4.
    Bose, R.P.J.C., van der Aalst, W.M.P.: Trace clustering based on conserved patterns: towards achieving better process models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 170–181. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12186-9_16 CrossRefGoogle Scholar
  5. 5.
    Buzan, T., Spek, P.: Snellezen. Tirion (2009)Google Scholar
  6. 6.
    De Koninck, P., De Weerdt, J.: Multi-objective trace clustering: finding more balanced solutions. In: Business Process Management Workshops 2016 (2016, accepted)Google Scholar
  7. 7.
    De Koninck, P., De Weerdt, J., vanden Broucke, S.K.L.M.: Explaining clusterings of process instances. Data Mining Knowl. Discov. 31(3), 1–35 (2016)MathSciNetGoogle Scholar
  8. 8.
    De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B.: A robust f-measure for evaluating discovered process models. In: 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 148–155. IEEE (2011)Google Scholar
  9. 9.
    De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B.: A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs. Inf. Syst. 37(7), 654–676 (2012)CrossRefGoogle Scholar
  10. 10.
    De Weerdt, J., Vanden Broucke, S., Vanthienen, J., Baesens, B.: Active trace clustering for improved process discovery. IEEE Trans. Knowl. Data Eng. 25(12), 2708–2720 (2013)CrossRefGoogle Scholar
  11. 11.
    Delias, P., Doumpos, M., Grigoroudis, E., Manolitzas, P., Matsatsinis, N.: Supporting healthcare management decisions via robust clustering of event logs. Knowl.-Based Syst. 84, 203–213 (2015)CrossRefGoogle Scholar
  12. 12.
    van Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM\(^2\): a process mining project methodology. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 297–313. Springer, Cham (2015). doi: 10.1007/978-3-319-19069-3_19 CrossRefGoogle Scholar
  13. 13.
    Fred, A., Lourenço, A.: Cluster ensemble methods: from single clusterings to combined solutions. In: Okun, O., Valentini, G. (eds.) Supervised and Unsupervised Ensemble Methods and their Applications, pp. 3–30. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Goedertier, S., Martens, D., Vanthienen, J., Baesens, B.: Robust process discovery with artificial negative events. J. Mach. Learn. Res. 10, 1305–1340 (2009)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs - a constructive approach. In: Colom, J.-M., Desel, J. (eds.) PETRI NETS 2013. LNCS, vol. 7927, pp. 311–329. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38697-8_17 CrossRefGoogle Scholar
  16. 16.
    Muñoz-Gama, J., Carmona, J.: A fresh look at precision in process conformance. In: Hull, R., Mendling, J., Tai, S. (eds.) BPM 2010. LNCS, vol. 6336, pp. 211–226. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15618-2_16 CrossRefGoogle Scholar
  17. 17.
    Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-00328-8_11 CrossRefGoogle Scholar
  18. 18.
    Van der Aalst, W., Adriansyah, A., Van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2(2), 182–192 (2012)CrossRefGoogle Scholar
  19. 19.
    Vanden Broucke, S.K.L.M.: Artificial negative events and other techniques. Ph.D. thesis, KU Leuven (2014)Google Scholar
  20. 20.
    Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S., et al.: Constrained k-means clustering with background knowledge. In: ICML, vol. 1, pp. 577–584 (2001)Google Scholar
  21. 21.
    Weijters, A., van Der Aalst, W.M., De Medeiros, A.A.: Process mining with the heuristics miner-algorithm. Technische Universiteit Eindhoven, Technical report WP 166, pp. 1–34 (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Pieter De Koninck
    • 1
    Email author
  • Klaas Nelissen
    • 1
  • Bart Baesens
    • 1
  • Seppe vanden Broucke
    • 1
  • Monique Snoeck
    • 1
  • Jochen De Weerdt
    • 1
  1. 1.Faculty of Economics and Business, Research Center for Management InformaticsKU LeuvenLeuvenBelgium

Personalised recommendations