Alignment-Based Trace Clustering

  • Thomas Chatain
  • Josep Carmona
  • Boudewijn van Dongen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10650)

Abstract

A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.

Notes

Acknowledgements

We thank Bart Hompes for facilitating the clustering results of his tool for the example used in the experiments. This work has been partially supported by funds from the Spanish Ministry for Economy and Competitiveness (MINECO), the European Union (FEDER funds) under grant COMMAS (ref. TIN2013-46181-C2-1-R).

References

  1. 1.
    van der Aalst, W.M.P.: Process Mining — Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011)MATHGoogle Scholar
  2. 2.
    Greco, G., Guzzo, A., Pontieri, L., Saccà, D.: Discovering expressive process models by clustering log traces. IEEE Trans. Knowl. Data Eng. 18(8), 1010–1027 (2006)CrossRefGoogle Scholar
  3. 3.
    Ferreira, D., Zacarias, M., Malheiros, M., Ferreira, P.: Approaching process mining with sequence clustering: experiments and findings. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 360–374. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-75183-0_26 CrossRefGoogle Scholar
  4. 4.
    Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-00328-8_11 CrossRefGoogle Scholar
  5. 5.
    Bose, R., van der Aalst, W.M.P.: Context aware trace clustering: towards improving process mining results. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2009, 30 April – 2 May 2009, Sparks, Nevada, USA, pp. 401–412 (2009)Google Scholar
  6. 6.
    Bose, R.P.J.C., van der Aalst, W.M.P.: Trace clustering based on conserved patterns: towards achieving better process models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 170–181. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12186-9_16 CrossRefGoogle Scholar
  7. 7.
    Weerdt, J.D., vanden Broucke, S.K.L.M., Vanthienen, J., Baesens, B.: Active trace clustering for improved process discovery. IEEE Trans. Knowl. Data Eng. 25(12), 2708–2720 (2013)CrossRefGoogle Scholar
  8. 8.
    Hompes, B., Buijs, J., van der Aalst, W., Dixit, P., Buurman, H.: Discovering deviating cases and process variants using trace clustering. In: Proceedings of the 27th Benelux Conference on Artificial Intelligence (BNAIC 2015), Hasselt, Belgium, 5–6 November 2015Google Scholar
  9. 9.
    Dumas, M., van der Aalst, W.M.P., ter Hofstede, A.H.M.: Process-Aware Information Systems: Bridging People and Software Through Process Technology. Wiley, Hoboken (2005)CrossRefGoogle Scholar
  10. 10.
    Adriansyah, A.: Aligning observed and modeled behavior. Ph.D. thesis, Technische Universiteit Eindhoven (2014)Google Scholar
  11. 11.
    Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–574 (1989)CrossRefGoogle Scholar
  12. 12.
    Stewart, I.A.: Reachability in some classes of acyclic Petri nets. Fundam. Inform. 23(1), 91–100 (1995)MATHMathSciNetGoogle Scholar
  13. 13.
    Cheng, A., Esparza, J., Palsberg, J.: Complexity results for 1-safe nets. In: Shyamasundar, R.K. (ed.) FSTTCS 1993. LNCS, vol. 761, pp. 326–337. Springer, Heidelberg (1993). doi: 10.1007/3-540-57529-4_66 CrossRefGoogle Scholar
  14. 14.
    Eén, N., Sörensson, N.: Translating pseudo-boolean constraints into SAT. JSAT 2(1–4), 1–26 (2006)MATHGoogle Scholar
  15. 15.
    Taymouri, F., Carmona, J.: Model and event log reductions to boost the computation of alignments. In: Proceedings of the 6th International Symposium on Data-driven Process Discovery and Analysis (SIMPDA 2016), Graz, Austria, 15–16 December 2016, pp. 50–62 (2016)Google Scholar
  16. 16.
    Chatain, T., Carmona, J.: Anti-alignments in conformance checking — the dark side of process models. In: Kordon, F., Moldt, D. (eds.) PETRI NETS 2016. LNCS, vol. 9698, pp. 240–258. Springer, Cham (2016). doi: 10.1007/978-3-319-39086-4_15 CrossRefGoogle Scholar
  17. 17.
    Taymouri, F., Carmona, J.: A recursive paradigm for aligning observed behavior of large structured process models. In: La Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNCS, vol. 9850, pp. 197–214. Springer, Cham (2016). doi: 10.1007/978-3-319-45348-4_12 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Thomas Chatain
    • 1
  • Josep Carmona
    • 2
  • Boudewijn van Dongen
    • 3
  1. 1.LSV, ENS Paris-Saclay, CNRS, InriaCachanFrance
  2. 2.Universitat Politècnica de CatalunyaBarcelonaSpain
  3. 3.Eindhoven University of TechnologyEindhovenThe Netherlands

Personalised recommendations