Slice, Mine and Dice: Complexity-Aware Automated Discovery of Business Process Models

  • Chathura C. Ekanayake
  • Marlon Dumas
  • Luciano García-Bañuelos
  • Marcello La Rosa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8094)


Automated process discovery techniques aim at extracting models from information system logs in order to shed light into the business processes supported by these systems. Existing techniques in this space are effective when applied to relatively small or regular logs, but otherwise generate large and spaghetti-like models. In previous work, trace clustering has been applied in an attempt to reduce the size and complexity of automatically discovered process models. The idea is to split the log into clusters and to discover one model per cluster. The result is a collection of process models – each one representing a variant of the business process – as opposed to an all-encompassing model. Still, models produced in this way may exhibit unacceptably high complexity. In this setting, this paper presents a two-way divide-and-conquer process discovery technique, wherein the discovered process models are split on the one hand by variants and on the other hand hierarchically by means of subprocess extraction. The proposed technique allows users to set a desired bound for the complexity of the produced models. Experiments on real-life logs show that the technique produces collections of models that are up to 64% smaller than those extracted under the same complexity bounds by applying existing trace clustering techniques.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bose, R.P.J.C.: Process Mining in the Large: Preprocessing, Discovery, and Diagnostics. PhD thesis, Eindhoven University of Technology, Eindhoven (2012)Google Scholar
  2. 2.
    Bose, R.P.J.C., van der Aalst, W.M.P.: Trace clustering based on conserved patterns: Towards achieving better process models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009 Workshops. LNBIP, vol. 43, pp. 170–181. Springer, Heidelberg (2010)Google Scholar
  3. 3.
    Bose, R.P.J.C., Verbeek, E.H.M.W., van der Aalst, W.M.P.: Discovering hierarchical process models using prom. In: Nurcan, S. (ed.) CAiSE Forum 2011. LNBIP, vol. 107, pp. 33–48. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    de Medeiros, A.K.A., Guzzo, A., Greco, G., van der Aalst, W.M.P., Weijters, A.J.M.M., van Dongen, B.F., Saccà, D.: Process mining based on clustering: A quest for precision. In: ter Hofstede, A.H.M., Benatallah, B., Paik, H.-Y. (eds.) BPM Workshops 2007. LNCS, vol. 4928, pp. 17–29. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Dijkman, R.M., Dumas, M., van Dongen, B.F., Käärik, R., Mendling, J.: Similarity of business process models: Metrics and evaluation. Inf. Syst. 36(2), 498–516 (2011)CrossRefGoogle Scholar
  6. 6.
    Dumas, M., García-Bañuelos, L., La Rosa, M., Uba, R.: Fast detection of exact clones in business process model repositories. Inf. Syst. 38(4), 619–633 (2012)CrossRefGoogle Scholar
  7. 7.
    Ekanayake, C.C., Dumas, M., García-Bañuelos, L., La Rosa, M., ter Hofstede, A.H.M.: Approximate clone detection in repositories of business process models. In: Barros, A., Gal, A., Kindler, E. (eds.) BPM 2012. LNCS, vol. 7481, pp. 302–318. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  8. 8.
    Greco, G., Guzzo, A., Pontieri, L.: Mining taxonomies of process models. Data Knowl. Eng. 67(1), 74–102 (2008)CrossRefGoogle Scholar
  9. 9.
    Greco, G., Guzzo, A., Pontieri, L., Saccà, D.: Discovering expressive process models by clustering log traces. IEEE Trans. Knowl. Data Eng. 18(8), 1010–1027 (2006)CrossRefGoogle Scholar
  10. 10.
    La Rosa, M., Reijers, H.A., van der Aalst, W.M.P., Dijkman, R.M., Mendling, J., Dumas, M., García-Bañuelos, L.: APROMORE: An Advanced Process Model Repository. Expert Syst. Appl. 38(6) (2011)Google Scholar
  11. 11.
    Mendling, J., Reijers, H.A., Cardoso, J.: What Makes Process Models Understandable? In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 48–63. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Mendling, J., Sánchez-González, L., García, F., La Rosa, M.: Thresholds for error probability measures of business process models. J. Syst. Software 85(5), 1188–1197 (2012)CrossRefGoogle Scholar
  13. 13.
    Reijers, H.A., Mendling, J.: A study into the factors that influence the understandability of business process models. IEEE T. Syst. Man Cy. A 41(3), 449–462 (2011)CrossRefGoogle Scholar
  14. 14.
    La Rosa, M., Dumas, M., Uba, R., Dijkman, R.: Business process model merging: An approach to business process consolidation. ACM T. Softw. Eng. Meth. 22(2) (2013)Google Scholar
  15. 15.
    Song, M., Günther, C.W., van der Aalst, W.M.P.: Improving process mining with trace clustering. J. Korean Inst. of Industrial Engineers 34(4), 460–469 (2008)Google Scholar
  16. 16.
    Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008 Workshops. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009)Google Scholar
  17. 17.
    van der Aalst, W.M.P.: Process Mining - Discovery, Conformance and Enhancement of Business Processes. Springer (2011)Google Scholar
  18. 18.
    van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process discovery using integer linear programming. Fundam. Inform. 94(3-4), 387–412 (2009)MATHGoogle Scholar
  19. 19.
    Vanhatalo, J., Völzer, H., Koehler, J.: The Refined Process Structure Tree. Data Knowl. Eng. 68(9), 793–818 (2009)CrossRefGoogle Scholar
  20. 20.
    Veiga, G.M., Ferreira, D.R.: Understanding spaghetti models with sequence clustering for prom. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds.) BPM 2009. LNBIP, vol. 43, pp. 92–103. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  21. 21.
    De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B.: A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs. Inf. Syst. 37(7), 654–676 (2012)CrossRefGoogle Scholar
  22. 22.
    Weijters, A.J.M.M., Ribeiro, J.T.S.: Flexible heuristics miner (fhm). In: CIDM, pp. 310–317. IEEE (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Chathura C. Ekanayake
    • 1
  • Marlon Dumas
    • 2
  • Luciano García-Bañuelos
    • 2
  • Marcello La Rosa
    • 1
  1. 1.Queensland University of TechnologyAustralia
  2. 2.University of TartuEstonia

Personalised recommendations