Discovering Duplicate Tasks in Transition Systems for the Simplification of Process Models

  • Javier de San Pedro
  • Jordi Cortadella
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9850)


This work presents a set of methods to improve the understandability of process models. Traditionally, simplification methods trade off quality metrics, such as fitness or precision. Conversely, the methods proposed in this paper produce simplified models while preserving or even increasing fidelity metrics. The first problem addressed in the paper is the discovery of duplicate tasks. A new method is proposed that avoids overfitting by working on the transition system generated by the log. The method is able to discover duplicate tasks even in the presence of concurrency and choice. The second problem is the structural simplification of the model by identifying optional and repetitive tasks. The tasks are substituted by annotated events that allow the removal of silent tasks and reduce the complexity of the model. An important feature of the methods proposed in this paper is that they are independent from the actual miner used for process discovery.



This work has been partially supported by funds from the Spanish Ministry for Economy and Competitiveness and the European Union (FEDER funds) under grant TIN2013-46181-C2-1-R, and the Generalitat de Catalunya (2014 SGR 1034 and FI-DGR 2015).


  1. 1.
    van der Aalst, W.M.P.: Process Mining - Discovery: Conformance and Enhancement of Business Processes. Springer, Heidelberg (2011)CrossRefzbMATHGoogle Scholar
  2. 2.
    van der Aalst, W., Rubin, V., Verbeek, H., van Dongen, B., Kindler, E., Gnther, C.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. & Syst. Model. 9(1), 87–111 (2010)CrossRefGoogle Scholar
  3. 3.
    de Medeiros, A.K.A.: Genetic process mining. Ph.D. thesis, Technische Universiteit Eindhoven, Eindhoven, The Netherlands (2006)Google Scholar
  4. 4.
    Carmona, J.: The label splitting problem. In: Jensen, K., van der Aalst, W.M., Ajmone Marsan, M., Franceschinis, G., Kleijn, J., Kristensen, L.M. (eds.) Transactions on Petri Nets and Other Models of Concurrency VI. LNCS, vol. 7400, pp. 1–23. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Song, J.L., Luo, T.J., Chen, S., Liu, W.: A clustering based method to solve duplicate tasks problem. J. Univ. Chin. Acad. Sci. 26(1), 107 (2009)Google Scholar
  6. 6.
    Vázquez-Barreiros, B., Mucientes, M., Lama, M.: Mining duplicate tasks from discovered processes. In: Proceedings of Algorithms and Theories for the Analysis of Event Data, vol. 1371, Brussels, Belgium, CEUR, pp. 78–82 June 2015Google Scholar
  7. 7.
    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from incomplete event logs. In: Ciardo, G., Kindler, E. (eds.) PETRI NETS 2014. LNCS, vol. 8489, pp. 91–110. Springer, Heidelberg (2014)Google Scholar
  8. 8.
    Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–574 (1989)CrossRefGoogle Scholar
  9. 9.
    Johnson, S.C.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)CrossRefGoogle Scholar
  10. 10.
    Jones, E., Oliphant, T., Peterson, P., et al.: SciPy: open source scientific tools for Python (2001) . Accessed 18 Mar 2016Google Scholar
  11. 11.
    van der Aalst, W.M.P., Dumas, M., Ouyang, C., Rozinat, A., Verbeek, E.: Conformance checking of service behavior. ACM Trans. Internet Technol. 8(3), 1–13 (2008)CrossRefGoogle Scholar
  12. 12.
    van Dongen, B.F., de Medeiros, A.K.A., Verbeek, H.M.W.E., Weijters, A.J.M.M.T., van der Aalst, W.M.P.: The ProM framework: a new era in process mining tool support. In: Ciardo, G., Darondeau, P. (eds.) ICATPN 2005. LNCS, vol. 3536, pp. 444–454. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  13. 13.
    van der Aalst, W.M.P., van Hee, K.M., ter Hofstede, A.H.M., Sidorova, N., Verbeek, H.M.W., Voorhoeve, M., Wynn, M.T.: Soundness of workflow nets: classification, decidability, and analysis. Formal Aspects Comput. 23(3), 333–363 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Carmona, J., Sol, M.: PMLAB: an scripting environment for process mining. In: Proceedings of the BPM Demo Sessions 2014, pp. 16–21 (2014)Google Scholar
  15. 15.
    Carmona, J.A., Cortadella, J., Kishinevsky, M.: A region-based algorithm for discovering petri nets from event logs. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 358–373. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Adriansyah, A., Munoz-Gama, J., Carmona, J., van Dongen, B., van der Aalst, W.: Measuring precision of modeled behavior. Inf. Syst. e-Bus. Manag. 13(1), 37–67 (2015)CrossRefGoogle Scholar
  17. 17.
    Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: Meersman, R., Panetto, H., Dillon, T., Rinderle-Ma, S., Dadam, P., Zhou, X., Pearson, S., Ferscha, A., Bergamaschi, S., Cruz, I.F. (eds.) OTM 2012, Part I. LNCS, vol. 7565, pp. 305–322. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  18. 18.
    Gansner, E.R., Koutsofios, E., North, S.C., Vo, K.: A technique for drawing directed graphs. IEEE Trans. Softw. Eng. 19(3), 214–230 (1993)CrossRefGoogle Scholar
  19. 19.
    Herbst, J., Karagiannis, D.: Workflow mining with InWoLvE. Comput. Ind. 53(3), 245–264 (2004). Process / Workflow MiningCrossRefGoogle Scholar
  20. 20.
    Burattin, A., Sperduti, A.: PLG: a framework for the generation of business process models and their execution logs. In: Muehlen, M., Su, J. (eds.) BPM 2010 Workshops. LNBIP, vol. 66, pp. 214–219. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  21. 21.
    Bose, R.: Process mining in the large: preprocessing, discovery, and diagnostics. Ph.D. thesis, Technische Universiteit Eindhoven (2012)Google Scholar
  22. 22.
    van den Broucke, S.K.L.M.: Advances in Process Mining. Ph.D., Katholieke Universiteit Leuven (2014)Google Scholar
  23. 23.
    Goedertier, S., Martens, D., Vanthienen, J., Baesens, B.: Robust process discovery with artificial negative events. J. Mach. Learn. Res. 10, 1305–1340 (2009)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Li, J., Liu, D., Yang, B.: Process mining: extending \(\alpha \)-algorithm to mine duplicate tasks in process logs. In: Chang, K.C.-C., Wang, W., Chen, L., Ellis, C.A., Hsu, C.-H., Tsoi, A.C., Wang, H. (eds.) APWeb/WAIM 2007. LNCS, vol. 4537, pp. 396–407. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  25. 25.
    De San Pedro, J., Carmona, J., Cortadella, J.: Log-based simplification of process models. In: Motahari-Nezhad, H.R., Recker, J., Weidlich, M. (eds.) BPM 2015. LNCS, vol. 9253, pp. 457–474. Springer International Publishing, Heidelberg (2015)CrossRefGoogle Scholar
  26. 26.
    Fahland, D., van der Aalst, W.M.P.: Simplifying discovered process models in a controlled manner. Inf. Syst. 38(4), 585–605 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversitat Politècnica de CatalunyaBarcelonaSpain

Personalised recommendations