Skip to main content

Process Model Comparison Based on Cophenetic Distance

  • Conference paper
  • First Online:
Business Process Management Forum (BPM 2016)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 260))

Included in the following conference series:

Abstract

The automated comparison of process models has received increasing attention in the last decade, due to the growing existence of process models and repositories, and the consequent need to assess similarities between the underlying processes. Current techniques for process model comparison are either structural (based on graph edit distances), or behavioural (through activity profiles or the analysis of the execution semantics). Accordingly, there is a gap between the quality of the information provided by these two families, i.e., structural techniques may be fast but inaccurate, whilst behavioural are accurate but complex. In this paper we present a novel technique, that is based on a well-known technique to compare labeled trees through the notion of Cophenetic distance. The technique lays between the two families of methods for comparing a process model: it has an structural nature, but can provide accurate information on the differences/similarities of two process models. The experimental evaluation on various benchmarks sets are reported, that position the proposed technique as a valuable tool for process model comparison.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We assume the problem of dealing with real activity labels, e.g., when the name of an activity in the models does not perfectly match, is resolved prior to the techniques of this paper.

  2. 2.

    We used Discover a Process Tree using Inductive Miner (ProM 6.5) and then converted them to Petri Nets.

  3. 3.

    The most common sequence of commands in the dataset is svn-options, svn update, svn -options indicating they use an IDE that overwrites the SVN options just to perform an update and then returns to its previous status.

  4. 4.

    Following the semantics of block-structured models in [4], only exclusive ORs are modeled.

  5. 5.

    Here a strict subtree of T is any subtree that does not contain the root of T.

  6. 6.

    In all three cases, the Pearson correlation coefficient is above 0.85 with a p-value, for testing non-correlation, below \(10^{-12}\).

  7. 7.

    Remember that the scale of metrics \(d_{\varphi }\), \(d_{ES}\) and \(d_{GED}\) is different, a fact that explains the differences on the absolute values provided in each one.

References

  1. Armas-Cervantes, A., Baldan, P., Dumas, M., García-Bañuelos, L.: Behavioral comparison of process models based on canonically reduced event structures. In: Sadiq, S., Soffer, P., Völzer, H. (eds.) BPM 2014. LNCS, vol. 8659, pp. 267–282. Springer, Heidelberg (2014)

    Google Scholar 

  2. Arvind, V., Köbler, J., Kuhnert, S., Vasudev, Y.: Approximate graph isomorphism. In: Rovan, B., Sassone, V., Widmayer, P. (eds.) MFCS 2012. LNCS, vol. 7464, pp. 100–111. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  3. Becker, M., Laue, R.: A comparative survey of business process similarity measures. Comput. Ind. 63(2), 148–167 (2012)

    Article  Google Scholar 

  4. Buijs, J., van Dongen, B.F., van der Aalst, W.M.P.: A genetic algorithm for discovering process trees. In: 2012 IEEE Congress on Evolutionary Computation (2012)

    Google Scholar 

  5. Cardona, G., Mir, A., Rosselló, F., Rotger, L., Sánchez, D.: Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf. BMC Bioinform. 14(1), 1–13 (2013)

    Article  Google Scholar 

  6. Curran, T., Keller, G., Ladd, A.: SAP R/3 business blueprint: understanding the business process reference model. Prentice-Hall Inc., Upper Saddle River (1997)

    Google Scholar 

  7. Dijkman, R.: Diagnosing differences between business process models. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 261–277. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Dijkman, R., Dumas, M., García-Bañuelos, L.: Graph matching algorithms for business process model similarity search. In: Dayal, U., Eder, J., Koehler, J., Reijers, H.A. (eds.) BPM 2009. LNCS, vol. 5701, pp. 48–63. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  9. Dijkman, R.M., Dumas, M., García-Bañuelos, L., Käärik, R.: Aligning business process models. In: EDOC 2009, Auckland, New Zealand, 1–4 September 2009, pp. 45–53 (2009)

    Google Scholar 

  10. Dijkman, R.M., Dumas, M., van Dongen, B.F., Käärik, R., Mendling, J.: Similarity of business process models: metrics and evaluation. Inf. Syst. 36(2), 498–516 (2011)

    Article  Google Scholar 

  11. Kumar, R., Talton, J.O., Ahmad, S., Roughgarden, T., Klemmer, S.R.: Flexible tree matching. In: Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI 2011) (2011)

    Google Scholar 

  12. Mena, A.A., Rosselló, F.: Ternary graph isomorphism in polynomial time, after luks. CoRR, abs/1209.0871 (2012)

    Google Scholar 

  13. Polyvyanyy, A., Weidlich, M., Conforti, R., La Rosa, M., ter Hofstede, A.H.M.: The 4C spectrum of fundamental behavioral relations for concurrent systems. In: Ciardo, G., Kindler, E. (eds.) PETRI NETS 2014. LNCS, vol. 8489, pp. 210–232. Springer, Heidelberg (2014)

    Google Scholar 

  14. Sokal, R.R., Rohlf, F.J.: The comparison of dendrograms by objective methods. Taxon 11(2), 33–40 (1962)

    Article  Google Scholar 

  15. Sun, L., Boztas, S., Horadam, K., Rao, A., Versteeg, S.: Analysis of user behaviour in accessing a source code repository. Technical report, RMIT University and CA Technologies (2013)

    Google Scholar 

  16. Weidlich, M., Mendling, J., Weske, M.: Efficient consistency measurement based on behavioral profiles of process models. IEEE Trans. Soft. Eng. 37(3), 410–429 (2011)

    Article  MathSciNet  Google Scholar 

  17. Weidlich, M., Polyvyanyy, A., Mendling, J., Weske, M.: Causal behavioural profiles - efficient computation, applications, and evaluation. Fundam. Inform. 113(3–4), 399–435 (2011)

    MathSciNet  MATH  Google Scholar 

  18. Yan, Z., Dijkman, R.M., Grefen, W.P.W.J.: Fast business process similarity search. Distrib. Parallel Databases 30(2), 105–144 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

This work is funded by Secretaria de Universitats i Recerca of Generalitat de Catalunya, under the Industrial Doctorate Program 2013DI062, and European Commission’s 7th Framework Programme project LeanBigData (Agreement 619606), and the Spanish Ministry for Economy and Competitiveness, the European Union (FEDER funds) under grant COMMAS (ref. TIN2013-46181-C2-1-R).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Sánchez-Charles .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Sánchez-Charles, D., Muntés-Mulero, V., Carmona, J., Solé, M. (2016). Process Model Comparison Based on Cophenetic Distance. In: La Rosa, M., Loos, P., Pastor, O. (eds) Business Process Management Forum. BPM 2016. Lecture Notes in Business Information Processing, vol 260. Springer, Cham. https://doi.org/10.1007/978-3-319-45468-9_9

Download citation

Publish with us

Policies and ethics