Skip to main content

Understanding Spaghetti Models with Sequence Clustering for ProM

  • Conference paper
Business Process Management Workshops (BPM 2009)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 43))

Included in the following conference series:


The goal of process mining is to discover process models from event logs. However, for processes that are not well structured and have a lot of diverse behavior, existing process mining techniques generate highly complex models that are often difficult to understand; these are called spaghetti models. One way to try to understand these models is to divide the log into clusters in order to analyze reduced sets of cases. However, the amount of noise and ad-hoc behavior present in real-world logs still poses a problem, as this type of behavior interferes with the clustering and complicates the models of the generated clusters, affecting the discovery of patterns. In this paper we present an approach that aims at overcoming these difficulties by extracting only the useful data and presenting it in an understandable manner. The solution has been implemented in ProM and is divided in two stages: preprocessing and sequence clustering. We illustrate the approach in a case study where it becomes possible to identify behavioral patterns even in the presence of very diverse and confusing behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. van Dongen, B., de Medeiros, A.A., Verbeek, H., Weijters, A., van der Aalst, W.: The proM framework: A new era in process mining tool support. In: Ciardo, G., Darondeau, P. (eds.) ICATPN 2005. LNCS, vol. 3536, pp. 444–454. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  2. Greco, G., Guzzo, A., Pontieri, L., Saccá, D.: Mining expressive process models by clustering workflow traces. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 52–62. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. de Medeiros, A.K.A., Guzzo, A., Greco, G., van der Aalst, W.M.P., Weijters, A.J.M.M.T., van Dongen, B.F., Saccà, D.: Process mining based on clustering: A quest for precision. In: ter Hofstede, A.H.M., Benatallah, B., Paik, H.-Y. (eds.) BPM Workshops 2007. LNCS, vol. 4928, pp. 17–29. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Weijters, A., van der Aalst, W., de Medeiros, A.A.: Process mining with the heuristicsminer algorithm. BETA Working Paper Series WP 166, Eindhoven University of Technology (2006)

    Google Scholar 

  5. Song, M., Günther, C., van der Aalst, W.: Trace clustering in process mining. In: Proceedings of the 4th Workshop on Business Process Intelligence (BPI 2008), BPM Workshops 2008, Milan, September 1 (2008)

    Google Scholar 

  6. Bose, R.P.J.C., van der Aalst, W.M.P.: Context aware trace clustering: Towards improving process mining results. In: SDM, pp. 401–412. SIAM, Philadelphia (2009)

    Google Scholar 

  7. Enright, A.J., Ouzounis, C.: Generage: a robust algorithm for sequence clustering and domain detection. Bioinformatics 16(5), 451–457 (2000)

    Article  Google Scholar 

  8. Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Model-based clustering and visualization of navigation patterns on a web site. Data Mining and Knowledge Discovery 7(4), 399–424 (2003)

    Article  Google Scholar 

  9. Ferreira, D.: Applied sequence clustering techniques for process mining. In: Cardoso, J., van der Aalst, W. (eds.) Handbook of Research on Business Process Modeling. IGI Global (2009)

    Google Scholar 

  10. Ferreira, D., Zacarias, M., Malheiros, M., Ferreira, P.: Approaching process mining with sequence clustering: Experiments and findings. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 360–374. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977)

    Google Scholar 

  12. Enright, A.J., van Dongen, S., Ouzounis, C.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research 30(7), 1575–1584 (2002)

    Article  Google Scholar 

  13. Tang, Z., MacLennan, J.: Data Mining with SQL Server 2005, ch. 8 pp. 209–227. Wiley Publishing, Inc., Chichester (2005)

    Google Scholar 

  14. van Dongen, B., van der Aalst, W.: A meta model for process mining data. In: Casto, J., Teniente, E. (eds.) Proceedings of the CAiSE 2005 Workshops (EMOI-INTEROP Workshop), FEUP, Porto, Portugal, vol. 2, pp. 309–320 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Veiga, G.M., Ferreira, D.R. (2010). Understanding Spaghetti Models with Sequence Clustering for ProM. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds) Business Process Management Workshops. BPM 2009. Lecture Notes in Business Information Processing, vol 43. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12185-2

  • Online ISBN: 978-3-642-12186-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics