Skip to main content

Advanced Process Discovery Techniques

  • Chapter
Process Mining

Abstract

The α-algorithm nicely illustrates some of the main ideas behind process discovery. However, this simple algorithm is unable to manage the trade-offs involving the four quality dimensions described in Chap. 5 (fitness, simplicity, precision, and generalization). To successfully apply process mining in practice, one needs to deal with noise and incompleteness. This chapter focuses on more advanced process discovery techniques. The goal is not to present one particular technique in detail, but to provide an overview of the most relevant approaches. This will assist the reader in selecting the appropriate process discovery technique. Moreover, insights into the strengths and weaknesses of the various approaches support the correct interpretation and effective use of the discovered models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that in this chapter we again assume that the event log is simple (like in Chap. 5), because at this stage we still abstract from the other perspectives.

  2. 2.

    Note that we overload the term “fitness” in this book. On the one hand, we use it to refer to the ability to replay the event log (see Sects. 5.4.3 and 7.2). On the other hand, we use it for the selection of individuals in genetic process mining. Note that the latter interpretation includes the former, but also adds other elements of the four criteria mentioned in Sect. 5.4.3.

References

  1. A. Adriansyah, B.F. van Dongen, and W.M.P. van der Aalst. Towards Robust Conformance Checking. In J. Su and M. zur Muehlen, editors, BPM 2010 Workshops, Proceedings of the 6th Workshop on Business Process Intelligence (BPI2010), Lecture Notes in Business Information Processing. Springer, Berlin, 2011.

    Google Scholar 

  2. R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), pages 487–499, Santiago de Chile, Chile, 1994. Morgan Kaufmann, San Mateo, CA, 1994.

    Google Scholar 

  3. R. Agrawal, D. Gunopulos, and F. Leymann. Mining Process Models from Workflow Logs. In 6th International Conference on Extending Database Technology, volume 1377 of Lecture Notes in Computer Science, pages 469–483. Springer, Berlin, 1998.

    Google Scholar 

  4. E. Alpaydin. Introduction to Machine Learning. MIT Press, Cambridge, MA, 2010.

    MATH  Google Scholar 

  5. D. Angluin and C.H. Smith. Inductive Inference: Theory and Methods. Computing Surveys, 15(3):237–269, 1983.

    Article  MathSciNet  Google Scholar 

  6. E. Badouel and P. Darondeau. Theory of Regions. In W. Reisig and G. Rozenberg, editors, Lectures on Petri Nets I: Basic Models, volume 1491 of Lecture Notes in Computer Science, pages 529–586. Springer, Berlin, 1998.

    Google Scholar 

  7. R. Bergenthum, J. Desel, R. Lorenz, and S. Mauser. Process Mining Based on Regions of Languages. In G. Alonso, P. Dadam, and M. Rosemann, editors, International Conference on Business Process Management (BPM 2007), volume 4714 of Lecture Notes in Computer Science, pages 375–383. Springer, Berlin, 2007.

    Google Scholar 

  8. A.W. Biermann. On the Inference of Turing Machines from Sample Computations. Artificial Intelligence, 3:181–198, 1972.

    Article  MATH  MathSciNet  Google Scholar 

  9. A.W. Biermann and J.A. Feldman. On the Synthesis of Finite-State Machines from Samples of Their Behavior. IEEE Transaction on Computers, 21:592–597, 1972.

    Article  MATH  MathSciNet  Google Scholar 

  10. C. Bratosin, N. Sidorova, and W.M.P. van der Aalst. Distributed Genetic Process Mining. In H. Ishibuchi, editor, IEEE World Congress on Computational Intelligence (WCCI 2010), pages 1951–1958, Barcelona, Spain, July 2010. IEEE Press, New York, NY, 2010.

    Google Scholar 

  11. J. Carmona and J. Cortadella. Process Mining Meets Abstract Interpretation. In J.L. Balcazar, editor, ECML/PKDD 2010, volume 6321 of Lecture Notes in Artificial Intelligence, pages 184–199. Springer, Berlin, 2010.

    Google Scholar 

  12. J.E. Cook and A.L. Wolf. Discovering Models of Software Processes from Event-Based Data. ACM Transactions on Software Engineering and Methodology, 7(3):215–249, 1998.

    Article  Google Scholar 

  13. J. Cortadella, M. Kishinevsky, L. Lavagno, and A. Yakovlev. Deriving Petri Nets from Finite Transition Systems. IEEE Transactions on Computers, 47(8):859–882, 1998.

    Article  MathSciNet  Google Scholar 

  14. A. Datta. Automating the Discovery of As-Is Business Process Models: Probabilistic and Algorithmic Approaches. Information Systems Research, 9(3):275–301, 1998.

    Article  Google Scholar 

  15. A.K.A de Medeiros. Genetic Process Mining. PhD Thesis, Eindhoven University of Technology, 2006.

    Google Scholar 

  16. A.K.A. de Medeiros, W.M.P. van der Aalst, and A.J.M.M. Weijters. Workflow Mining: Current Status and Future Directions. In R. Meersman, Z. Tari, and D.C. Schmidt, editors, On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, volume 2888 of Lecture Notes in Computer Science, pages 389–406. Springer, Berlin, 2003.

    Chapter  Google Scholar 

  17. A.K.A de Medeiros, A.J.M.M. Weijters, and W.M.P. van der Aalst. Genetic Process Mining: An Experimental Evaluation. Data Mining and Knowledge Discovery, 14(2):245–304, 2007.

    Article  MathSciNet  Google Scholar 

  18. J. Desel and J. Esparza. Free Choice Petri Nets, volume 40 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, 1995.

    Book  MATH  Google Scholar 

  19. A. Ehrenfeucht and G. Rozenberg. Partial (Set) 2-Structures—Part 1 and Part 2. Acta Informatica, 27(4):315–368, 1989.

    Article  MathSciNet  Google Scholar 

  20. D.R. Ferreira and D. Gillblad. Discovering Process Models from Unlabelled Event Logs. In U. Dayal, J. Eder, J. Koehler, and H. Reijers, editors, Business Process Management (BPM 2009), volume 5701 of Lecture Notes in Computer Science, pages 143–158. Springer, Berlin, 2009.

    Chapter  Google Scholar 

  21. S. Goedertier, D. Martens, B. Baesens, R. Haesen, and J. Vanthienen. Process Mining as First-Order Classification Learning on Logs with Negative Events. In A. ter Hofstede, B. Benatallah, and H.Y. Paik, editors, BPM 2007 International Workshops (BPI, BPD, CBP, ProHealth, RefMod, Semantics4ws), volume 4928 of Lecture Notes in Computer Science, pages 42–53. Springer, Berlin, 2008.

    Google Scholar 

  22. S. Goedertier, D. Martens, J. Vanthienen, and B. Baesens. Robust Process Discovery with Artificial Negative Events. Journal of Machine Learning Research, 10:1305–1340, 2009.

    MathSciNet  Google Scholar 

  23. E.M. Gold. Language Identification in the Limit. Information and Control, 10(5):447–474, 1967.

    Article  MATH  Google Scholar 

  24. C.W. Günther. Process Mining in Flexible Environments. PhD Thesis, Eindhoven University of Technology, September 2009.

    Google Scholar 

  25. C.W. Günther and W.M.P. van der Aalst. Fuzzy Mining: Adaptive Process Simplification Based on Multi-Perspective Metrics. In G. Alonso, P. Dadam, and M. Rosemann, editors, International Conference on Business Process Management (BPM 2007), volume 4714 of Lecture Notes in Computer Science, pages 328–343. Springer, Berlin, 2007.

    Google Scholar 

  26. D. Hand, H. Mannila, and P. Smyth. Principles of Data Mining. MIT Press, Cambridge, MA, 2001.

    Google Scholar 

  27. J. Herbst. A Machine Learning Approach to Workflow Management. In Proceedings 11th European Conference on Machine Learning, volume 1810 of Lecture Notes in Computer Science, pages 183–194. Springer, Berlin, 2000.

    Google Scholar 

  28. J. Herbst. Ein induktiver Ansatz zur Akquisition und Adaption von Workflow-Modellen. PhD Thesis, Universität Ulm, November 2001.

    Google Scholar 

  29. S.C. Kleene. Representation of Events in Nerve Nets and Finite Automata. In C.E. Shannon and J. McCarthy, editors, Automata Studies, pages 3–41. Princeton University Press, Princeton, NJ, 1956.

    Google Scholar 

  30. H. Mannila, H. Toivonen, and A.I. Verkamo. Discovery of Frequent Episodes in Event Sequences. Data Mining and Knowledge Discovery, 1(3):259–289, 1997.

    Article  Google Scholar 

  31. T.M. Mitchell. Machine Learning. McGraw-Hill, New York, NY, 1997.

    MATH  Google Scholar 

  32. A. Nerode. Linear Automaton Transformations. Proceedings of the American Mathematical Society, 9(4):541–544, 1958.

    Article  MATH  MathSciNet  Google Scholar 

  33. C.A. Petri. Kommunikation mit Automaten. PhD Thesis, Institut für instrumentelle Mathematik, Bonn, 1962.

    Google Scholar 

  34. A. Rozinat. Process Mining: Conformance and Extension. PhD Thesis, Eindhoven University of Technology, November 2010.

    Google Scholar 

  35. R. Srikant and R. Agrawal. Mining Sequential Patterns: Generalization and Performance Improvements. In Proceedings of the 5th International Conference on Extending Database Technology (EDBT ’96), pages 3–17, 1996.

    Google Scholar 

  36. W.M.P. van der Aalst, A.H.M. ter Hofstede, B. Kiepuszewski, and A.P. Barros. Workflow Patterns. Distributed and Parallel Databases, 14(1):5–51, 2003.

    Article  Google Scholar 

  37. W.M.P. van der Aalst, B.F. van Dongen, J. Herbst, L. Maruster, G. Schimm, and A.J.M.M. Weijters. Workflow Mining: A Survey of Issues and Approaches. Data and Knowledge Engineering, 47(2):237–267, 2003.

    Article  Google Scholar 

  38. W.M.P. van der Aalst, A.J.M.M. Weijters, and L. Maruster. Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering, 16(9):1128–1142, 2004.

    Article  Google Scholar 

  39. W.M.P. van der Aalst, H.A. Reijers, A.J.M.M. Weijters, B.F. van Dongen, A.K.A. de Medeiros, M. Song, and H.M.W. Verbeek. Business Process Mining: An Industrial Application. Information Systems, 32(5):713–732, 2007.

    Article  Google Scholar 

  40. W.M.P. van der Aalst, M. Pesic, and H. Schonenberg. Declarative Workflows: Balancing Between Flexibility and Support. Computer Science—Research and Development, 23(2):99–113, 2009.

    Google Scholar 

  41. W.M.P. van der Aalst, V. Rubin, H.M.W. Verbeek, B.F. van Dongen, E. Kindler, and C.W. Günther. Process Mining: A Two-Step Approach to Balance Between Underfitting and Overfitting. Software and Systems Modeling, 9(1):87–111, 2010.

    Article  Google Scholar 

  42. J.M.E.M. van der Werf, B.F. van Dongen, C.A.J. Hurkens, and A. Serebrenik. Process Discovery Using Integer Linear Programming. Fundamenta Informaticae, 94:387–412, 2010.

    Google Scholar 

  43. B.F. van Dongen. Process Mining and Verification. PhD Thesis, Eindhoven University of Technology, 2007.

    Google Scholar 

  44. B.F. van Dongen and W.M.P. van der Aalst. Multi-Phase Process Mining: Building Instance Graphs. In P. Atzeni, W. Chu, H. Lu, S. Zhou, and T.W. Ling, editors, International Conference on Conceptual Modeling (ER 2004), volume 3288 of Lecture Notes in Computer Science, pages 362–376. Springer, Berlin, 2004.

    Chapter  Google Scholar 

  45. B.F. van Dongen, N. Busi, G.M. Pinna, and W.M.P. van der Aalst. An Iterative Algorithm for Applying the Theory of Regions in Process Mining. In W. Reisig, K. van Hee, and K. Wolf, editors, Proceedings of the Workshop on Formal Approaches to Business Processes and Web Services (FABPWS’07), pages 36–55. Publishing House of University of Podlasie, Siedlce, 2007.

    Google Scholar 

  46. B.F. van Dongen, A.K.A. de Medeiros, and L. Wenn. Process Mining: Overview and Outlook of Petri Net Discovery Algorithms. In K. Jensen and W.M.P. van der Aalst, editors, Transactions on Petri Nets and Other Models of Concurrency II, volume 5460 of Lecture Notes in Computer Science, pages 225–242. Springer, Berlin, 2009.

    Chapter  Google Scholar 

  47. A.J.M.M. Weijters and J.T.S. Ribeiro. Flexible Heuristics Miner (FHM). BETA Working Paper Series, WP 334, Eindhoven University of Technology, Eindhoven, 2010.

    Google Scholar 

  48. A.J.M.M. Weijters and W.M.P. van der Aalst. Rediscovering Workflow Models from Event-Based Data Using Little Thumb. Integrated Computer-Aided Engineering, 10(2):151–162, 2003.

    Google Scholar 

  49. L. Wen, W.M.P. van der Aalst, J. Wang, and J. Sun. Mining Process Models with Non-free-Choice Constructs. Data Mining and Knowledge Discovery, 15(2):145–180, 2007.

    Article  MathSciNet  Google Scholar 

  50. L. Wen, J. Wang, W.M.P. van der Aalst, B. Huang, and J. Sun. A Novel Approach for Process Mining Based on Event Types. Journal of Intelligent Information Systems, 32(2):163–190, 2009.

    Article  Google Scholar 

  51. Workflow Patterns Home Page. http://www.workflowpatterns.com.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wil M. P. van der Aalst .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

van der Aalst, W.M.P. (2011). Advanced Process Discovery Techniques. In: Process Mining. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19345-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19345-3_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19344-6

  • Online ISBN: 978-3-642-19345-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics