Data Mining and Knowledge Discovery

, Volume 13, Issue 1, pp 67–87 | Cite as

A Rule-Based Approach for Process Discovery: Dealing with Noise and Imbalance in Process Logs

  • Laura MăruşterEmail author
  • A. J. M. M. (TON) Weijters
  • Wil M. P. Van Der Aalst
  • Antal Van Den Bosch


Effective information systems require the existence of explicit process models. A completely specified process design needs to be developed in order to enact a given business process. This development is time consuming and often subjective and incomplete. We propose a method that constructs the process model from process log data, by determining the relations between process tasks. To predict these relations, we employ machine learning technique to induce rule sets. These rule sets are induced from simulated process log data generated by varying process characteristics such as noise and log size. Tests reveal that the induced rule sets have a high predictive accuracy on new data. The effects of noise and imbalance of execution priorities during the discovery of the relations between process tasks are also discussed. Knowing the causal, exclusive, and parallel relations, a process model expressed in the Petri net formalism can be built. We illustrate our approach with real world data in a case study.


rule induction process mining knowledge discovery Petri nets 



We would like to thank Dr. Christine Pelletier (University of Groningen) for her valuable comments and remarks during the review of our paper.


  1. Aalst, W. van der. 1998. The application of Petri nets to workflow management. The Journal of Circuits, Systems and Computers, 8(1):21–66.Google Scholar
  2. Aalst, W. van der, Dongen, B. van, Herbst, J., Măruşter, L., Schimm, G., and Weijters, A. 2003. Workflow mining: A survey of issues and approaches. Data and Knowledge Engineering, 47(2):237–267.Google Scholar
  3. Aalst, W. van der and Weijters, A. 2004. Process mining: A research agenda. Computers in Industry, 53(3):231–244.Google Scholar
  4. Aalst, W. van der Weijters, A., and Măruşter, L. 2004. Workflow mining: Discovering process models from event logs. IEEE Transactions on Data and Knowledge Engineering 16(9):1128–1142.Google Scholar
  5. Agrawal, R., Gunopulos, D., and Leymann, F. 1998. Mining process models from workflow logs. In Sixth International Conference on Extending Database Technology, pp. 469–483.Google Scholar
  6. Cohen, W. 1995. Fast effective rule induction. In Proceedings of the Twelfth Int. Conference of Machine Learning ICML95.Google Scholar
  7. Cook, J. and Wolf, A. 1998a. Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology, 7(3):215–249.Google Scholar
  8. Cook, J. and Wolf, A. 1998b. Event-based detection of concurrency. Proceedings of the Sixth International Symposium on the Foundations of Software Engineering (FSE-6), pp. 35–45.Google Scholar
  9. Herbst, J. 2000a. Dealing with concurrency in workflow induction. In U. Baake, R. Zobel, and M. Al-Akaidi (Eds.), European Concurrent Engineering Conference. Society of Computer Simulation (SCS) Europe.Google Scholar
  10. Herbst, J. (2000b). Inducing Workflow models from workflow instances. In Proceedings of the 6th European Concurrent Engineering Conference. Society of Computer Simulation (SCS) Europe, pp. 175–182.Google Scholar
  11. Herbst, J. and Karagiannis, D. 2000. Integrating machine learning and workflow management to support acquisition and adaptation of workflow models. International Journal of Intelligent Systems in Accounting, Finance and Management, 9:67–92.Google Scholar
  12. IDS Scheer. 2002. ARIS Process Performance Manager (ARIS PPM): Measure, analyze and optimize your business process performance (whitepaper). (IDS Scheer, Saarbruecken, Gemany, Scholar
  13. Keller, G. and Teufel, T. 1998. SAP R/3 Process Oriented Implementation. Reading MA: Addison-Wesley.Google Scholar
  14. Măruşter, L., Aalst, W. van der, Weijters, A., Bosch, A. van den, and Daelemans, W. 2002. Automated discovery of workflow models from hospital data. In C. Dousson, F. Höppner, and R. Quiniou (Eds.), Proceedings of the ECAI Workshop on Knowledge Discovery from Temporal and Spatial Data, pp. 32–37.Google Scholar
  15. Măruşter, L., Weijters, A., Aalst, W., and Bosch, A. 2002. Process mining: Discovering direct successors in process logs. In S. Lange, K. Satoh, and C.H. Smith (Eds.), Proceedings of the 5th International Conference on Discovery Science (Discovery Science 2002), Berlin: Springer-Verlag, vol. 2534: pp. 364–373.Google Scholar
  16. Medeiros, A. de, Dongen, B. van, Aalst, W. van der and Weijters, A. 2004. Process Mining: Extending the α-algorithm to Mine Short Loops. BETA Working Paper Series, WP 113, Eindhoven University of Technology, Eindhoven, 2004.Google Scholar
  17. Medeiros, A. de, Weijters, A. and Aalst, W. van der. 2004. Using genetic algorithms to mine process models: Representation, operators and results. BETA Working Paper Series, WP 124, Eindhoven University of Technology, Eindhoven, 2004.Google Scholar
  18. Mitchell, T. 1995. Machine Learning. McGraw-Hill.Google Scholar
  19. Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan-Kaufmann.Google Scholar
  20. Reisig, W. and Rosenberg, G. (Eds.). 1998. Lectures on Petri nets I. Basic models, Berlin: Springer-Verlag.Google Scholar
  21. Veld, A. 2002. WFM, een last of een lust? (Confidential Report), Eindhoven University of Technology.Google Scholar
  22. Weijters, A. and Aalst, W. 2001. Process mining: Discovering workflow models from event-based data, B. Kröse, M. Rijke, G. Schreiber, and M. Someren (Eds.), Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001), pp. 283–290.Google Scholar
  23. Weiss, S. and Indhurkya, N. 1998. Predictive Data Mining. San Francisco: Morgan Kaufmann.Google Scholar
  24. Weiss, S. and Kulikowski, C. 1991. Computer Systems That Learn. Morgan Kaufmann.Google Scholar

Copyright information

© Springer Science + Business Media, LLC 2006

Authors and Affiliations

  • Laura Măruşter
    • 1
    Email author
  • A. J. M. M. (TON) Weijters
    • 2
  • Wil M. P. Van Der Aalst
    • 2
  • Antal Van Den Bosch
    • 3
  1. 1.University of GroningenGroningenNetherlands
  2. 2.Eindhoven University of TechnologyEindhovenNetherlands
  3. 3.Tilburg UniversityTilburgNetherlands

Personalised recommendations