Repairing Outlier Behaviour in Event Logs

  • Mohammadreza Fani SaniEmail author
  • Sebastiaan J. van Zelst
  • Wil M. P. van der Aalst
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 320)


One of the main challenges in applying process mining on real event data, is the presence of noise and rare behaviour. Applying process mining algorithms directly on raw event data typically results in complex, incomprehensible, and, in some cases, even inaccurate analyses. As a result, correct and/or important behaviour may be concealed. In this paper, we propose an event data repair method, that tries to detect and repair outlier behaviour within the given event data. We propose a probabilistic method that is based on the occurrence frequency of activities in specific contexts. Our approach allows for removal of infrequent behaviour, which enables us to obtain a more global view of the process. The proposed method has been implemented in both the ProM- and the RapidProM framework. Using these implementations, we conduct a collection of experiments that show that we are able to detect and modify most types of outlier behaviour in the event data. Our evaluation clearly demonstrates that we are able to help to improve process mining discovery results by repairing event logs upfront.


Process mining Data cleansing Log repair Event log preprocessing Conditional probability Outlier detection 


  1. 1.
    van der Aalst, W.M.P.: Using process mining to bridge the gap between BI and BPM. IEEE Comput. 44(12), 77–80 (2011)CrossRefGoogle Scholar
  2. 2.
    van der Aalst, W.M.P.: Process Mining - Data Science in Action, 2nd edn. Springer, Heidelberg (2016)CrossRefGoogle Scholar
  3. 3.
    Conforti, R., La Rosa, M., ter Hofstede, A.H.M.: Filtering out infrequent behavior from business process event logs. IEEE Trans. Knowl. Data Eng. 29(2), 300–314 (2017)CrossRefGoogle Scholar
  4. 4.
    Sani, M.F., van Zelst, S.J., van der Aalst, W.M.P.: Improving process discovery results by filtering outliers using conditional behavioural probabilities. In: Teniente, E., Weidlich, M. (eds.) BPM 2017. LNBIP, vol. 308, pp. 216–229. Springer, Cham (2018). Scholar
  5. 5.
    van der Aalst, W., van Dongen, B.F., Günther, C.W., Rozinat, A., Verbeek, E., Weijters, T.: ProM: the process mining toolkit. BPM (Demos) 489(31) (2009)Google Scholar
  6. 6.
    van der Aalst, W.M.P., Bolt, A., van Zelst, S.J.: RapidProM: mine your processes and not just your data. CoRR abs/1703.03740 (2017)Google Scholar
  7. 7.
    van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). Scholar
  8. 8.
    Rebuge, Á., Ferreira, D.R.: Business process analysis in healthcare environments: a methodology based on process mining. Inf. Syst. 37(2), 99–116 (2012)CrossRefGoogle Scholar
  9. 9.
    van der Aalst, W.M.P., Weijters, T., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)CrossRefGoogle Scholar
  10. 10.
    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs - a constructive approach. In: Colom, J.-M., Desel, J. (eds.) PETRI NETS 2013. LNCS, vol. 7927, pp. 311–329. Springer, Heidelberg (2013). Scholar
  11. 11.
    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs containing infrequent behaviour. In: Lohmann, N., Song, M., Wohed, P. (eds.) BPM 2013. LNBIP, vol. 171, pp. 66–78. Springer, Cham (2014). Scholar
  12. 12.
    van Zelst, S.J., van Dongen, B.F., van der Aalst, W.M.P., Verbeek, H.M.W.: Discovering workflow nets using integer linear programming. Computing (2017)Google Scholar
  13. 13.
    Weijters, A.J.M.M., Ribeiro, J.T.S.: Flexible heuristics miner (FHM). In: CIDM (2011)Google Scholar
  14. 14.
    Günther, C.W., van der Aalst, W.M.P.: Fuzzy mining – adaptive process simplification based on multi-perspective metrics. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 328–343. Springer, Heidelberg (2007). Scholar
  15. 15.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection for discrete sequences: a survey. IEEE Trans. Knowl. Data Eng. 24(5), 823–839 (2012)CrossRefGoogle Scholar
  16. 16.
    Wang, J., Song, S., Lin, X., Zhu, X., Pei, J.: Cleaning structured event logs: a graph repair approach. In: ICDE 2015, pp. 30–41 (2015)Google Scholar
  17. 17.
    Cheng, H.J., Kumar, A.: Process mining on noisy logs-can log sanitization help to improve performance? Decis. Support Syst. 79, 138–149 (2015)CrossRefGoogle Scholar
  18. 18.
    van Zelst, S.J., Fani Sani, M., Ostovar, A., Conforti, R., La Rosa, M.: Filtering spurious events from event streams of business processes. In: Proceedings of the CAISE (2018)Google Scholar
  19. 19.
    Fahland, D., van der Aalst, W.: Model repair-aligning process models to reality. Inf. Syst. 47, 220–243 (2015)CrossRefGoogle Scholar
  20. 20.
    Armas-Cervantes, A., van Beest, N., La Rosa, M., Dumas, M., Raboczi, S.: Incremental and interactive business process model repair in Apromore. In: Proceedings of the BPM Demos. CRC Press (2017)Google Scholar
  21. 21.
    Rogge-Solti, A., Mans, R.S., van der Aalst, W.M.P., Weske, M.: Improving documentation by repairing event logs. In: Grabis, J., Kirikova, M., Zdravkovic, J., Stirna, J. (eds.) PoEM 2013. LNBIP, vol. 165, pp. 129–144. Springer, Heidelberg (2013). Scholar
  22. 22.
    Bolt, A., de Leoni, M., van der Aalst, W.M.P.: Scientific workflows for process mining: building blocks, scenarios, and implementation. STTT 18(6), 607–628 (2016)CrossRefGoogle Scholar
  23. 23.
    Weerdt, J.D., Backer, M.D., Vanthienen, J., Baesens, B.: A robust f-measure for evaluating discovered process models. In: Proceedings of the CIDM, pp. 148–155 (2011)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Mohammadreza Fani Sani
    • 1
    Email author
  • Sebastiaan J. van Zelst
    • 2
  • Wil M. P. van der Aalst
    • 1
  1. 1.Process and Data Science ChairRWTH Aachen UniversityAachenGermany
  2. 2.Department of Mathematics and Computer ScienceEindhoven University of TechnologyEindhovenThe Netherlands

Personalised recommendations