A repairing missing activities approach with succession relation for event logs

Abstract

In the field of process mining, it is worth noting that process mining techniques assume that the resulting event logs can not only continuously record the occurrence of events but also contain all event data. However, like in IoT systems, data transmission may fail due to weak signal or resource competition, which causes the company’s information system to be unable to keep a complete event log. Based on a incomplete event log, the process model obtained by using existing process mining technologies is deviated from actual business process to a certain degree. In this paper, we propose a method for repairing missing activities based on succession relation of activities from event logs. We use an activity relation matrix to represent the event log and cluster it. The number of traces in the cluster is used as a measure of similarity calculation between incomplete traces and cluster results. Parallel activities in selecting pre-occurrence and post-occurrence activities of missing activities from incomplete traces are considered. Experimental results on real-life event logs show that our approach performs better than previous method in repairing missing activities.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Notes

  1. 1.

    http://www.promtools.org/doku.php?id=prom67.

  2. 2.

    http://www.processmining.org/event_logs_and_models_used_in_book.

  3. 3.

    http://www.promtools.org/doku.php?id=promlite12.

  4. 4.

    http://www.processmining.org/event_logs_and_models_used_in_book.

  5. 5.

    https://data.4tu.nl/repository/uuid:76c46b83-c930-4798-a1c9-4be94dfeb741.

  6. 6.

    https://data.4tu.nl/repository/uuid:3926db30-f712-4394-aebc-75976070e91f.

  7. 7.

    https://data.4tu.nl/repository/uuid:c923af09-ce93-44c3-ace0-c5508cf103ad.

References

  1. 1.

    Van der Aalst W, Weijters T, Maruster L (2004) Workflow mining: discovering process models from event logs. IEEE Trans Knowl Data Eng 16(9):1128–1142

    Article  Google Scholar 

  2. 2.

    van der Aalst WM (2018) Process discovery from event data: relating models and logs through abstractions. Wiley Interdiscip Rev Data Min Knowl Discov 8(3):e1244

    Article  Google Scholar 

  3. 3.

    Aalst WMPVD (2011) Process mining: discovery. Springer, Berlin

    Google Scholar 

  4. 4.

    van der Aalst (2012) Process mining manifesto. n: Daniel F, Barkaoui K, Dustdar S (eds) Business process management workshops. Springer, Berlin, pp 169–194

    Google Scholar 

  5. 5.

    Augusto A, Conforti R, Dumas M, La Rosa M, Bruno G (2016) Automated discovery of structured process models: discover structured vs. discover and structure. In: Comyn-Wattiau I, Tanaka K, Song IY, Yamamoto S, Saeki M (eds) Conceptual modeling. Springer, Cham, pp 313–329

    Google Scholar 

  6. 6.

    Augusto A, Conforti R, Dumas M, La Rosa M, Polyvyanyy A (2019) Split miner: automated discovery of accurate and simple business process models from event logs. Knowl Inf Syst 59(2):251–284. https://doi.org/10.1007/s10115-018-1214-x

    Article  Google Scholar 

  7. 7.

    Brown ML, Kros JF (2003) Data mining and the impact of missing data. Ind Manag Data Syst 103(8):611–621

    Article  Google Scholar 

  8. 8.

    De Medeiros AKA, Van Dongen BF, Van der Aalst WMP, Weijters AJMM (2004) Process mining: extending the \(\alpha \)-algorithm to mine short loops, BETA Working Paper Series, WP 113, Eindhoven University of Technology, Eindhoven

  9. 9.

    De Weerdt J, Vanden Broucke S, Vanthienen J, Baesens B (2013) Active trace clustering for improved process discovery. IEEE Trans Knowl Data Eng 25(12):2708–2720

    Article  Google Scholar 

  10. 10.

    Delias P, Doumpos M, Grigoroudis E, Matsatsinis N (2019) A non-compensatory approach for trace clustering. Int Trans Oper Res 26(5):1828–1846

    MathSciNet  Article  Google Scholar 

  11. 11.

    Di Francescomarino C, Dumas M, Federici M, Ghidini C, Maggi FM, Rizzi W, Simonetto L (2018) Genetic algorithms for hyperparameter optimization in predictive business process monitoring. Inf Syst 74:67–83

    Article  Google Scholar 

  12. 12.

    Effendi YA, Sarno R (2017) Discovering process model from event logs by considering overlapping rules. In: 2017 4th International conference on electrical engineering, computer science and informatics (EECSI) pp 1–6

  13. 13.

    Fahland D, van der Aalst WMP (2012) Repairing process models to reflect reality. In: Barros A, Gal A, Kindler E (eds) Business process management. Springer, Berlin, pp 229–245

    Google Scholar 

  14. 14.

    Greco G, Guzzo A, Pontieri L, Saccà D (2004) Mining expressive process models by clustering workflow traces. In: Dai H, Srikant R, Zhang C (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 52–62

    Google Scholar 

  15. 15.

    Gu CQ, Chang HY, Yi Y (2008) Workflow mining: extending the \(\alpha \)algorithm to mine duplicate tasks. In: 2008 International conference on machine learning and cybernetics, IEEE, vol 1, pp 361–368

  16. 16.

    Günther CW, Van Der Aalst WMP (2007) Fuzzy mining - adaptive process simplification based on multi-perspective metrics. In: Alonso G, Dadam P, Rosemann M (eds) Business process management, BPM 2007. Lecture notes in computer science, vol 4714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75183-0_24

  17. 17.

    Ha QT, Bui HN, Nguyen TT (2016) A trace clustering solution based on using the distance graph model. In: Nguyen NT, Iliadis L, Manolopoulos Y, Trawiński B (eds) Computational collective intelligence. Springer, Cham, pp 313–322

    Google Scholar 

  18. 18.

    Jaeger D, Jung R (eds) (2015) Self-organizing maps. Springer, New York, pp 2655–2655. https://doi.org/10.1007/978-1-4614-6675-8_100525

    Google Scholar 

  19. 19.

    Lamma E, Mello P, Riguzzi F, Storari S (2008) Applying inductive logic programming to process mining. In: Blockeel H, Ramon J, Shavlik J, Tadepalli P (eds) Inductive logic programming. Springer, Berlin, pp 132–146

    Google Scholar 

  20. 20.

    Lee C, Choy KL, Ho GT, Lam CH (2016) A slippery genetic algorithm-based process mining system for achieving better quality assurance in the garment industry. Expert Syst Appl 46:236–248

    Article  Google Scholar 

  21. 21.

    Leemans SJJ, Fahland D, van der Aalst WMP (2013) Discovering block-structured process models from event logs - a constructive approach. In: Colom JM, Desel J (eds) Application and theory of petri nets and concurrency. Springer, Berlin, pp 311–329

    Google Scholar 

  22. 22.

    Leemans SJJ, Fahland D, van der Aalst WMP (2014) Discovering block-structured process models from incomplete event logs. In: Ciardo G, Kindler E (eds) Application and theory of petri nets and concurrency. Springer, Cham, pp 91–110

    Google Scholar 

  23. 23.

    de Leoni M, van der Aalst WMP (2013) Aligning event logs and process models for multi-perspective conformance checking: An approach based on integer linear programming. In: Daniel F, Wang J, Weber B (eds) Business process management. Springer, Berlin, pp 113–129

    Google Scholar 

  24. 24.

    Lu X, Fahland D, van der Aalst WMP (2015) Conformance checking based on partially ordered event data. In: Fournier F, Mendling J (eds) Business process management workshops. Springer, Cham, pp 75–88

    Google Scholar 

  25. 25.

    Mannhardt F, de Leoni M, Reijers HA, van der Aalst WMP (2016) Balanced multi-perspective checking of process conformance. Computing 98(4):407–437. https://doi.org/10.1007/s00607-015-0441-1

    MathSciNet  Article  MATH  Google Scholar 

  26. 26.

    Mannhardt F, de Leoni M, Reijers HA, van der Aalst WM, Toussaint PJ (2018) Guided process discovery-a pattern-based approach. Inf Syst 76:1–18

    Article  Google Scholar 

  27. 27.

    de Medeiros AKA, Weijters AJMM, van der Aalst WMP (2007) Genetic process mining: an experimental evaluation. Data Min Knowl Discov 14(2):245–304. https://doi.org/10.1007/s10618-006-0061-7

    MathSciNet  Article  Google Scholar 

  28. 28.

    Rozinat A, Van der Aalst WM (2008) Conformance checking of processes based on monitoring real behavior. Inf Syst 33(1):64–95

    Article  Google Scholar 

  29. 29.

    Song M, Günther CW, van der Aalst WMP (2009) Trace clustering in process mining. In: Ardagna D, Mecella M, Yang J (eds) Business process management workshops. Springer, Berlin, pp 109–120

    Google Scholar 

  30. 30.

    Sun Y, Bauer B, Weidlich M (2017) Compound trace clustering to generate accurate and simple sub-process models. In: Maximilien M, Vallecillo A, Wang J, Oriol M (eds) Service-oriented computing. Springer, Cham, pp 175–190

    Google Scholar 

  31. 31.

    Van Der Aalst WMP (2013) Business process management: a comprehensive survey. ISRN Softw Eng 2013:37. https://doi.org/10.1155/2013/507984

    Article  Google Scholar 

  32. 32.

    Wang P, Tan W, Tang A, Hu K (2018) A novel trace clustering technique based on constrained trace alignment. In: Zu Q, Hu B (eds) Human centered computing. Springer, Cham, pp 53–63

    Google Scholar 

  33. 33.

    Weijters A, Ribeiro J (2011) Flexible heuristics miner (FHM). In: 2011 IEEE symposium on computational intelligence and data mining (CIDM), IEEE, pp 310–317

  34. 34.

    Wen L, van der Aalst WMP, Wang J, Sun J (2007) Mining process models with non-free-choice constructs. Data Min Knowl Discov 15(2):145–180. https://doi.org/10.1007/s10618-007-0065-y

    MathSciNet  Article  Google Scholar 

  35. 35.

    Xu J, Liu J (2019) A profile clustering based event logs repairing approach for process mining. IEEE Access 7:17872–17881

    Article  Google Scholar 

  36. 36.

    Zakarija I, Skopljanac-Macina F, Blaskovic B (2015) Discovering process model from incomplete log using process mining. In: 2015 57th International symposium ELMAR (ELMAR) pp 117–120

  37. 37.

    ZarehFarkhady R, Aali SH, Branch B (2012) A two phase approach for process mining in incomplete and noisy Logs. Int J Comput Sci Issues, 9(1)

  38. 38.

    van Zelst SJ, van Dongen BF, van der Aalst WMP (2018) Event stream-based process discovery using abstract representations. Knowl Inf Syst 54(2):407–435. https://doi.org/10.1007/s10115-017-1060-2

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jiuyun Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Xu, J., Zhang, R. et al. A repairing missing activities approach with succession relation for event logs. Knowl Inf Syst 63, 477–495 (2021). https://doi.org/10.1007/s10115-020-01524-6

Download citation

Keywords

  • Process mining
  • Information system
  • Activity relation matrix
  • Incomplete event logs