Advertisement

LogRank: An Approach to Sample Business Process Event Log for Efficient Discovery

  • Cong Liu
  • Yulong Pei
  • Qingtian Zeng
  • Hua Duan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11061)

Abstract

Considerable amounts of business process event logs can be collected by modern information systems. Process discovery aims to uncover a process model from an event log. Many process discovery approaches have been proposed, however, most of them have difficulties in handling large-scale event logs. Motivated by PageRank, in this paper we propose LogRank, a graph-based ranking model, for event log sampling. Using LogRank, a large-scale event log can be sampled to a smaller size that can be efficiently handled by existing discovery approaches. Moreover, we introduce an approach to measure the quality of a sample log with respect to the original one from a discovery perspective. The proposed sampling approach has been implemented in the open-source process mining toolkit ProM. The experimental analyses with both synthetic and real-life event logs demonstrate that the proposed sampling approach provides an effective solution to improve process discovery efficiency as well as ensuring high quality of the discovered model.

Keywords

LogRank Log sampling Process discovery Quality measure 

Notes

Acknowledgement

This work was supported in part by the NSFC under Grant 61472229, Grant 61602279, Grant 71704096, and Grant 31671588, in part by the Science and Technology Development Fund of Shandong Province of China under Grant 2016ZDJS02A11, Grant 2014GGX101035, and Grant ZR2017MF027, in part by the Taishan Scholar Climbing Program of Shandong Province, and in part by the SDUST Research Fund under Grant 2015TDJH102.

References

  1. 1.
    van der Aalst, W.: Process Mining: Data Science in Action. Springer, Heidelberg (2016).  https://doi.org/10.1007/978-3-662-49851-4CrossRefGoogle Scholar
  2. 2.
    Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: Meersman, R., et al. (eds.) OTM 2012. LNCS, vol. 7565, pp. 305–322. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33606-5_19CrossRefGoogle Scholar
  3. 3.
    Cheng, J., Liu, C., Zhou, M., Zeng, Q., Ylä-Jääski, A.: Automatic composition of semantic web services based on fuzzy predicate petri nets. IEEE Trans. Autom. Sci. Eng. 12(2), 680–689 (2015)CrossRefGoogle Scholar
  4. 4.
    Cheng, L., Kotoulas, S., Ward, T.E., Theodoropoulos, G.: Robust and efficient large-large table outer joins on distributed infrastructures. In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014. LNCS, vol. 8632, pp. 258–269. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-09873-9_22CrossRefGoogle Scholar
  5. 5.
    Cheng, L., Li, T.: Efficient data redistribution to speedup big data analytics in large systems. In: 2016 IEEE 23rd International Conference on High Performance Computing (HiPC), pp. 91–100. IEEE (2016)Google Scholar
  6. 6.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2012)MATHGoogle Scholar
  7. 7.
    Evermann, J.: Scalable process discovery using map-reduce. IEEE Trans. Serv. Comput. 9(3), 469–481 (2016)CrossRefGoogle Scholar
  8. 8.
    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs - a constructive approach. In: Colom, J.-M., Desel, J. (eds.) PETRI NETS 2013. LNCS, vol. 7927, pp. 311–329. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38697-8_17CrossRefGoogle Scholar
  9. 9.
    Liu, C., Cheng, J., Wang, Y., Gao, S.: Time performance optimization and resource conflicts resolution for multiple project management. IEICE Trans. Inf. Syst. 99(3), 650–660 (2016)CrossRefGoogle Scholar
  10. 10.
    Liu, C., Duan, H., Qingtian, Z., Zhou, M., Lu, F., Cheng, J.: Towards comprehensive support for privacy preservation cross-organization business process mining. IEEE Trans. Serv. Comput. 1–15 (2016).  https://doi.org/10.1109/TSC.2016.2617331
  11. 11.
    Liu, C., Zeng, Q., Duan, H., Zhou, M., Lu, F., Cheng, J.: E-net modeling and analysis of emergency response processes constrained by resources and uncertain durations. IEEE Trans. Syst. Man Cybern.: Syst. 45(1), 84–96 (2015)CrossRefGoogle Scholar
  12. 12.
    Liu, C., Zeng, Q., Zou, J., Lu, F., Wu, Q.: Invariant decomposition conditions for petri nets based on the index of transitions. Inf. Technol. J. 11(7), 768–774 (2012)CrossRefGoogle Scholar
  13. 13.
    Liu, C., Zhang, F.: Petri net based modeling and correctness verification of collaborative emergency response processes. Cybern. Inf. Technol. 16(3), 122–136 (2016)MathSciNetGoogle Scholar
  14. 14.
    Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. Association for Computational Linguistics (2004)Google Scholar
  15. 15.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report, Stanford InfoLab (1999)Google Scholar
  16. 16.
    Pei, Y., Yin, W., Huang, L.: Generic multi-document summarization using topic-oriented information. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS (LNAI), vol. 7458, pp. 435–446. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-32695-0_39CrossRefGoogle Scholar
  17. 17.
    Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace clustering in process mining. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-00328-8_11CrossRefGoogle Scholar
  18. 18.
    Zeng, Q., Liu, C., Duan, H.: Resource conflict detection and removal strategy for nondeterministic emergency response processes using petri nets. Enterp. Inf. Syst. 10(7), 729–750 (2016)CrossRefGoogle Scholar
  19. 19.
    Zeng, Q., Lu, F., Liu, C., Duan, H., Zhou, C.: Modeling and verification for cross-department collaborative business processes using extended petri nets. IEEE Trans. Syst. Man Cybern.: Syst. 45(2), 349–362 (2015)CrossRefGoogle Scholar
  20. 20.
    Zeng, Q., Sun, S.X., Duan, H., Liu, C., Wang, H.: Cross-organizational collaborative workflow mining from a multi-source log. Decis. Support Syst. 54(3), 1280–1301 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Shandong University of Science and TechnologyQingdaoChina
  2. 2.Eindhoven University of TechnologyEindhovenThe Netherlands

Personalised recommendations