Skip to main content

Managing Variability of Large Public Administration Event Log Collections: Dealing with Concept Drift

  • Conference paper
  • First Online:
Perspectives in Business Informatics Research (BIR 2023)

Abstract

The analysis of large event log collections aimed at variability management requires an intensive pre-processing phase. It is intuitive that obsolete behaviour that could be present in the logs must be removed in order to gain insight into the collection. Changes in the information system may indeed generate obsolete behaviour, more specifically, in the case of public administration, changes in the law may imply a change in the process, which must be updated in the information system. The logs containing the updated behaviour can then be used in variability management practices, such as the creation of configurable models. This type of analysis has numerous criticalities, one of which is the difficulty of obtaining an effective representation of the process, without running into excessive complexity of the model produced. Obsolete behavior results in an unnecessary increase in complexity and should therefore be removed. This paper introduces an event log analysis and visualisation technique based on the notion of complexity introduced by Lempel Ziv. The visualization enables process analysts to identify concept drift in the logs, thereby facilitating the removal of outdated behavior. Furthermore, when equilibrium is achieved, it indicates that the behavior is representative of the entire log. Consequently, during variability analysis, it becomes possible to prune the log, reducing computational complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Change history

  • 30 October 2023

    A correction has been published.

Notes

  1. 1.

    when talking about obsolete behavior, we mean behavioral patterns that were once part of the process under analysis but that can no longer be found in more recent event logs, describing that same process.

References

  1. Aalst, W.M.P.: Process-aware information systems: lessons to be learned from process mining. In: Jensen, K., van der Aalst, W.M.P. (eds.) Transactions on Petri Nets and Other Models of Concurrency II. LNCS, vol. 5460, pp. 1–26. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00899-3_1

    Chapter  Google Scholar 

  2. Aalst, W.M.P.: Using process mining to generate accurate and interactive business process maps. In: Abramowicz, W., Flejter, D. (eds.) BIS 2009. LNBIP, vol. 37, pp. 1–14. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03424-4_1

    Chapter  Google Scholar 

  3. Back, C.O., Debois, S., Slaats, T.: Entropy as a measure of log variability. J. Data Semant. 8, 129–156 (2019)

    Article  Google Scholar 

  4. Bai, Y., Liang, Z., Li, X.: A permutation Lempel-Ziv complexity measure for EEG analysis. Biomed. Signal Process. Control 19, 102–114 (2015)

    Article  Google Scholar 

  5. Bose, R.P.J.C., van der Aalst, W.M.P., Žliobaitė, I., Pechenizkiy, M.: Handling concept drift in process mining. In: Mouratidis, H., Rolland, C. (eds.) CAiSE 2011. LNCS, vol. 6741, pp. 391–405. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21640-4_30

    Chapter  Google Scholar 

  6. Bose, R.J.C., Van Der Aalst, W.M., Žliobaitė, I., Pechenizkiy, M.: Dealing with concept drifts in process mining. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 154–171 (2013)

    Article  Google Scholar 

  7. Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: Mining configurable process models from collections of event logs. In: Daniel, F., Wang, J., Weber, B. (eds.) BPM 2013. LNCS, vol. 8094, pp. 33–48. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40176-3_5

    Chapter  Google Scholar 

  8. Ceravolo, P., Tavares, G.M., Junior, S.B., Damiani, E.: Evaluation goals for online process mining: a concept drift perspective. IEEE Trans. Serv. Comput. 15(4), 2473–2489 (2020)

    Article  Google Scholar 

  9. Corradini, F., Luciani, C., Morichetta, A., Piangerelli, M., Polini, A.: TLV-diss\(_{\gamma }\): a dissimilarity measure for public administration process logs. In: Scholl, H.J., Gil-Garcia, J.R., Janssen, M., Kalampokis, E., Lindgren, I., Rodríguez Bolívar, M.P. (eds.) EGOV 2021. LNCS, vol. 12850, pp. 301–314. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84789-0_22

    Chapter  Google Scholar 

  10. Corradini, F., Luciani, C., Morichetta, A., Piangerelli, M., Polini, A.: Label-independent feature engineering-based clustering in public administration event logs. EGOV-CeDEM-ePart 2022, 222 (2022)

    Google Scholar 

  11. Corradini, F., Luciani, C., Morichetta, A., Polini, A.: Process variance analysis and configuration in the public administration sector 2872, 103–112 (2021)

    Google Scholar 

  12. Corradini, F., Morichetta, A., Re, B., Tiezzi, F.: Walking through the semantics of exclusive and event-based gateways in BPMN choreographies. In: Alvim, M.S., Chatzikokolakis, K., Olarte, C., Valencia, F. (eds.) The Art of Modelling Computational Systems: A Journey from Logic and Concurrency to Security and Privacy. LNCS, vol. 11760, pp. 163–181. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31175-9_10

    Chapter  Google Scholar 

  13. Dumas, M., Van der Aalst, W.M., Ter Hofstede, A.H.: Process-Aware Information Systems: Bridging People and Software Through Process Technology. John Wiley & Sons, Hoboken (2005)

    Book  Google Scholar 

  14. Ostovar, A., Maaradji, A., La Rosa, M., ter Hofstede, A.H.M.: Characterizing drift from event streams of business processes. In: Dubois, E., Pohl, K. (eds.) CAiSE 2017. LNCS, vol. 10253, pp. 210–228. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59536-8_14

    Chapter  Google Scholar 

  15. Pentland, B.T.: Sequential variety in work processes. Organ. Sci. 14(5), 528–540 (2003)

    Article  Google Scholar 

  16. Perez-Castillo, R., Weber, B., Pinggera, J., Zugal, S., de Guzmán, I.G.R., Piattini, M.: Generating event logs from non-process-aware systems enabling business process mining. Enterp. Inf. Syst. 5(3), 301–335 (2011)

    Article  Google Scholar 

  17. dos Santos Garcia, C., et al.: Process mining techniques and applications - a systematic mapping study. Expert Syst. Appl. 133, 260–295 (2019)

    Article  Google Scholar 

  18. Sato, D.M.V., De Freitas, S.C., Barddal, J.P., Scalabrin, E.E.: A survey on concept drift in process mining. ACM Comput. Surv. (CSUR) 54(9), 1–38 (2021). https://arxiv.org/pdf/2112.02000.pdf

  19. Schunselaar, D.M., van der Avoort, T., Verbeek, H., van der Aalst, W.M.: Yawl in the cloud. In: YAWL Symposium, pp. 41–48 (2013)

    Google Scholar 

  20. Schunselaar, D.M.M., Verbeek, E., van der Aalst, W.M.P., Raijers, H.A.: Creating sound and reversible configurable process models using CoSeNets. In: Abramowicz, W., Kriksciuniene, D., Sakalauskas, V. (eds.) BIS 2012. LNBIP, vol. 117, pp. 24–35. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30359-3_3

    Chapter  Google Scholar 

  21. Schunselaar, D.M., Verbeek, E., Van Der Aalst, W.M., Reijers, H.A.: Petra: a tool for analysing a process family. In: PNSE@ Petri Nets, pp. 269–288 (2014)

    Google Scholar 

  22. Schunselaar, D.M.M., Verbeek, H.M.W., Reijers, H.A., van der Aalst, W.M.P.: YAWL in the cloud: supporting process sharing and variability. In: Fournier, F., Mendling, J. (eds.) BPM 2014. LNBIP, vol. 202, pp. 367–379. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15895-2_31

    Chapter  Google Scholar 

  23. Szczepański, J., Amigó, J.M., Wajnryb, E., Sanchez-Vives, M.: Application of Lempel-Ziv complexity to the analysis of neural discharges. Netw.: Comput. Neural Syst. 14(2), 335 (2003)

    Google Scholar 

  24. Torres, V., Zugal, S., Weber, B., Reichert, M., Ayora, C., Pelechano, V.: A qualitative comparison of approaches supporting business process variability. In: La Rosa, M., Soffer, P. (eds.) BPM 2012. LNBIP, vol. 132, pp. 560–572. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36285-9_57

    Chapter  Google Scholar 

  25. Van Der Aalst, W.: Process mining: overview and opportunities. ACM Trans. Manag. Inf. Syst. (TMIS) 3(2), 1–17 (2012)

    Article  Google Scholar 

  26. Aalst, W.: Data science in action. In: Process Mining, pp. 3–23. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4_1

    Chapter  Google Scholar 

  27. Aalst, W.M.P.: Configurable services in the cloud: supporting variability while enabling cross-organizational process mining. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2010. LNCS, vol. 6426, pp. 8–25. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16934-2_5

    Chapter  Google Scholar 

  28. Vogelaar, J.J.C.L., Verbeek, H.M.W., Luka, B., van der Aalst, W.M.P.: Comparing business processes to determine the feasibility of configurable models: a case study. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 100, pp. 50–61. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28115-0_6

    Chapter  Google Scholar 

  29. Yeshchenko, A., Di Ciccio, C., Mendling, J., Polyvyanyy, A.: Comprehensive process drift detection with visual analytics. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 119–135. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_11

    Chapter  Google Scholar 

Download references

Acknowledgements

Funded by the European Union - NextGenerationEU - Piano Nazionale di Ripresa e Resilienza, Missione 4 Istruzione e Ricerca - Componente 2 Dalla ricerca all’impresa - Investimento 1.5, ECS_00000041 VITALITY - Innovation, digitalisation and sustainability for the diffused economy in Central Italy. Caterina Luciani’s work has been funded by Maggioli Spa.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Piangerelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Corradini, F., Luciani, C., Morichetta, A., Piangerelli, M. (2023). Managing Variability of Large Public Administration Event Log Collections: Dealing with Concept Drift. In: Hinkelmann, K., López-Pellicer, F.J., Polini, A. (eds) Perspectives in Business Informatics Research. BIR 2023. Lecture Notes in Business Information Processing, vol 493. Springer, Cham. https://doi.org/10.1007/978-3-031-43126-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43126-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43125-8

  • Online ISBN: 978-3-031-43126-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics