Skip to main content

A Distance Measure for Privacy-Preserving Process Mining Based on Feature Learning

  • Conference paper
  • First Online:
Business Process Management Workshops (BPM 2021)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 436))

Included in the following conference series:

Abstract

To enable process analysis based on an event log without compromising the privacy of individuals involved in process execution, a log may be anonymized. Such anonymization strives to transform a log so that it satisfies provable privacy guarantees, while largely maintaining its utility for process analysis. Existing techniques perform anonymization using simple, syntactic measures to identify suitable transformation operations. This way, the semantics of the activities referenced by the events in a trace are neglected, potentially leading to transformations in which events of unrelated activities are merged. To avoid this and incorporate the semantics of activities during anonymization, we propose to instead incorporate a distance measure based on feature learning. Specifically, we show how embeddings of events enable the definition of a distance measure for traces to guide event log anonymization. Our experiments with real-world data indicate that anonymization using this measure, compared to a syntactic one, yields logs that are closer to the original log in various dimensions and, hence, have higher utility for process analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/roeselfa/FeatureLearningBasedDistanceMetrics.

References

  1. BPI challenge 2020: Prepaid travel costs. https://data.4tu.nl/articles/dataset/BPI_Challenge_2020_Prepaid_Travel_Costs/12696722. Accessed 12 May 2020

  2. Receipt phase of an environmental permit application process (‘wabo’), coselog project. https://data.4tu.nl/collections/Environmental_permit_application_process_WABO_CoSeLoG_project/5065529. Accessed 11 May 2020

  3. Sepsis cases - event log. https://data.4tu.nl/articles/dataset/Sepsis_Cases_-_Event_Log/12707639. Accessed 03 Apr 2020

  4. Batista, E., Solanas, A.: A uniformization-based approach to preserve individuals’ privacy during process mining analyses. Peer Peer Netw. Appl. 14, 1–20 (2021). https://doi.org/10.1007/s12083-020-01059-1

    Article  Google Scholar 

  5. Bauer, M., Fahrenkrog-Petersen, S.A., Koschmider, A., Mannhardt, F., van der Aa, H., Weidlich, M.: ELPaaS: event log privacy as a service. In: BPM Demos, pp. 159–163 (2019)

    Google Scholar 

  6. De Koninck, P., vanden Broucke, S., De Weerdt, J.: act2vec, trace2vec, log2vec, and model2vec: representation learning for business processes. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds.) BPM 2018. LNCS, vol. 11080, pp. 305–321. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98648-7_18

    Chapter  Google Scholar 

  7. Elkoumy, G., Fahrenkrog-Petersen, S.A., Dumas, M., Laud, P., Pankova, A., Weidlich, M.: Secure multi-party computation for inter-organizational process mining. In: Nurcan, S., Reinhartz-Berger, I., Soffer, P., Zdravkovic, J. (eds.) BPMDS/EMMSAD -2020. LNBIP, vol. 387, pp. 166–181. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49418-6_11

    Chapter  Google Scholar 

  8. Elkoumy, G., Fahrenkrog-Petersen, S.A., Dumas, M., Laud, P., Pankova, A., Weidlich, M.: Shareprom: a tool for privacy-preserving inter-organizational process mining. In: BPM Demos, pp. 72–76 (2020)

    Google Scholar 

  9. Elkoumy, G., et al.: Privacy and confidentiality in process mining-threats and research challenges. arXiv:2106.00388 (2021)

  10. Elkoumy, G., Pankova, A., Dumas, M.: Mine me but don’t single me out: differentially private event logs for process mining. arXiv:2103.11739 (2021)

  11. Fahrenkrog-Petersen, S., van der Aa, H., Weidlich, M.: PRETSA: event log sanitization for privacy-aware process discovery. In: ICPM (2019)

    Google Scholar 

  12. Fahrenkrog-Petersen, S.A.: Providing privacy guarantees in process mining. In: CAiSE (Doctoral Consortium), pp. 23–30 (2019)

    Google Scholar 

  13. Fahrenkrog-Petersen, S.A., van der Aa, H., Weidlich, M.: PRIPEL: privacy-preserving event log publishing including contextual information. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNCS, vol. 12168, pp. 111–128. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58666-9_7

    Chapter  Google Scholar 

  14. Kabierski, M., Fahrenkrog-Petersen, S.A., Weidlich, M.: Privacy-aware process performance indicators: framework and release mechanisms. In: La Rosa, M., Sadiq, S., Teniente, E. (eds.) CAiSE 2021. LNCS, vol. 12751, pp. 19–36. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79382-1_2

    Chapter  Google Scholar 

  15. Knols, B., van der Werf, J.M.E.M.: Measuring the behavioral quality of log sampling. In: ICPM. pp. 97–104 (2019)

    Google Scholar 

  16. Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: ICDE. IEEE (2007)

    Google Scholar 

  17. Liu, C., Duan, H., Zeng, Q., Zhou, M., Lu, F., Cheng, J.: Towards comprehensive support for privacy preservation cross-organization business process mining. IEEE Trans. Serv. Comput. 12(4), 639–653 (2016)

    Article  Google Scholar 

  18. Mannhardt, F., Koschmider, A., Baracaldo, N., Weidlich, M., Michael, J.: Privacy-preserving process mining. BISE 61(5), 595–614 (2019). https://doi.org/10.1007/s12599-019-00613-3

    Article  Google Scholar 

  19. Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL, pp. 746–751 (2013)

    Google Scholar 

  20. Pika, A., Wynn, M.T., Budiono, S., Ter Hofstede, A.H., van der Aalst, W., Reijers, H.A.: Privacy-preserving process mining in healthcare. Int. J. Environ. Res. Public Health 17(5), 1612 (2020)

    Article  Google Scholar 

  21. Rafiei, M., van der Aalst, W.M.P.: Mining roles from event logs while preserving privacy. In: Di Francescomarino, C., Dijkman, R., Zdun, U. (eds.) BPM 2019. LNBIP, vol. 362, pp. 676–689. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37453-2_54

    Chapter  Google Scholar 

  22. Rafiei, M., van der Aalst, W.: Practical aspect of privacy-preserving data publishing in process mining. In: BPM Demos, pp. 92–96 (2020)

    Google Scholar 

  23. Rafiei, M., van der Aalst, W.: Group-based privacy preservation techniques for process mining. arXiv preprint arXiv:2105.11983 (2021)

  24. Rafiei, M., Wagner, M., van der Aalst, W.M.P.: TLKC-privacy model for process mining. In: Dalpiaz, F., Zdravkovic, J., Loucopoulos, P. (eds.) RCIS 2020. LNBIP, vol. 385, pp. 398–416. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50316-1_24

    Chapter  Google Scholar 

  25. Rozinat, A., Aalst, W.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33, 64–95 (2008)

    Article  Google Scholar 

  26. Stefanini, A., Aloini, D., Benevento, E., Dulmin, R., Mininno, V.: Performance analysis in emergency departments: a data-driven approach. Measuring Bus. Excell. (2018)

    Google Scholar 

  27. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzz. Knowl.-Based Syst. 10(05), 557–570 (2002)

    Article  Google Scholar 

  28. Van Der Aalst, W.: Process mining: overview and opportunities. ACM Trans. Manag. Inf. Syst. (TMIS) 3(2), 1–17 (2012)

    Article  Google Scholar 

  29. Nuñez von Voigt, S., et al.: Quantifying the re-identification risk of event logs for process mining. In: Dustdar, S., Yu, E., Salinesi, C., Rieu, D., Pant, V. (eds.) CAiSE 2020. LNCS, vol. 12127, pp. 252–267. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49435-3_16

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephan A. Fahrenkog-Petersen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rösel, F., Fahrenkog-Petersen, S.A., van der Aa, H., Weidlich, M. (2022). A Distance Measure for Privacy-Preserving Process Mining Based on Feature Learning. In: Marrella, A., Weber, B. (eds) Business Process Management Workshops. BPM 2021. Lecture Notes in Business Information Processing, vol 436. Springer, Cham. https://doi.org/10.1007/978-3-030-94343-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94343-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94342-4

  • Online ISBN: 978-3-030-94343-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics