TraVaG: Differentially Private Trace Variant Generation Using GANs

Rafiei, Majid; Wangelik, Frederik; Pourbafrani, Mahsa; van der Aalst, Wil M. P.

doi:10.1007/978-3-031-33080-3_25

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 476))

Included in the following conference series:

International Conference on Research Challenges in Information Science

643 Accesses

Abstract

Process mining is rapidly growing in the industry. Consequently, privacy concerns regarding sensitive and private information included in event data, used by process mining algorithms, are becoming increasingly relevant. State-of-the-art research mainly focuses on providing privacy guarantees, e.g., differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques for releasing trace variants still do not fulfill all the requirements of industry-scale usage. Moreover, providing privacy guarantees when there exists a high rate of infrequent trace variants is still a challenge. In this paper, we introduce TraVaG as a new approach for releasing differentially private trace variants based on Generative Adversarial Networks (GANs) that provides industry-scale benefits and enhances the level of privacy guarantees when there exists a high ratio of infrequent variants. Moreover, TraVaG overcomes shortcomings of conventional privacy preservation techniques such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data show that our approach outperforms state-of-the-art techniques in terms of privacy guarantees, plain data utility preservation, and result utility preservation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

TraVaS: Differentially Private Trace Variant Selection for Process Mining

Generative Adversarial Nets Enhanced Continual Data Release Using Differential Privacy

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Notes

1.
https://github.com/wangelik/TraVaG/blob/main/supplementary/TraVaG.pdf.
2.
Note that also other clipping strategies exist, as highlighted in [22].
3.
Note that in [25], TraVaS was already compared with SaCoFa [11] and benchmark [21] and showed better performance. Here, the benchmark method is included for easier comparison. Moreover, Libra [8] does not take \(\epsilon \) as an input parameter but computes it based on \(\alpha \) as an RDP parameter and its sampling strategy. This makes the comparison based on exact \(\epsilon \) and \(\delta \) parameters very difficult. Nevertheless, an important observation in contrast to TraVaG is that Libra returns an empty log for event logs with many infrequent variants, such as Sepsis when \(\delta \le 10^{-3}\).
4.
https://github.com/wangelik/TraVaG/blob/main/supplementary/TraVaG.pdf.
5.
https://github.com/wangelik/TraVaG/blob/main/supplementary/metrics.pdf.

References

van der Aalst, W.M.P.: Process Mining - Data Science in Action, 2nd edn. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
Book Google Scholar
Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016, pp. 308–318. ACM (2016)
Google Scholar
Ács, G., Melis, L., Castelluccia, C., Cristofaro, E.D.: Differentially private mixture of generative neural networks. IEEE Trans. Knowl. Data Eng. 31(6), 1109–1121 (2019)
Article Google Scholar
Chen, Q., et al.: Differentially private data generative models. CoRR abs/1812.02274 (2018)
Google Scholar
Cohen, A., Nissim, K.: Towards formalizing the GDPR’s notion of singling out. Proc. Natl. Acad. Sci. USA 117(15), 8344–8352 (2020)
Article Google Scholar
van Dongen, B.F., Weber, B., Ferreira, D.R., Weerdt, J.D.: BPI challenge 2013. In: Proceedings of the 3rd Business Process Intelligence Challenge (2013)
Google Scholar
Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-79228-4_1
Chapter MATH Google Scholar
Elkoumy, G., Dumas, M.: Libra: high-utility anonymization of event logs for process mining via subsampling. In: 4th International Conference on Process Mining, ICPM. IEEE (2022)
Google Scholar
Elkoumy, G., Pankova, A., Dumas, M.: Mine me but don’t single me out: differentially private event logs for process mining. In: 3rd International Conference on Process Mining, ICPM 2021, pp. 80–87. IEEE (2021)
Google Scholar
EU: EU General Data Protection. OJ L 119(1) (2016)
Google Scholar
Fahrenkrog-Petersen, S.A., Kabierski, M., Rösel, F., van der Aa, H., Weidlich, M.: Sacofa: semantics-aware control-flow anonymization for process mining. In: 3rd International Conference on Process Mining, ICPM 2021, Eindhoven, The Netherlands, 31 October–4 November 2021, pp. 72–79. IEEE (2021)
Google Scholar
Frigerio, L., de Oliveira, A.S., Gomez, L., Duverger, P.: Differentially private generative adversarial networks for time series, continuous, and discrete open data. In: Dhillon, G., Karlsson, F., Hedström, K., Zúquete, A. (eds.) SEC 2019. IAICT, vol. 562, pp. 151–164. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22312-0_11
Chapter Google Scholar
Goodfellow, I.J., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems (2017)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: 2nd International Conference on Learning Representations, Conference Track Proceedings (2014)
Google Scholar
Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from incomplete event logs. In: Ciardo, G., Kindler, E. (eds.) PETRI NETS 2014. LNCS, vol. 8489, pp. 91–110. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07734-5_6
Chapter Google Scholar
Li, K., Yang, S., Sullivan, T.M., Burd, R.S., Marsic, I.: Generating privacy-preserving process data with deep generative models. CoRR abs/2203.07949 (2022)
Google Scholar
Liashchynskyi, P., Liashchynskyi, P.: Grid search, random search, genetic algorithm: a big comparison for NAS. CoRR abs/1912.06059 (2019)
Google Scholar
Lu, Y., Chen, Q., Poon, S.K.: A deep learning approach for repairing missing activity labels in event logs for process mining. Information 13(5), 234 (2022)
Article Google Scholar
Mannhardt, F.: Sepsis cases (2016). https://doi.org/10.4121/UUID:915D2BFB-7E84-49AD-A286-DC35F063A460
Mannhardt, F., Koschmider, A., Baracaldo, N., Weidlich, M., Michael, J.: Privacy-preserving process mining - differential privacy for event logs. Bus. Inf. Syst. Eng. 61(5), 595–614 (2019)
Article Google Scholar
McMahan, H.B., Andrew, G.: A general approach to adding differential privacy to iterative training procedures. CoRR abs/1812.06210 (2018)
Google Scholar
Mironov, I.: Rényi differential privacy. In: 30th IEEE Computer Security Foundations Symposium, CSF 2017, pp. 263–275. IEEE Computer Society (2017)
Google Scholar
Rafiei, M., van der Aalst, W.M.P.: Towards quantifying privacy in process mining. In: Leemans, S., Leopold, H. (eds.) ICPM 2020. LNBIP, vol. 406, pp. 385–397. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72693-5_29
Chapter Google Scholar
Rafiei, M., Wangelik, F., van der Aalst, W.M.P.: TraVaS: differentially private trace variant selection for process mining. In: Montali, M., Senderovich, A., Weidlich, M. (eds.) ICPM 2022. LNBIP, vol. 468. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-27815-0_9
Chapter Google Scholar
Tang, J., Korolova, A., Bai, X., Wang, X., Wang, X.: Privacy loss in apple’s implementation of differential privacy on macos 10.12. CoRR abs/1709.02753 (2017)
Google Scholar
Tantipongpipat, U.T., Waites, C., Boob, D., Siva, A.A., Cummings, R.: Differentially private synthetic mixed-type data generation for unsupervised learning. Intell. Decis. Technol. 15(4), 779–807 (2021)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Chair of Process and Data Science, RWTH Aachen University, Aachen, Germany
Majid Rafiei, Frederik Wangelik, Mahsa Pourbafrani & Wil M. P. van der Aalst

Authors

Majid Rafiei
View author publications
You can also search for this author in PubMed Google Scholar
Frederik Wangelik
View author publications
You can also search for this author in PubMed Google Scholar
Mahsa Pourbafrani
View author publications
You can also search for this author in PubMed Google Scholar
Wil M. P. van der Aalst
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Majid Rafiei .

Editor information

Editors and Affiliations

Université Paris 1 Panthéon-Sorbonne, Paris, France
Selmin Nurcan
University of Bergen, Bergen, Norway
Andreas L. Opdahl
University of Essex, Colchester, UK
Haralambos Mouratidis
Ionian University, Corfu, Greece
Aggeliki Tsohou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rafiei, M., Wangelik, F., Pourbafrani, M., van der Aalst, W.M.P. (2023). TraVaG: Differentially Private Trace Variant Generation Using GANs. In: Nurcan, S., Opdahl, A.L., Mouratidis, H., Tsohou, A. (eds) Research Challenges in Information Science: Information Science and the Connected World. RCIS 2023. Lecture Notes in Business Information Processing, vol 476. Springer, Cham. https://doi.org/10.1007/978-3-031-33080-3_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-33080-3_25
Published: 23 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33079-7
Online ISBN: 978-3-031-33080-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TraVaG: Differentially Private Trace Variant Generation Using GANs

Abstract

Access this chapter

Similar content being viewed by others

TraVaS: Differentially Private Trace Variant Selection for Process Mining

Generative Adversarial Nets Enhanced Continual Data Release Using Differential Privacy

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

TraVaG: Differentially Private Trace Variant Generation Using GANs

Abstract

Access this chapter

Similar content being viewed by others

TraVaS: Differentially Private Trace Variant Selection for Process Mining

Generative Adversarial Nets Enhanced Continual Data Release Using Differential Privacy

On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation