Analysis of Language Inspired Trace Representation for Anomaly Detection

Tavares, Gabriel Marques; Barbon, Sylvio

doi:10.1007/978-3-030-55814-7_25

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1260))

Included in the following conference series:

908 Accesses
10 Citations
1 Altmetric

Abstract

A great concern for organizations is to detect anomalous process instances within their business processes. For that, conformance checking performs model-aware analysis by comparing process logs to business models for the detection of anomalous process executions. However, in several scenarios, a model is either unavailable or its generation is costly, which requires the employment of alternative methods to allow a confident representation of traces. This work supports the analysis of language inspired process analysis grounded in the word2vec encoding algorithm. We argue that natural language encodings correctly model the behavior of business processes, supporting a proper distinction between common and anomalous behavior. In the experiments, we compared accuracy and time cost among different word2vec setups and classic encoding methods (token-based replay and alignment features), addressing seven different anomaly scenarios. Feature importance values and the impact of different anomalies in seven event logs were also evaluated to bring insights on the trace representation subject. Results show the proposed encoding overcomes representational capability of traditional conformance metrics for the anomaly detection task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://pm4py.fit.fraunhofer.de/documentation.

References

van der Aalst, W., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. WIREs Data Min. Knowl. Disc. 2(2), 182–192 (2012)
Article Google Scholar
van der Aalst, W.M.P.: Process Mining: Data Science in Action, 2nd edn. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
Book Google Scholar
Barbon Junior, S., Tavares, G.M., da Costa, V.G.T., Ceravolo, P., Damiani, E.: A framework for human-in-the-loop monitoring of concept-drift detection in event log stream. In: Companion Proceedings of the The Web Conference 2018, pp. 319–326. WWW 2018, International World Wide Web Conferences Steering Committee (2018)
Google Scholar
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(10), 281–305 (2012)
MathSciNet MATH Google Scholar
Berti, A., van Zelst, S.J., van der Aalst, W.: Process mining for python (pm4py): Bridging the gap between process- and data science (2019)
Google Scholar
Bezerra, F., Wainer, J.: Algorithms for anomaly detection of traces in logs of process aware information systems. Inf. Syst. 38(1), 33–44 (2013)
Article Google Scholar
Böhmer, K., Rinderle-Ma, S.: Multi-perspective anomaly detection in business process execution events. In: Debruyne, C., et al. (eds.) OTM 2016. LNCS, vol. 10033, pp. 80–98. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48472-3_5
Chapter Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Burattin, A.: Plg2: Multiperspective processes randomization and simulation for online and offline settings (2015)
Google Scholar
De Koninck, P., vanden Broucke, S., De Weerdt, J.: act2vec, trace2vec, log2vec, and model2vec: representation learning for business processes. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds.) BPM 2018. LNCS, vol. 11080, pp. 305–321. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98648-7_18
Chapter Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol. 32. p. II-1188-II-1196. ICML 2014, JMLR.org (2014)
Google Scholar
Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Scalable process discovery with guarantees. In: Gaaloul, K., Schmidt, R., Nurcan, S., Guerreiro, S., Ma, Q. (eds.) Enterprise, Business-Process and Information Systems Modeling, pp. 85–101. Springer, Cham (2015)
Chapter Google Scholar
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
Google Scholar
Nolle, T., Luettgen, S., Seeliger, A., Mühlhäuser, M.: Analyzing business process anomalies using autoencoders. Mach. Learn. 107(11), 1875–1893 (2018)
Article MathSciNet Google Scholar
Nolle, T., Luettgen, S., Seeliger, A., Mühlhäuser, M.: Binet: multi-perspective business process anomaly classification. Inf. Syst. 1, 101458 (2019)
Article Google Scholar
Nolle, T., Seeliger, A., Mühlhäuser, M.: Unsupervised anomaly detection in noisy business process event logs using denoising autoencoders. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS (LNAI), vol. 9956, pp. 442–456. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46307-0_28
Chapter Google Scholar
Nolle, T., Seeliger, A., Mühlhäuser, M.: BINet: multivariate business process anomaly detection using deep learning. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds.) BPM 2018. LNCS, vol. 11080, pp. 271–287. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98648-7_16
Chapter MATH Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Rozinat, A., van der Aalst, W.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33(1), 64–95 (2008)
Article Google Scholar
Tavares, G.M., Ceravolo, P., Turrisi Da Costa, V.G., Damiani, E., Barbon Junior, S.: Overlapping analytic stages in online process mining. In: 2019 IEEE International Conference on Services Computing (SCC), pp. 167–175, July 2019
Google Scholar
Tavares, G.M., Turrisi Da Costa, V.G., Martins, V., Ceravolo, P., Barbon Junior, S.: Leveraging anomaly detection in business process with data stream mining. iSys - Revista Brasileira de Sistemas de Informação 12(1), 54–75 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Università degli Studi di Milano (UNIMI), Milan, Italy
Gabriel Marques Tavares
Londrina State University (UEL), Londrina, Brazil
Sylvio Barbon Jr.

Authors

Gabriel Marques Tavares
View author publications
You can also search for this author in PubMed Google Scholar
Sylvio Barbon Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gabriel Marques Tavares .

Editor information

Editors and Affiliations

ISAE-ENSMA, Poitiers, France
Ladjel Bellatreche
Slovak University of Technology, Bratislava, Slovakia
Mária Bieliková
Université Lumière Lyon 2, Lyon, France
Omar Boussaïd
University of Genova, Genova, Italy
Barbara Catania
Université Lumière Lyon 2, Lyon, France
Jérôme Darmont
Leibniz University of Hannover, Hannover, Niedersachsen, Germany
Elena Demidova
Université Claude Bernard Lyon 1, Lyon, France
Fabien Duchateau
The Open University, Milton Keynes, UK
Mark Hall
University of Ljubljana, Ljubljana, Slovenia
Tanja Merčun
National Research University Higher School of Economics, St. Petersburg, Russia
Boris Novikov
Ionian University, Corfu, Greece
Christos Papatheodorou
Goethe University Frankfurt, Frankfurt am Main, Hessen, Germany
Thomas Risse
Universitat Politècnica de Catalunya, Barcelona, Spain
Oscar Romero
AgroParisTech, Montpellier, France
Lucile Sautot
University of Lyon, Lyon, France
Guilaine Talens
Poznań University of Technology, Poznań, Poland
Robert Wrembel
University of Ljubljana, Ljubljana, Slovenia
Maja Žumer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tavares, G.M., Barbon, S. (2020). Analysis of Language Inspired Trace Representation for Anomaly Detection. In: Bellatreche, L., et al. ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium. TPDL ADBIS 2020 2020. Communications in Computer and Information Science, vol 1260. Springer, Cham. https://doi.org/10.1007/978-3-030-55814-7_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-55814-7_25
Published: 18 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55813-0
Online ISBN: 978-3-030-55814-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics