Skip to main content

Analysis of Language Inspired Trace Representation for Anomaly Detection

  • Conference paper
  • First Online:
ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium (TPDL 2020, ADBIS 2020)

Abstract

A great concern for organizations is to detect anomalous process instances within their business processes. For that, conformance checking performs model-aware analysis by comparing process logs to business models for the detection of anomalous process executions. However, in several scenarios, a model is either unavailable or its generation is costly, which requires the employment of alternative methods to allow a confident representation of traces. This work supports the analysis of language inspired process analysis grounded in the word2vec encoding algorithm. We argue that natural language encodings correctly model the behavior of business processes, supporting a proper distinction between common and anomalous behavior. In the experiments, we compared accuracy and time cost among different word2vec setups and classic encoding methods (token-based replay and alignment features), addressing seven different anomaly scenarios. Feature importance values and the impact of different anomalies in seven event logs were also evaluated to bring insights on the trace representation subject. Results show the proposed encoding overcomes representational capability of traditional conformance metrics for the anomaly detection task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pm4py.fit.fraunhofer.de/documentation.

References

  1. van der Aalst, W., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. WIREs Data Min. Knowl. Disc. 2(2), 182–192 (2012)

    Article  Google Scholar 

  2. van der Aalst, W.M.P.: Process Mining: Data Science in Action, 2nd edn. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4

    Book  Google Scholar 

  3. Barbon Junior, S., Tavares, G.M., da Costa, V.G.T., Ceravolo, P., Damiani, E.: A framework for human-in-the-loop monitoring of concept-drift detection in event log stream. In: Companion Proceedings of the The Web Conference 2018, pp. 319–326. WWW 2018, International World Wide Web Conferences Steering Committee (2018)

    Google Scholar 

  4. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(10), 281–305 (2012)

    MathSciNet  MATH  Google Scholar 

  5. Berti, A., van Zelst, S.J., van der Aalst, W.: Process mining for python (pm4py): Bridging the gap between process- and data science (2019)

    Google Scholar 

  6. Bezerra, F., Wainer, J.: Algorithms for anomaly detection of traces in logs of process aware information systems. Inf. Syst. 38(1), 33–44 (2013)

    Article  Google Scholar 

  7. Böhmer, K., Rinderle-Ma, S.: Multi-perspective anomaly detection in business process execution events. In: Debruyne, C., et al. (eds.) OTM 2016. LNCS, vol. 10033, pp. 80–98. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48472-3_5

    Chapter  Google Scholar 

  8. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  9. Burattin, A.: Plg2: Multiperspective processes randomization and simulation for online and offline settings (2015)

    Google Scholar 

  10. De Koninck, P., vanden Broucke, S., De Weerdt, J.: act2vec, trace2vec, log2vec, and model2vec: representation learning for business processes. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds.) BPM 2018. LNCS, vol. 11080, pp. 305–321. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98648-7_18

    Chapter  Google Scholar 

  11. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol. 32. p. II-1188-II-1196. ICML 2014, JMLR.org (2014)

    Google Scholar 

  12. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Scalable process discovery with guarantees. In: Gaaloul, K., Schmidt, R., Nurcan, S., Guerreiro, S., Ma, Q. (eds.) Enterprise, Business-Process and Information Systems Modeling, pp. 85–101. Springer, Cham (2015)

    Chapter  Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  14. Nolle, T., Luettgen, S., Seeliger, A., Mühlhäuser, M.: Analyzing business process anomalies using autoencoders. Mach. Learn. 107(11), 1875–1893 (2018)

    Article  MathSciNet  Google Scholar 

  15. Nolle, T., Luettgen, S., Seeliger, A., Mühlhäuser, M.: Binet: multi-perspective business process anomaly classification. Inf. Syst. 1, 101458 (2019)

    Article  Google Scholar 

  16. Nolle, T., Seeliger, A., Mühlhäuser, M.: Unsupervised anomaly detection in noisy business process event logs using denoising autoencoders. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS (LNAI), vol. 9956, pp. 442–456. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46307-0_28

    Chapter  Google Scholar 

  17. Nolle, T., Seeliger, A., Mühlhäuser, M.: BINet: multivariate business process anomaly detection using deep learning. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds.) BPM 2018. LNCS, vol. 11080, pp. 271–287. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98648-7_16

    Chapter  MATH  Google Scholar 

  18. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  19. Rozinat, A., van der Aalst, W.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33(1), 64–95 (2008)

    Article  Google Scholar 

  20. Tavares, G.M., Ceravolo, P., Turrisi Da Costa, V.G., Damiani, E., Barbon Junior, S.: Overlapping analytic stages in online process mining. In: 2019 IEEE International Conference on Services Computing (SCC), pp. 167–175, July 2019

    Google Scholar 

  21. Tavares, G.M., Turrisi Da Costa, V.G., Martins, V., Ceravolo, P., Barbon Junior, S.: Leveraging anomaly detection in business process with data stream mining. iSys - Revista Brasileira de Sistemas de Informação 12(1), 54–75 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Marques Tavares .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tavares, G.M., Barbon, S. (2020). Analysis of Language Inspired Trace Representation for Anomaly Detection. In: Bellatreche, L., et al. ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium. TPDL ADBIS 2020 2020. Communications in Computer and Information Science, vol 1260. Springer, Cham. https://doi.org/10.1007/978-3-030-55814-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-55814-7_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-55813-0

  • Online ISBN: 978-3-030-55814-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics