Neural Approach to the Discovery Problem in Process Mining

  • Timofey ShuninEmail author
  • Natalia Zubkova
  • Sergey Shershakov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11179)


Process mining deals with various types of formal models. Some of them are used at intermediate stages of synthesis and analysis, whereas others are the desired goals themselves. Transition systems (TS) are widely used in both scenarios. Process discovery, which is a special case of the synthesis problem, tries to find patterns in event logs. In this paper, we propose a new approach to the discovery problem based on recurrent neural networks (RNN). Here, an event log serves as a training sample for a neural network; the algorithm extracts RNN’s internal state as the desired TS that describes the behavior present in the log. Models derived by the approach contain all behaviors from the event log (i.e. are perfectly fit) and vary in simplicity and precision, the key model quality metrics. One of the main advantages of the neural method is the natural ability to detect and merge common behavioral parts that are scattered across the log. The paper studies the proposed method, its properties and possible cases where the application of this approach is sensible as compared to other methods of TS synthesis.


Process mining Transition systems Quality metrics Recurrent neural networks Process models synthesis FSA/FSM 


  1. 1.
    Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006)zbMATHGoogle Scholar
  2. 2.
    Lavagno, L., Kishinevsky, M., Yakovlev, A., Cortadella, J.: Deriving Petri Nets from Finite Transition Systems. IEEE Transactions on Computers 47, 859–882 (1998)MathSciNetCrossRefGoogle Scholar
  3. 3.
    van der Aalst, W.M.P., Rubin, V., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87 (2008)CrossRefGoogle Scholar
  4. 4.
    Solé, M., Carmona, J.: Region-based foldings in process discovery. IEEE Trans. Knowl. Data Eng. 25(1), 192–205 (2013)CrossRefGoogle Scholar
  5. 5.
    Badouel, E., Bernardinello, L., Darondeau, P.: The synthesis problem for elementary net systems is NP-complete. Theor. Comput. Sci. 186(1), 107–134 (1997)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Das, S., Mozer, M.: A unified gradient-descent/clustering architecture for finite state machine induction. In: Morgan Kaufmann Advances in Neural Information Processing Systems, vol. 6, pp. 19–26 (1994)Google Scholar
  7. 7.
    Cook, J., Wolf, A.: Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol. 7, 215–249 (1998)CrossRefGoogle Scholar
  8. 8.
    Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: Meersman, R., et al. (eds.) OTM 2012. LNCS, vol. 7565, pp. 305–322. Springer, Heidelberg (2012). Scholar
  9. 9.
    Adriansyah, A., Munoz-Gama, J., Carmona, J., van Dongen, B.F., van der Aalst, W.M.P.: Measuring precision of modeled behavior. Inf. Syst. e-Bus. Manag. 13(1), 37–67 (2015)CrossRefGoogle Scholar
  10. 10.
    Shershakov, S.A., Kalenkova, A.A., Lomazova, I.A.: Transition systems reduction: balancing between precision and simplicity. In: Koutny, M., Kleijn, J., Penczek, W. (eds.) Transactions on Petri Nets and Other Models of Concurrency XII. LNCS, vol. 10470, pp. 119–139. Springer, Heidelberg (2017). Scholar
  11. 11.
    Tax, N., Lu, X., Sidorova, N., Fahland, D., van der Aalst, W.M.P.: The imprecisions of precision measures in process mining. CoRR, abs/1705.03303 (2017)Google Scholar
  12. 12.
    Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2010)zbMATHGoogle Scholar
  13. 13.
    Weiss, S., Kulikowski, C.: Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems. Morgan Kaufmann Publishers Inc., Burlington (1991)Google Scholar
  14. 14.
    Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of ICSA, pp. 1045–1048 (2010)Google Scholar
  15. 15.
    Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. In: 5th International Conference on Learning Representations (2017)Google Scholar
  16. 16.
    Biermann, A., Feldman, J.: On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput. C-21(6), 592–597 (1972)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Timofey Shunin
    • 1
    Email author
  • Natalia Zubkova
    • 1
  • Sergey Shershakov
    • 1
  1. 1.Laboratory of Process-Aware Information Systems (PAIS Lab), Faculty of Computer ScienceNational Research University Higher School of EconomicsMoscowRussia

Personalised recommendations