Advertisement

Decomposed Process Mining: The ILP Case

  • H. M. W. VerbeekEmail author
  • Wil M. P. van der Aalst
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 202)

Abstract

Over the last decade process mining techniques have matured and more and more organizations started to use process mining to analyze their operational processes. The current hype around “big data” illustrates the desire to analyze ever-growing data sets. Process mining starts from event logs—multisets of traces (sequences of events)—and for the widespread application of process mining it is vital to be able to handle “big event logs”. Some event logs are “big” because they contain many traces. Others are big in terms of different activities. Most of the more advanced process mining algorithms (both for process discovery and conformance checking) scale very badly in the number of activities. For these algorithms, it could help if we could split the big event log (containing many activities) into a collection of smaller event logs (which each contain fewer activities), run the algorithm on each of these smaller logs, and merge the results into a single result. This paper introduces a generic framework for doing exactly that, and makes this concrete by implementing algorithms for decomposed process discovery and decomposed conformance checking using Integer Linear Programming (ILP) based algorithms. ILP-based process mining techniques provide precise results and formal guarantees (e.g., perfect fitness), but are known to scale badly in the number of activities. A small case study shows that we can gain orders of magnitude in run-time. However, in some cases there is tradeoff between run-time and quality.

Keywords

Process discovery Conformance analysis Big data Decomposition 

References

  1. 1.
    Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. Technical report, McKinsey Global Institute (2011)Google Scholar
  2. 2.
    Hilbert, M., López, P.: The World’s Technological Capacity to Store, Communicate, and Compute Information. Sci. 332(6025), 60–65 (2011)CrossRefGoogle Scholar
  3. 3.
    van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes, 1st edn. Springer Publishing Company Incorporated, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    van der Aalst, W.M.P., Rubin, V., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87–111 (2010)CrossRefGoogle Scholar
  5. 5.
    Bergenthum, R., Desel, J., Lorenz, R., Mauser, S.: Process mining based on regions of languages. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 375–383. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  6. 6.
    Solé, M., Carmona, J.: Process mining from a basis of state regions. In: Lilius, J., Penczek, W. (eds.) PETRI NETS 2010. LNCS, vol. 6128, pp. 226–245. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  7. 7.
    Carmona, J.A., Cortadella, J., Kishinevsky, M.: A region-based algorithm for discovering petri nets from event logs. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 358–373. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  8. 8.
    van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process Discovery using Integer Linear Programming. Fundam. Inform. 94(3–4), 387–412 (2009)zbMATHGoogle Scholar
  9. 9.
    van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)CrossRefGoogle Scholar
  10. 10.
    Adriansyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Conformance Checking using Cost-Based Fitness Analysis. In: Chi, C., Johnson, P., eds.: IEEE International Enterprise Computing Conference (EDOC 2011), pp. 55–64. IEEE Computer Society (2011)Google Scholar
  11. 11.
    Muñoz-Gama, J., Carmona, J.: A fresh look at precision in process conformance. In: Hull, R., Mendling, J., Tai, S. (eds.) BPM 2010. LNCS, vol. 6336, pp. 211–226. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  12. 12.
    Muñoz-Gama, J., Carmona, J.: Enhancing precision in Process Conformance: Stability, confidence and severity. In: CIDM, pp. 184–191. IEEE (2011)Google Scholar
  13. 13.
    Rozinat, A., van der Aalst, W.M.P.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. 33(1), 64–95 (2008)CrossRefGoogle Scholar
  14. 14.
    van der Aalst, W.M.P.: Decomposing Petri nets for process mining: A generic approach. Distrib. Parallel Databases 31(4), 471–507 (2013)CrossRefGoogle Scholar
  15. 15.
    Verbeek, H.M.W., van der Aalst, W.M.P.: Decomposing Replay Problems: A Case Study. In: Moldt, D., (ed.) PNSE+ModPE. vol. 989 of CEUR Workshop Proceedings, CEUR-WS.org, pp. 219–235 (2013)Google Scholar
  16. 16.
    van der Wiel, T.: Process mining using integer linear programming. Master’s thesis, Eindhoven University of Technology, Department of Mathematics and Computer Science (2010). http://alexandria.tue.nl/extra1/afstversl/wsk-i/wiel2010.pdf
  17. 17.
    van Dongen, B.F.: BPI Challenge 2012 (2012). http://dx.doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • H. M. W. Verbeek
    • 1
    Email author
  • Wil M. P. van der Aalst
    • 1
  1. 1.Department of Mathematics and Computer ScienceEindhoven University of TechnologyEindhovenThe Netherlands

Personalised recommendations