Skip to main content
Log in

A composite machine-learning-based framework for supporting low-level event logs to high-level business process model activities mappings enhanced by flexible BPMN model translation

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Process mining is an emerging discipline that aims to analyze business processes using event data logged by IT systems. In process mining, the focus is on how to effectively and efficiently predict the next process/trace to be activated among all the possible processes/traces that are available in the process schema (usually modeled as a graph). Most of the existing process mining techniques assume that there is a one-to-one mapping between process model activities and the events that are recorded during process execution. However, event logs and process model activities are at different level of granularity. In this paper, we present a machine-learning-based approach to map low-level event logs to high-level activities. With this work, we can bridge the abstraction levels when the high-level labels of the low-level events are not available. The proposed approach consists of two main phases: automatic labeling and machine-learning-based classification. In automatic labeling, a modified k-prototypes clustering approach has been used in order to obtain the labeled examples. Then, in the second phase, we trained different ML classifiers using the obtained labeled examples. Since, in real-life applications and systems, business processes are expressed according to the Business Process Model and Notation (BPMN) format, we improve our proposed framework by means of an innovative, flexible BPMN model translation methodology that acts at the first phase. We demonstrate the applicability of our proposed framework using two case studies with real-world event logs, and provide its experimental assessment and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Notes

  1. Some definitions used in our work are taken verbatim from their paper.

  2. The full definitions can be found in Ouyang et al. (2006).

References

  • Alfadhel S, Liu S, Oderanti FO (2017) Business process modelling and visualisation to support e-government decision making: business/is alignment. In: Proceedings third international conference decision support systems VII. Data, information and knowledge visualization in decision support systems, ICDSST 2017, Namur, Belgium, May 29–31, 2017, pp 45–57,

  • Altendrof J, Brende P, Lessard L (2005) Fraud detection for online retail using random forests. Technical Report

  • Awad A, Decker G, Weske M (2008) Efficient compliance checking using BPMN-Q and temporal logic. In: Proceedings 6th international conference business process management, BPM 2008, Milan, Italy, September 2–4, 2008, pp 326–341

  • Baier T, Mendling J (2013) Bridging abstraction layers in process mining: event to activity mapping. In: Nurcan S (eds) Enterprise, business-process and information systems modeling. BPMDS 2013 EMMSAD 2013, vol 147. Lecture Notes in Business Information Processing. Springer, Berlin, Heidelberg

  • Bernardi ML, Cimitile M, Francescomarino CD, Maggi FM (2016) Do activity lifecycles affect the validity of a business rule in a business process? Inf Syst 62:42–59

    Article  Google Scholar 

  • Boinee P, De Angelis A, Foresti GL (2005) Ensembling classifiers—an application to image data classification from Cherenkov telescope experiment. In: IEC (Prague), pp 394–398

  • Bose RJ, Verbeek EH, van der Aalst WM (2011) Discovering hierarchical process models using prom. In: Forum at the conference on advanced information systems engineering (CAiSE). Springer, pp 33–48

  • Braun P, Cameron JJ, Cuzzocrea A, Jiang F, Leung Carson K-S (2014) Effectively and efficiently mining frequent patterns from dense graph streams on disk. In: KES, volume 35 of procedia computer science. Elsevier, pp 338–347

  • Leo Breiman (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Burattin A, Cimitile M, Maggi FM, Sperduti A (2015) Online discovery of declarative process models from event streams. IEEE Trans Serv Comput 8(6):833–846

    Article  Google Scholar 

  • Ceravolo P, Damiani E, Torabi M, Barbon S (2017) Toward a new generation of log pre-processing methods for process mining. In: International conference on business process management. Springer, pp 55–70

  • Ciccio CD, Mecella M (2015) On the discovery of declarative control flows for artful processes. ACM Trans Manag Inf Syst 5(4):24:1–24:37

  • Costa R, Garcia O, Nuñez Maria J, Maló Pedro MN, Jardim-Gonçalves R (2007) Integrated solution to support enterprise interoperability at the business process level on e-procurement. In: Proceedings of the 3th international conference on interoperability for enterprise software and applications enterprise interoperability II–new challenges and industrial approaches, IESA 2007(March), pp. 27–30 (2007) Funchal. Madeira Island, Portugal, pp 89–100

  • Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    Article  MATH  Google Scholar 

  • Cuzzocrea A (2006) Improving range-sum query evaluation on data cubes via polynomial approximation. Data Knowl Eng 56(2):85–121

    Article  Google Scholar 

  • Cuzzocrea A, Bertino E (2011) Privacy preserving OLAP over distributed XML data: a theoretically-sound secure-multiparty-computation approach. J Comput Syst Sci 77(6):965–987

    Article  MathSciNet  MATH  Google Scholar 

  • Cuzzocrea A, Moussa R, Xu G (2013) Olap*: effectively and efficiently supporting parallel OLAP over big data. In: MEDI, volume 8216 of lecture notes in computer science. Springer, pp 38–49

  • Cuzzocrea A, Russo V (2009) Privacy preserving OLAP and OLAP security. In: Encyclopedia of data warehousing and mining. IGI Global, pp 1575–1581

  • Damiani E, Mulazzani F, Russo B, Succi G (2008) SAF: strategic alignment framework for monitoring organizations. In: BIS, volume 7 of Lecture notes in business information processing. Springer, pp 213–226

  • Debois S, Hildebrandt TT., Laursen P, Ulrik KR (2017) Declarative process mining for DCR graphs. In: Proceedings of the symposium on applied computing, SAC 2017, Marrakech, Morocco, April 3–7, 2017, pp 759–764

  • Dezi L, Santoro G, Gabteni H, Pellicelli AC (2018) The role of big data in shaping ambidextrous business process management: case studies from the service industry. Bus Proc Manag J 24(5):1163–1175

    Article  Google Scholar 

  • Dixit PM, Verbeek HMW, van der Aalst Wil MP (2018) Fast conformance analysis based on activity log abstraction. In: 22nd IEEE International enterprise distributed object computing conference, EDOC 2018, Stockholm, Sweden, October 16–19, 2018, pp 135–144

  • Dumas M, Van der Aalst WM, Ter Hofstede AH (2005) Process-aware information systems: bridging people and software through process technology. Wiley, Hoboken

    Book  Google Scholar 

  • Festa G, Safraou I, Cuomo MT, Solima L (2018) Big data for big pharma: harmonizing business process management to enhance ambidexterity. Bus Proc Manag J 24(5):1110–1123

    Article  Google Scholar 

  • Folleco A, Khoshgoftaar TM, Van Hulse J, Bullard L (2008) Software quality modeling: the impact of class noise on the random forest classifier. In: IEEE congress on IEEE world congress on computational intelligence evolutionary computation, 2008, CEC 2008. IEEE, pp 3853–3859

  • Günther CW, van der Aalst WMP (2006) Mining activity clusters from low-level event logs. (BETApublicatie : working papers; vol 165). Eindhoven: Technische Universiteit Eindhoven

  • Günther CW, Rozinat A, Van Der Aalst WMP (2009) Activity mining by global trace segmentation. In: International conference on business process management. Springer, pp 128–139

  • Huang Z (1997) Clustering large data sets with mixed numeric and categorical values. In: Proceedings of the 1st Pacific-Asia conference on knowledge discovery and data mining,(PAKDD). Singapore, pp 21–34

  • Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304

    Article  Google Scholar 

  • Huurros M (2007) The emergence and scope of complex system/service innovation: the case of the mobile payment service market in Finland. PhD thesis, Aalto University

  • Kalenkova AA, Ageev AA, Lomazova IA, van der Aalst WMP (2017) E-government services: comparing real and expected user behavior. In: Business process management workshops—BPM 2017 international workshops, Barcelona, Spain, September 10–11, 2017, Revised Papers, pp 484–496

  • Kluza K, Maslanka T, Nalepa Grzegorz J, Ligeza A (2011) Proposal of representing BPMN diagrams with XTT2-based business rules. In: Intelligent distributed computing V—proceedings of the 5th international symposium on intelligent distributed computing—IDC 2011, Delft, The Netherlands, October 2011, pp 243–248

  • Leemans Sander JJ, Fahland D, van der Aalst Wil MP (2013) Discovering block-structured process models from event logs-a constructive approach. In: International conference on applications and theory of Petri nets and concurrency. Springer, pp 311–329

  • Li J, Bose RP Jagadeesh C, van der Aalst WMP (2010) Mining context-dependent and interactive business process maps using execution patterns. In: International conference on business process management. Springer, pp 109–121

  • Li K-C, Jiang H, Yang LT, Cuzzocrea A (eds) (2015) Big data—algorithms, analytics, and applications. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Liu S, d’Aquin M (2017) Unsupervised learning for understanding student achievement in a distance learning setting. In: Global engineering education conference (EDUCON), 2017 IEEE. IEEE, pp 1373–1377

  • Ly LT, Maggi FM, Montali M, Rinderle-Ma S, van der Aalst WMP (2015) Compliance monitoring in business processes: functionalities, application, and tool-support. Inf Syst 54:209–234

    Article  Google Scholar 

  • Ma Y, Guo L, Cukic B (2007) A statistical framework for the prediction of fault-proneness. In: Advances in machine learning applications in software engineering. IGI Global, pp 237–263

  • Malik S, Bajwa IS (2012) A rule based approach for business rule generation from business process models. In: Proceedings 6th international symposium rules on the web: research and applications, RuleML 2012, Montpellier, France, August 27–29, 2012, pp 92–99

  • Mannhardt F, De Leoni M, Reijers HA, Van Der Aalst WMP, Toussaint PJ (2016) From low-level events to activities-a pattern-based approach. In: International conference on business process management, pp 125–141. Springer

  • Mannhardt F, Tax N (2017) Unsupervised event abstraction using pattern abstraction and local process models. arXiv preprint arXiv:1704.03520

  • Ordónez FJ, de Toledo P, Sanchis A (2013) Activity recognition using hybrid generative/discriminative models on home environments using binary sensors. Sensors 13(5):5460–5477

    Article  Google Scholar 

  • Ouyang C, van der Aalst WMP, Dumas M, ter Hofstede Arthur HM (2006) Translating BPMN to BPEL. BPM Center, Brisbane, QLD, Australia

    Google Scholar 

  • Park H-S, Jun C-H (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341

    Article  Google Scholar 

  • Park H-S, Jun C-H (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341

    Article  Google Scholar 

  • Pérez-Castillo R, Weber B, de Guzmán IG-R, Piattini M, Pinggera J (2014) Assessing event correlation in non-process-aware information systems. Softw Syst Model 13(3):1117–1139

    Google Scholar 

  • Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

    Google Scholar 

  • Rachdi A, En-Nouaary A, Dahchour M (2016) Analysis of common business rules in BPMN process models using business rule language. In: 11th International conference on intelligent systems: theories and applications (SITA), pp 1–6

  • Rialti R, Marzi G, Silic M, Ciappei C (2018) Ambidextrous organization and agility in big data era: the role of business process management systems. Bus Proc Manag J 24(5):1091–1109

    Article  Google Scholar 

  • Scholkopf B, Smola A (2001) Learning with Kernels - Support Vector Machines, Regularization. MIT Press, Cambridge, MA, USA, Optimization and Beyond

    MATH  Google Scholar 

  • Sujatha J, Rajagopalan SP (2017) Performance evaluation of machine learning algorithms in the classification of parkinson disease using voice attributes. Int J Appl Eng Res 12(21):10669–10675

    Google Scholar 

  • Tax N, Sidorova N, Haakma R, van der AW (2016) Mining process model descriptions of daily life through event abstraction. In: Proceedings of SAI intelligent systems conference. Springer, pp 83–104

  • Tax N, Sidorova N, Haakma R, van der Aalst WMP (2016) Event abstraction for process mining using supervised learning techniques. In: Proceedings of SAI intelligent systems conference. Springer, pp 251–269

  • Türetken O, Elgammal A, van den Heuvel W-J, Papazoglou MP (2012) Capturing compliance requirements: a pattern-based approach. IEEE Softw 29(3):28–36

    Article  Google Scholar 

  • van der Aalst WMP (2011) Process mining—discovery, conformance and enhancement of business processes. Springer, Berlin

    MATH  Google Scholar 

  • van der Aalst WMP, Damiani E (2015) Processes meet big data: connecting data science with process science. IEEE Trans Serv Comput 8(6):810–819

    Article  Google Scholar 

  • van der Aalst WMP, Reijers HA, Weijters AJMM, van Dongen BF, de Medeiros AKA, Song M, Verbeek HMW (2007) Business process mining: an industrial application. Inf Syst 32(5):713–732

    Article  Google Scholar 

  • von Halie B, Goldberg L (eds) (2006) The business rule revolution. Happy About, Dalston

    Google Scholar 

  • Wamba SF (2017) Big data analytics and business process innovation. Bus Proc Manag J 23(3):470–476

    Article  Google Scholar 

  • Yang C-T, Liu J-C, Hsu C-H, Chou W-L (2014) On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism. J Supercomput 69(3):1103–1122

    Article  Google Scholar 

  • Zhang J, Zulkernine M (2005) Network intrusion detection using random forests. In: PST. Citeseer

  • Zurada J (1992) Introduction to artificial neural systems. West Publishing Company, St. Paul

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the project ICT Fund 3093 - Cloud-enabled Business Process Management and e-Government as a service.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Cuzzocrea.

Ethics declarations

Conflicts of interest

The authors declares that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Ali, H., Cuzzocrea, A., Damiani, E. et al. A composite machine-learning-based framework for supporting low-level event logs to high-level business process model activities mappings enhanced by flexible BPMN model translation. Soft Comput 24, 7557–7578 (2020). https://doi.org/10.1007/s00500-019-04385-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04385-6

Keywords

Navigation