Abstract
Process mining is an emerging discipline that aims to analyze business processes using event data logged by IT systems. In process mining, the focus is on how to effectively and efficiently predict the next process/trace to be activated among all the possible processes/traces that are available in the process schema (usually modeled as a graph). Most of the existing process mining techniques assume that there is a one-to-one mapping between process model activities and the events that are recorded during process execution. However, event logs and process model activities are at different level of granularity. In this paper, we present a machine-learning-based approach to map low-level event logs to high-level activities. With this work, we can bridge the abstraction levels when the high-level labels of the low-level events are not available. The proposed approach consists of two main phases: automatic labeling and machine-learning-based classification. In automatic labeling, a modified k-prototypes clustering approach has been used in order to obtain the labeled examples. Then, in the second phase, we trained different ML classifiers using the obtained labeled examples. Since, in real-life applications and systems, business processes are expressed according to the Business Process Model and Notation (BPMN) format, we improve our proposed framework by means of an innovative, flexible BPMN model translation methodology that acts at the first phase. We demonstrate the applicability of our proposed framework using two case studies with real-world event logs, and provide its experimental assessment and analysis.
Similar content being viewed by others
Notes
Some definitions used in our work are taken verbatim from their paper.
The full definitions can be found in Ouyang et al. (2006).
References
Alfadhel S, Liu S, Oderanti FO (2017) Business process modelling and visualisation to support e-government decision making: business/is alignment. In: Proceedings third international conference decision support systems VII. Data, information and knowledge visualization in decision support systems, ICDSST 2017, Namur, Belgium, May 29–31, 2017, pp 45–57,
Altendrof J, Brende P, Lessard L (2005) Fraud detection for online retail using random forests. Technical Report
Awad A, Decker G, Weske M (2008) Efficient compliance checking using BPMN-Q and temporal logic. In: Proceedings 6th international conference business process management, BPM 2008, Milan, Italy, September 2–4, 2008, pp 326–341
Baier T, Mendling J (2013) Bridging abstraction layers in process mining: event to activity mapping. In: Nurcan S (eds) Enterprise, business-process and information systems modeling. BPMDS 2013 EMMSAD 2013, vol 147. Lecture Notes in Business Information Processing. Springer, Berlin, Heidelberg
Bernardi ML, Cimitile M, Francescomarino CD, Maggi FM (2016) Do activity lifecycles affect the validity of a business rule in a business process? Inf Syst 62:42–59
Boinee P, De Angelis A, Foresti GL (2005) Ensembling classifiers—an application to image data classification from Cherenkov telescope experiment. In: IEC (Prague), pp 394–398
Bose RJ, Verbeek EH, van der Aalst WM (2011) Discovering hierarchical process models using prom. In: Forum at the conference on advanced information systems engineering (CAiSE). Springer, pp 33–48
Braun P, Cameron JJ, Cuzzocrea A, Jiang F, Leung Carson K-S (2014) Effectively and efficiently mining frequent patterns from dense graph streams on disk. In: KES, volume 35 of procedia computer science. Elsevier, pp 338–347
Leo Breiman (2001) Random forests. Mach Learn 45(1):5–32
Burattin A, Cimitile M, Maggi FM, Sperduti A (2015) Online discovery of declarative process models from event streams. IEEE Trans Serv Comput 8(6):833–846
Ceravolo P, Damiani E, Torabi M, Barbon S (2017) Toward a new generation of log pre-processing methods for process mining. In: International conference on business process management. Springer, pp 55–70
Ciccio CD, Mecella M (2015) On the discovery of declarative control flows for artful processes. ACM Trans Manag Inf Syst 5(4):24:1–24:37
Costa R, Garcia O, Nuñez Maria J, Maló Pedro MN, Jardim-Gonçalves R (2007) Integrated solution to support enterprise interoperability at the business process level on e-procurement. In: Proceedings of the 3th international conference on interoperability for enterprise software and applications enterprise interoperability II–new challenges and industrial approaches, IESA 2007(March), pp. 27–30 (2007) Funchal. Madeira Island, Portugal, pp 89–100
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Cuzzocrea A (2006) Improving range-sum query evaluation on data cubes via polynomial approximation. Data Knowl Eng 56(2):85–121
Cuzzocrea A, Bertino E (2011) Privacy preserving OLAP over distributed XML data: a theoretically-sound secure-multiparty-computation approach. J Comput Syst Sci 77(6):965–987
Cuzzocrea A, Moussa R, Xu G (2013) Olap*: effectively and efficiently supporting parallel OLAP over big data. In: MEDI, volume 8216 of lecture notes in computer science. Springer, pp 38–49
Cuzzocrea A, Russo V (2009) Privacy preserving OLAP and OLAP security. In: Encyclopedia of data warehousing and mining. IGI Global, pp 1575–1581
Damiani E, Mulazzani F, Russo B, Succi G (2008) SAF: strategic alignment framework for monitoring organizations. In: BIS, volume 7 of Lecture notes in business information processing. Springer, pp 213–226
Debois S, Hildebrandt TT., Laursen P, Ulrik KR (2017) Declarative process mining for DCR graphs. In: Proceedings of the symposium on applied computing, SAC 2017, Marrakech, Morocco, April 3–7, 2017, pp 759–764
Dezi L, Santoro G, Gabteni H, Pellicelli AC (2018) The role of big data in shaping ambidextrous business process management: case studies from the service industry. Bus Proc Manag J 24(5):1163–1175
Dixit PM, Verbeek HMW, van der Aalst Wil MP (2018) Fast conformance analysis based on activity log abstraction. In: 22nd IEEE International enterprise distributed object computing conference, EDOC 2018, Stockholm, Sweden, October 16–19, 2018, pp 135–144
Dumas M, Van der Aalst WM, Ter Hofstede AH (2005) Process-aware information systems: bridging people and software through process technology. Wiley, Hoboken
Festa G, Safraou I, Cuomo MT, Solima L (2018) Big data for big pharma: harmonizing business process management to enhance ambidexterity. Bus Proc Manag J 24(5):1110–1123
Folleco A, Khoshgoftaar TM, Van Hulse J, Bullard L (2008) Software quality modeling: the impact of class noise on the random forest classifier. In: IEEE congress on IEEE world congress on computational intelligence evolutionary computation, 2008, CEC 2008. IEEE, pp 3853–3859
Günther CW, van der Aalst WMP (2006) Mining activity clusters from low-level event logs. (BETApublicatie : working papers; vol 165). Eindhoven: Technische Universiteit Eindhoven
Günther CW, Rozinat A, Van Der Aalst WMP (2009) Activity mining by global trace segmentation. In: International conference on business process management. Springer, pp 128–139
Huang Z (1997) Clustering large data sets with mixed numeric and categorical values. In: Proceedings of the 1st Pacific-Asia conference on knowledge discovery and data mining,(PAKDD). Singapore, pp 21–34
Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304
Huurros M (2007) The emergence and scope of complex system/service innovation: the case of the mobile payment service market in Finland. PhD thesis, Aalto University
Kalenkova AA, Ageev AA, Lomazova IA, van der Aalst WMP (2017) E-government services: comparing real and expected user behavior. In: Business process management workshops—BPM 2017 international workshops, Barcelona, Spain, September 10–11, 2017, Revised Papers, pp 484–496
Kluza K, Maslanka T, Nalepa Grzegorz J, Ligeza A (2011) Proposal of representing BPMN diagrams with XTT2-based business rules. In: Intelligent distributed computing V—proceedings of the 5th international symposium on intelligent distributed computing—IDC 2011, Delft, The Netherlands, October 2011, pp 243–248
Leemans Sander JJ, Fahland D, van der Aalst Wil MP (2013) Discovering block-structured process models from event logs-a constructive approach. In: International conference on applications and theory of Petri nets and concurrency. Springer, pp 311–329
Li J, Bose RP Jagadeesh C, van der Aalst WMP (2010) Mining context-dependent and interactive business process maps using execution patterns. In: International conference on business process management. Springer, pp 109–121
Li K-C, Jiang H, Yang LT, Cuzzocrea A (eds) (2015) Big data—algorithms, analytics, and applications. Chapman and Hall/CRC, Boca Raton
Liu S, d’Aquin M (2017) Unsupervised learning for understanding student achievement in a distance learning setting. In: Global engineering education conference (EDUCON), 2017 IEEE. IEEE, pp 1373–1377
Ly LT, Maggi FM, Montali M, Rinderle-Ma S, van der Aalst WMP (2015) Compliance monitoring in business processes: functionalities, application, and tool-support. Inf Syst 54:209–234
Ma Y, Guo L, Cukic B (2007) A statistical framework for the prediction of fault-proneness. In: Advances in machine learning applications in software engineering. IGI Global, pp 237–263
Malik S, Bajwa IS (2012) A rule based approach for business rule generation from business process models. In: Proceedings 6th international symposium rules on the web: research and applications, RuleML 2012, Montpellier, France, August 27–29, 2012, pp 92–99
Mannhardt F, De Leoni M, Reijers HA, Van Der Aalst WMP, Toussaint PJ (2016) From low-level events to activities-a pattern-based approach. In: International conference on business process management, pp 125–141. Springer
Mannhardt F, Tax N (2017) Unsupervised event abstraction using pattern abstraction and local process models. arXiv preprint arXiv:1704.03520
Ordónez FJ, de Toledo P, Sanchis A (2013) Activity recognition using hybrid generative/discriminative models on home environments using binary sensors. Sensors 13(5):5460–5477
Ouyang C, van der Aalst WMP, Dumas M, ter Hofstede Arthur HM (2006) Translating BPMN to BPEL. BPM Center, Brisbane, QLD, Australia
Park H-S, Jun C-H (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341
Park H-S, Jun C-H (2009) A simple and fast algorithm for k-medoids clustering. Expert Syst Appl 36(2):3336–3341
Pérez-Castillo R, Weber B, de Guzmán IG-R, Piattini M, Pinggera J (2014) Assessing event correlation in non-process-aware information systems. Softw Syst Model 13(3):1117–1139
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Rachdi A, En-Nouaary A, Dahchour M (2016) Analysis of common business rules in BPMN process models using business rule language. In: 11th International conference on intelligent systems: theories and applications (SITA), pp 1–6
Rialti R, Marzi G, Silic M, Ciappei C (2018) Ambidextrous organization and agility in big data era: the role of business process management systems. Bus Proc Manag J 24(5):1091–1109
Scholkopf B, Smola A (2001) Learning with Kernels - Support Vector Machines, Regularization. MIT Press, Cambridge, MA, USA, Optimization and Beyond
Sujatha J, Rajagopalan SP (2017) Performance evaluation of machine learning algorithms in the classification of parkinson disease using voice attributes. Int J Appl Eng Res 12(21):10669–10675
Tax N, Sidorova N, Haakma R, van der AW (2016) Mining process model descriptions of daily life through event abstraction. In: Proceedings of SAI intelligent systems conference. Springer, pp 83–104
Tax N, Sidorova N, Haakma R, van der Aalst WMP (2016) Event abstraction for process mining using supervised learning techniques. In: Proceedings of SAI intelligent systems conference. Springer, pp 251–269
Türetken O, Elgammal A, van den Heuvel W-J, Papazoglou MP (2012) Capturing compliance requirements: a pattern-based approach. IEEE Softw 29(3):28–36
van der Aalst WMP (2011) Process mining—discovery, conformance and enhancement of business processes. Springer, Berlin
van der Aalst WMP, Damiani E (2015) Processes meet big data: connecting data science with process science. IEEE Trans Serv Comput 8(6):810–819
van der Aalst WMP, Reijers HA, Weijters AJMM, van Dongen BF, de Medeiros AKA, Song M, Verbeek HMW (2007) Business process mining: an industrial application. Inf Syst 32(5):713–732
von Halie B, Goldberg L (eds) (2006) The business rule revolution. Happy About, Dalston
Wamba SF (2017) Big data analytics and business process innovation. Bus Proc Manag J 23(3):470–476
Yang C-T, Liu J-C, Hsu C-H, Chou W-L (2014) On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism. J Supercomput 69(3):1103–1122
Zhang J, Zulkernine M (2005) Network intrusion detection using random forests. In: PST. Citeseer
Zurada J (1992) Introduction to artificial neural systems. West Publishing Company, St. Paul
Acknowledgements
This work has been supported by the project ICT Fund 3093 - Cloud-enabled Business Process Management and e-Government as a service.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declares that they have no conflict of interest.
Human and animal rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Al-Ali, H., Cuzzocrea, A., Damiani, E. et al. A composite machine-learning-based framework for supporting low-level event logs to high-level business process model activities mappings enhanced by flexible BPMN model translation. Soft Comput 24, 7557–7578 (2020). https://doi.org/10.1007/s00500-019-04385-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04385-6