Abstract
The most prominent Business Process Model Abstraction (BPMA) use case is the construction of the process “quick view” for rapidly comprehending a complex process. Some researchers propose process abstraction methods to aggregate the activities on the basis of their semantic similarity. One important clustering technique used in these methods is traditional k-means cluster analysis which so far is an unsupervised process without any priori information, and most of the techniques aggregate the activities only according to business semantics without considering the requirement of an order-preserving model transformation. The paper proposes a BPMA method based on semi-supervised clustering which chooses the initial clusters based on the refined process structure tree and designs constraints by combining the control flow consistency of the process and the semantic similarity of the activities to guide the clustering process. To be more precise, the constraint function is discovered by mining from a process model collection enriched with subprocess relations. The proposed method is validated by applying it to a process model repository in use. In an experimental validation, the proposed method is compared to the traditional k-means clustering (parameterized with randomly chosen initial clusters and an only semantics-based distance measure), showing that the approach closely approximates the decisions of the involved modelers to cluster activities. As such, the paper contributes to the development of modeling support for effective process model abstraction, facilitating the use of business process models in practice.
Similar content being viewed by others
References
Alves de Medeiros AK, van der Aalst WMP, Pedrinaci C (2008) Semantic process mining tools: core building blocks. In: Proceedings of the 16th European conference on information systems, Galway, pp 475–478
Bar-Hillel A, Hertz T, Shental N (2003) Learning distance functions using equivalence relations. In: Proceedings of the twentieth international conference on machine learning, pp 11–18
Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. In: Proceedings of the nineteenth international conference on machine learning, pp 19–26
Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 59–68. doi:10.1145/1014052.1014062
Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the twenty-first international conference on machine learning, pp 81–88. doi:10.1145/1015330.1015360
Bobrik R, Reichert M, Bauer T (2007a) View-based process visualization. In: International conference on business process management, Brisbane, Australia. LNCS 4714. Springer, Heidelberg, pp 88–95
Bobrik R, Reichert M, Bauer T (2007b) Parameterizable views for process visualization. Technical report TR-CTIT-07-37, Centre for Telematics and Information Technology, University of Twente, Enschede
Bose RPJC, van der Aalst WMP (2009) Abstractions in process mining: a taxonomy of patterns. In: Proceedings of the 7th international conference on business process management. LNCS 5701. Springer, Heidelberg, pp 159–175
Bose RPJC, Verbeek EHMW, van der Aalst WMP (2012) Discovering hierarchical process models using ProM. In: Nurcan S (ed) IS Olympics: information systems in a diverse world, vol 107. LNBIP, pp 33–48
Casati F, Shan M-C (2002) Semantic analysis of business process executions. Proceedings of the 8th international conference on extending database technology: advances in database technology. Springer, Heidelberg, pp 287–296
Cohn D, Caruana R, McCallum A (2009) Semi-supervised clustering with user feedback. In: Basu S Davidson I, Wagstaff K (eds) Constrained clustering: advances in algorithms, theory, and applications. Data Mining and Knowledge Discovery Series, chapter 2. CRC, Boca Raton, pp 17–31
Demiriz A, Bennett KP, Embrechts MJ (1999) Semi-supervised clustering using genetic algorithms. In: Proceedings of the artificial neural networks in engineering conference, pp 809–814
Dumas M, Luciano García-Bañuelos L, Polyvyanyy A et al (2010) Aggregate quality of service computation for composite services. In: ICSOC 2010, San Francisco, 7–10 December. LNCS 6470. Springer, Heidelberg, pp 213–227
Eshuis R, Grefen P (2008) Constructing customized process views. Data Knowl Eng 64(2):419–438
Euzenat J, Shvaiko P (2007) Ontology matching. Springer, Heidelberg
Fahland D, Favre C, Koehler J et al (2011) Analysis on demand: instantaneous soundness checking of industrial business process models. Data Knowl Eng 70(5):448–466
Francescomarino CD, Marchetto A, Tonella P (2013) Cluster-based modularization of processes recovered from web applications. J Softw Maint Evol Res Pract 25(2):113–138
Gao Y, Liu DY, Qi H (2008) Semi-supervised k-means clustering algorithm for multi-type relational data. J Softw 19(11):2814–2821 (in Chinese with English abstract)
Gaynor S, Bair E (2013) Identification of biologically relevant subtypes via preweighted sparse clustering. ArXiv e-prints 2013. arXiv:1304.3760. http://biostats.bepress.com/cgi/viewcontent.cgi?article=1032&context=uncbiostat. Accessed 14 Aug 2014
Gschwind T, Koehler J, Wong J (2008) Applying patterns during business process modeling. In: International conference on business process management. LNCS 5240. Springer, Heidelberg, pp 4–19
Günther CW, van der Aalst WMP (2006) Mining activity clusters from low-level event logs. BETA working paper series, WP 165
Günther CW, van der Aalst WMP (2007) Fuzzy mining: adaptive process simplification based on multi-perspective metrics. In: International conference on business process management, Brisbane, LNCS 4714. Springer, Heidelberg, pp 328–343
Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, Heidelberg
Hepp M, Leymann F, Domingue J et al. (2005) Semantic business process management: a vision towards using semantic web services for business process management. In: IEEE international conference on e-business engineering (ICEBE’05), Beijing. IEEE Computer Society, pp 535–540
Kamvar SD, Klein D, Manning CD (2003) Spectral learning. In: Proceedings of the 18th international joint conference on artificial intelligence. Morgan Kaufmann, pp 561–566
Klein D, Kamvar SD, Manning CD (2002) From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the nineteenth international conference on machine learning. Morgan Kaufmann, Burlington, pp 307–314
Kolb J, Reichert M (2013a) A flexible approach for abstracting and personalizing large business process models. ACM Sigapp Appl Comput Rev 13(1):6–17
Kolb J, Reichert M (2013b) Data flow abstractions and adaptations through updatable process views. In: Proceedings of 28th ACM symposium on applied computing. ACM, pp 1447–1453
Lau JM, Iochpe C, Thom L et al (2009) Discovery and analysis of activity pattern co-occurrences in business process models. In: Proceedings of the international conference on enterprise information systems, Milan, vol Isas, pp 83–88
Li J, Bose RPJC, van der Aalst WMP (2010) Mining context-dependent and interactive business process maps using execution patterns. In: zur Muehlen M, Su J (eds) BPM 2010 Workshops, LNBIP 66. Springer, Heidelberg, pp 109–121
Liu D, Shen M (2003) Workflow modeling for virtual processes: an order- preserving process-view approach. Inf Syst 28(6):505–532
Mendling J, Verbeek H, van Dongen BF et al (2008) Detection and prediction of errors in EPCs of the SAP reference model. Data Knowl Eng 64(1):312–329
Nan W, Shanwu S, Ying L et al (2015) Business process model abstraction based on structure and semantics. ICIC Express Lett 2(9):557–563
Polyvyanyy A, Smirnov S, Weske M (2008) Reducing complexity of large EPCs. In: EPK 2008 GI-Workshop, Saarbrücken
Polyvyanyy A, Smirnov S, Weske M (2009a) On application of structural decomposition for process model abstraction. Business Process, Services Computing and Intelligent Service Management, Leipzig, pp 110–122
Polyvyanyy A, Smirnov S, Weske M (2009b) The triconnected abstraction of process models. In: International Conference on Business Process Management, Ulm, LNCS 5701. Springer, Heidelberg, pp 229–24
Polyvyanyy A, Vanhatalo J, Völzer H (2010) Simplified computation and generalization of the refined process structure tree. In: Proceedings of the WS-FM 2010, LNCS 6551. Springer, Heidelberg, pp 25–41
Porter MF (1980) An algorithm for suffix stripping. Progr 14(3):130–137
Qu Y, Hu W, Cheng G (2006) Constructing virtual documents for ontology matching. In: Proceedings of the 15th international conference on World Wide Web. ACM, New York, pp 23–31. doi:10.1145/1135777.1135786
Reijers HA, Mendling J, Dijkman RM (2010) On the usefulness of subprocesses in business process models. BPM center report BPM-10-03. http://www.BPMcenter.org. Accessed 18 Sept 2013
Ruiz C, Spiliopoulou M, Menasalvas E (2007) C-DBSCAN: density-based clustering with constraints. In: Proceedings of the Rough sets, fuzzy sets, data mining and granular computing. LNCS 4482, pp 216–223. doi:10.1007/978-3-540-72530-5_25
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Schaeffer S (2007) Graph clustering—survey. Comput Sci Rev 1:27–64
Schultz M, Joachims T (2003) Learning a distance metric from relative comparisons. Adv Neural Inf Process Syst 16:40–47
Sharp A, McDermott P (2008) Workflow modeling: tools for process improvement and applications development. Artech House, London
Smirnov S (2012) Business process model abstraction. Doctor Dissertation, University of Potsdam
Smirnov S, Weidlich M, Mendling J et al (2009) Action patterns in business process models. Comput Ind 63(2):115–129
Smirnov S, Weidlich M, Mendling J (2010) Business process model abstraction based on behavioral profiles. Service-Oriented Computing. LNCS 6470. Springer, Heidelberg, pp 1–16
Smirnov S, Weidlich M, Mendling J (2010a) Object-sensitive action patterns in process model repositories. In: zur Muehlen M et al. (eds) Business Process Management Workshops, vol 66. Springer, Heidelberg, pp 251–263
Smirnov S, Dijkman R, Mendling J et al. (2010b) Meronymy-based aggregation of activities in business process models. In: Conceptual Modeling – ER 2010. 29th international conference on conceptual modeling, Vancouver, Canada. LNCS 6412. Springer, Heidelberg, pp 1–14
Smirnov S, Reijers HA, Weske M (2011) A semantic approach for business process model abstraction. International Conference on Advanced Information Systems Engineering, LNCS 6741. Springer, Heidelberg, pp 497–511
Smirnov S, Reijers HA, Weske MH et al (2012) Business process model abstraction: a definition, catalog, and survey. Distrib Parallel Databases 30(1):63–99
Tang W, Xiong H, Zhong S et al. (2007) Enhancing semi-supervised clustering: A feature projection perspective. In: Proceedings of the thirteenth international conference on knowledge discovery and data mining, pp 707–716, doi: 10.1145/1281192.1281268
van der Aalst WMP, Basten T (1997) Life-cycle inheritance: a petri-net-based approach. Proceedings of the 18th international conference on application and theory of Petri Nets, LNCS 1248. Springer, Heidelberg, pp 62–81
van der Aalst WMP, ter Hofstede AHM, Kiepuszewski B et al (2003) Workflow patterns. Distrib Parallel Databases 14:5–51
van der Aalst W, Weijters A, Maruster L (2004) Workflow mining: discovering process models from event logs. IEEE Trans Knowl Data Eng 16(9):1128–1142
Vanhatalo J, Völzer H, Leymann F (2007) Faster and more focused control-flow analysis for business process models through SESE decomposition. In: ICSOC 2007, Vienna, LNCS 4749, pp 43–55
Vanhatalo J, Völzer H, Koehler J (2009) The refined process structure tree. Data Knowl Eng 68(9):793–818. doi:10.1016/j.datak.2009.02.015
Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: ICML’00 proceedings of the seventeenth international conference on machine learning, pp 1103–1110
Wagstaff K, Cardie C, Rogers S et al. (2001) Constrained k-means clustering with background knowledge. In: Proceedings of the eighteenth international conference on machine learning, pp 577–584
Wang L, Bo LF, Jiao LC (2007) Density-sensitive semi-supervised spectral clustering (in Chinese with English abstract). J Softw 18(10):2412-2422. doi:10.1360/jos182412. http://www.jos.org.cn/1000-9825/18/2412.html. Accessed 21 Mar 2013
Weidlich M, Dijkman R, Mendling J (2010) The ICoP framework-identification of correspondences between process models. In: Proceedings of the 22nd international conference on advanced information systems engineering, LNCS 6051, pp 483–498
Weidlich M, Mendling J, Weske M (2011) Efficient consistency measurement based on behavioural profiles of process models. IEEE Transact Softw Eng 37(3):410–429
Xing EP, Ng AY, Jordan MI et al (2003) Distance metric learning with application to clustering with side-information. Adv Neural Inf Process Syst 15:505–512
Xu QJ, Desjardins M, Wagstaf K (2005) Constrained spectral clustering under a local proximity structure assumption. In: Proceedings of the eighteenth international Florida artificial intelligence research society conference, Clearwater Beach, pp 866–867
Yin X, Chen S, Hu E et al (2010) Semi-supervised clustering with metric learning: an adaptive kernel method. Pattern Recognit 43(4):1320–1333. doi:10.1016/j.patcog.2009.11.005
Acknowledgements
This work is supported in part by NSFC under Grant Nos. 61402193, 61272208, 61133011, 60973089, 61003101, 61170092, by the Jilin Province Science and Technology Development Plan under Grant Nos. 20130522177JH, by the Jilin Provincial Department of Education “Twelfth/Thirteenth Five Year Plan” Science and Technology Development Plan under Grant Nos. 2014160, 2016105, and by the Jilin Province Education Science “Twelfth Five Year Plan” under Grant Nos. GH150285, GH16249.
Author information
Authors and Affiliations
Corresponding author
Additional information
Accepted after three revisions by Prof. Dr. Becker.
Rights and permissions
About this article
Cite this article
Wang, N., Sun, S. & OuYang, D. Business Process Modeling Abstraction Based on Semi-Supervised Clustering Analysis. Bus Inf Syst Eng 60, 525–542 (2018). https://doi.org/10.1007/s12599-016-0457-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12599-016-0457-x