Discovering configuration workflows from existing logs using process mining

Ramos-Gutiérrez, Belén; Varela-Vaca, Ángel Jesús; Galindo, José A.; Gómez-López, María Teresa; Benavides, David

doi:10.1007/s10664-020-09911-x

Discovering configuration workflows from existing logs using process mining

Published: 26 January 2021

Volume 26, article number 11, (2021)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Belén Ramos-Gutiérrez¹,
Ángel Jesús Varela-Vaca ORCID: orcid.org/0000-0001-9953-6005¹,
José A. Galindo¹,
María Teresa Gómez-López¹ &
…
David Benavides¹

799 Accesses
5 Citations
7 Altmetric
Explore all metrics

Abstract

Variability models are used to build configurators, for guiding users through the configuration process to reach the desired setting that fulfils user requirements. The same variability model can be used to design different configurators employing different techniques. One of the design options that can change in a configurator is the configuration workflow, i.e., the order and sequence in which the different configuration elements are presented to the configuration stakeholders. When developing a configurator, a challenge is to decide the configuration workflow that better suits stakeholders according to previous configurations. For example, when configuring a Linux distribution the configuration process starts by choosing the network or the graphic card and then, other packages concerning a given sequence. In this paper, we present COnfiguration workfLOw proceSS mIning (COLOSSI), a framework that can automatically assist determining the configuration workflow that better fits the configuration logs generated by user activities given a set of logs of previous configurations and a variability model. COLOSSI is based on process discovery, commonly used in the process mining area, with an adaptation to configuration contexts. Derived from the possible complexity of both logs and the discovered processes, often, it is necessary to divide the traces into small ones. This provides an easier configuration workflow to be understood and followed by the user during the configuration process. In this paper, we apply and compare four different techniques for the traces clustering: greedy, backtracking, genetic and hierarchical algorithms. Our proposal is validated in three different scenarios, to show its feasibility, an ERP configuration, a Smart Farming, and a Computer Configuration. Furthermore, we open the door to new applications of process mining techniques in different areas of software product line engineering along with the necessity to apply clustering techniques for the trace preparation in the context of configuration workflows.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Configuration Rule Mining for Variability Analysis in Configurable Process Models

Extracting Configuration Guidance Models from Business Process Repositories

Mining Configurable Process Fragments for Business Process Design

Notes

BPMN: Business Process Model and Notation
Dendrogram that is a branching diagram which represents the arrangement of the clusters produced by the corresponding analyses
https://fluxicon.com/disco/
https://www.celonis.com/
https://doi.org/10.5281/zenodo.3574053
https://wwwiti.cs.uni-magdeburg.de/~jualves/PROFilE/datasets-download/Dell-Laptop_readme.txt

References

Alférez M, Acher M, Galindo JA, Baudry B, Benavides D (2019) Modeling variability in the video domain: language and experience report. Softw Qual J 27(1):307–347
Article Google Scholar
Astromskis S, Janes A, Mairegger M (2015) A process mining approach to measure how users interact with software: an industrial case study. In: Proceedings of the 2015 international conference on software and system process. ICSSP 2015. ACM, New York, pp 137–141
Augusto A, Conforti R, Dumas M, Rosa ML, Maggi FM, Marrella A, Mecella M, Soo A (2019) Automated discovery of process models from event logs: review and benchmark. IEEE Trans Knowl Data Eng 31(4):686–705. https://doi.org/10.1109/TKDE.2018.2841877
Article Google Scholar
Baker FB, Hubert LJ (1975) Measuring the power of hierarchical cluster analysis. J Am Stat Assoc 70(349):31–38
Article MATH Google Scholar
Ball GH, Hall DJ (1965) Isodata a novel method of data analysis and pattern classification. Tech. rep. Stanford Research Inst, Menlo Park
Bosch J (2018) The three layer product model: an alternative view on spls and variability. In: Proceedings of the 12th international workshop on variability modelling of software-intensive systems, VAMOS 2018, Madrid, Spain, February 7–9, 2018, p 1. https://doi.org/10.1145/3168365.3168366
Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat-Theory Methods 3(1):1–27
Article MathSciNet MATH Google Scholar
Cardoso J (2005) Control-flow complexity measurement of processes and weyuker’s properties. In: 6th International enformatika conference, vol 8, pp 213–218
Cheng H, Kumar A (2015) Process mining on noisy logs—can log sanitization help to improve performance? Decis Support Syst 79:138–149. https://doi.org/10.1016/j.dss.2015.08.003
Article Google Scholar
Conforti R, Rosa ML, ter Hofstede AHM (2017) Filtering out infrequent behavior from business process event logs. IEEE Trans Knowl Data Eng 29(2):300–314. https://doi.org/10.1109/TKDE.2016.2614680
Article Google Scholar
Dakic D, Stefanovic D, Cosic I, Lolic T, Medojevic M (2018) Business application: a literature review. In: 29th DAAAM international symposium on intelligent manufacturing and automation. https://doi.org/10.2507/29th.daaam.proceedings.125
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell (2):224–227
de Leoni M, van der Aalst WMP, Dees M (2016) A general framework for correlating, predicting and clustering dynamic behavior based on event logs. Inf Syst 56:235–257. https://doi.org/10.1016/j.is.2015.07.003
Article Google Scholar
de Medeiros AKA, Guzzo A, Greco G, van der Aalst WMP, Weijters AJMM, van Dongen BF, Saccà D (2007) Process mining based on clustering: a quest for precision. In: Business process management workshops, BPM 2007 international workshops, BPI, BPD, CBP, ProHealth, RefMod, semantics4ws, Brisbane, Australia, September 24, 2007, Revised Selected Papers, pp 17–29. https://doi.org/10.1007/978-3-540-78238-4_4
De Weerdt J, vanden Broucke S, Vanthienen J, Baesens B (2013) Active trace clustering for improved process discovery. IEEE Trans Knowl Data Eng 25(12):2708–2720
Article Google Scholar
Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18
Article Google Scholar
Duda RO, Hart PE et al (1973) Pattern classification and scene analysis, vol 3. Wiley, New York
MATH Google Scholar
Dunn JC (1974) Well-separated clusters and optimal fuzzy partitions. J Cybern 4(1):95–104
Article MathSciNet MATH Google Scholar
Durán A, Benavides D, Segura S, Trinidad P, Ruiz-Cortés A (2017) Flame: a formal framework for the automated analysis of software product lines validated by automated specification testing. SOSYM 16(4):1049–1082. https://doi.org/10.1007/s10270-015-0503-z
Google Scholar
Felfernig A, Walter R, Galindo JA, Benavides D, Erdeniz SP, Atas M, Reiterer S (2018) Anytime diagnosis for reconfiguration. J Intell Inf Syst 51(1):161–182. https://doi.org/10.1007/s10844-017-0492-1
Article Google Scholar
Fernández-Cerero D, Varela-Vaca ÁJ, Fernández-Montes A, Gómez-López MT, Alvárez-Bermejo JA (2019) Measuring data-centre workflows complexity through process mining: the google cluster case. J Supercomput. https://doi.org/10.1007/s11227-019-02996-2
Ferreira DR, Alves C (2011) Discovering user communities in large event logs. In: Daniel F, Barkaoui K, Dustdar S (eds) Business process management workshops—BPM 2011 international workshops, Clermont-Ferrand, France, August 29, 2011, Revised Selected Papers, Part I, Springer, Lecture Notes in Business Information Processing, vol 99, pp 123–134. https://doi.org/10.1007/978-3-642-28108-2_11
Frey T, Van Groenewoud H (1972) A cluster analysis of the d2 matrix of white spruce stands in saskatchewan based on the maximum-minimum principle. J Ecol 60(3):873–886
Article Google Scholar
Galindo J, Turner H, Benavides D, White J (2014a) Testing variability-intensive systems using automated analysis: an application to android. Softw Qual J 1–41. https://doi.org/10.1007/s11219-014-9258-y
Galindo JA, Alférez M, Acher M, Baudry B, Benavides D (2014b) A variability-based testing approach for synthesizing video sequences. In: International symposium on software testing and analysis, ISSTA ’14, San Jose, CA, USA—July 21–26, 2014, pp 293–303
Galindo J, Dhungana D, Rabiser R, Benavides D, Botterweck G, Grünbacher P (2015) Supporting distributed product configuration by integrating heterogeneous variability modeling approaches. Inf Softw Technol 62 (1):78–100
Article Google Scholar
Galindo JA, Benavides D, Trinidad P, Gutiérrez-Fernández AM, Ruiz-Cortés A (2018) Automated analysis of feature models: Quo vadis?. Computing 101:387–433
Article MathSciNet Google Scholar
Ghionna L, Greco G, Guzzo A, Pontieri L (2008) Outlier detection techniques for applications. In: Foundations of intelligent systems. Springer, Berlin, pp 150–159
Grabusts P, et al. (2011) The choice of metrics for clustering algorithms. In: Proceedings of the 8th international scientific and practical conference, vol 2, pp 70–76
Greco G, Guzzo A, Pontieri L, Sacca D (2006) Discovering expressive process models by clustering log traces. IEEE Trans Knowl Data Eng 18 (8):1010–1027
Article Google Scholar
Halkidi M, Vazirgiannis M, Batistakis Y (2000) Quality scheme assessment in the clustering process. In: European conference on principles of data mining and knowledge discovery. Springer, pp 265–276
Hartigan JA (1975) Clustering algorithms, 99th, John Wiley & Sons, Inc., USA
Hompes BFA, Verbeek HMW, van der Aalst WMP (2015) Finding suitable activity clusters for decomposed process discovery. In: Ceravolo P, Russo B, Accorsi R (eds) Data-driven process discovery and analysis. Springer International Publishing, Cham, pp 32–57
Hompes BFA, Buijs JCAM, van der Aalst WMP, Dixit PM, Buurman J (2017) Detecting changes in process behavior using comparative case clustering. In: Ceravolo P, Rinderle-Ma S (eds) Data-driven process discovery and analysis. Springer International Publishing, pp 54–75
Hubaux A, Classen A, Heymans P (2009) Formal modelling of feature configuration workflows. In: Proceedings of the 13th international software product line conference, Carnegie Mellon University, Pittsburgh, PA, USA, SPLC ’09, pp 221–230. http://dl.acm.org/citation.cfm?id=1753235.1753266
Hubaux A, Heymans P, Schobbens PY, Deridder D, Abbasi E (2013) Supporting multiple perspectives in feature-based configuration. SOSYM 12 (3):641–663. https://doi.org/10.1007/s10270-011-0220-1. http://www.scopus.com/inward/record.url?eid=2-s2.0-84879788174&partnerID=40&md5=dee1ff6a27f859c32d424a1528d81ada
Google Scholar
Hubert L (1974) Approximate evaluation techniques for the single-link and complete-link hierarchical clustering procedures. J Am Stat Assoc 69 (347):698–704
Article MathSciNet MATH Google Scholar
Hubert LJ, Levin JR (1976) A general statistical framework for assessing categorical clustering in free recall. Psychol Bull 83(6):1072
Article Google Scholar
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Article Google Scholar
Kobren A, Monath N, Krishnamurthy A, McCallum A (2017) A hierarchical algorithm for extreme clustering. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’17. ACM, New York, pp 255–264
Krzanowski WJ, Lai Y (1988) A criterion for determining the number of groups in a data set using sum-of-squares clustering. Biometrics 44(1):23–34
Article MathSciNet MATH Google Scholar
Kuiper FK, Fisher L (1975) 391: a Monte Carlo comparison of six clustering procedures 777–783. Biometrics 31(3):777–783
Article MATH Google Scholar
Lebart L, Morineau A, Piron M (2000) Statistique exploratoire multidimensionnelle, Dunod, Paris, France
Leemans SJJ, Fahland D, van der Aalst WMP (2014) Discovering block-structured process models from incomplete event logs. In: Petri Nets, Springer, Lecture Notes in Computer Science, vol 8489, pp 91–110
Leemans SJJ, Fahland D, van der Aalst WMP (2015) Scalable process discovery with guarantees. In: Gaaloul K, Schmidt R, Nurcan S, Guerreiro S, Ma Q (eds) Enterprise, business-process and information systems modeling. Springer International Publishing, Cham, pp 85–101
Lettner M, Rodas-Silva J, Galindo JA, Benavides D (2019) Automated analysis of two-layered feature models with feature attributes. J Comput Lang 51:154–172
Article Google Scholar
Ly LT, Indiono C, Mangler J, Rinderle-Ma S (2012) Data transformation and semantic log purging for process mining. In: CAiSE, Springer, Lecture notes in computer science, vol 7328, pp 238–253
MacKay DJC (2002) Information theory inference & learning algorithms. Cambridge University Press, New York
Google Scholar
Makanju A, Brooks S, Zincir-Heywood AN, Milios EE, Safavi-Naini R (2008) Logview: visualizing event log clusters. In: Korba L, Marsh S (eds) Sixth annual conference on privacy, security and trust, PST 2008, October 1–3, 2008. IEEE Computer Society, Fredericton, pp 99–108. https://doi.org/10.1109/PST.2008.17
Makanju A, AN Zincir-Heywood, Milios EE (2009) Clustering event logs using iterative partitioning. In: IV J F E, Fogelman-Soulié F, Flach PA, Zaki MJ (eds) Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, France, June 28–July 1, 2009. ACM, pp 1255–1264. https://doi.org/10.1145/1557019.1557154
Mans RS, Schonenberg MH, Song M, van der Aalst WMP, Bakker PJM (2009) Application of process mining in healthcare—a case study in a dutch hospital. In: Fred A, Filipe J, Gamboa H (eds) Biomedical engineering systems and technologies. Springer, Berlin, pp 425–438
Măruşter L, van Beest NRTP (2009) Redesigning business processes: a methodology based on simulation and techniques. Knowl Inf Syst 21(3):267. https://doi.org/10.1007/s10115-009-0224-0
Article Google Scholar
Maruster L, Weijters AJMM, van der Aalst WMP, van den Bosch A (2002) Process mingin: discovering direct successors in process logs. In: Discovery Science, 5th international conference, DS 2002, Lübeck, Germany, November 24–26, 2002, Proceedings, pp 364–373. https://doi.org/10.1007/3-540-36182-0_37
Maruster L, Weijters AJMM, van der Aalst WMP, van den Bosch A (2006) A rule-based approach for process discovery: dealing with noise and imbalance in process logs. Data Min Knowl Discov 13(1):67–87
Article MathSciNet Google Scholar
McClain JO, Rao VR (1975) Clustisz: a program to test for the quality of clustering of a set of objects. JMR. J Market Res (pre-1986) 12(000004):456
Google Scholar
Mendling J (2008) Metrics for business process models. Springer, Berlin, pp 103–133
Book Google Scholar
Milligan GW (1980) An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika 45(3):325–342
Article Google Scholar
Milligan GW (1981) A monte carlo study of thirty internal criterion measures for cluster analysis. Psychometrika 46(2):187–199
Article MATH Google Scholar
Murtagh F (1983) A survey of recent advances in hierarchical clustering algorithms. Comput J 26(4):354–359. https://doi.org/10.1093/comjnl/26.4.354. http://oup.prod.sis.lan/comjnl/article-pdf/26/4/354/1072603/26-4-354.pdf
Article MATH Google Scholar
Pereira JA, Matuszyk P, Krieter S, Spiliopoulou M, Saake G (2016a) A feature-based personalized recommender system for product-line configuration. In: Proceedings of the international conference on generative programming: concepts and experiences. ACM, pp 120–131
Pereira JA, Matuszyk P, Krieter S, Spiliopoulou M, Saake G (2016b) A feature-based personalized recommender system for product-line configuration. In: Proceedings of the international conference on generative programming: concepts and experiences. ACM, pp 120–131
Pereira JA, Schulze S, Figueiredo E, Saake G (2018a) N-dimensional tensor factorization for self-configuration of software product lines at runtime. In Proceedings of the 22nd International Systems and Software Product Line Conference - Volume 1 (SPLC ’18). Association for Computing Machinery, New York, NY, USA, 87–97. https://doi.org/10.1145/3233027.3233039
Pereira JA, Matuszyk P, Krieter S, Spiliopoulou M, Saake G (2018b) Personalized recommender systems for product-line configuration processes. Comput Lang Syst Struct 54:451–471
Google Scholar
Pérez-Álvarez JM, Maté A, López MTG, Trujillo J (2018) Tactical business-process-decision support based on kpis monitoring and validation. Comput Ind 102:23–39
Article Google Scholar
Pérez-Castillo R, Fernéndez-Ropero M, Piattini M (2019) Business process model refactoring applying ibuprofen. An industrial evaluation. J Syst Softw 147:86–103
Article Google Scholar
Perimal-Lewis L, Teubner D, Hakendorf P, Horwood C (2016) Application of process mining to assess the data quality of routinely collected time-based performance data sourced from electronic health records by validating process conformance. Health Inform J 22(4):1017–1029
Article Google Scholar
Ratkowsky D, Lance G (1978) Criterion for determining the number of groups in a classification Vol. 44, No. 1, pages 23-34
Rodas-Silva J, Galindo JA, García-Gutiérrez J, Benavides D (2019) Selection of software product line implementation components using recommender systems: an application to wordpress. IEEE Access 7:69226–69245
Article Google Scholar
Rohlf FJ (1974) Methods of comparing classifications. Annu Rev Ecol System 5(1):101–113
Article Google Scholar
Rozinat A, de Jong ISM, Günther C W, van der Aalst WMP (2009) Process mining applied to the test process of wafer scanners in ASML. IEEE Trans Syst Man Cybern Part C 39(4):474–479
Article Google Scholar
Rubin V, Günther C W, van der Aalst WMP, Kindler E, van Dongen BF, Schäfer W (2007) Process mining framework for software processes. In: Wang Q, Pfahl D, Raffo DM (eds) Software process dynamics and agility. Springer, Berlin, pp 169–181
Rubin VA, Mitsyuk AA, Lomazova IA, van der Aalst WMP (2014) Process mining can be applied to software too!. In: Proceedings of the 8th ACM/IEEE international symposium on empirical software engineering and measurement. ESEM ’14. ACM, New York, pp 57:1–57:8
Sahlabadi M, Muniyandi R, Shukur Z (2014) Detecting abnormal behavior in social network websites by using a process mining technique. J Comput Sci 10(3):393–402. https://doi.org/10.3844/jcssp.2014.393.402
Article Google Scholar
Sani MF, van Zelst SJ, van der Aalst WMP (2017) Improving process discovery results by filtering outliers using conditional behavioural probabilities. In: Business process management workshops—BPM 2017 international workshops, Barcelona, Spain, September 10–11, 2017, Revised Papers. https://doi.org/10.1007/978-3-319-74030-0∖_16, pp 216–229
Sani MF, Boltenhagen M, van der Aalst W (2019) Prototype selection based on clustering and conformance metrics for model discovery. https://arxiv.org/pdf/1912.00736.pdf
Schobbens P, Heymans P, Trigaux J, Bontemps Y (2007) Generic semantics of feature diagrams. Comput Netw 51(2):456–479. https://doi.org/10.1016/j.comnet.2006.08.008
Article MATH Google Scholar
She S, Lotufo R, Berger T, Wasowski A, Czarnecki K (2010) The variability model of the linux kernel. In: VAMOS, vol 10, pp 45–51
Song M, Günther CW, van der Aalst WMP (2008) Trace clustering in process mining. In: Ardagna D, Mecella M, Yang J (eds) Business process management workshops, BPM 2008 international workshops, Milano, Italy, September 1–4, 2008. Revised Papers, Springer, Lecture Notes in Business Information Processing. https://doi.org/10.1007/978-3-642-00328-8∖_11, vol 17, pp 109–120
Song M, Günther C W, van der Aalst WMP (2009) Trace clustering in. In: Ardagna D, Mecella M, Yang J (eds) Business Process Management Workshops. Springer, Berlin, pp 109–120
Tax N, Sidorova N, van der Aalst WMP (2019) Discovering more precise process models from event logs by filtering out chaotic activities. J Intell Inf Syst 52(1):107–139. https://doi.org/10.1007/s10844-018-0507-6
Article Google Scholar
Thüm T, Apel S, Kästner C, Schaefer I, Saake G (2014) A classification and survey of analysis strategies for software product lines. ACMCS 47(1). https://doi.org/10.1145/2580950
Valencia-Parra A, Ramos-Gutiérrez B, Varela-Vaca AJ, López MTG, Bernal AG (2019a) Enabling process mining in aircraf manufactures: extracting event logs and discovering processes from complex data. In: Proceedings of the industry forum at BPM 2019 co-located with 17th international conference on business process management (BPM 2019), Vienna, Austria, September 1–6, 2019, pp 166–177
Valencia-Parra Á, Varela-Vaca ÁJ, Gómez-López MT, Ceravolo P (2019b) CHAMALEON: framework to improve data wrangling with complex data. In: Proceedings of the 40th international conference on information systems, ICIS 2019, Munich, Germany, December 15–18, 2019
van der Aalst WMP (2011) Analyzing “spaghetti processes”. Springer, Berlin
Google Scholar
van der Aalst WMP (2016) Process mining–data science in action, 2nd edn. Springer, Berlin
Google Scholar
van Dongen BF, de Medeiros AKA, Verbeek HMW, Weijters AJMM, van der Aalst WMP (2005) The prom framework: a new era in process mining tool support. In: Applications and theory of Petri nets 2005, 26th international conference, ICATPN 2005, Miami, USA, June 20–25, 2005, Proceedings, pp 444–454. https://doi.org/10.1007/11494744_25
vanden Broucke SKLM, Weerdt JD (2017) Fodina: a robust and flexible heuristic process discovery technique. Decis Support Syst 100:109–118. https://doi.org/10.1016/j.dss.2017.04.005
Article Google Scholar
Varela-Vaca AJ, Gasca RM (2013) Towards the automatic and optimal selection of risk treatments for business processes using a constraint programming approach. Inf Softw Technol 55(11):1948–1973
Article Google Scholar
Varela-Vaca ÁJ, Galindo JA, Ramos-Gutiérrez B, Gómez-López MT, Benavides D (2019a) Process mining to unleash variability management: discovering configuration workflows using logs. In: Proceedings of the 23rd International Systems and Software Product Line conference, SPLC 2019, Volume A, Paris, France, September 9–13, 2019, pp 37:1–37:12
Varela-Vaca ÁJ, Gasca RM, Ceballos R, Gómez-López MT, Torres PB (2019b) Cyberspl: a framework for the verification of cybersecurity policy compliance of system configurations using software product lines. Applied Sciences 9(24). https://doi.org/10.3390/app9245364. https://www.mdpi.com/2076-3417/9/24/5364
Wang Y, Tseng MM (2011) Adaptive attribute selection for configurator design via shapley value. Artif Intell Eng Des Anal Manuf 25(2):185–195. https://doi.org/10.1017/S0890060410000624
Article Google Scholar
Wang Y, Tseng M (2014) Attribute selection for product configurator design based on gini index. Int J Prod Res 52(20):6136–6145. https://doi.org/10.1080/00207543.2014.917216
Article Google Scholar
Ward JH Jr (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244
Article MathSciNet Google Scholar
Weijters AJMM, Ribeiro JTS (2011) Flexible heuristics miner (FHM). In: CIDM. IEEE, pp 310–317
Wilcoxon F (1946) Individual comparisons of grouped data by ranking methods. J Econ Entomol 39(2):269–270
Article Google Scholar
XES (2016) IEEE Standard for eXtensible Event Stream (XES) for achieving interoperability in event logs and event streams. IEEE Std 1849-2016 pp 1–50. https://doi.org/10.1109/IEEESTD.2016.7740858

Download references

Acknowledgements

This work has been partially by the Ministry of Science and Technology of Spain through ECLIPSE (RTI2018-094283-B-C33) and OPHELIA (RTI2018-101204-B-C22) projects; the TASOVA network (MCIU-AEI TIN2017-90644-REDT); and the Junta de Andalucía via METAMORFOSIS projects, the European Regional Development Fund (ERDF/FEDER), and the MINECO Juan de la Cierva postdoctoral program.

Author information

Authors and Affiliations

Data-Centric Computing Research Hub (IDEA), Universidad de Sevilla, Seville, Spain
Belén Ramos-Gutiérrez, Ángel Jesús Varela-Vaca, José A. Galindo, María Teresa Gómez-López & David Benavides

Authors

Belén Ramos-Gutiérrez
View author publications
You can also search for this author in PubMed Google Scholar
Ángel Jesús Varela-Vaca
View author publications
You can also search for this author in PubMed Google Scholar
José A. Galindo
View author publications
You can also search for this author in PubMed Google Scholar
María Teresa Gómez-López
View author publications
You can also search for this author in PubMed Google Scholar
David Benavides
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Belén Ramos-Gutiérrez.

Additional information

Communicated by: Laurence Duchien, Thomas Thüm and Paul Grünbacher

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Configurable Systems

Appendix: Quality metrics results

This appendix contains in the Tables 9, 10, 11, 12, and 13, the metric data represented in Figs. 10, 11, 12. To facilitate the interpretation of the data, the values of the metrics have been normalised in each metric, so that, all the results are between 0 and 1, allowing comparisons to be made. In addition, Table 8 is included to show the metric values for the original logs. With this, it can be seen how, in most cases, their values are closer to 0 after clustering, meaning that the resulting configuration workflows are also less complex. Still, it is important to note that it is very difficult to determine a generalisation regarding this data, since they are too domain-specific.

Table 8 Metrics for the initial logs of each case study

Full size table

Table 9 Metrics for ERP case with entropy-features

Full size table

Table 10 Metrics for ERP case with entropy-transitions

Full size table

Table 11 Metrics for Smart Farm case with entropy-features

Full size table

Table 12 Metrics for Computer Configuration case with entropy-features

Full size table

Table 13 Metrics for Computer Configuration case with entropy-transitions

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramos-Gutiérrez, B., Varela-Vaca, Á.J., Galindo, J.A. et al. Discovering configuration workflows from existing logs using process mining. Empir Software Eng 26, 11 (2021). https://doi.org/10.1007/s10664-020-09911-x

Download citation

Accepted: 30 October 2020
Published: 26 January 2021
DOI: https://doi.org/10.1007/s10664-020-09911-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering configuration workflows from existing logs using process mining

Abstract

Access this article

Similar content being viewed by others

Configuration Rule Mining for Variability Analysis in Configurable Process Models

Extracting Configuration Guidance Models from Business Process Repositories

Mining Configurable Process Fragments for Business Process Design

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: Quality metrics results

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discovering configuration workflows from existing logs using process mining

Abstract

Access this article

Similar content being viewed by others

Configuration Rule Mining for Variability Analysis in Configurable Process Models

Extracting Configuration Guidance Models from Business Process Repositories

Mining Configurable Process Fragments for Business Process Design

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: Quality metrics results

Appendix: Quality metrics results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation