Abstract
Process-oriented systems have been increasingly attracting data mining community, due to the opportunities the application of inductive process mining techniques to log data can open to both the analysis of complex processes and the design of new process models. Currently, these techniques focus on structural aspects of the process and disregard data that are kept by many real systems, such as information about activity executors, parameter values, and time-stamps.
In this paper, an enhanced process mining approach is presented, where different process variants (use cases) can be discovered by clustering log traces, based on both structural aspects and performance measures. To this aim, an information-theoretic framework is used, where the structural information as well as performance measures are represented by a proper domain, which is correlated to the “central domain” of logged process instances. Then, the clustering of log traces is performed synergically with that of the correlated domains. Eventually, each cluster is equipped with a specific model, so providing the analyst with a compact and handy description of the execution paths characterizing each process variant.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proc. Intl. Conf. on Knowledge Discovery and Data Mining (KDD 2001), pp. 269–274 (2001)
Berkhin, P., Becher, J.D.: Learning simple relations: Theory and applications. In: Proc. SIAM Intl. Conf. on Data Mining (SDM 2002) (2002)
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proc. Intl. Conf. on Knowledge Discovery and Data Mining (KDD 2003), pp. 89–98 (2003)
Gao, B., Liu, T.-Y., Zheng, X., Cheng, Q.-S., Ma, W.-Y.: Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering. In: Proc. Intl. Conf. on Knowledge Discovery and Data mining (KDD 2005)), pp. 41–50 (2005)
Greco, G., Guzzo, A., Pontieri, L., Saccà, D.: Mining Expressive Process Models by Clustering Workflow Traces. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 52–62. Springer, Heidelberg (2004)
Herbst, J., Karagiannis, D.: Integrating machine learning and workflow management to support acquisition and adaptation of workflow models. Journal of Intelligent Systems in Accounting, Finance and Management 9, 67–92 (2000)
Hwang, S., Yang, W.: On the discovery of process models from their instances. Decision Support Systems 34(1), 41–57 (2002)
Schimm, G.: Mining most specific workflow models from event-based data. In: Proc. Intl. Conf. on Business Process Management, pp. 25–40 (2003)
van der Aalst, W., Weijters, A., Maruster, L.: Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering (TKDE) 16(9), 1128–1142 (2004)
van Dongen, B., van der Aalst, W.: Multi-phase process mining: Aggregating instance graphs into EPCs and Petri nets. In: Proc. Intl. Work. on Applications of Petri Nets to Coordination, Worklflow and Business Process Management (PNCWB) at the ICATPN 2005 (2005)
Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1, 24–45
Zha, H., He, X., Ding, C., Simon, H., Gu, M.: Bipartite graph partitioning and data clustering. In: Proc. Intl. Conf. on Information and Knowledge Management (CIKM 2001), pp. 25–32 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chiaravalloti, A.D., Greco, G., Guzzo, A., Pontieri, L. (2006). An Information-Theoretic Framework for Process Structure and Data Mining. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_24
Download citation
DOI: https://doi.org/10.1007/11823728_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)