An Information-Theoretic Framework for Process Structure and Data Mining

Chiaravalloti, Antonio D.; Greco, Gianluigi; Guzzo, Antonella; Pontieri, Luigi

doi:10.1007/11823728_24

Antonio D. Chiaravalloti¹⁹,
Gianluigi Greco¹⁸,
Antonella Guzzo¹⁹ &
…
Luigi Pontieri¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4081))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

786 Accesses
1 Citations

Abstract

Process-oriented systems have been increasingly attracting data mining community, due to the opportunities the application of inductive process mining techniques to log data can open to both the analysis of complex processes and the design of new process models. Currently, these techniques focus on structural aspects of the process and disregard data that are kept by many real systems, such as information about activity executors, parameter values, and time-stamps.

In this paper, an enhanced process mining approach is presented, where different process variants (use cases) can be discovered by clustering log traces, based on both structural aspects and performance measures. To this aim, an information-theoretic framework is used, where the structural information as well as performance measures are represented by a proper domain, which is correlated to the “central domain” of logged process instances. Then, the clustering of log traces is performed synergically with that of the correlated domains. Eventually, each cluster is equipped with a specific model, so providing the analyst with a compact and handy description of the execution paths characterizing each process variant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proc. Intl. Conf. on Knowledge Discovery and Data Mining (KDD 2001), pp. 269–274 (2001)
Google Scholar
Berkhin, P., Becher, J.D.: Learning simple relations: Theory and applications. In: Proc. SIAM Intl. Conf. on Data Mining (SDM 2002) (2002)
Google Scholar
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proc. Intl. Conf. on Knowledge Discovery and Data Mining (KDD 2003), pp. 89–98 (2003)
Google Scholar
Gao, B., Liu, T.-Y., Zheng, X., Cheng, Q.-S., Ma, W.-Y.: Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering. In: Proc. Intl. Conf. on Knowledge Discovery and Data mining (KDD 2005)), pp. 41–50 (2005)
Google Scholar
Greco, G., Guzzo, A., Pontieri, L., Saccà, D.: Mining Expressive Process Models by Clustering Workflow Traces. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 52–62. Springer, Heidelberg (2004)
Chapter Google Scholar
Herbst, J., Karagiannis, D.: Integrating machine learning and workflow management to support acquisition and adaptation of workflow models. Journal of Intelligent Systems in Accounting, Finance and Management 9, 67–92 (2000)
Article Google Scholar
Hwang, S., Yang, W.: On the discovery of process models from their instances. Decision Support Systems 34(1), 41–57 (2002)
Article Google Scholar
Schimm, G.: Mining most specific workflow models from event-based data. In: Proc. Intl. Conf. on Business Process Management, pp. 25–40 (2003)
Google Scholar
van der Aalst, W., Weijters, A., Maruster, L.: Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering (TKDE) 16(9), 1128–1142 (2004)
Article Google Scholar
van Dongen, B., van der Aalst, W.: Multi-phase process mining: Aggregating instance graphs into EPCs and Petri nets. In: Proc. Intl. Work. on Applications of Petri Nets to Coordination, Worklflow and Business Process Management (PNCWB) at the ICATPN 2005 (2005)
Google Scholar
Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1, 24–45
Google Scholar
Zha, H., He, X., Ding, C., Simon, H., Gu, M.: Bipartite graph partitioning and data clustering. In: Proc. Intl. Conf. on Information and Knowledge Management (CIKM 2001), pp. 25–32 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Mathematics, UNICAL, Via P. Bucci 30B, 87036, Rende, Italy
Gianluigi Greco
ICAR, CNR, Via Pietro Bucci 41C, 87036, Rende, Italy
Antonio D. Chiaravalloti, Antonella Guzzo & Luigi Pontieri

Authors

Antonio D. Chiaravalloti
View author publications
You can also search for this author in PubMed Google Scholar
Gianluigi Greco
View author publications
You can also search for this author in PubMed Google Scholar
Antonella Guzzo
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Pontieri
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstr. 9-11/188, A-1040, Wien, Austria
A Min Tjoa
Department of Software and Computing Systems, University of Alicante, Spain
Juan Trujillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chiaravalloti, A.D., Greco, G., Guzzo, A., Pontieri, L. (2006). An Information-Theoretic Framework for Process Structure and Data Mining. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_24

Download citation

DOI: https://doi.org/10.1007/11823728_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics