Abstract
Classical outlier detection approaches may hardly fit process mining applications, since in these settings anomalies emerge not only as deviations from the sequence of events most often registered in the log, but also as deviations from the behavior prescribed by some (possibly unknown) process model. These issues have been faced in the paper via an approach for singling out anomalous evolutions within a set of process traces, which takes into account both statistical properties of the log and the constraints associated with the process model. The approach combines the discovery of frequent execution patterns with a cluster-based anomaly detection procedure; notably, this procedure is suited to deal with categorical data and is, hence, interesting in its own, given that outlier detection has mainly been studied on numerical domains in the literature. All the algorithms presented in the paper have been implemented and integrated into a system prototype that has been thoroughly tested to assess its scalability and effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Apostolico, A., Bock, M.E., Lonardi, S., Xu, X.: Efficient detection of unusual words. Journal of Computational Biology 7(1/2), 71–94 (2000)
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proc. 9th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD 2003), pp. 89–98 (2003)
Dustdar, S., Hoffmann, T., van der Aalst, W.M.P.: Mining of ad-hoc business processes with teamlog. Data and Knowledge Engineering 55(2), 129–158 (2005)
Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002)
Fawcett, T.E., Provost, F.: Fraud detection. In: Handbook of data mining and knowledge discovery, pp. 726–731. Oxford University Press, Oxford (2002)
Greco, G., Guzzo, A., Pontieri, L., Saccà , D.: Discovering expressive process models by clustering log traces. IEEE Trans. on Knowledge and Data Engin. 18(8), 1010–1027 (2006)
He, Z., Xu, Z., Huang, J.Z., Deng, S.: Fp-outlier: Frequent pattern based outlier detection. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 735–740. Springer, Heidelberg (2005)
Jaing, M.F., Tseng, S.S., Su, C.M.: Two-phase clustering process for outliers detection. Pattern Recogn. Lett. 22(6-7), 691–700 (2001)
Jiang, S., Song, X., Wang, H., Han, J.-J., Li, Q.-H.: A clustering-based method for unsupervised intrusion detections. Pattern Recogn. Lett. 27(7), 802–810 (2006)
Maruster, L., Weijters, A.J.M.M., van der Aalst, W.M.P., van den Bosch, A.: A rule-based approach for process discovery: Dealing with noise and imbalance in process logs. Data Mining and Knowledge Discovery 13(1), 67–87 (2006)
Motahari Nezhad, H.R., Saint-Paul, R., Benatallah, B., Casati, F.: Protocol discovery from imperfect service interaction logs. In: Proc. of ICDE 2007, pp. 1405–1409 (2007)
van der Aalst, W.M.P., van Dongen, B.F., Herbst, J., Maruster, L., Schimm, G., Weijters, A.: Workflow mining: a survey of issues and approaches. Data & Know. Engin. 47(2), 237–267 (2003)
Yu, D., Sheikholeslami, G., Zhang, A.: Findout: finding outliers in very large datasets. Knowledge Information Systems 4(4), 387–412 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ghionna, L., Greco, G., Guzzo, A., Pontieri, L. (2008). Outlier Detection Techniques for Process Mining Applications. In: An, A., Matwin, S., Raś, Z.W., Ślęzak, D. (eds) Foundations of Intelligent Systems. ISMIS 2008. Lecture Notes in Computer Science(), vol 4994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68123-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-68123-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68122-9
Online ISBN: 978-3-540-68123-6
eBook Packages: Computer ScienceComputer Science (R0)