Abstract
Frequent Episode Mining (FEM) techniques play an important role in data mining, and have multiple applications, spanning from identifying user marketing habits to performing anomaly detection in computer networks. Most of the FEM approaches exhaustively search for frequent patterns, while using a threshold to efficiently reduce the search space. While this approach provides efficient results in small datasets, it fails in large datasets due to heavy processing, which leads to low performance. This paper, proposes a fast frequent episode mining method which utilizes Finite-State Machines (FSM). Initially, a FSM is created based on a subset of the data, in order to approximate the type and frequency of the most dominant episodes. Instead of applying traditional exhaustive search procedures, the parsing of the dataset is herein guided by the proposed FSM approach. Experimental results show that the proposed approach has better time performance than the traditional FEM algorithms, while still maintaining high accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This data is available thanks to msnbc.com.
References
Achar, A., Sastry, P.S., et al.: Pattern-growth based frequent serial episode discovery. Data Knowl. Eng. 87, 91–108 (2013)
Agathangelidis, A., Darzentas, N., Hadzidimitriou, A., Brochet, X., Murray, F., Yan, X.-J., Davis, Z., van Gastel-Mol, E.J., Tresoldi, C., Chu, C.C., et al.: Stereotyped B-cell receptors in one-third of chronic lymphocytic leukemia: A molecular classification with implications for targeted therapies. Blood 119(19), 4467–4475 (2012)
Aggarwal, C.C., Han, J.: Frequent pattern mining. Springer 1st edition (2014)
Castro, N.C., Azevedo, P.J.: Significant motifs in time series. Stat. Anal. Data Min.: ASA Data Sci. J. 5(1), 35–53 (2012)
Ding, B., Lo, D., Han, J., Khoo, S.-C.: Efficient mining of closed repetitive gapped subsequences from a sequence database. In: Data Engineering, 2009. ICDE’09. IEEE 25th International Conference on, pp. 1024–1035. IEEE (2009)
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113–129 (2010)
Gouda, K., Zaki, M.: Efficiently mining maximal frequent itemsets. In: Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, pp. 163–170. IEEE (2001)
Huang, K.-Y., Chang, C.-H.: Efficient mining of frequent episodes from complex sequences. Inf. Syst. 33(1), 96–114 (2008)
Iwanuma, K., Takano, Y., Nabeshima, H.: On anti-monotone frequency measures for extracting sequential patterns from a single very-long data sequence. In: Cybernetics and Intelligent Systems, 2004 IEEE Conference on, vol. 1, pp. 213–217. IEEE (2004)
Kim, M., Yoon, S.H., Domanski, P.A., Payne, W.V.: Design of a steady-state detector for fault detection and diagnosis of a residential air conditioner. Int. J. Refrig. 31(5), 790–799 (2008)
Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1(3), 259–289 (1997)
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 215. IEEE Computer Society (2001)
UCI. Machine learning repository. http://archive.ics.uci.edu/ml (2010)
Acknowledgments
This work has been partially supported by the European Commission through project FP7-ICT-317888-NEMESYS funded by the 7th framework program. The opinions expressed in this paper are those of the authors and do not necessarily reflect the views of the European Commission.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Papadopoulos, S., Drosou, A., Tzovaras, D. (2016). Fast Frequent Episode Mining Based on Finite-State Machines. In: Abdelrahman, O., Gelenbe, E., Gorbil, G., Lent, R. (eds) Information Sciences and Systems 2015. Lecture Notes in Electrical Engineering, vol 363. Springer, Cham. https://doi.org/10.1007/978-3-319-22635-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-22635-4_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22634-7
Online ISBN: 978-3-319-22635-4
eBook Packages: EngineeringEngineering (R0)