Skip to main content

Fast Frequent Episode Mining Based on Finite-State Machines

  • Conference paper
  • First Online:
Information Sciences and Systems 2015

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 363))

Abstract

Frequent Episode Mining (FEM) techniques play an important role in data mining, and have multiple applications, spanning from identifying user marketing habits to performing anomaly detection in computer networks. Most of the FEM approaches exhaustively search for frequent patterns, while using a threshold to efficiently reduce the search space. While this approach provides efficient results in small datasets, it fails in large datasets due to heavy processing, which leads to low performance. This paper, proposes a fast frequent episode mining method which utilizes Finite-State Machines (FSM). Initially, a FSM is created based on a subset of the data, in order to approximate the type and frequency of the most dominant episodes. Instead of applying traditional exhaustive search procedures, the parsing of the dataset is herein guided by the proposed FSM approach. Experimental results show that the proposed approach has better time performance than the traditional FEM algorithms, while still maintaining high accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This data is available thanks to msnbc.com.

References

  1. Achar, A., Sastry, P.S., et al.: Pattern-growth based frequent serial episode discovery. Data Knowl. Eng. 87, 91–108 (2013)

    Article  Google Scholar 

  2. Agathangelidis, A., Darzentas, N., Hadzidimitriou, A., Brochet, X., Murray, F., Yan, X.-J., Davis, Z., van Gastel-Mol, E.J., Tresoldi, C., Chu, C.C., et al.: Stereotyped B-cell receptors in one-third of chronic lymphocytic leukemia: A molecular classification with implications for targeted therapies. Blood 119(19), 4467–4475 (2012)

    Article  Google Scholar 

  3. Aggarwal, C.C., Han, J.: Frequent pattern mining. Springer 1st edition (2014)

    Google Scholar 

  4. Castro, N.C., Azevedo, P.J.: Significant motifs in time series. Stat. Anal. Data Min.: ASA Data Sci. J. 5(1), 35–53 (2012)

    Google Scholar 

  5. Ding, B., Lo, D., Han, J., Khoo, S.-C.: Efficient mining of closed repetitive gapped subsequences from a sequence database. In: Data Engineering, 2009. ICDE’09. IEEE 25th International Conference on, pp. 1024–1035. IEEE (2009)

    Google Scholar 

  6. Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113–129 (2010)

    Article  MathSciNet  Google Scholar 

  7. Gouda, K., Zaki, M.: Efficiently mining maximal frequent itemsets. In: Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, pp. 163–170. IEEE (2001)

    Google Scholar 

  8. Huang, K.-Y., Chang, C.-H.: Efficient mining of frequent episodes from complex sequences. Inf. Syst. 33(1), 96–114 (2008)

    Article  Google Scholar 

  9. Iwanuma, K., Takano, Y., Nabeshima, H.: On anti-monotone frequency measures for extracting sequential patterns from a single very-long data sequence. In: Cybernetics and Intelligent Systems, 2004 IEEE Conference on, vol. 1, pp. 213–217. IEEE (2004)

    Google Scholar 

  10. Kim, M., Yoon, S.H., Domanski, P.A., Payne, W.V.: Design of a steady-state detector for fault detection and diagnosis of a residential air conditioner. Int. J. Refrig. 31(5), 790–799 (2008)

    Google Scholar 

  11. Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1(3), 259–289 (1997)

    Article  Google Scholar 

  12. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 215. IEEE Computer Society (2001)

    Google Scholar 

  13. UCI. Machine learning repository. http://archive.ics.uci.edu/ml (2010)

Download references

Acknowledgments

This work has been partially supported by the European Commission through project FP7-ICT-317888-NEMESYS funded by the 7th framework program. The opinions expressed in this paper are those of the authors and do not necessarily reflect the views of the European Commission.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stavros Papadopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Papadopoulos, S., Drosou, A., Tzovaras, D. (2016). Fast Frequent Episode Mining Based on Finite-State Machines. In: Abdelrahman, O., Gelenbe, E., Gorbil, G., Lent, R. (eds) Information Sciences and Systems 2015. Lecture Notes in Electrical Engineering, vol 363. Springer, Cham. https://doi.org/10.1007/978-3-319-22635-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22635-4_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22634-7

  • Online ISBN: 978-3-319-22635-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics