Skip to main content
Log in

Mining frequent closed inter-sequence patterns efficiently using dynamic bit vectors

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Mining frequent sequences is a critical stage before rule generation for sequence databases. Currently, there are two main ways for mining frequent sequences, namely intra-sequence mining and inter-sequence mining. Inter-sequence mining is more attractive than intra-sequence mining because it considers the relationship between sequences in transactions. However, mining all possible frequent inter-sequences takes a long time and requires a lot of memory. Mining frequent closed inter-sequences is efficient because such sequences are compact, and only the necessary information is maintained. CISP-Miner was proposed for mining frequent closed inter-sequence patterns, but it consumes a lot of memory since many closed patterns are mined. This paper proposes an algorithm called ClosedISP for mining frequent closed inter-sequence patterns. The proposed algorithm uses a checking scheme for early eliminating and checking closed patterns without candidate maintenance. ClosedISP uses a dynamic bit vector that combines transaction information to compress data. In addition, ClosedISP adopts a prefix tree and a depth-first search order to reduce the search space and generate non-redundant sequential rules efficiently. Experiments were conducted to compare the proposed algorithm with CISP-Miner to demonstrate the effectiveness of the proposed algorithm in terms of runtime and memory usage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1995) Mining sequential patterns. In: IEEE international conference on data engineering, pp 3–14

  2. Ayres J, Gehrke J, Yiu T, Flannick J (2002) Sequential pattern mining using a bitmap representation. In: 8th ACM SIGKDD international conference on knowledge discovery and data mining, Edmonton, pp 429–435

  3. Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl-Based Syst 20(4):329–335

    Article  Google Scholar 

  4. Feng L, Dillon TS, Liu J (2001) Inter-transactional association rules for multi-dimensional contests for prediction and their application to studying meteorological data. Data Knowl Eng 37(1):85–115

    Article  MATH  Google Scholar 

  5. Feng L, Yu JX, Lu H, Han J (2002) A template model for multidimensional inter-transactional association rules. VLDB J 11(2):153–175

    Article  Google Scholar 

  6. Gomariz A, Campos M, Marin R, Goethals B (2013) ClaSP: an efficient algorithm for mining frequent closed sequences. In: Advances in knowledge discovery and data mining, LNAI, vol 7818, pp 50–61

  7. Hu Y, Panda B (2010) Mining inter-transaction data dependencies for database intrusion detection. In: Innovations and advances in computer sciences and engineering, pp 67–72

  8. Mabroukeh NR, Ezeife CI (2010) A taxonomy of sequential pattern mining algorithms. ACM Comput Surv 43(1):Article No 3

    Article  Google Scholar 

  9. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: International conference. Database Theory (ICDT ’99), pp 398–416

  10. Pei J, Han J, Mao R (2000) CLOSET: an efficient algorithm for mining frequent closed itemsets. In: Proceedings of the ACM SIGMOD workshop research issues in data mining and knowledge discovery (DMKD ’00), pp 21–30

  11. Pei J et al (2001) PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: International conference. Data engineering, pp 215–224

  12. Pham TT, Luo J, Hong TP, Vo B (2012) MSGPs: a novel algorithm for mining sequential generator patterns. In: Computational collective intelligence, technologies and applications, LNCS, vol 7654, pp 393–401

  13. Pham TT, Luo J, Hong TP, Vo B (2014) An efficient method for mining non-redundant sequential rules using attributed prefix-trees. Eng Appl Artif Intell 32:88–99

    Article  Google Scholar 

  14. Pham TT, Luo J, Vo B (2013) An effective algorithm for mining closed sequential patterns and their minimal generators based on prefix trees. Int J Intell Inf Database Syst 7(4):324– 339

    Google Scholar 

  15. Song S, Hu H, Jin S (2005) HVSM: a new sequential pattern mining algorithm using bitmap representation. Advanced data mining and applications, pp 455–463

  16. Song W, Yang B, Xu Z (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl-Based Syst 21(6):507–513

    Article  Google Scholar 

  17. Van TT, Vo B, Le B (2014) IMSR_PreTree: an improved algorithm for mining sequential rules based on the prefix-tree. Vietnam J Comput Sci 1(2):97–105

    Article  Google Scholar 

  18. Vo B, Hong TP, Le B (2012) DBV-Miner: a Dynamic Bit-Vector approach for fast mining frequent itemsets. Expert Syst Appl 39(8):7196–7206

    Article  Google Scholar 

  19. Wang CS, Lee AJT (2009) Mining inter-sequence patterns. Expert Syst Appl 36(4):8649–8656

    Article  Google Scholar 

  20. Wang CS, Liu Y-H, Chu KC (2013) Closed inter-sequence pattern mining. J Syst Softw 86:1603–1612

    Article  Google Scholar 

  21. Wang J, Han J, Pei J (2003) CLOSET + : searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the ACM SIGKDD international conference. Knowledge Discovery and Data Mining (SIGKDD’03), pp 236–245

  22. Wang J, Han J, Li C (2007) Frequent closed sequence mining without candidate maintenance. IEEE Trans Knowl Data Eng 19(8):1042–1056

    Article  MathSciNet  Google Scholar 

  23. Yang Z, Kitsuregawa M (2005) LAPIN-SPAM: an improved algorithm for mining sequential pattern. ICDE Workshops 2005:1222

    Google Scholar 

  24. Zaki M, Hsiao C (2002) CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of SIAM international conference. Data Mining (SDM’02), pp 457–473

  25. Zaki M (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work was funded by the Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant no. 102.05-2013.20.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bay Vo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Le, B., Tran, MT. & Vo, B. Mining frequent closed inter-sequence patterns efficiently using dynamic bit vectors. Appl Intell 43, 74–84 (2015). https://doi.org/10.1007/s10489-014-0630-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-014-0630-1

Keywords

Navigation