Advertisement

General Algorithms for Mining Closed Flexible Patterns under Various Equivalence Relations

  • Tomohiro I
  • Yuki Enokuma
  • Hideo Bannai
  • Masayuki Takeda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7524)

Abstract

We address the closed pattern discovery problem in sequential databases for the class of flexible patterns. We propose two techniques of coarsening existing equivalence relations on the set of patterns to obtain new equivalence relations. Our new algorithm GenCloFlex is a generalization of MaxFlex proposed by Arimura and Uno (2007) that was designed for a particular equivalence relation. GenCloFlex can cope with existing, as well as new equivalence relations, and we investigate the computational complexities of the algorithm for respective equivalence relations. Then, we present an improved algorithm GenCloFlex+ based on new pruning techniques, which improve the delay time per output for some of the equivalence relations. By computational experiments on synthetic data, we show that most of the redundancies in the mined patterns are removed using the proposed equivalence relations.

Keywords

Equivalence Relation Binary Relation Sequential Pattern Mining Algorithm Pattern Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: ICDE, pp. 3–14 (1995)Google Scholar
  2. 2.
    Arimura, H., Uno, T.: Mining Maximal Flexible Patterns in a Sequence. In: Satoh, K., Inokuchi, A., Nagao, K., Kawamura, T. (eds.) JSAI 2007. LNCS (LNAI), vol. 4914, pp. 307–317. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: KDD, pp. 429–435 (2002)Google Scholar
  4. 4.
    Blumer, A., Blumer, J., Haussler, D., McConnell, R., Ehrenfeucht, A.: Complete inverted files for efficient text retrieval and analysis. J. ACM 34(3), 578–595 (1987)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Ding, B., Lo, D., Han, J., Khoo, S.-C.: Efficient mining of closed repetitive gapped subsequences from a sequence database. In: ICDE, pp. 1024–1035 (2009)Google Scholar
  6. 6.
    Lo, D., Cheng, H.: Lucia: Mining closed discriminative dyadic sequential patterns. In: EDBT, pp. 21–32 (2011)Google Scholar
  7. 7.
    Lo, D., Ding, B., Lucia, Han, J.: Bidirectional mining of non-redundant recurrent rules from a sequence database. In: ICDE, pp. 1043–1054 (2011)Google Scholar
  8. 8.
    Lo, D., Khoo, S.C., Li, J.: Mining and ranking generators of sequential patterns. In: SDM, pp. 553–564 (2008)Google Scholar
  9. 9.
    Mannila, H., Toivonen, H., Verkamo, I.A.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1(3), 259–289 (1997)CrossRefGoogle Scholar
  10. 10.
    Parida, L., Rigoutsos, I., Floratos, A., Platt, D.E., Gao, Y.: Pattern discovery on character sets and real-valued data: linear bound on irredundant motifs and an efficient polynomial time algorithm. In: Proc. SODA, pp. 297–308 (2000)Google Scholar
  11. 11.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Prefixspan: Mining sequential patterns by prefix-projected growth. In: ICDE, pp. 215–224 (2001)Google Scholar
  12. 12.
    Pisanti, N., Crochemore, M., Grossi, R., Sagot, M.-F.: A basis of tiling motifs for generating repeated patterns and its complexity for higher quorum. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 622–631. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  13. 13.
    Raïssi, C., Calders, T., Poncelet, P.: Mining Conjunctive Sequential Patterns. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, p. 19. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)Google Scholar
  15. 15.
    Tatti, N., Cule, B.: Mining closed strict episodes. In: ICDM, pp. 501–510 (2010)Google Scholar
  16. 16.
    Wang, J., Han, J.: BIDE: Efficient mining of frequent closed sequences. In: ICDE, pp. 79–90 (2004)Google Scholar
  17. 17.
    Wang, J., Han, J., Li, C.: Frequent closed sequence mining without candidate maintenance. IEEE Transactions on Knowledge and Data Engineering 19(8), 1042–1056 (2007)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Wang, K., Xu, Y., Yu, J.X.: Scalable sequential pattern mining for biological sequences. In: CIKM, pp. 178–187 (2004)Google Scholar
  19. 19.
    Wu, H.-W., Lee, A.J.T.: Mining closed flexible patterns in time-series databases. Expert Systems with Applications 37(3), 2098 (2010)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Yan, X., Han, J., Afshar, R.: CloSpan: Mining closed sequential patterns in large databases. In: SDM (2003)Google Scholar
  21. 21.
    Zaki, M.J.: Spade: An efficient algorithm for mining frequent sequences. Machine Learning 42(1/2), 31–60 (2001)zbMATHCrossRefGoogle Scholar
  22. 22.
    Zhou, W., Liu, H., Cheng, H.: Mining Closed Episodes from Event Sequences Efficiently. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010, Part I. LNCS, vol. 6118, pp. 310–318. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tomohiro I
    • 1
  • Yuki Enokuma
    • 1
  • Hideo Bannai
    • 1
  • Masayuki Takeda
    • 1
  1. 1.Department of InformaticsKyushu UniversityFukuokaJapan

Personalised recommendations