Advertisement

A Global Constraint for Mining Sequential Patterns with GAP Constraint

  • Amina Kemmar
  • Samir Loudni
  • Yahia Lebbah
  • Patrice Boizumault
  • Thierry Charnois
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9676)

Abstract

Sequential pattern mining (SPM) under gap constraint is a challenging task. Many efficient specialized methods have been developed but they are all suffering from a lack of genericity. The Constraint Programming (CP) approaches are not so effective because of the size of their encodings. In [7], we have proposed the global constraint Prefix-Projection for SPM which remedies to this drawback. However, this global constraint cannot be directly extended to support gap constraint. In this paper, we propose the global constraint GAP-SEQ enabling to handle SPM with or without gap constraint. GAP-SEQ relies on the principle of right pattern extensions. Experiments show that our approach clearly outperforms both CP approaches and the state-of-the-art cSpade method on large datasets.

Keywords

Sequential Pattern Constraint Programming Constraint Satisfaction Problem Global Constraint Frequent Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: Yu, P.S., Chen, A.L.P. (eds.) ICDE, pp. 3–14. IEEE Computer Society (1995)Google Scholar
  2. 2.
    Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: KDD 2002, pp. 429–435. ACM (2002)Google Scholar
  3. 3.
    Béchet, N., Cellier, P., Charnois, T., Crémilleux, B.: Sequential pattern mining to discover relations between genes and rare diseases. In: CBMS (2012)Google Scholar
  4. 4.
    Coquery, E., Jabbour, S., Saïs, L., Salhi, Y.: A SAT-based approach for discovering frequent, closed and maximal patterns in a sequence. In: ECAI, pp. 258–263 (2012)Google Scholar
  5. 5.
    Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C., Tseng, V.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. 15, 3389–3393 (2014). http://jmlr.org/papers/v15/fournierviger14a.html zbMATHGoogle Scholar
  6. 6.
    Ji, X., Bailey, J., Dong, G.: Mining minimal distinguishing subsequence patterns with gap constraints. In: ICDM 2005, pp. 194–201 (2005)Google Scholar
  7. 7.
    Kemmar, A., Loudni, S., Lebbah, Y., Boizumault, P., Charnois, T.: PREFIX-PROJECTION global constraint for sequential pattern mining. In: Pesant, G. (ed.) CP 2015. LNCS, vol. 9255, pp. 226–243. Springer, Heidelberg (2015)Google Scholar
  8. 8.
    Kemmar, A., Ugarte, W., Loudni, S., Charnois, T., Lebbah, Y., Boizumault, P., Crémilleux, B.: Mining relevant sequence patterns with CP-based framework. In: ICTAI, pp. 552–559 (2014)Google Scholar
  9. 9.
    Li, C., Wang, J.: Efficiently mining closed subsequences with gap constraints. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 313–322 (2008)Google Scholar
  10. 10.
    Li, C., Yang, Q., Wang, J., Li, M.: Efficient mining of gap-constrained subsequences and its various applications. Trans. Knowl. Discov. Data 6(1), 2:1–2:39 (2012)MathSciNetGoogle Scholar
  11. 11.
    Métivier, J.P., Loudni, S., Charnois, T.: A constraint programming approach for mining sequential patterns in a sequence database. In: ECML/PKDD Workshop on Languages for Data Mining and Machine Learning (2013)Google Scholar
  12. 12.
    Negrevergne, B., Guns, T.: Constraint-based sequence mining using constraint programming. In: Michel, L. (ed.) CPAIOR 2015. LNCS, vol. 9075, pp. 288–305. Springer, Heidelberg (2015)Google Scholar
  13. 13.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining sequential patterns by prefix-projected growth. In: ICDE, pp. 215–224. IEEE Computer Society (2001)Google Scholar
  14. 14.
    Pei, J., Han, J., Wang, W.: Mining sequential patterns with constraints in large databases. In: CIKM 2002, pp. 18–25. ACM (2002)Google Scholar
  15. 15.
    Wu, X., Zhu, X., He, Y., Arslan, A.N.: PMBC: pattern mining from biological sequences with wildcard constraints. Comput. Biol. Med. 43(5), 481–492 (2013)CrossRefGoogle Scholar
  16. 16.
    Yang, G.: Computational aspects of mining maximal frequent patterns. Theoret. Comput. Sci. 362(1–3), 63–85 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Zaki, M.J.: Sequence mining in categorical domains: incorporating constraints. In: CIKM 2000, pp. 422–429 (2000)Google Scholar
  18. 18.
    Zaki, M.J.: SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1/2), 31–60 (2001)CrossRefzbMATHGoogle Scholar
  19. 19.
    Zhang, M., Kao, B., Cheung, D.W., Yip, K.Y.: Mining periodic patterns with gap requirement from sequences. TKDD 1(2) (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Amina Kemmar
    • 1
  • Samir Loudni
    • 2
  • Yahia Lebbah
    • 1
  • Patrice Boizumault
    • 2
  • Thierry Charnois
    • 3
  1. 1.LITIO, University of Oran 1, EPSECG of OranOranAlgeria
  2. 2.GREYC (CNRS UMR 6072), University of CaenCaenFrance
  3. 3.LIPN (CNRS UMR 7030), University Paris 13ParisFrance

Personalised recommendations