Advertisement

A Proposition for Sequence Mining Using Pattern Structures

  • Victor Codocedo
  • Guillaume Bosc
  • Mehdi Kaytoue
  • Jean-François Boulicaut
  • Amedeo Napoli
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10308)

Abstract

In this article we present a novel approach to rare sequence mining using pattern structures. Particularly, we are interested in mining closed sequences, a type of maximal sub-element which allows providing a succinct description of the patterns in a sequence database. We present and describe a sequence pattern structure model in which rare closed subsequences can be easily encoded. We also propose a discussion and characterization of the search space of closed sequences and, through the notion of sequence alignments, provide an intuitive implementation of a similarity operator for the sequence pattern structure based on directed acyclic graphs. Finally, we provide an experimental evaluation of our approach in comparison with state-of-the-art closed sequence mining algorithms showing that our approach can largely outperform them when dealing with large regions of the search space.

Keywords

Search Space Directed Acyclic Graph Sequence Pattern Sink Node Pattern Structure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, pp. 3–14. IEEE Computer Society, Washington, D.C. (1995)Google Scholar
  2. 2.
    Ayouni, S., Laurent, A., Yahia, S.B., Poncelet, P.: Mining closed gradual patterns. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010. LNCS, vol. 6113, pp. 267–274. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-13208-7_34 CrossRefGoogle Scholar
  3. 3.
    Buzmakov, A., Egho, E., Jay, N., Kuznetsov, S.O., Napoli, A., Raïssi, C.: On projections of sequential pattern structures (with an application on care trajectories). In: Ojeda-Aciego, M., Outrata, J. (eds.) The Tenth International Conference on Concept Lattices and their Applications - CLA 2013, La Rochelle, France, pp. 199–208. Université de La Rochelle (2013)Google Scholar
  4. 4.
    Casas-Garriga, G.: Summarizing sequential data with closed partial orders. In: Proceedings of the 2005 SIAM International Conference on Data Mining, SDM 2005, Newport Beach, CA, USA, 21–23 April 2005, pp. 380–391. SIAM (2005)Google Scholar
  5. 5.
    Fabrègue, M., Braud, A., Bringay, S., Le Ber, F., Teisseire, M.: Mining closed partially ordered patterns, a new optimized algorithm. Knowl.-Based Syst. 79, 68–79 (2015)CrossRefGoogle Scholar
  6. 6.
    Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: Delugach, H.S., Stumme, G. (eds.) ICCS-ConceptStruct 2001. LNCS, vol. 2120, pp. 129–142. Springer, Heidelberg (2001). doi: 10.1007/3-540-44583-8_10 CrossRefGoogle Scholar
  7. 7.
    Gomariz, A., Campos, M., Marin, R., Goethals, B.: ClaSP: an efficient algorithm for mining frequent closed sequences. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS, vol. 7818, pp. 50–61. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-37453-1_5 CrossRefGoogle Scholar
  8. 8.
    Kaytoue, M., Codocedo, V., Buzmakov, A., Baixeries, J., Kuznetsov, S.O., Napoli, A.: Pattern structures and concept lattices for data mining and knowledge processing. In: Bifet, A., May, M., Zadrozny, B., Gavalda, R., Pedreschi, D., Bonchi, F., Cardoso, J., Spiliopoulou, M. (eds.) ECML PKDD 2015. LNCS, vol. 9286, pp. 227–231. Springer, Cham (2015). doi: 10.1007/978-3-319-23461-8_19 CrossRefGoogle Scholar
  9. 9.
    Mooney, C.H., Roddick, J.F.: Sequential pattern mining - approaches and algorithms. ACM Comput. Surv. 45(2), 19:1–19:39 (2013)CrossRefMATHGoogle Scholar
  10. 10.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q.C.Q., Dayal, U., Hsu, M.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings 17th International Conference on Data Engineering, pp. 215–224 (2001)Google Scholar
  11. 11.
    Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: 20th International Conference on Data Engineering, Proceedings, pp. 79–90 (2004)Google Scholar
  12. 12.
    Yan, X., Han, J., Afshar, R.: CloSpan: mining: closed sequential patterns in large datasets. In: Proceedings of the 2003 SIAM International Conference on Data Mining, pp. 166–177 (2003)Google Scholar
  13. 13.
    Zaki, M.J.: SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)CrossRefMATHGoogle Scholar
  14. 14.
    Zhang, J., Wang, Y., Yang, D.: CCSpan: mining closed contiguous sequential patterns. Knowl.-Based Syst. 89, 1–13 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Victor Codocedo
    • 1
    • 3
  • Guillaume Bosc
    • 2
  • Mehdi Kaytoue
    • 2
  • Jean-François Boulicaut
    • 2
  • Amedeo Napoli
    • 3
  1. 1.Inria ChileLas CondesChile
  2. 2.Université de Lyon, CNRS, INSA-Lyon, LIRISLyonFrance
  3. 3.LORIA (CNRS – INRIA Nancy Grand-Est – Université de Lorraine)NancyFrance

Personalised recommendations