Abstract
In this article we present a novel approach to rare sequence mining using pattern structures. Particularly, we are interested in mining closed sequences, a type of maximal sub-element which allows providing a succinct description of the patterns in a sequence database. We present and describe a sequence pattern structure model in which rare closed subsequences can be easily encoded. We also propose a discussion and characterization of the search space of closed sequences and, through the notion of sequence alignments, provide an intuitive implementation of a similarity operator for the sequence pattern structure based on directed acyclic graphs. Finally, we provide an experimental evaluation of our approach in comparison with state-of-the-art closed sequence mining algorithms showing that our approach can largely outperform them when dealing with large regions of the search space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Open-source data mining library - http://www.philippe-fournier-viger.com/spmf/.
- 3.
- 4.
Version 0.97d / 0.97e - 2015-12-06.
References
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, pp. 3–14. IEEE Computer Society, Washington, D.C. (1995)
Ayouni, S., Laurent, A., Yahia, S.B., Poncelet, P.: Mining closed gradual patterns. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010. LNCS, vol. 6113, pp. 267–274. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13208-7_34
Buzmakov, A., Egho, E., Jay, N., Kuznetsov, S.O., Napoli, A., Raïssi, C.: On projections of sequential pattern structures (with an application on care trajectories). In: Ojeda-Aciego, M., Outrata, J. (eds.) The Tenth International Conference on Concept Lattices and their Applications - CLA 2013, La Rochelle, France, pp. 199–208. Université de La Rochelle (2013)
Casas-Garriga, G.: Summarizing sequential data with closed partial orders. In: Proceedings of the 2005 SIAM International Conference on Data Mining, SDM 2005, Newport Beach, CA, USA, 21–23 April 2005, pp. 380–391. SIAM (2005)
Fabrègue, M., Braud, A., Bringay, S., Le Ber, F., Teisseire, M.: Mining closed partially ordered patterns, a new optimized algorithm. Knowl.-Based Syst. 79, 68–79 (2015)
Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: Delugach, H.S., Stumme, G. (eds.) ICCS-ConceptStruct 2001. LNCS, vol. 2120, pp. 129–142. Springer, Heidelberg (2001). doi:10.1007/3-540-44583-8_10
Gomariz, A., Campos, M., Marin, R., Goethals, B.: ClaSP: an efficient algorithm for mining frequent closed sequences. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS, vol. 7818, pp. 50–61. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37453-1_5
Kaytoue, M., Codocedo, V., Buzmakov, A., Baixeries, J., Kuznetsov, S.O., Napoli, A.: Pattern structures and concept lattices for data mining and knowledge processing. In: Bifet, A., May, M., Zadrozny, B., Gavalda, R., Pedreschi, D., Bonchi, F., Cardoso, J., Spiliopoulou, M. (eds.) ECML PKDD 2015. LNCS, vol. 9286, pp. 227–231. Springer, Cham (2015). doi:10.1007/978-3-319-23461-8_19
Mooney, C.H., Roddick, J.F.: Sequential pattern mining - approaches and algorithms. ACM Comput. Surv. 45(2), 19:1–19:39 (2013)
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q.C.Q., Dayal, U., Hsu, M.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings 17th International Conference on Data Engineering, pp. 215–224 (2001)
Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: 20th International Conference on Data Engineering, Proceedings, pp. 79–90 (2004)
Yan, X., Han, J., Afshar, R.: CloSpan: mining: closed sequential patterns in large datasets. In: Proceedings of the 2003 SIAM International Conference on Data Mining, pp. 166–177 (2003)
Zaki, M.J.: SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)
Zhang, J., Wang, Y., Yang, D.: CCSpan: mining closed contiguous sequential patterns. Knowl.-Based Syst. 89, 1–13 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Codocedo, V., Bosc, G., Kaytoue, M., Boulicaut, JF., Napoli, A. (2017). A Proposition for Sequence Mining Using Pattern Structures. In: Bertet, K., Borchmann, D., Cellier, P., Ferré, S. (eds) Formal Concept Analysis. ICFCA 2017. Lecture Notes in Computer Science(), vol 10308. Springer, Cham. https://doi.org/10.1007/978-3-319-59271-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-59271-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59270-1
Online ISBN: 978-3-319-59271-8
eBook Packages: Computer ScienceComputer Science (R0)