A Similarity Measure for Sequences of Categorical Data Based on the Ordering of Common Elements

Gómez-Alonso, Cristina; Valls, Aida

doi:10.1007/978-3-540-88269-5_13

Cristina Gómez-Alonso³ &
Aida Valls³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5285))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

1004 Accesses
10 Citations

Abstract

Similarity measures are usually used to compare items and identify pairs or groups of similar individuals. The similarity measure strongly depends on the type of values to compare. We have faced the problem of considering that the information of the individuals is a sequence of events (i.e. sequences of web pages visited by a certain user or the personal daily schedule). Some measures for numerical sequences exist, but very few methods consider sequences of categorical data. In this paper, we present a new similarity measure for sequences of categorical labels and compare it with the previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abul, O., Atzori, M., Bonchi, F., Giannotti, F.: Hiding sequences. In: ICDE Workshops, pp. 147–156. IEEE Computer Society, Los Alamitos (2007)
Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://archive.ics.uci.edu/ml/
Dietterich, T.G.: Machine learning for sequential data: A review. In: Caelli, T., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 15–30. Springer, Heidelberg (2002)
Chapter Google Scholar
Dong, G., Pei, J.: Sequence Data Mining. Advances in Database Systems, vol. 33. Springer, US (2007)
MATH Google Scholar
Figueira, J., Greco, S., Ehrgott, M.: Multiple Criteria Decision Analysis: State of the Art Surveys. ISOR & MS, vol. 78. Springer, Heidelberg (2005)
Book MATH Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann Publishers, San Francisco (2006)
MATH Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999)
Article Google Scholar
Liao, T.W.: Clustering of time series data–a survey. Pattern Recognition 38(11), 1857–1874 (2005)
Article MATH Google Scholar
Mount, D.W.: Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press (September 2004)
Google Scholar
Nin, J., Torra, V.: Extending microaggregation procedures for time series protection. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Slowinski, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 899–908. Springer, Heidelberg (2006)
Chapter Google Scholar
Notredame, C.: Recent evolutions of multiple sequence alignment algorithms. PLoS Computational Biology 3(8), e123+ (2007)
Article Google Scholar
Wallace, I.M., Blackshields, G., Higgins, D.G.: Multiple sequence alignments. Current Opinion in Structural Biology 15(3), 261–266 (2005)
Article MathSciNet Google Scholar
Yang, J., Wang, W.: Cluseq: Efficient and effective sequence clustering. In: 19th International Conference on Data Engineering (ICDE 2003), vol. 00, p. 101 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

iTAKA Research Group - Intelligent Tech. for Advanced Knowledge Acquisition Department of Computer Science and Mathematics, Universitat Rovira i Virgili, 43007, Tarragona, Catalonia, Spain
Cristina Gómez-Alonso & Aida Valls

Authors

Cristina Gómez-Alonso
View author publications
You can also search for this author in PubMed Google Scholar
Aida Valls
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council,, Campus UAB s/n, 08193, Bellaterra, Catalonia, Spain
Vicenç Torra
Toho Gakuen,, 3-1-10 Naka, Kunitachi, 186-0004, Tokyo, Japan
Yasuo Narukawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gómez-Alonso, C., Valls, A. (2008). A Similarity Measure for Sequences of Categorical Data Based on the Ordering of Common Elements. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2008. Lecture Notes in Computer Science(), vol 5285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88269-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-88269-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88268-8
Online ISBN: 978-3-540-88269-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics