Abstract
This work takes place in the context of conversion rate optimization by enhancing the user experience during navigation on e-commerce web sites. The requirement is to be able to segment visitors into meaningful clusters, which can then be targeted with specific call-to-actions, in order to increase the web site turnover. This paper presents an original approach, which equally combines global- and local-alignment techniques (Needleman-Wunsch and Smith-Waterman) in order to automatically segment visitors according to the sequence of visited pages. Experimental results on synthetic datasets show that our approach out-performs other typically used alignment metrics, such as hybrid approaches or Dynamic Time Warping.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Algiriyage, N., Jayasena, S., Dias, G.: Web user profiling using hierarchical clustering with improved similarity measure. In: Moratuwa Engineering Research Conference (MERCon), pp. 295–300. IEEE (2015)
Aruk, T., Ustek, D., Kursun, O.: A comparative analysis of smith-waterman based partial alignment. In: 2012 IEEE Symposium on Computers and Communications (ISCC), pp. 000250–000252. IEEE (2012)
Bouguessa, M.: A practical approach for clustering transaction data. In: Perner, P. (ed.) MLDM 2011. LNCS, vol. 6871, pp. 265–279. Springer, Heidelberg (2011)
Brudno, M., Malde, S., Poliakov, A., Do, C.B., Couronne, O., Dubchak, I., Batzoglou, S.: Glocal alignment: finding rearrangements during alignment. Bioinformatics 19(Suppl. 1), i54–i62 (2003)
Chitraa, V., Thanamni, A.S.: An enhanced clustering technique for web usage mining. International Journal of Engineering Research and Technology 1. ESRSA Publications (2012)
Chordia, B.S., Adhiya, K.P.: Grouping web access sequences using sequence alignment method. Indian Journal of Computer Science and Engineering (IJCSE) 2(3), 308–314 (2011)
Della Vedova, G.: Multiple Sequence Alignment and Phylogenetic Reconstruction: Theory and Methods in Biological Data Analysis. Ph.D. thesis, Citeseer (2000)
Dimopoulos, C., Makris, C., Panagis, Y., Theodoridis, E., Tsakalidis, A.: A web page usage prediction scheme using sequence indexing and clustering techniques. Data & Knowledge Engineering 69(4), 371–382 (2010)
Duraiswamy, K., Mayil, V.V.: Similarity matrix based session clustering by sequence alignment using dynamic programming. Computer and Information Science 1(3), 66 (2008)
Giegerich, R., Wheeler, D.: Pairwise sequence alignment. BioComputing Hypertext Coursebook 2 (1996)
Hay, B., Wets, G., Vanhoof, K.: Clustering navigation patterns on a website using a sequence alignment method. Intelligent Techniques for Web Personalization: IJCAI, 1–6 (2001)
Likic, V.: The needleman-wunsch algorithm for sequence alignment. Lecture given at the 7th Melbourne Bioinformatics Course, Bi021 Molecular Science and Biotechnology Institute, University of Melbourne (2008)
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: International Conference on Data Mining, pp. 911–916. IEEE (2010)
Liu, Y., Hong, Y., Lin, C.Y., Hung, C.L.: Accelerating smith-waterman alignment for protein database search using frequency distance filtration scheme based on cpu-gpu collaborative system. International Journal of Genomics 2015 (2015)
Lu, L., Dunham, M., Meng, Y.: Discovery of significant usage patterns from clusters of clickstream data. In: Proc. of WebKDD, pp. 21–24. Citeseer (2005)
Luu, V.-T., Forestier, G., Fondement, F., Muller, P.-A.: Web site audience segmentation using hybrid alignment techniques. In: Li, X.-L., Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D. (eds.) PAKDD 2015. LNCS, vol. 9441, pp. 29–40. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25660-3_3
Mandal, O.P., Azad, H.K.: Web access prediction model using clustering and artificial neural network. International Journal of Engineering Research and Technology 3. ESRSA Publications (2014)
Meesrikamolkul, W., Niennattrakul, V., Ratanamahatana, C.A.: Shape-based clustering for time series data. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part I. LNCS, vol. 7301, pp. 530–541. Springer, Heidelberg (2012)
Muhamad, F.N., Ahmad, R., Asi, S.M., Murad, M.: Reducing the search space and time complexity of needleman-wunsch algorithm (global alignment) and smith-waterman algorithm (local alignment) for dna sequence alignment. Jurnal Teknologi 77(20) (2015)
Nakamura, A., Kudo, M.: Packing alignment: alignment for sequences of various length events. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 234–245. Springer, Heidelberg (2011)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970)
Perner, P.: A novel method for the interpretation of spectrometer signals based on delta-modulation and similarity determination. In: 2014 IEEE 28th International Conference on Advanced Information Networking and Applications (AINA), pp. 1154–1160. IEEE (2014)
Petitjean, F., Forestier, G., Webb, G., Nicholson, A.E., Chen, Y., Keogh, E., et al.: Dynamic time warping averaging of time series allows faster and more accurate classification. In: International Conference on Data Mining, pp. 470–479. IEEE (2014)
Petitjean, F., Gançarski, P.: Summarizing a set of time series by averaging: From steiner sequence to compact multiple alignment. Theoretical Computer Science 414(1), 76–91 (2012)
Poornalatha, G., Raghavendra, P.S.: Web user session clustering using modified k-means algorithm. In: Lloret Mauri, J., Buford, J.F., Suzuki, J., Thampi, S.M., Abraham, A. (eds.) ACC 2011, Part II. CCIS, vol. 191, pp. 243–252. Springer, Heidelberg (2011)
Qi, Z., Redding, S., Lee, J.Y., Gibb, B., Kwon, Y., Niu, H., Gaines, W.A., Sung, P., Greene, E.C.: Dna sequence alignment by microhomology sampling during homologous recombination. Cell 160(5), 856–869 (2015)
Rendón, E., Abundez, I., Arizmendi, A., Quiroz, E.: Internal versus external cluster validation indexes. International Journal of Computers and Communications 5(1), 27–34 (2011)
Si, J., Li, Q., Qian, T., Deng, X.: Discovering K web user groups with specific aspect interests. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 321–335. Springer, Heidelberg (2012)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)
Yan, R., Xu, D., Yang, J., Walker, S., Zhang, Y.: A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Scientific Reports 3 (2013)
Zahid, S.K., Hasan, L., Khan, A.A., Ullah, S.: A novel structure of the smith-waterman algorithm for efficient sequence alignment. In: International Conference on Digital Information, Networking, and Wireless Communications (DINWC), pp. 6–9. IEEE (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Luu, VT., Ripken, M., Forestier, G., Fondement, F., Muller, PA. (2016). Using Glocal Event Alignment for Comparing Sequences of Significantly Different Lengths. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)