Clustering Web Page Sessions Using Sequence Alignment Method
This paper illustrates clustering of web page sessions in order to identify the users’ navigation pattern. In the approach presented here, user sessions of variable lengths are compared pair wise, numbers of alignments are found between them and the distances are measured. Web page sessions are clustered by employing the modified k-means algorithm. A couple of web access logs including the well known NASA data set are used to illustrate the effectiveness of the clustering. R-squared measure is applied to determine the optimal number of clusters and chi-squared test is carried out to see the association between the various web page sessions that are clustered. These two measures show the goodness of the clusters formed.
Keywordsclustering sequence alignment web usage mining R-squared measure dynamic programming
Unable to display preview. Download preview PDF.
- 1.Nina, S.P., Rahman, M., Bhuiyan, K.I., Ahmed, K.: Pattern Discovery of Web Usage Mining. In: Int. Conf. on Computer Technology and Development. IEEE (2009)Google Scholar
- 2.Mojica, J.A., Rojas, D.A., Gomez, J., Gonzalez, F.: Page Clustering Using a Distance Based Algorihm. In: 3rd Latin American Web congress, LA-WEB 2005. IEEE (2005)Google Scholar
- 3.Yilmaz, H., Senkul, P.: Using Ontology and Sequence Information for Extracting Behavior Patterns from Web Navigation Logs. In: IEEE Int. Conf. on Data Mining Workshops, pp. 549–556. IEEE (2010)Google Scholar
- 4.Oh, S.: Mining Clusters of Sequences Using Extended Sequence Element-Based Similarity Measure. In: 2nd Int. Conf. on Innovative Computing, Information and Control. IEEE (2007)Google Scholar
- 5.Yanchi, L., Zhongmou, L., Hui, X., Xuedong, G., Junjie, W.: Understanding of Internal Clustering Validation Measures. In: IEEE 10th Int. Conf. on Data Mining, pp. 911–916. IEEE (2010)Google Scholar
- 6.Hey, B., Wets, G., Vanhoof, K.: Mining Navigation Patterns Using a Sequence Alignment Method. J. Know. and Info. Systems, 150–163 (2004)Google Scholar
- 10.What’s a good value for R-squared?, http://www.duke.edu/~rnau/rsquared.htm
- 11.How high, R-squared?, http://cooldata.wordpress.com/2010/04/19/how-high-r-squared/