Web User Session Clustering Using Modified K-Means Algorithm

  • G. Poornalatha
  • Prakash S. Raghavendra
Part of the Communications in Computer and Information Science book series (CCIS, volume 191)


The proliferation of internet along with the attractiveness of the web in recent years has made web mining as the research area of great magnitude. Web mining essentially has many advantages which makes this technology attractive to researchers. The analysis of web user’s navigational pattern within a web site can provide useful information for applications like, server performance enhancements, restructuring a web site, direct marketing in ecommerce etc. The navigation paths may be explored based on some similarity criteria, in order to get the useful inference about the usage of web. The objective of this paper is to propose an effective clustering technique to group users’ sessions by modifying K-means algorithm and suggest a method to compute the distance between sessions based on similarity of their web access path, which takes care of the issue of the user sessions that are of variable length.


web mining clustering K-means Jaccard Index 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Facca, F.M., Lanzi, P.L.: Mining interesting knowledge from web logs: a survey. Journal of Data and Knowledge Engineering 53, 225–241 (2005)CrossRefGoogle Scholar
  2. 2.
    Xie, Y., Phoha, V.V.: Web User clustering from Access Log Using Belief function. In: Proceedings of the First International Conference On Knowledge Capture (K-CAP 2001), pp. 202–208. ACM Press, New York (2001)Google Scholar
  3. 3.
    Li, C.: Algorithm of Web Session Clustering Based on Increase of Similarities. In: Proceedings of International Conference on Information Management, Innovation Management and Industrial Engineering, pp. 316–319. IEEE, Los Alamitos (2008)Google Scholar
  4. 4.
    Krol, D., Scigajlo, M., Trawinski, B.: Investigation of Internet System User Behavior Using Cluster Analysis. In: Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, pp. 3408–3412. IEEE, Los Alamitos (2008)Google Scholar
  5. 5.
    Fu, Y., Sandhu, K., Shih, M.-Y.: Clustering of Web Users Based on Access Patterns. In: KDD workshop on Web Mining, San Diego, CA (1999)Google Scholar
  6. 6.
    Raghavendra, P.S., Chowdhury, S.R., Kameswari, S.V.: Comparative Study of Neural Networks and K-Means Classification in Web Usage Mining. In: Proceedings of 5th IEEE International Conference for Internet Technology and Secured Transaction (ICITST). IEEE, Los Alamitos (2010)Google Scholar
  7. 7.
    Xu, J.-H., Liu, H.: Web User Clustering Analysis based on KMeans Algorithm. In: Proceedings of 2010 International conference on Information, Networking and Automation (ICINA), pp. V26–V29. IEEE, Los Alamitos (2010)Google Scholar
  8. 8.
    Srivastava, J., Cooley, R., Deshpande, M.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. In: ACM SIGKDD, vol. 1, pp. 12–23 (2000)Google Scholar
  9. 9.
    Pallis, G., Angelis, L., Vakali, A.: Validation and interpretation of Web users’ sessions clusters. Journal of Information Processing & Management 43, 1348–1367 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • G. Poornalatha
    • 1
  • Prakash S. Raghavendra
    • 1
  1. 1.Department of Information TechnologyNational Institute of Technology Karnataka (NITK)MangaloreIndia

Personalised recommendations