Mining Significant Usage Patterns from Clickstream Data

  • Lin Lu
  • Margaret Dunham
  • Yu Meng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4198)


Discovery of usage patterns from Web data is one of the primary purposes for Web Usage Mining. In this paper, a technique to generate Significant Usage Patterns (SUP) is proposed and used to acquire significant “user preferred navigational trails”. The technique uses pipelined processing phases including sub-abstraction of sessionized Web clickstreams, clustering of the abstracted Web sessions, concept-based abstraction of the clustered sessions, and SUP generation. Using this technique, valuable customer behavior information can be extracted by Web site practitioners. Experiments conducted using Web log data provided by J.C.Penney demonstrate that SUPs of different types of customers are distinguishable and interpretable. This technique is particularly suited for analysis of dynamic websites.


User Session Longe Common Subsequence Abstraction Hierarchy General Page Clickstream Data 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. 11 Intl. Conf. on Data Engineering, Taipi, Taiwan (March 1995)Google Scholar
  2. 2.
    Buchner, A.G., Baumgarten, M., Anand, S.S., Mulvenna, M.D., Hughes, J.G.: Navigation Pattern Discovery From Internet Data. In: Workshop on Web Usage Analysis and User Profiling (August 1999)Google Scholar
  3. 3.
    Berkhin, P.: Survey Of Clustering Data Mining Techniques. Accrue Software, Technical Report (2002)Google Scholar
  4. 4.
    Banerjee, A., Ghosh, J.: Clickstream Clustering using Weighted Longest Common Subsequences. In: Proc. of the Workshop on Web Mining, SIAM Conference on Data Mining, pp. 33–40. Chicago IL (April 2001)Google Scholar
  5. 5.
    Borges, J., Levene, M.: Data Mining of User Navigation Patterns. In: Masand, B., Spiliopoulou, M. (eds.) WebKDD 1999. LNCS (LNAI), vol. 1836, pp. 92–112. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Borges, J., Levene, M.: An average linear time algorithm for web data mining. International Journal of Information Technology and Decision Making 3, 307–320 (2004)CrossRefGoogle Scholar
  7. 7.
    Cadez, I.V., Heckerman, D., Meek, C., Smyth, P., White, S.: Visualization of Navigation Patterns on a Web Site Using Model Based Clustering. In: Proc. of 6th ACM SIGKDD Int’l. Conf. on Knowledge Discovery and Data Mining (2000)Google Scholar
  8. 8.
    Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems 1(1), 5–32 (1999)Google Scholar
  9. 9.
    Chen, M.-S., Park, J.S., Yu, P.S.: Efficient Data Mining for Path Traversal Patterns. IEEE Transactions on Knowledge and Data Engineering 10(2), 209–221 (1998)CrossRefGoogle Scholar
  10. 10.
    Dunham, M.H.: Data Mining Introductory and Advanced Topics. Prentice-Hall, Englewood Cliffs (2003)Google Scholar
  11. 11.
    Fu, Y., Sandhu, K., Shih, M.: Clustering of web users based on access patterns. In: Masand, B., Spiliopoulou, M. (eds.) WebKDD 1999. LNCS (LNAI), vol. 1836. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  12. 12.
    Foss, A., Wang, W., Zaïane, O.R.: A non-parametric approach to web log analysis. In: Proc. of Workshop on Web Mining in First International SIAM Conference on Data Mining, Chicago, April 2001, pp. 41–50 (2001)Google Scholar
  13. 13.
    Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. In: ICDE (1999)Google Scholar
  14. 14.
    Gündüz, Ş., Özsu, M.T.: A Web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, D.C, August 24-27 (2003)Google Scholar
  15. 15.
    Hair, J.F., Andersen, R.E., Tatham, R.L., Black, W.C.: Multivariate Data Analysis. Prentice Hall, New Jersey (1998)Google Scholar
  16. 16.
    Hay, B., Wets, G., Vanhoof, K.: Clustering Navigation Patterns on a Website Using a Sequence Alignment Method. In: IJCAI’s Workshop on Intelligent Techniques for Web Personalization (2001)Google Scholar
  17. 17.
    Karypis, G., Han, E.-H., Kumar, V.: Chameleon: A hierarchical clustering algorithm using dynamic modeling. IEEE Computer 32(8), 68–75 (1999)Google Scholar
  18. 18.
    Moe, W.W.: Buying, Searching, or Browsing: Differentiating between Online Shoppers Using In-Store Navigational Clickstream. Journal of Consumer Psychology 13(1&2), 29–40 (2003)Google Scholar
  19. 19.
    Pei, J., Han, J., Mortazavi-Asl, B., Zhu, H.: Mining Access Patterns Efficiently From Web Logs. In: Proc. of Pacific Asia Conf. on Knowledge Discovery and Data Mining, Kyoto, Japan, p. 592 (April 2000)Google Scholar
  20. 20.
    Srivastava, J., Cooley, R., Deshpande, M., Tan, P.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD Explorations 1(2), 12–23 (2000)CrossRefGoogle Scholar
  21. 21.
    Setubal, Meidanis: Introduction to Computational Molecular Biology. PWS Publishing Company (1997)Google Scholar
  22. 22.
    Spiliopoulou, M., Pohle, C., Teltzrow, M.: Modelling Web Site Usage with Sequences of Goal-Oriented Tasks. In: Multi-Konferenz Wirtschaftsinformatik 2002 vom 9.-11, Nürnberg (September 2002)Google Scholar
  23. 23.
    Wang, W., Zaïane, O.R.: Clustering Web Sessions by Sequence Alignment. In: Third International Workshop on Management of Information on the Web in conjunction with 13th International Conference on Database and Expert Systems Applications DEXA 2002, Aix en Provence, France, September 2-6, pp. 394–398 (2002)Google Scholar
  24. 24.
    Xiao, Y.-Q., Dunham, M.H.: Efficient mining of traversal patterns. Data and Knowledge Engineering 39(2), 191–214 (2001)MATHCrossRefGoogle Scholar
  25. 25.
    Nasraoui, O., Frigui, H., Joshi, A., Krishnapuram, R.: Mining Web Access Logs Using Relational Competitive Fuzzy Clustering. In: Proceedings of the Eighth International Fuzzy Systems Association Congress, Hsinchu, Taiwan (August 1999)Google Scholar
  26. 26.
    Nasraoui, O., Frigui, H., Krishnapuram, R., Joshi, A.: Extracting Web User Profiles Using Relational Competitive Fuzzy Clustering. International Journal on Artificial Intelligence Tools 9(4), 509–526 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Lin Lu
    • 1
  • Margaret Dunham
    • 1
  • Yu Meng
    • 1
  1. 1.Department of Computer Science and EngineeringSouthern Methodist UniversityDallasUSA

Personalised recommendations