A Pattern Restore Method for Restoring Missing Patterns in Server Side Clickstream Data

  • I-Hsien Ting
  • Chris Kimble
  • Daniel Kudenko
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3399)


When analyzing patterns in server side data, it becomes quickly apparent that some of the data originating from the client is lost, mainly due to the caching of web pages. Missing data is a very important issue when using server side data to analyze a user’s browsing behavior, since the quality of the browsing patterns that can be identified depends on the quality of the data. In this paper, we present a series of experiments to demonstrate the extent of the data loss in different browsing environments and illustrate the difference this makes in the resulting browsing patterns when visualized as footstep graphs. We propose an algorithm, called the P attern R estore M ethod (PRM), for restoring some of the data that has been lost and evaluate the efficiency and accuracy of this algorithm.


Server Side Proxy Server User Session Miss Data Problem Side Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Berendt, B., Mobasher, B., Nakagawa, M., Spiliopoulou, M.: The Impact of Site Structure and User Environment on Session Reconstruction in Web Usage Analysis. In: Proceedings of the WebKDD 2002 Workshop, Edmonton, Alberta, Canada, July 2002, pp. 159–179 (2002)Google Scholar
  2. 2.
    Clickstream Technologies Plc.: Technical White Paper: A clickstream Though-leadership Paper, (Access date: September 6, 2004)
  3. 3.
    Cooley, R., Mobasher, B., Srivastava, J.: Data Preparation for Mining World Wide Web Browsing Patterns. Knowledge and Information System 1(1), 5–32 (1999)Google Scholar
  4. 4.
    Eirinaki, M., Vazirgiannis, M.: Web Mining for Web Personalization. ACM Transactions on Internet Technology 3(1), 1–27 (2003)CrossRefGoogle Scholar
  5. 5.
    Fenstermacher, K.D., Ginsburg, M.: Mining Client-Side Activity for Personalization. In: Proceedings of the Fourth Workshop on Advanced Issues in Electronic Commerce and Web Information Systems, Newport Beach, California, USA, June 2002, pp. 26–28 (2002)Google Scholar
  6. 6.
    Kohavi, R.: Mining E-commerce Data: The Good, the Bad, and the Ugly. In: Proceedings of the KDD 2001 Conference, San Francisco, CA, USA, pp. 8–13 (2001)Google Scholar
  7. 7.
    Lee, J., Podlaseck, M., Schonberg, E., Hoch, R.: Visualization and analysis of clickstream data of online stores for understanding web merchandising. Journal of data mining and knowledge discovery 5, 59–84 (2001)CrossRefGoogle Scholar
  8. 8.
    Pierrakos, D., Paliouras, G., Papatheodorou, C., Spyropoulos, C.D.: Web Usage Mining as a Tool for Personalization: A Survey. User Modeling and User-Adapted Interaction 13, 311–372 (2003)CrossRefGoogle Scholar
  9. 9.
    Spiliopoulou, M., Mobasher, B., Berendt, B., Nakagawa, M.: A Framework for the Evaluation of Session Reconstruction Heuristics in Web Usage Analysis. INFORMS Journal of Computing, Special Issue on Mining Web-Based Data for E-Business Applications 15(2), 171–190 (2003)Google Scholar
  10. 10.
    Tan, P.N., Kumar, V.: Discovery of the Web Robot Sessions Based on their Navigational Patterns. Data Mining and Knowledge Discovery 6, 9–35 (2002)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Ting, I.H., Kimble, C., Kudenko, D.: Visualizing and Classifying the Pattern of User’s Browsing Behavior for Website Design Recommendation. In: Proceedings of First International Workshop on Knowledge Discovery in Data Stream (ECML/PKDD 2004), Pisa, Italy, September 20-24, pp. 101–102 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • I-Hsien Ting
    • 1
  • Chris Kimble
    • 1
  • Daniel Kudenko
    • 1
  1. 1.Department of Computer ScienceThe University of York HeslingtonYorkUnited Kingdom

Personalised recommendations