A Pattern Restore Method for Restoring Missing Patterns in Server Side Clickstream Data
When analyzing patterns in server side data, it becomes quickly apparent that some of the data originating from the client is lost, mainly due to the caching of web pages. Missing data is a very important issue when using server side data to analyze a user’s browsing behavior, since the quality of the browsing patterns that can be identified depends on the quality of the data. In this paper, we present a series of experiments to demonstrate the extent of the data loss in different browsing environments and illustrate the difference this makes in the resulting browsing patterns when visualized as footstep graphs. We propose an algorithm, called the P attern R estore M ethod (PRM), for restoring some of the data that has been lost and evaluate the efficiency and accuracy of this algorithm.
KeywordsServer Side Proxy Server User Session Miss Data Problem Side Data
Unable to display preview. Download preview PDF.
- 1.Berendt, B., Mobasher, B., Nakagawa, M., Spiliopoulou, M.: The Impact of Site Structure and User Environment on Session Reconstruction in Web Usage Analysis. In: Proceedings of the WebKDD 2002 Workshop, Edmonton, Alberta, Canada, July 2002, pp. 159–179 (2002)Google Scholar
- 2.Clickstream Technologies Plc.: Technical White Paper: A clickstream Though-leadership Paper, http://www.clickstream.com/docs/cswhitepaper.pdf (Access date: September 6, 2004)
- 3.Cooley, R., Mobasher, B., Srivastava, J.: Data Preparation for Mining World Wide Web Browsing Patterns. Knowledge and Information System 1(1), 5–32 (1999)Google Scholar
- 5.Fenstermacher, K.D., Ginsburg, M.: Mining Client-Side Activity for Personalization. In: Proceedings of the Fourth Workshop on Advanced Issues in Electronic Commerce and Web Information Systems, Newport Beach, California, USA, June 2002, pp. 26–28 (2002)Google Scholar
- 6.Kohavi, R.: Mining E-commerce Data: The Good, the Bad, and the Ugly. In: Proceedings of the KDD 2001 Conference, San Francisco, CA, USA, pp. 8–13 (2001)Google Scholar
- 9.Spiliopoulou, M., Mobasher, B., Berendt, B., Nakagawa, M.: A Framework for the Evaluation of Session Reconstruction Heuristics in Web Usage Analysis. INFORMS Journal of Computing, Special Issue on Mining Web-Based Data for E-Business Applications 15(2), 171–190 (2003)Google Scholar
- 11.Ting, I.H., Kimble, C., Kudenko, D.: Visualizing and Classifying the Pattern of User’s Browsing Behavior for Website Design Recommendation. In: Proceedings of First International Workshop on Knowledge Discovery in Data Stream (ECML/PKDD 2004), Pisa, Italy, September 20-24, pp. 101–102 (2004)Google Scholar