User Modeling and User-Adapted Interaction

, Volume 25, Issue 4, pp 331–369 | Cite as

Methods for web revisitation prediction: survey and experimentation

  • George Papadakis
  • Ricardo Kawase
  • Eelco Herder
  • Wolfgang Nejdl
Article

Abstract

More than 45 % of the pages that we visit on the Web are pages that we have visited before. Browsers support revisits with various tools, including bookmarks, history views and URL auto-completion. However, these tools only support revisits to a small number of frequently and recently visited pages. Several browser plugins and extensions have been proposed to better support the long tail of less frequently visited pages, using recommendation and prediction techniques. In this article, we present a systematic overview of revisitation prediction techniques, distinguishing them into two main types and several subtypes. We also explain how the individual prediction techniques can be combined into comprehensive revisitation workflows that achieve higher accuracy. We investigate the performance of the most important workflows and provide a statistical analysis of the factors that affect their predictive accuracy. Further, we provide an upper bound for the accuracy of revisitation prediction using an ‘oracle’ that discards non-revisited pages.

Keywords

Web behavior Navigation entropy Revisitation prediction  Revisitation evaluation 

References

  1. Adar, E., Teevan, J., Dumais, S.T.: Large scale analysis of web revisitation patterns. In: Proceedings of the 26th Conference on Human Factors in Computing Systems, CHI 2008, 2008, Florence, Italy, April 5–10 2008, pp. 1197–1206 (2008)Google Scholar
  2. Adar, E., Teevan, J., Dumais, S.T.: Resonance on the web: web dynamics and revisitation patterns. In: Proceedings of the 27th International Conference on Human Factors in Computing Systems, CHI 2009, Boston, MA, USA, 4–9 April 2009, pp. 1381–1390 (2009)Google Scholar
  3. Adomavicius, G., Tuzhilin, A.: Using data mining methods to build customer profiles. IEEE Comput. 34(2), 74–82 (2001)CrossRefGoogle Scholar
  4. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C., 26–28 May 1993, pp. 207–216 (1993)Google Scholar
  5. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases, 12–15 Sept 1994, Santiago de Chile, Chile, pp. 487–499 (1994)Google Scholar
  6. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering, 6–10 March 1995, Taipei, Taiwan, pp. 3–14 (1995)Google Scholar
  7. Albrecht, D.W., Zukerman, I., Nicholson, A.E.: Pre-sending documents on the WWW: A comparative study. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, IJCAI 99, Stockholm, Sweden, 31 July– 6 Aug 1999, pp. 1274–1279 (1999)Google Scholar
  8. Arcuri, M., Coon, T., Johnson, J., Manning, A., Van Tilburg, M.: Adaptive menus. US Patent 6,121,968 (2000)Google Scholar
  9. Awad, M., Khan, L., Thuraisingham, B.M.: Predicting WWW surfing using multiple evidence combination. VLDB J. Int. J. Very Large Data Bases 17(3), 401–417 (2008)CrossRefGoogle Scholar
  10. Billsus, D., Pazzani, M.J.: A hybrid user model for news story classification. In: Proceedings of the 7th International Conference on User Modeling, UM 99, Banff, Canada, pp. 99–108 (1999)Google Scholar
  11. Brank, J., Milic-Frayling, N., Frayling, A., Smyth, G.: Predictive algorithms for browser support of habitual user activities on the web. In: 2005 IEEE / WIC/ACM International Conference on Web Intelligence (WI 2005), 19–22 Sept 2005, Compiegne, France, pp. 629–635 (2005)Google Scholar
  12. Brusilovsky, P.: Adaptive hypermedia. User Model. User-Adap. Interact. 11(1–2), 87–110 (2001)MATHCrossRefGoogle Scholar
  13. Catledge, L.D., Pitkow, J.E.: Characterizing browsing strategies in the world-wide web. Comput. Netw. ISDN Syst. 27(6), 1065–1073 (1995)CrossRefGoogle Scholar
  14. Chierichetti, F., Kumar, R., Tomkins, A.: Stochastic models for tabbed browsing. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, 26–30 April 2010, pp. 241–250 (2010)Google Scholar
  15. Cockburn, A., McKenzie, B.: What do web users do? An empirical analysis of web use. Int. J. Hum Comput Stud. 54(6), 903–922 (2001)MATHCrossRefGoogle Scholar
  16. Cormode, G., Shkapenyuk, V., Srivastava, D., Xu, B.: Forward decay: a practical time decay model for streaming systems. In: Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, 29 March–2 April 2009, Shanghai, China, pp. 138–149 (2009)Google Scholar
  17. Crabtree, I.B., Soltysiak, S.J.: Identifying and tracking changing interests. Int. J. Digit. Libr 2(1), 38–53 (1998)CrossRefGoogle Scholar
  18. Deshpande, M., Karypis, G.: Selective markov models for predicting web page accesses. ACM Trans. Internet Technol. 4(2), 163–184 (2004)CrossRefGoogle Scholar
  19. Ding, Y., Li, X.: Time weight collaborative filtering. In: Proceedings of the 14th ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, 31 Oct–5 Nov 2005, pp. 485–492 (2005)Google Scholar
  20. El-Sayed, M., Ruiz, C., Rundensteiner, E.A.: Fs-miner: efficient and incremental mining of frequent sequence patterns in web logs. In: 6th ACM CIKM International Workshop on Web Information and Data Management (WIDM 2004), Washington, DC, USA, 12–13 Nov 2004, pp. 128–135 (2004)Google Scholar
  21. Findlater, L., McGrenere, J.: A comparison of static, adaptive, and adaptable menus. In: Proceedings of the 2004 Conference on Human Factors in Computing Systems, CHI 2004, Vienna, Austria, 24–29 April 2004, pp. 89–96 (2004)Google Scholar
  22. Fitchett, S., Cockburn, A.: Accessrank: predicting what users will do next. In: Proceedings of the 2012 CHI Conference on Human Factors in Computing Systems, Austin, TX, USA, 05–10 May 2012, pp. 2239–2242 (2012)Google Scholar
  23. Fox, S., Karnawat, K., Mydland, M., Dumais, S.T., White, T.: Evaluating implicit measures to improve web search. ACM Trans. Inf. Syst 23(2), 147–168 (2005)CrossRefGoogle Scholar
  24. Fu, X., Budzik, J., Hammond, K.J.: Mining navigation history for recommendation. In: Proceedings of the 5th International Conference on Intelligent User Interfaces, IUI 00, New Orleans, Louisiana, USA, pp. 106–112 (2000)Google Scholar
  25. Gaul, W., Schmidt-Thieme, L.: Mining generalized association rules for sequential and path data. In: Proceedings of the 2001 IEEE International Conference on Data Mining, 29 Nov–2 Dec 2001, San Jose, California, USA, pp. 593–596 (2001)Google Scholar
  26. Géry, M., Haddad, M.H.: Evaluation of web usage mining approaches for user’s next request prediction. In: 5th ACM CIKM International Workshop on Web Information and Data Management (WIDM 2003), New Orleans, Louisiana, USA, 7–8 Nov 2003, pp. 74–81 (2003)Google Scholar
  27. Hawking, D., Craswell, N., Bailey, P., Griffiths, K.: Measuring search engine quality. Inf. Retr 4(1), 33–59 (2001)MATHCrossRefGoogle Scholar
  28. Kawase, R., Papadakis, G., Herder, E., Nejdl, W.: The impact of bookmarks and annotations on refinding information. In: HT’10, Proceedings of the 21st ACM Conference on Hypertext and Hypermedia, Toronto, Ontario, Canada, 13–16 June 2010, pp. 29–34 (2010)Google Scholar
  29. Kawase, R., Papadakis, G., Herder, E., Nejdl, W.: Beyond the usual suspects: context-aware revisitation support. In: HT’11, Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia, Eindhoven, The Netherlands, 6–9 June 2011, pp. 27–36 (2011)Google Scholar
  30. Koren, Y.: Collaborative filtering with temporal dynamics. Commun. ACM 53(4), 89–97 (2010)CrossRefGoogle Scholar
  31. Koychev, I., Schwab, I.: Adaptation to drifting user’s interests. In: Proceedings of ECML Workshop: Machine Learning in New Information Age, Barcelona, Spain, pp. 39–46 (2000)Google Scholar
  32. Kumar, R., Tomkins, A.: A characterization of online browsing behavior. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, 26–30 April 2010, pp. 561–570 (2010)Google Scholar
  33. Lee, D., Choi, J., Kim, J.H., Noh, S.H., Min, S.L., Cho, Y., Kim, C.S.: On the existence of a spectrum of policies that subsumes the least recently used (lru) and least frequently used (lfu) policies. In: Proceedings of the 1999 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, Atlanta, Georgia, USA, pp. 134–143 (1999)Google Scholar
  34. Lymberopoulos, D., Riva, O., Strauss, K., Mittal, A., Ntoulas, A.: Pocketweb: instant web browsing for mobile devices. In: Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2012, London, UK, 3–7 March 2012, pp. 1–12 (2012)Google Scholar
  35. Mayer, M.: Web history tools and revisitation support: a survey of existing approaches and directions. Found. Trends Hum.-Comput. Interact. 2(3), 173–278 (2009)CrossRefGoogle Scholar
  36. Milic-Frayling, N., Jones, R., Rodden, K., Smyth, G., Blackwell, A.F., Sommerer, R.: Smartback: supporting users in back navigation. In: Proceedings of the 13th international conference on World Wide Web, WWW 2004, New York, NY, USA, 17–20 May 2004, pp. 63–71 (2004)Google Scholar
  37. Mitchell, T.M., Caruana, R., Freitag, D., McDermott, J.P., Zabowski, D.: Experience with a learning personal assistant. Commun. ACM 37(7), 80–91 (1994)CrossRefGoogle Scholar
  38. Mobasher, B., Cooley, R., Srivastava, J.: Automatic personalization based on web usage mining. Commun. ACM 43(8), 142–151 (2000)CrossRefGoogle Scholar
  39. Morris, D., Morris, M.R., Venolia, G.: Searchbar: a search-centric web history for task resumption and information re-finding. In: Proceedings of the 2008 ACM CHI Conference on Human Factors in Computing Systems, Florence, Italy, 5–10 April 2008, pp. 1207–1216 (2008)Google Scholar
  40. Obendorf, H., Weinreich, H., Herder, E., Mayer, M.: Web page revisitation revisited: implications of a long-term click-stream study of browser usage. In: Proceedings of the 2007 ACM CHI Conference on Human Factors in Computing Systems, San Jose, California, USA, 28 April–3 May 2007, pp. 597–606 (2007)Google Scholar
  41. Papadakis, G., Kawase, R., Herder, E.: Client- and server-side revisitation prediction with SUPRA. In: 2nd International Conference on Web Intelligence, Mining and Semantics, WIMS ’12, Craiova, Romania, 6–8 June 2012, p. 14 (2012)Google Scholar
  42. Papadakis, G., Niederée, C., Nejdl, W.: Decay-based ranking for social application content. In: WEBIST 2010, Proceedings of the 6th International Conference on Web Information Systems and Technologies, Volume 1, Valencia, Spain, 7–10 April 2010, pp. 276–281 (2010)Google Scholar
  43. Parameswaran, A.G., Koutrika, G., Bercovitz, B., Garcia-Molina, H.: Recsplorer: recommendation algorithms based on precedence mining. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, 6–10 June 2010, pp. 87–98 (2010)Google Scholar
  44. Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: P. Brusilovsky, A. Kobsa, W. Nejdl (eds.) The Adaptive Web, Methods and Strategies of Web Personalization, Lecture Notes in Computer Science, pp. 325–341. Springer (2007)Google Scholar
  45. Sandvig, J.J., Mobasher, B., Burke, R.D.: Robustness of collaborative recommendation based on association rule mining. In: Proceedings of the 2007 ACM Conference on Recommender Systems, RecSys 2007, Minneapolis, MN, USA, 19–20 Oct 2007, pp. 105–112 (2007)Google Scholar
  46. Shani, G., Heckerman, D., Brafman, R.I.: An mdp-based recommender system. J. Mach. Learn. Res. 6, 1265–1295 (2005)MATHMathSciNetGoogle Scholar
  47. Sugiyama, K., Hatano, K., Yoshikawa, M.: Adaptive web search based on user profile constructed without any effort from users. In: Proceedings of the 13th international conference on World Wide Web, WWW 2004, New York, NY, USA, 17–20 May 2004, pp. 675–684 (2004)Google Scholar
  48. Takano, H., Winograd, T.: Dynamic bookmarks for the WWW. In: HYPERTEXT ’98. Proceedings of the 9th ACM Conference on Hypertext and Hypermedia: Links, Objects, Time and Space—Structure in Hypermedia Systems, 20–24 June 1998, Pittsburgh, PA, USA, pp. 297–298 (1998)Google Scholar
  49. Tauscher, L., Greenberg, S.: How people revisit web pages: empirical findings and implications for the design of history systems. Int. J. Hum. Comput. Stud. 47(1), 97–137 (1997)CrossRefGoogle Scholar
  50. Teevan, J., Dumais, S.T., Liebling, D.J.: A longitudinal study of how highlighting web content change affects people’s web interactions. In: Proceedings of the 28th International Conference on Human Factors in Computing Systems, CHI 2010, Atlanta, Georgia, USA, 10–15 April 2010, pp. 1353–1356 (2010)Google Scholar
  51. Tyler, S.K., Teevan, J.: Large scale query log analysis of re-finding. In: Proceedings of the 3rd International Conference on Web Search and Web Data Mining, WSDM 2010, New York, NY, USA, 4–6 Feb 2010, pp. 191–200 (2010)Google Scholar
  52. Weinreich, H., Obendorf, H., Herder, E., Mayer, M.: Off the beaten tracks: exploring three aspects of web navigation. In: Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh, Scotland, UK, 23–26 May 2006, pp. 133–142 (2006)Google Scholar
  53. Yang, H., Parthasarathy, S.: On the use of constrained associations for web log mining. In: O. Zaiane, J. Srivastava, M. Spiliopoulou, B. Masand (eds.) WEBKDD 2002—Mining Web Data for Discovering Usage Patterns and Profiles, Lecture Notes in Computer Science, pp. 100–118. Springer (2003)Google Scholar
  54. Yao, Y., Shi, L., Wang, Z.: A markov prediction model based on page hierarchical clustering. Int. J. Distrib. Sens. Netw. 5(1), 89–89 (2009)CrossRefGoogle Scholar
  55. Zukerman, I., Albrecht, D.W., Nicholson, A.E.: Predicting users’ requests on the www. In: Proceedings of the 7th International Conference on User Modeling, UM 99, Banff, Canada, pp. 275–284 (1999)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  • George Papadakis
    • 1
  • Ricardo Kawase
    • 2
  • Eelco Herder
    • 2
  • Wolfgang Nejdl
    • 2
  1. 1.Department of Informatics and TelecommunicationsUniversity of AthensPanepistimiopolis, IlissiaGreece
  2. 2.Leibniz University of Hanover & L3S Research CenterHanoverGermany

Personalised recommendations