Usage Data in Web Search: Benefits and Limitations
Web Search, which takes its root in the mature field of information retrieval, evolved tremendously over the last 20 years. The field encountered its first revolution when it started to deal with huge amounts of Web pages. Then, a major step was accomplished when engines started to consider the structure of the Web graph and link analysis became a differentiator in both crawling and ranking. Finally, a more discrete, but not less critical step, was made when search engines started to monitor and mine the numerous (mostly implicit) signals provided by users while interacting with the search engine. We focus here on this third “revolution” of large scale usage data. We detail the different shapes it takes, illustrating its benefits through a review of some winning search features that could not have been possible without it. We also discuss its limitations and how in some cases it even conflicts with some natural users’ aspirations such as personalization and privacy. We conclude by discussing how some of these conflicts can be circumvented by using adequate aggregation principles to create “ad hoc”crowds.
KeywordsWeb search usage data wisdom of crowds large scale data mining privacy personalization long tail
Unable to display preview. Download preview PDF.
- 2.Baeza-Yates, R., Broder, A., Maarek, Y.: The New Frontier of Web Search Technology: Seven Challenges, ch. 2, pp. 11–23. Springer (2011)Google Scholar
- 3.Baeza-Yates, R., Maarek, Y.: Web retrieval. In: Baeza-Yates, R., Ribeiro-Neto, B. (eds.) Modern Information Retrieval: The Concepts and Technology behind Search, 2nd edn. Addison-Wesley (2011)Google Scholar
- 5.Barbaro, M., Zeller Jr., T.: A face is exposed for aol searcher no. 4417749. The New York Times, August 9 (2006)Google Scholar
- 6.Bilton, N.: Erasing the digital past. The New York Times (April 2011), http://www.nytimes.com/2011/04/03/fashion/03reputation.html
- 12.Hamilton, A.: Why cuil is no threat to google. Time.com (Time Magazine Online) (July 2008), http://www.time.com/time/business/article/0,8599,1827331,00.html
- 13.Huang, J., White, R.W., Dumais, S.: No clicks, no problem: using cursor movements to understand and improve search. In: Proceedings of the 2011 Annual Conference on Human Factors in Computing Systems, CHI 2011, pp. 1225–1234. ACM, New York (2011)Google Scholar
- 14.Kadouch, D.: Local flavor for google suggest. The Official Google Blog (March 2009), http://googleblog.blogspot.com/2009/03/local-flavor-for-google-suggest.html
- 15.Kukich, K.: Techniques for automatically corecting words in text. ACM Computing Surveys 24(4) (December 1992)Google Scholar
- 16.Mullin, J.: FTC commissioner: If companies don’t protect privacy, we’ll go to congress. paidContent.org, the Economics of Digital Content (February 2011)Google Scholar
- 17.Pariser, E.: The Filter Bubble: What the Internet Is Hiding from You. Penguin Press (2011)Google Scholar
- 19.Shi, X.: Social network analysis of web search engine query logs. Technical report, School of Information, University of Michigan (2007)Google Scholar
- 22.Surowiecki, J.: The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. Random House (2004)Google Scholar