A Versatile Tool for Privacy-Enhanced Web Search
We consider the problem of privacy leaks suffered by Internet users when they perform web searches, and propose a framework to mitigate them. Our approach, which builds upon and improves recent work on search privacy, approximates the target search results by replacing the private user query with a set of blurred or scrambled queries. The results of the scrambled queries are then used to cover the original user interest. We model the problem theoretically, define a set of privacy objectives with respect to web search and investigate the effectiveness of the proposed solution with a set of real queries on a large web collection. Experiments show great improvements in retrieval effectiveness over a previously reported baseline in the literature. Furthermore, the methods are more versatile, predictably-behaved, applicable to a wider range of information needs, and the privacy they provide is more comprehensible to the end-user.
KeywordsDocument Sample Semantic Approach Retrieval Effectiveness Target Document Pointwise Mutual Information
Unable to display preview. Download preview PDF.
- 2.Barbaro, M., Zeller, T.: A Face Is Exposed for AOL Searcher No. 4417749 (2006), http://www.nytimes.com/2006/08/09/technology/09aol.html (accessed June 3, 2010)
- 3.Bouma, G.: Normalized (pointwise) mutual information in collocation extraction. In: GSCL, pp. 31–40. Gunter Narr Verlag, Tbingen (2009)Google Scholar
- 4.Brown, P.F., Pietra, V.J.D., de Souza, P.V., Lai, J.C., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)Google Scholar
- 7.Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge University Press (2008)Google Scholar
- 8.Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: InfoScale. ACM (2006)Google Scholar
- 9.Solove, D.J.: Understanding Privacy. Harvard University Press (2008)Google Scholar
- 11.Terra, E.L., Clarke, C.L.A.: Frequency estimates for statistical word similarity measures. In: NAACL-HLT, pp. 165–172. ACL (2003)Google Scholar
- 12.Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: ICML, pp. 412–420. Morgan Kaufmann (1997)Google Scholar