Advertisement

On the Privacy of Web Search Based on Query Obfuscation: A Case Study of TrackMeNot

  • Sai Teja Peddinti
  • Nitesh Saxena
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6205)

Abstract

Web Search is one of the most rapidly growing applications on the internet today. However, the current practice followed by most search engines – of logging and analyzing users’ queries – raises serious privacy concerns. One viable solution to search privacy is query obfuscation, whereby a client-side software attempts to mask real user queries via injection of certain noisy queries. In contrast to other privacy-preserving search mechanisms, query obfuscation does not require server-side modifications or a third party infrastructure, thus allowing for ready deployment at the discretion of privacy-conscious users. In this paper, our higher level goal is to analyze whether query obfuscation can preserve users’ privacy in practice against an adversarial search engine. We focus on TrackMeNot (TMN) [10,20], a popular search privacy tool based on the principle of query obfuscation. We demonstrate that a search engine, equipped with only a short-term history of a user’s search queries, can break the privacy guarantees of TMN by only utilizing off-the-shelf machine learning classifiers.

Keywords

Web Search Privacy Query Obfuscation Noisy Queries 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    AOL Search Log Mirrors, http://www.gregsadetsky.com/aol-data/
  2. 2.
    Barbaro, M., Zeller, T.J.: A Face Is Exposed for AOL Searcher No. 4417749. The New York Times (August 9, 2006)Google Scholar
  3. 3.
    Schneier, B.: Schneier on Security: TrackMeNot. (2006), http://www.schneier.com/blog/archives/2006/08/trackmenot_1.html
  4. 4.
    Chow, R., Golle, P.: Faking contextual data for fun, profit, and privacy. In: ACM workshop on Privacy in the electronic society, WPES (2009)Google Scholar
  5. 5.
    DTREG - Software For Predictive Modeling and Forecasting. Logistic regression (Feburary 2010), http://www.dtreg.com/logistic.htm
  6. 6.
    Evans, R.: Clustering for Clasification. Master’s thesis, Computer Science, University of Waikato (2007), http://adt.waikato.ac.nz/uploads/approved/adt-uow20070730.091151/public/02whole.pdf
  7. 7.
  8. 8.
    Frank, E., Kirkby, R.: Random tree (Feburary 2010), http://weka.sourceforge.net/doc/weka/classifiers/trees/RandomTree.html
  9. 9.
    Hansell, S.: Marketers Trace Paths Users Leave on Internet. The New York Times (September 15, 2006)Google Scholar
  10. 10.
    Howe, D., Nissenbaum, H.: TrackMeNot: Resisting Surveillance in Web Search. In: Kerr, I., Lucock, C., Steeves, V. (eds.) On the Identity Trail: Privacy, Anonymity and Identity in a Networked Society (2008)Google Scholar
  11. 11.
    Jones, R., Kumar, R., Pang, B., Tomkins, A.: i know what you did last summer: query logs and user privacy. In: Conference on information and knowledge management, CIKM (2007)Google Scholar
  12. 12.
    Jones, R., Kumar, R., Pang, B., Tomkins, A.: Vanity fair: privacy in querylog bundles. In: Conference on Information and knowledge management, CIKM (2008)Google Scholar
  13. 13.
  14. 14.
    Kushilevitz, E., Ostrovsky, R.: Replication is not needed: single database, computationally-private information retrieval. In: Symposium on Foundations of Computer Science, FOCS (1997)Google Scholar
  15. 15.
    NYTimes: Google Resists U.S. Subpoena of Search Data, http://www.nytimes.com/2006/01/20/technology/20google.html?_r=1
  16. 16.
    PlanetLab: An open platform for developing, deploying, and accessing planetary-scale services, http://www.planet-lab.org/
  17. 17.
    Saint-Jean, F., Johnson, A., Boneh, D., Feigenbaum, J.: Private web search. In: ACM workshop on Privacy in Electronic Society (WPES) (2007)Google Scholar
  18. 18.
    Scroogle.org, http://scroogle.org/
  19. 19.
    Tor Anonymizing Network, http://www.torproject.org/
  20. 20.
    TrackMeNot: Browser Plugin, http://www.mrl.nyu.edu/~dhowe/trackmenot/
  21. 21.
    Trancer, B.: Click: What millions of people are doing online and why it matters. Hyperion (2008)Google Scholar
  22. 22.
    Wikipedia. Alternating decision tree (Feburary 2010), http://en.wikipedia.org/wiki/Alternating_decision_tree
  23. 23.
    Witten, I., Frank, E.: Data Mining–Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier, Amsterdam (2005)zbMATHGoogle Scholar
  24. 24.
    Ye, S., Wu, S.F., Pandey, R., Chen, H.: Noise injection for search privacy protection. In: Conference on Computational Science and Engineering, CSE (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Sai Teja Peddinti
    • 1
  • Nitesh Saxena
    • 1
  1. 1.Computer Science and EngineeringPolytechnic Institute of New York UniversityUSA

Personalised recommendations