, Volume 98, Issue 8, pp 827–846 | Cite as

Privacy-preserving distributed collaborative filtering

  • Antoine Boutet
  • Davide Frey
  • Rachid Guerraoui
  • Arnaud Jégou
  • Anne-Marie Kermarrec


We propose a new mechanism to preserve privacy while leveraging user profiles in distributed recommender systems. Our mechanism relies on two contributions: (i) an original obfuscation scheme, and (ii) a randomized dissemination protocol. We show that our obfuscation scheme hides the exact profiles of users without significantly decreasing their utility for recommendation. In addition, we precisely characterize the conditions that make our randomized dissemination protocol differentially private. We compare our mechanism with a non-private as well as with a fully private alternative. We consider a real dataset from a user survey and report on simulations as well as planetlab experiments. We dissect our results in terms of accuracy and privacy trade-offs, bandwidth consumption, as well as resilience to a censorship attack. In short, our extensive evaluation shows that our twofold mechanism provides a good trade-off between privacy and accuracy, with little overhead and high resilience.


Privacy Collaborative filtering Obfuscation Distributed system Differential privacy 

Mathematics Subject Classification

68W15 68M14 


  1. 1.
    Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: PODS, ACM, New York, NY, pp 247–255Google Scholar
  2. 2.
    Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: SIGMOD, ACM, New York, NY, pp 439–450Google Scholar
  3. 3.
    Ahmad W, Khokhar A (2007) An architecture for privacy preserving collaborative filtering on web portals. In: IAS, Manchester, pp 273–278Google Scholar
  4. 4.
    Alaggan M, Gambs S, Kermarrec A-M (2012) BLIP: non-interactive differentially-private similarity computation on bloom filters. In: Richa AW, Scheideler C (eds) Stabilization, safety, and security of distributed systems. Lecture notes in computer science, vol 7596. Springer, Berlin, pp 202–216Google Scholar
  5. 5.
    Boutet A, Frey D, Guerraoui R, Jégou A, Kermarrec A-M (2013) WHATSUP: a decentralized instant news recommender. In: IEEE 27th international symposium on parallel distributed processing (IPDPS), Boston, MA, pp 741–752Google Scholar
  6. 6.
    Canny J (2002) Collaborative filtering with privacy via factor analysis. In: SIGIR, ACM, New York, NY, pp 238–245Google Scholar
  7. 7.
    Das AS, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online collaborative filtering. In: WWW, ACM, New York, NY, pp 271–280Google Scholar
  8. 8.
    Dwork C (2008) Differential privacy: a survey of results. In: Theory and applications of models of computation. Springer, Berlin, pp 1–19Google Scholar
  9. 9.
    Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: Halevi S, Rabin T (eds) Theory of cryptography. Lecture notes in computer science, vol 3876. Springer, Berlin, pp 265–284Google Scholar
  10. 10.
    Goldreich O (2003) Cryptography and cryptographic protocols. Distrib Comput 16(2–3):177–199. doi: 10.1007/s00446-002-0077-1
  11. 11.
    Haeberlen A, Pierce BC, Narayan A (2011) Differential privacy under fire. In: SEC, USENIX Association, Berkeley, CA, p 33Google Scholar
  12. 12.
    Huang Z, Du W, Chen B (2005) Deriving private information from randomized data. In: SIGMOD, ACM, New York, NY, pp 37–48Google Scholar
  13. 13.
    Kanerva P, Kristoferson J, Holst A (2000) Random indexing of text samples for latent semantic analysis. In: CCSS, University of Pennsylvania, Philadelphia, PA, p 1036Google Scholar
  14. 14.
    Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: ICDM, pp 99–106Google Scholar
  15. 15.
    Machanavajjhala A, Korolova A, Sarma AD (2011) Personalized social recommendations: accurate or private. Proc VLDB Endow 4(7):440–450Google Scholar
  16. 16.
    Polat H, Du W (2003) Privacy-preserving collaborative filtering using randomized perturbation techniques. In: ICDM, pp 625–628Google Scholar
  17. 17.
    Polat H, Du W (2005) SVD-based collaborative filtering with privacy. In: SAC, ACM, New York, NY, pp 791–795Google Scholar
  18. 18.
    Singh A, Castro M, Druschel P, Rowstron A (2004) Defending against eclipse attacks on overlay networks. In: SIGOPS, ACM, New York, NYGoogle Scholar
  19. 19.
    Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell 2009:4Google Scholar
  20. 20.
    Tarkoma S, Rothenberg CE, Lagerspetz E (2012) Theory and practice of bloom filters for distributed systems. IEEE Commun Surv Tutor 14(1):131–155Google Scholar
  21. 21.
    van Rijsbergen CJ (1979) Information retrieval. Butterworth, OxfordGoogle Scholar
  22. 22.
    Voulgaris S, Gavidia D, van Steen M (2005) CYCLON: inexpensive membership management for unstructured P2P overlays. J Netw Syst Manag 13(2):197–217. doi: 10.1007/s10922-005-4441-x
  23. 23.
    Voulgaris S, van Steen M (2005) Epidemic-style management of semantic overlays for content-based searching. In: Euro-Par 2005 parallel processing. Springer, Berlin, pp 1143–1152Google Scholar
  24. 24.
    Wan M, Jönsson A, Wang C, Li L, Yang Y (2012) A random indexing approach for web user clustering and web prefetching. In: New frontiers in applied data mining. Springer, Berlin, pp 40–52Google Scholar
  25. 25.
    Warner SL (1965) Randomized response: a survey technique for eliminating evasive answer bias. J Am Stat Assoc 60(309):63–69CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Wien 2015

Authors and Affiliations

  • Antoine Boutet
    • 1
  • Davide Frey
    • 1
  • Rachid Guerraoui
    • 2
  • Arnaud Jégou
    • 1
  • Anne-Marie Kermarrec
    • 1
  1. 1.INRIA RennesRennesFrance
  2. 2.EPFLLausanneSwitzerland

Personalised recommendations