Knowledge and Information Systems

, Volume 46, Issue 1, pp 33–58 | Cite as

Privacy-preserving topic model for tagging recommender systems

  • Tianqing Zhu
  • Gang LiEmail author
  • Wanlei Zhou
  • Ping Xiong
  • Cao Yuan
Regular Paper


Tagging recommender systems provide users the freedom to explore tags and obtain recommendations. The releasing and sharing of these tagging datasets will accelerate both commercial and research work on recommender systems. However, releasing the original tagging datasets is usually confronted with serious privacy concerns, because adversaries may re-identify a user and her/his sensitive information from tagging datasets with only a little background information. Recently, several privacy techniques have been proposed to address the problem, but most of these lack a strict privacy notion, and rarely prevent individuals being re-identified from the dataset. This paper proposes a privacy- preserving tag release algorithm, PriTop. This algorithm is designed to satisfy differential privacy, a strict privacy notion with the goal of protecting users in a tagging dataset. The proposed PriTop algorithm includes three privacy-preserving operations: Private topic model generation structures the uncontrolled tags; private weight perturbation adds Laplace noise into the weights to hide the numbers of tags; while private tag selection finally finds the most suitable replacement tags for the original tags, so the exact tags can be hidden. We present extensive experimental results on four real-world datasets, Delicious, MovieLens, and BibSonomy. While the recommendation algorithm is successful in all the cases, our results further suggest the proposed PriTop algorithm can successfully retain the utility of the datasets while preserving privacy.


Privacy preserving Differential privacy Topic model  Recommender system Tagging system 


  1. 1.
    Berkovsky S, Eytani Y, Kuflik T, Ricci F (2007) Enhancing privacy and preserving accuracy of a distributed collaborative filtering. In: Proceedings of the 2007 ACM conference on recommender systems, RecSys ’07. ACM, New York, NY, USA, pp 9–16Google Scholar
  2. 2.
    Blei David M (2012) Probabilistic topic models. Commun ACM 55(4):77–84MathSciNetCrossRefGoogle Scholar
  3. 3.
    Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022zbMATHGoogle Scholar
  4. 4.
    Blum A, Ligett K, Roth A (2008) A learning theory approach to non-interactive database privacy. In: Proceedings of the 40th annual ACM symposium on theory of computing, STOC ’08. ACM, New York, NY, USA, pp 609–618Google Scholar
  5. 5.
    Calandrino JA, Kilzer A, Narayanan A, Felten EW, Shmatikov V (2011) “you might also like: ” privacy risks of collaborative filtering. In: Proceedings of the 2011 IEEE symposium on security and privacy, SP ’11. IEEE Computer Society, Washington, DC, USA, pp 231–246Google Scholar
  6. 6.
    Canny J (2002) Collaborative filtering with privacy via factor analysis. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’02. ACM, New York, NY, USA, pp 238–245Google Scholar
  7. 7.
    Dwork C (2006) Differential privacy. In: ICALP’06: Proceedings of the 33rd international conference on automata, languages and programming. Springer, Berlin, pp 1–12Google Scholar
  8. 8.
    Dwork C (2008) Differential privacy: a survey of results. In: TAMC’08: Proceedings of the 5th international conference on theory and applications of models of computation. Springer, Berlin, pp 1–19Google Scholar
  9. 9.
    Dwork C (2011) A firm foundation for private data analysis. Commun ACM 54(1):86–95CrossRefGoogle Scholar
  10. 10.
    Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: TCC’06: Proceedings of the third conference on theory of cryptography. Springer, Berlin, pp 265–284Google Scholar
  11. 11.
    Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42(4):1–53Google Scholar
  12. 12.
    Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci USA 101(Suppl 1):5228–5235CrossRefGoogle Scholar
  13. 13.
    Jäschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2007) Tag recommendations in folksonomies. In: Proceedings of the 11th European conference on principles and practice of knowledge discovery in databases, PKDD 2007. Springer, Berlin, pp 506–514Google Scholar
  14. 14.
    Krestel R, Fankhauser P, Nejdl W (2009) Latent dirichlet allocation for tag recommendation. In: Proceedings of the third ACM conference on recommender systems, RecSys ’09. ACM, New York, NY, USA, pp 61–68Google Scholar
  15. 15.
    Lin J (1991) Divergence measures based on the shannon entropy. IEEE Trans Inf Theory 37(1):145–151zbMATHCrossRefGoogle Scholar
  16. 16.
    Marinho L, Hotho A, Jschke R, Nanopoulos A, Rendle S, Schmidt-Thieme L, Stumme G, Symeonidis P (2012) SpringerBriefs in electrical and computer engineering. Recommender systems for social tagging systems. Springer, US, pp 75–80Google Scholar
  17. 17.
    McSherry F, Mironov I (2009) Differentially private recommender systems: building privacy into the net. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’09. ACM, New York, NY, USA, pp 627–636Google Scholar
  18. 18.
    McSherry F, Talwar K (2007) Mechanism design via differential privacy. In: Proceedings of the 48th annual IEEE symposium on foundations of computer science, FOCS ’07. IEEE Computer Society, Washington, DC, USA, pp 94–103Google Scholar
  19. 19.
    Narayanan A, Shmatikov V (2006) How to break anonymity of the netflix prize dataset. CoRR, abs/cs/0610105Google Scholar
  20. 20.
    Narayanan A, Shmatikov V (2008) Robust de-anonymization of large sparse datasets. In: Proceedings of the 2008 IEEE symposium on security and privacy, SP ’08. IEEE Computer Society, Washington, DC, USA, pp 111–125Google Scholar
  21. 21.
    Parameswaran R, Blough DM (2007) Privacy preserving collaborative filtering using data obfuscation. In: Granular computing, 2007. GRC 2007. IEEE international conference on granular computing, p 380Google Scholar
  22. 22.
    Parra-Arnau J, Perego A, Ferrari E, Forne J, Rebollo-Monedero D (2014) Privacy-preserving enhanced collaborative tagging. IEEE Trans Knowl Data Eng 26(1):180–193CrossRefGoogle Scholar
  23. 23.
    Parra-Arnau J, Rebollo-Monedero D, Forne J (2014) Measuring the privacy of user profiles in personalized information systems. Future Gener Comput Syst 33(0):5363Google Scholar
  24. 24.
    Polat H, Du W (2003) Privacy-preserving collaborative filtering using randomized perturbation techniques. In: ICDM 2003. Third IEEE international conference on Data mining, 2003, pp 625–628Google Scholar
  25. 25.
    Polat H, Du W (2006) Achieving private recommendations using randomized response techniques. In: Proceedings of the 10th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’06. Springer, Berlin, pp 637–646Google Scholar
  26. 26.
    Ramakrishnan N, Keller BJ, Mirza BJ, Grama AY, Karypis G (2001) Privacy risks in recommender systems. IEEE Internet Comput 5(6):54–62CrossRefGoogle Scholar
  27. 27.
    Shepitsen A, Gemmell J, Mobasher B, Burke R (2008) Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08. ACM, New York, NY, USA, pp 259–266Google Scholar
  28. 28.
    Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08. ACM, New York, NY, USA, pp 327–336Google Scholar
  29. 29.
    Steyvers M, Griffiths T (2007) Probabilistic topic models. Handb Latent semant Anal 427(7):424–440Google Scholar
  30. 30.
    Symeonidis P, Nanopoulos A, Manolopoulos Y (2008) Tag recommendations based on tensor dimensionality reduction. In: Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08. ACM, New York, NY, USA, pp 43–50Google Scholar
  31. 31.
    Zhan J, Hsieh C-L, Wang I-C, Tsan sheng H, Liau C-J, Wang Da-Wei (2010) Privacy-preserving collaborative recommender systems. IEEE Trans Syst Man Cybern C Appl Rev 40(4):472–476CrossRefGoogle Scholar
  32. 32.
    Zhu T, Li G, Ren Y, Zhou W, Xiong P (2013) Differential privacy for neighborhood-based collaborative filtering. In: Proceedings of the 2013 international conference on advances in social networks analysis and mining (ASONAM 2013), ASONAM ’13. IEEE computer society, Washington, DC, USA, pp 752–759Google Scholar

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  • Tianqing Zhu
    • 1
  • Gang Li
    • 1
    Email author
  • Wanlei Zhou
    • 1
  • Ping Xiong
    • 2
  • Cao Yuan
    • 3
  1. 1.School of Information TechnologyDeakin UniversityMelbourneAustralia
  2. 2.School of Information and Security EngineeringZhongnan University of Economics and LawWuhanChina
  3. 3.School of Information TechnologyWuhan Polytechnic UniversityWuhanChina

Personalised recommendations