A differential privacy framework for matrix factorization recommender systems

Friedman, Arik; Berkovsky, Shlomo; Kaafar, Mohamed Ali

doi:10.1007/s11257-016-9177-7

A differential privacy framework for matrix factorization recommender systems

Published: 16 August 2016

Volume 26, pages 425–458, (2016)
Cite this article

User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Arik Friedman¹,
Shlomo Berkovsky² &
Mohamed Ali Kaafar²

2509 Accesses
64 Citations
Explore all metrics

Abstract

Recommender systems rely on personal information about user behavior for the recommendation generation purposes. Thus, they inherently have the potential to hamper user privacy and disclose sensitive information. Several works studied how neighborhood-based recommendation methods can incorporate user privacy protection. However, privacy preserving latent factor models, in particular, those represented by matrix factorization techniques, the state-of-the-art in recommender systems, have received little attention. In this paper, we address the problem of privacy preserving matrix factorization by utilizing differential privacy, a rigorous and provable approach to privacy in statistical databases. We propose a generic framework and evaluate several ways, in which differential privacy can be applied to matrix factorization. By doing so, we specifically address the privacy-accuracy trade-off offered by each of the algorithms. We show that, of all the algorithms considered, input perturbation results in the best recommendation accuracy, while guaranteeing a solid level of privacy protection against attacks that aim to gain knowledge about either specific user ratings or even the existence of these ratings. Our analysis additionally highlights the system aspects that should be addressed when applying differential privacy in practice, and when considering potential privacy preserving solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Differentially Private Neighborhood-Based Recommender Systems

Effective Matrix Factorization for Recommendation with Local Differential Privacy

When Differential Privacy Meets Randomized Perturbation: A Hybrid Approach for Privacy-Preserving Recommender System

Notes

http://grouplens.org/datasets/movielens/.
http://www.netflixprize.com/.
We used Matlab and, specifically, crossvalind, to split the data.
Results obtained for the MovieLens-100K dataset exhibit a similar trend and are not shown, but only summarized in Table 4.
Due to technical limitations (computational time and memory requirements), the experiments with the input perturbation approach could not be conducted on the MovieLens-10M and Netflix datasets. Therefore, ISGD results are not shown in Table 5 and ISGD curves are missing from Fig. 4b, c.
We approached the authors, but unfortunately differentially private implementation of the kNN algorithm outlined in McSherry and Mironov (2009) was not publicly available, such that we were not able to reproduce the exact results reported therein.
Due to memory limitations, kNN implementation for the MovieLens-10M dataset was not feasible.

References

Berkovsky, S., Eytani, Y., Kuflik, T., Ricci, F.: Hierarchical neighborhood topology for privacy enhanced collaborative filtering. In: Proceedings of Workshop on Privacy-Enhanced Personalization, PEP 2006, Montreal, Canada, pp. 6–13 (2006)
Berkovsky, S., Kuflik, T., Ricci, F.: The impact of data obfuscation on the accuracy of collaborative filtering. Expert Systems with Applications 39(5), 5033–5042 (2012)
Article Google Scholar
Berlioz, A., Friedman, A., Kâafar, M.A., Boreli, R., Berkovsky, S.: Applying differential privacy to matrix factorization. In: Proceedings of the 9th ACM Conference on Recommender Systems, RecSys 2015, Vienna, Austria, pp. 107–114 (2015). doi:10.1145/2792838.2800173
Bhaskar, R., Laxman, S., Smith, A.D., Thakurta, A.: Discovering frequent patterns in sensitive data. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, Washington, DC, USA, pp. 503–512 (2010). doi:10.1145/1835804.1835869
Bilge, A., Gunes, I., Polat, H.: Robustness analysis of privacy-preserving model-based recommendation schemes. Expert Systems With Applications 41(8), 3671–3681 (2014)
Article Google Scholar
Calandrino, J.A., Kilzer, A., Narayanan, A., Felten, E.W., Shmatikov, V.: “You Might Also Like”: Privacy risks of collaborative filtering. In: Proceedings of the 32nd IEEE Symposium on Security and Privacy, S&P 2011, Berkeley, CA, USA, pp. 231–246 (2011). doi:10.1109/SP.2011.40
Canny, J.F.: Collaborative filtering with privacy. In: Proceedings of the 23rd IEEE Symposium on Security and Privacy, S&P 2002, Berkeley, CA, USA, pp. 45–57 (2002). doi:10.1109/SECPRI.2002.1004361
Chaudhuri, K., Monteleoni, C., Sarwate, A.D.: Differentially private empirical risk minimization. Journal of Machine Learning Research 12, 1069–1109 (2011)
MathSciNet MATH Google Scholar
Cheng, Z., Hurley, N.: Trading robustness for privacy in decentralized recommender systems. In: Proceedings of the 21st Conference on Innovative Applications of Artificial Intelligence, IAAI 2009, Pasadena, CA, USA (2009)
Dwork, C.: Differential privacy: A survey of results. In: Proceedings of the 5th International Conference on Theory and Applications of Models of Computation, TAMC 2008, Xi’an, China, pp. 1–19 (2008). doi:10.1007/978-3-540-79228-4_1
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Differential privacy – a primer for the preplexed. In: Joint UNECE/Eurostat work session on statistical data confidentiality. Tarragona, Spain (2011)
Dwork, C., McSherry, F., Nissim, K., Smith, A.D.: Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference, TCC 2006, New York, NY, USA, pp. 265–284 (2006). doi:10.1007/11681878_14
Erlingsson, Ú., Pihur, V., Korolova, A.: RAPPOR: Randomized aggregatable privacy-preserving ordinal response. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, CCS 2014, Scottsdale, AZ, USA, pp. 1054–1067 (2014). doi:10.1145/2660267.2660348
Friedman, A., Knijnenburg, B., Vanhecke, K., Martens, L., Berkovsky, S.: Privacy aspects of recommender systems. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 649–688. Springer, (2015)
Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, Washington, DC, USA, pp. 493–502 (2010). doi:10.1145/1835804.1835868
Hardt, M., Talwar, K.: On the geometry of differential privacy. In: Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, MA, USA, pp. 705–714 (2010). doi:10.1145/1806689.1806786
Harper, F.M., Konstan, J.A.: The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems 5(4), 19 (2016)
Google Scholar
Hay, M., Machanavajjhala, A., Miklau, G., Chen, Y., Zhang, D.: Principled evaluation of differentially private algorithms using DPBench. In: Proceedings of the International Conference on Management of Data, SIGMOD 2016, San Francisco, CA, USA, pp. 139–154 (2016). doi:10.1145/2882903.2882931
Jeckmans, A.J., Beye, M., Erkin, Z., Hartel, P., Lagendijk, R.L., Tang, Q.: Privacy in recommender systems. In: Ramzan, N., van Zwol, R., Lee, J.S., Clüver, K., Hua, X.S. (eds.) Social Media Retrieval, pp. 263–281. Springer, (2013)
Kifer, D., Machanavajjhala, A.: No free lunch in data privacy. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, 2011, pp. 193–204 (2011). doi:10.1145/1989323.1989345
Klösgen, W.: Anonymization techniques for knowledge discovery in databases. In: Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining, KDD 1995, Montreal, Canada, pp. 186–191 (1995)
Kobsa, A.: Privacy-enhanced web personalization. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) The Adaptive Web, pp. 628–670. Springer, (2007)
Koren, Y., Bell, R.: Advances in collaborative filtering. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 77–118. Springer, (2015)
Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)
Article Google Scholar
Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences 110(15), 5802–5805 (2013)
Article Google Scholar
Lam, S.K., Frankowski, D., Riedl, J.: Do you trust your recommendations? An exploration of security and privacy issues in recommender systems. In: Proceedings of the International Conference on Emerging Trends in Information and Communication Security, ETRICS 2006, Freiburg, Germany, pp. 14–29 (2006). doi:10.1007/11766155_2
Li, T., Unger, T.: Willing to pay for quality personalization? Trade-off between quality and privacy. European Journal of Information Systems 21(6), 621–642 (2012)
Article Google Scholar
Machanavajjhala, A., Korolova, A., Sarma, A.D.: Personalized social recommendations - accurate or private? Proceedings of the VLDB Endowment 4(7), 440–450 (2011)
Article Google Scholar
McSherry, F., Mironov, I.: Differentially private recommender systems: Building privacy into the netflix prize contenders. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, pp. 627–636 (2009). doi:10.1145/1557019.1557090
Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the 29th IEEE Symposium on Security and Privacy, (S&P 2008), Oakland, CA, USA, pp. 111–125 (2008). doi:10.1109/SP.2008.33
Netflix spilled your Brokeback Mountain secret. http://www.wired.com/threatlevel/2009/12/netflix-privacy-lawsuit/. Accessed: July 2016
Nikolaenko, V., Ioannidis, S., Weinsberg, U., Joye, M., Taft, N., Boneh, D.: Privacy-preserving matrix factorization. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, CCS 2013, Berlin, Germany, pp. 801–812 (2013). doi:10.1145/2508859.2516751
Ning, X., Desrosiers, C., Karypis, G.: A comprehensive survey of neighborhood-based recommendation methods. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 37–76. Springer, (2015)
Parameswaran, R., Blough, D.M.: Privacy preserving collaborative filtering using data obfuscation. In: Proceedings of the IEEE International Conference on Granular Computing, GrC 2007, San Jose, CA, USA, pp. 380–386 (2007). doi:10.1109/GRC.2007.129
Polat, H., Du, W.: Achieving private recommendations using randomized response techniques. In: Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2014, Singapore, Singapore, pp. 637–646 (2006). doi:10.1007/11731139_73
Ricci, F., Rokach, L., Shapira, B. (eds.): Recommender Systems Handbook, 2nd edn. Springer, (2015)
Said, A., Berkovsky, S., De Luca, E.W., Hermanns, J.: Challenge on context-aware movie recommendation: Camra2011. In: Proceedings of the ACM Conference on Recommender Systems, RecSys 2011, Chicago, IL, USA, pp. 385–386 (2011). doi:10.1145/2043932.2044015
Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based access control models. IEEE Computers 29(2), 38–47 (1996)
Article Google Scholar
Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Analysis of recommendation algorithms for e-commerce. In: Proceedings of the ACM Conference on Electronic Commerce, Minneapolis, MN, USA, pp. 158–167 (2000). doi:10.1145/352871.352887
Sun, X., Kashima, H., Matsuzaki, T., Ueda, N.: Averaged stochastic gradient descent with feedback: An accurate, robust, and fast training method. In: Proceedings of the 10th IEEE International Conference on Data Mining, ICDM 2010, Sydney, Australia, pp. 1067–1072 (2010). doi:10.1109/ICDM.2010.26
Sweeney, L.: \(k\)-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Article MathSciNet MATH Google Scholar
Vallet, D., Friedman, A., Berkovsky, S.: Matrix factorization without user data retention. In: Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2014, Tainan, Taiwan, pp. 569–580 (2014). doi:10.1007/978-3-319-06608-0_47
Weinsberg, U., Bhagat, S., Ioannidis, S., Taft, N.: BlurMe: Inferring and obfuscating user gender based on ratings. In: Proceedings of the 6th ACM Conference on Recommender Systems, RecSys 2012, Dublin, Ireland, pp. 195–202 (2012). doi:10.1145/2365952.2365989
Zhou, Y., Wilkinson, D.M., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the netflix prize. In: Proceedings of 4th International Conference on Algorithmic Aspects in Information and Management, AAIM 2008, Shanghai, China, pp. 337–348 (2008). doi:10.1007/978-3-540-68880-8_32

Download references

Author information

Authors and Affiliations

Mobile Systems Research Group, National ICT Australia (NICTA), Sydney, Australia
Arik Friedman
Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Eveleigh, Australia
Shlomo Berkovsky & Mohamed Ali Kaafar

Authors

Arik Friedman
View author publications
You can also search for this author in PubMed Google Scholar
Shlomo Berkovsky
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Ali Kaafar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shlomo Berkovsky.

Appendix: Parameterisation

Here we briefly describe the parameterization of the differentially private algorithms. We detail the results obtained for the MovieLens-100K dataset and bounded differential privacy case; however, the same methodology was also applied to other datasets.

The goal of the parameterization was to set the most appropriate values of the MF and privacy parameters. To start with, the value of the learning rate parameter was set to \(\gamma =0.01\), as in other MF implementations (Koren and Bell 2015). Next, we optimized the regularization parameter \(\lambda \) and the number of latent factors d. For this, we defined a fixed test set of ratings and repeated the MF predictions for various combinations of values of \(\lambda \) and d. These combinations included the exhaustive set of pairs within the ranges \(\lambda \in [0.01,0.15]\) and \(d \in [1,25]\). For each value of the parameters, the RMSE of the predictions for the same test set was computed. Since a 3D plot of the RMSE is hard to corroborate, Fig. 6a, b shows the 2D projections of the plot obtained for the fixed values of \(\lambda \) and d. The best performing combination of \(\lambda =0.08\) and \(d=3\) was used in the unbounded experiments with the MovieLens-100K dataset.

Having set the parameters \(\lambda \) and d, we turned to the number of SGD/ALS iterations, k. For this, we gradually increased the number of iterations from \(k=1\) to \(k=15\), and for each value of k computed the RMSE obtained for the fixed test set. The results of this experiment are shown in Fig. 6c. As expected, the RMSE values stabilise starting from a certain value of k. For example, in this case RMSE is reasonably stable after \(k=7\), such that we parameterize the number of iterations to \(k=10\).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Friedman, A., Berkovsky, S. & Kaafar, M.A. A differential privacy framework for matrix factorization recommender systems. User Model User-Adap Inter 26, 425–458 (2016). https://doi.org/10.1007/s11257-016-9177-7

Download citation

Received: 11 April 2016
Accepted: 24 July 2016
Published: 16 August 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s11257-016-9177-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A differential privacy framework for matrix factorization recommender systems

Abstract

Access this article

Similar content being viewed by others

Differentially Private Neighborhood-Based Recommender Systems

Effective Matrix Factorization for Recommendation with Local Differential Privacy

When Differential Privacy Meets Randomized Perturbation: A Hybrid Approach for Privacy-Preserving Recommender System

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Parameterisation

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A differential privacy framework for matrix factorization recommender systems

Abstract

Access this article

Similar content being viewed by others

Differentially Private Neighborhood-Based Recommender Systems

Effective Matrix Factorization for Recommendation with Local Differential Privacy

When Differential Privacy Meets Randomized Perturbation: A Hybrid Approach for Privacy-Preserving Recommender System

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Parameterisation

Appendix: Parameterisation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation