Privacy-Preserving Ridge Regression with only Linearly-Homomorphic Encryption

  • Irene Giacomelli
  • Somesh Jha
  • Marc Joye
  • C. David Page
  • Kyonghwan Yoon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10892)


Linear regression with 2-norm regularization (i.e., ridge regression) is an important statistical technique that models the relationship between some explanatory values and an outcome value using a linear function. In many applications (e.g., predictive modeling in personalized health-care), these values represent sensitive data owned by several different parties who are unwilling to share them. In this setting, training a linear regression model becomes challenging and needs specific cryptographic solutions. This problem was elegantly addressed by Nikolaenko et al. in S&P (Oakland) 2013. They suggested a two-server system that uses linearly-homomorphic encryption (LHE) and Yao’s two-party protocol (garbled circuits). In this work, we propose a novel system that can train a ridge linear regression model using only LHE (i.e., without using Yao’s protocol). This greatly improves the overall performance (both in computation and communication) as Yao’s protocol was the main bottleneck in the previous solution. The efficiency of the proposed system is validated both on synthetically-generated and real-world datasets.


Ridge regression Linear regression Privacy Homomorphic encryption 



This work was partially supported by the Clinical and Translational Science Award (CTSA) program, through the NIH National Center for Advancing Translational Sciences (NCATS) grant UL1TR002373, and by the NIH BD2K Initiative grant U54 AI117924.


  1. 1.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: 2000 ACM SIGMOD International Conference on Management of Data, pp. 439–450. ACM Press (2000)Google Scholar
  2. 2.
    Aono, Y., Hayashi, T., Phong, L.T., Wang, L.: Fast and secure linear regression and biometric authentication with security update. Cryptology ePrint Archive, Report 2015/692 (2015)Google Scholar
  3. 3.
    Bar-Ilan, J., Beaver, D.: Non-cryptographic fault-tolerant computing in constant number of rounds of interaction. In: Eighth Annual ACM Symposium on Principles of Distributed Computing, pp. 201–209. ACM Press (1989)Google Scholar
  4. 4.
    Barbosa, M., Catalano, D., Fiore, D.: Labeled homomorphic encryption: scalable and privacy-preserving processing of outsourced data. In: Foley, S.N., Gollmann, D., Snekkenes, E. (eds.) ESORICS 2017. LNCS, vol. 10492, pp. 146–166. Springer, Cham (2017). Scholar
  5. 5.
    Beaver, D.: Efficient multiparty protocols using circuit randomization. In: Feigenbaum, J. (ed.) CRYPTO 1991. LNCS, vol. 576, pp. 420–432. Springer, Heidelberg (1992). Scholar
  6. 6.
    Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness theorems for non-cryptographic fault-tolerant distributed computation. In: 20th Annual ACM Symposium on Theory of Computing, STOC, pp. 1–10. ACM Press (1988)Google Scholar
  7. 7.
    Cao, Z., Liu, L.: Comment on “harnessing the cloud for securely outsourcing large-scale systems of linear equations”. IEEE Trans. Parallel Distrib. Syst. 27(5), 1551–1552 (2016)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Cock, M.D., Dowsley, R., Nascimento, A.C.A., Newman, S.C.: Fast, privacy preserving linear regression over distributed datasets based on pre-distributed data. In: 8th ACM Workshop on Artificial Intelligence and Security, pp. 3–14. ACM Press (2015)Google Scholar
  9. 9.
    Damgård, I., Jurik, M.: A generalisation, a simplification and some applications of Paillier’s probabilistic public-key system. In: Kim, K. (ed.) PKC 2001. LNCS, vol. 1992, pp. 119–136. Springer, Heidelberg (2001). Scholar
  10. 10.
    Du, W., Han, Y.S., Chen, S.: Privacy-preserving multivariate statistical analysis: linear regression and classification. In: Fourth SIAM International Conference on Data Mining, pp. 222–233. SIAM (2004)CrossRefGoogle Scholar
  11. 11.
    Fouque, P.-A., Stern, J., Wackers, G.-J.: CryptoComputing with rationals. In: Blaze, M. (ed.) FC 2002. LNCS, vol. 2357, pp. 136–146. Springer, Heidelberg (2003). Scholar
  12. 12.
    Gascón, A., Schoppmann, P., Balle, B., Raykova, M., Doerner, J., Zahur, S., Evans, D.: Privacy-preserving distributed linear regression on high-dimensional data. PoPETS 2017(4), 248–267 (2017)Google Scholar
  13. 13.
    Gentry, C.: Fully homomorphic encryption using ideal lattices. In: 41st Annual ACM Symposium on Theory of Computing, STOC, pp. 169–178. ACM Press (2009)Google Scholar
  14. 14.
    Giacomelli, I., Jha, S., Joye, M., Page, C.D., Yoon, K.: Privacy-preserving ridge regression with only linearly-homomorphic encryption. Cryptology ePrint Archive, Report 2017/979 (2017)Google Scholar
  15. 15.
    Hall, R., Fienberg, S.E., Nardi, Y.: Secure multiple linear regression based on homomorphic encryption. J. Off. Stat. 27(4), 669–691 (2011)Google Scholar
  16. 16.
    Kamara, S., Mohassel, P., Raykova, M.: Outsourcing multi-party computation. Cryptology ePrint Archive, Report 2011/272 (2011)Google Scholar
  17. 17.
    Karr, A.F., Lin, X., Sanil, A.P., Reiter, J.P.: Regression on distributed databases via secure multi-party computation. In: 2004 Annual National Conference on Digital Government Research, pp. 108:1–108:2 (2004)Google Scholar
  18. 18.
    Karr, A.F., Lin, X., Sanil, A.P., Reiter, J.P.: Secure regression on distributed databases. J. Comput. Graph. Stat. 14(2), 263–279 (2005)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Karr, A.F., Lin, X., Sanil, A.P., Reiter, J.P.: Privacy-preserving analysis of vertically partitioned data using secure matrix products. J. Off. Stat. 25(1), 125–138 (2009)Google Scholar
  20. 20.
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000). Scholar
  21. 21.
    McDonald, G.C.: Ridge regression. Wiley Interdiscip. Rev.: Comput. Stat. 1(1), 93–100 (2009)CrossRefGoogle Scholar
  22. 22.
    Mohassel, P., Zhang, Y.: SecureML: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy, pp. 19–38. IEEE Computer Society (2017)Google Scholar
  23. 23.
    Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: 2013 IEEE Symposium on Security and Privacy, pp. 334–348. IEEE Computer Society (2013)Google Scholar
  24. 24.
    Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). Scholar
  25. 25.
    Sanil, A.P., Karr, A.F., Lin, X., Reiter, J.P.: Privacy preserving regression modelling via distributed computation. In: Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 677–682. ACM Press (2004)Google Scholar
  26. 26.
    Wang, C., Ren, K., Wang, J., Wang, Q.: Harnessing the cloud for securely outsourcing large-scale systems of linear equations. IEEE Trans. Parallel Distrib. Syst. 24(6), 1172–1181 (2013)CrossRefGoogle Scholar
  27. 27.
    Wang, P.S., Guy, M.J.T., Davenport, J.H.: \(P\)-adic reconstruction of rational numbers. ACM SIGSAM Bull. 16(2), 2–3 (1982)CrossRefGoogle Scholar
  28. 28.
    The International Warfarin Pharmacogenetics Consortium: Estimation of the Warfarin dose with clinical and pharmacogenetic data. N. Engl. J. Med. 360(8), 753–764 (2009)Google Scholar
  29. 29.
    Yao, A.C.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science, FOCS, pp. 162–167. IEEE Computer Society (1986)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Irene Giacomelli
    • 1
  • Somesh Jha
    • 1
  • Marc Joye
    • 2
  • C. David Page
    • 1
  • Kyonghwan Yoon
    • 1
  1. 1.University of Wisconsin-MadisonMadisonUSA
  2. 2.NXP SemiconductorsSan JoseUSA

Personalised recommendations