Privacy-Preserving Decision Trees Evaluation via Linear Functions

  • Raymond K. H. Tai
  • Jack P. K. Ma
  • Yongjun Zhao
  • Sherman S. M. Chow
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10493)

Abstract

The combination of cloud-based computing paradigm and machine learning algorithms has enabled many complex analytic services, such as face recognition in a crowd or valuation of immovable properties. Companies can charge clients who do not have the expertise or resource to build such complex models for the prediction or classification service. In this work, we focus on machine learning classification with decision tree (or random forests) as the analytic model, which is popular for its effectiveness and simplicity. We propose privacy-preserving decision tree evaluation protocols which hide the sensitive inputs (model and query) from the counterparty. Comparing with the state-of-the-art, we made a significant improvement in efficiency by cleverly exploiting the structure of decision trees, which avoids an exponential number of encryptions in the depth of the decision tree. Our experiment results show that our protocols are especially efficient for deep but sparse decision trees, which are typical for classification models trained from real datasets, ranging from cancer diagnosis to spam classification.

References

  1. 1.
    Barni, M., Failla, P., Kolesnikov, V., Lazzeretti, R., Sadeghi, A.-R., Schneider, T.: Secure evaluation of private linear branching programs with medical applications. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 424–439. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-04444-1_26CrossRefGoogle Scholar
  2. 2.
    Barni, M., Failla, P., Lazzeretti, R., Sadeghi, A., Schneider, T.: Privacy-preserving ECG classification with branching programs and neural networks. Trans. Inf. Forensics Secur. 6(2), 452–468 (2011)CrossRefGoogle Scholar
  3. 3.
    Bos, J.W., Lauter, K.E., Naehrig, M.: Private predictive analysis on encrypted medical data. J. Biomed. Inform. 50, 234–243 (2014)CrossRefGoogle Scholar
  4. 4.
    Bost, R., Popa, R.A., Tu, S., Goldwasser, S.: Machine learning classification over encrypted data. In: NDSS (2015)Google Scholar
  5. 5.
    Brickell, J., Porter, D.E., Shmatikov, V., Witchel, E.: Privacy-preserving remote diagnostics. In: ACM CCS (2007)Google Scholar
  6. 6.
    Camenisch J., Stadler, M.: Efficient group signature schemes for large groups (extended abstract). In: CRYPTO (1997)Google Scholar
  7. 7.
    Chaum, D., Pedersen, T.P.: Wallet databases with observers. In: Brickell, E.F. (ed.) CRYPTO 1992. LNCS, vol. 740, pp. 89–105. Springer, Heidelberg (1993). doi: 10.1007/3-540-48071-4_7Google Scholar
  8. 8.
    Demmler, D., Schneider, T., Zohner, M.: ABY - a framework for efficient mixed-protocol secure two-party computation. In: NDSS (2015)Google Scholar
  9. 9.
    Du, W., Han, Y.S. Chen, S.: Privacy-preserving multivariate statistical analysis: linear regression and classification. In: SDM (2004)Google Scholar
  10. 10.
    Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: ACM CCS (2015)Google Scholar
  11. 11.
    Fredrikson, M., Lantz, E., Jha, S., Lin, D., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized Warfarin dosing. In: USENIX Security (2014)Google Scholar
  12. 12.
    Frikken, K.B.: Practical private DNA string searching and matching through efficient oblivious automata evaluation. In: DBSec (2009)Google Scholar
  13. 13.
    ElGamal, T.: A public key cryptosystem and a signature scheme based on discrete logarithms. In: Blakley, G.R., Chaum, D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 10–18. Springer, Heidelberg (1985). doi: 10.1007/3-540-39568-7_2CrossRefGoogle Scholar
  14. 14.
    Gentry, C.: A fully homomorphic encryption scheme. Ph.D. thesis, Stanford University, Stanford, CA, USA, AAI3382729 (2009)Google Scholar
  15. 15.
    Graepel, T., Lauter, K., Naehrig, M.: ML confidential: machine learning on encrypted data. In: Kwon, T., Lee, M.-K., Kwon, D. (eds.) ICISC 2012. LNCS, vol. 7839, pp. 1–21. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-37682-5_1CrossRefGoogle Scholar
  16. 16.
    Hazay, C., Lindell, Y.: Efficient Secure Two-Party Protocols - Techniques and Constructions. Information Security and Cryptography. Springer, Heidelberg (2010)CrossRefMATHGoogle Scholar
  17. 17.
    Ishai, Y., Paskin, A.: Evaluating branching programs on encrypted data. In: TCC (2007)Google Scholar
  18. 18.
    Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A practical differentially private random decision tree classifier. Trans. Data Priv. 5(1), 273–295 (2012)MathSciNetGoogle Scholar
  19. 19.
    Kolesnikov, V., Mohassel, P., Rosulek, M.: FleXOR: flexible garbling for XOR Gates that beats free-XOR. In: Garay, J.A., Gennaro, R. (eds.) CRYPTO 2014 Part II. LNCS, vol. 8617, pp. 440–457. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44381-1_25CrossRefGoogle Scholar
  20. 20.
    Kolesnikov, V., Schneider, T.: Improved garbled circuit: free XOR gates and applications. In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008 Part II. LNCS, vol. 5126, pp. 486–498. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-70583-3_40CrossRefGoogle Scholar
  21. 21.
    Kononenko, I.: Machine learning for medical diagnosis: history, state of the art and perspective. Artif. Intell. Med. 23(1), 89–109 (2001)CrossRefGoogle Scholar
  22. 22.
    Lichman, M.: UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine (2013). http://archive.ics.uci.edu/ml
  23. 23.
    Lindell, Y.: Fast cut-and-choose based protocols for malicious and covert adversaries. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013 Part II. LNCS, vol. 8043, pp. 1–17. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40084-1_1CrossRefGoogle Scholar
  24. 24.
    Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining. Springer, Heidelberg (2000)CrossRefMATHGoogle Scholar
  25. 25.
    Lindell, Y., Pinkas, B.: An efficient protocol for secure two-party computation in the presence of malicious adversaries. J. Cryptol. 28(2), 312–350 (2015)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Mohassel, P., Niksefat, S.: Oblivious decision programs from oblivious transfer: efficient reductions. In: Keromytis, A.D. (ed.) FC 2012. LNCS, vol. 7397, pp. 269–284. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-32946-3_20CrossRefGoogle Scholar
  27. 27.
    Mohassel, P., Niksefat, S., Sadeghian, S., Sadeghiyan, B.: An efficient protocol for oblivious DFA evaluation and applications. In: Dunkelman, O. (ed.) CT-RSA 2012. LNCS, vol. 7178, pp. 398–415. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-27954-6_25CrossRefGoogle Scholar
  28. 28.
    Papernot, N., McDaniel, P.D., Sinha, A., Wellman, M.P.: Towards the science of security and privacy in machine learning. CoRR, abs/1611.03814 (2016)Google Scholar
  29. 29.
    Qin, Z., Yan, K. Ren, K., Chen, C.W., Wang, C.: Towards efficient privacy-preserving image feature extraction in cloud computing. In: ACM Multimedia (2014)Google Scholar
  30. 30.
    Sakai, Y., Emura, K., Hanaoka, G., Kawai, Y., Omote, K.: Methods for restricting message space in public-key encryption. IEICE Trans. 96(6), 156–1168 (2013)MATHGoogle Scholar
  31. 31.
    Vaidya, J., Kantarcioglu, M., Clifton, C.: Privacy-preserving naïve Bayes classification. VLDB J. 17(4), 879–898 (2008)CrossRefGoogle Scholar
  32. 32.
    Veugen, T.: Improving the DGK comparison protocol. In: WIFS (2012)Google Scholar
  33. 33.
    Wang, Q., He, M., Du, M., Chow, S.S.M., Lai, R.W.F., Zou, Q.: Searchable encryption over feature-rich data. IEEE Trans. Dependable Sec. Comput. (2017)Google Scholar
  34. 34.
    Wright, R.N., Yang, Z.: Privacy-preserving Bayesian network structure computation on distributed heterogeneous data. In: SIGKDD (2004)Google Scholar
  35. 35.
    Wu, D.J., Feng, T., Naehrig, M., Lauter, K.: Privately evaluating decision trees and random forests. PoPETs 4, 335–355 (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Raymond K. H. Tai
    • 1
  • Jack P. K. Ma
    • 1
  • Yongjun Zhao
    • 1
  • Sherman S. M. Chow
    • 1
  1. 1.Information Engineering DepartmentChinese University of Hong KongShatinHong Kong

Personalised recommendations