Privacy-Preserving SVM Classification on Vertically Partitioned Data

  • Hwanjo Yu
  • Jaideep Vaidya
  • Xiaoqian Jiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)


Classical data mining algorithms implicitly assume complete access to all data, either in centralized or federated form. However, privacy and security concerns often prevent sharing of data, thus derailing data mining projects. Recently, there has been growing focus on finding solutions to this problem. Several algorithms have been proposed that do distributed knowledge discovery, while providing guarantees on the non-disclosure of data. Classification is an important data mining problem applicable in many diverse domains. The goal of classification is to build a model which can predict an attribute (binary attribute in this work) based on the rest of attributes. We propose an efficient and secure privacy-preserving algorithm for support vector machine (SVM) classification over vertically partitioned data.


Support Vector Machine Local Model Support Vector Machine Model Privacy Preserve Local Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Standard for privacy of individually identifiable health information. Federal Register 66(40) (Febraury 28 2001)Google Scholar
  2. 2.
    Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons, Chichester (1998)MATHGoogle Scholar
  3. 3.
    Fung, G., Mangasarian, O.L.: Proximal support vector machine classifiers. In: Proc. ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, KDD 2001 (2001)Google Scholar
  4. 4.
    Sweeney, L., Shamos, M.: A multiparty computation for randomly ordering players and making random selections. Tech. Rep. CMU-ISRI-04-126, Carnegie Mellon University (2004)Google Scholar
  5. 5.
    Yu, H., Vaidya, J.: Secure matrix addition. Tech. Rep., UIOWA Technical Report UIOWA-CS-04-04 (2004),
  6. 6.
    Yao, A.C.: How to generate and exchange secrets. In: Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pp. 162–167. IEEE, Los Alamitos (1986)Google Scholar
  7. 7.
    Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game - a completeness theorem for protocols with honest majority. In: ACM Symp. on the Theory of Computing (1987)Google Scholar
  8. 8.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD Conference on Management of Data (2000)Google Scholar
  9. 9.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the Third IEEE International Conference on Data Mining, ICDM 2003 (2003)Google Scholar
  10. 10.
    Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proc. of ACM SIGMOD Int. Conf. Management of data (2005)Google Scholar
  11. 11.
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. Journal of Cryptology 15(3), 177–206 (2002)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y.: State-of-the-art in privacy preserving data mining. SIGMOD Record 33(1), 50–57 (2004)CrossRefGoogle Scholar
  13. 13.
    Aggarwal, C.C., Yu, P.S.: A condensation approach to privacy preserving data mining. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 183–199. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
    Oliveira, S.R.M., Zaiane, O.R.: Privacy preserving clustering by data transformation. In: SBBD 2004 (2004)Google Scholar
  15. 15.
    Vaidya, J., Clifton, C.: Secure set intersection cardinality with application to association rule mining. Journal of Computer Security (to appear)Google Scholar
  16. 16.
    Lin, X., Clifton, C., Zhu, M.: Privacy preserving clustering with distributed EM mixture modeling. Knowledge and Information Systems (to appear 2004)Google Scholar
  17. 17.
    Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (2003)Google Scholar
  18. 18.
    Vaidya, J., Clifton, C.: Privacy preserving naıve bayes classifier for vertically partitioned data. In: 2004 SIAM International Conference on Data Mining (2004)Google Scholar
  19. 19.
    Karr, A.F., Lin, X., Sanil, A.P., Reiter, J.P.: Secure regressions on distributed databases. Journal of Computational and Graphical Statistics (2005)Google Scholar
  20. 20.
    Sanil, A.P., Karr, A.F., Lin, X., Reiter, J.P.: Privacy preserving regression modeling via distributed computation. In: ACM SIGKDD Int. Conf. Knowledge discovery and data mining (2004)Google Scholar
  21. 21.
    Yu, H., Jiang, X., Vaidya, J.: Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. In: Proc. ACM SAC Conf. Data Mining Track (2006)Google Scholar
  22. 22.
    Poulet, F.: Multi-way distributed SVM. In: Proc. European Conf. Machine Learning, ECML 2003 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hwanjo Yu
    • 1
  • Jaideep Vaidya
    • 2
  • Xiaoqian Jiang
    • 1
  1. 1.University of IowaIowa CityUSA
  2. 2.Rutgers UniversityNewarkUSA

Personalised recommendations