A Cryptographic Privacy Preserving Approach over Classification

  • G. Nageswara Rao
  • M. Sweta Harini
  • Ch. Ravi Kishore
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 249)

Abstract

We introduce a cryptographic based approach that will ensure the protection of data sets, which are used by third parties for constructing decision tree models using classification techniques, specifically ID3 algorithm. There is no necessity to increase the data sets size through perturbation or sanitize the samples before forwarding the data sets to third parties for further processing. The suggested method does not affect the accuracy of the data mining results. Cryptography techniques are applied after the collection of the entire data. This ensures privacy protection as the data sets are encrypted before they are sent to third parties preventing inadvertent disclosure or theft. This would prevent hackers/people who would like to misuse the data as the information is in encrypted form. We propose to use ID3 algorithm for classification, which is used extensively in machine learning/data mining, in construction of decision tree models.

Keywords

Cryptography Datasets ID3 Algorithm Decision tree models Perturbation Data mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ajmani, S., Morris, R., Liskov, B.: A Trusted Third-Party Computation Service. Technical Report MIT-LCS-TR-847, MIT (2001)Google Scholar
  2. 2.
    Keer, S., Singh, A.: Hiding Sensitive Association Rules Using Clusters of Sensitive Association Rule. IJCSN 1(3) (June 2012)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Privacy Preserving Data Mining. In: Proc. ACM SIGMOD Conf. Management of Data (SIGMOD 2000), pp. 439–450 (May 2000)Google Scholar
  4. 4.
    Dowd, J., Xu, S., Zhang, W.: Privacy-Preserving Decision Tree Mining Based on Random Substitutions. In: Müller, G. (ed.) ETRICS 2006. LNCS, vol. 3995, pp. 145–159. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Fong, P.K., Weber-Jahnke, J.H.: Privacy Preserving Decision Tree Learning Using Unrealized Data Sets. IEEE Transactions on Knowledge and Data Engineering 24(2) (2012)Google Scholar
  7. 7.
    Ma, Q., Deng, P.: Secure Multi-Party Protocols for Privacy Preserving Data Mining. In: Li, Y., Huynh, D.T., Das, S.K., Du, D.-Z. (eds.) WASA 2008. LNCS, vol. 5258, pp. 526–537. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  8. 8.
    Lomas, N.: Data on 84,000 United Kingdom Prisoners is Lost (August 2008), http://news.cnet.com/8301-1009_3-10024550-83.html (retrieved September 12, 2008)
  9. 9.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)MATHGoogle Scholar
  10. 10.
    Ross Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufman (1993)Google Scholar
  11. 11.
    Denning, D.E.: Cryptography and Data Security. Addison-Wesley (1982)Google Scholar
  12. 12.
    Estivill-Castro, V., Brankovic, L.: Data swapping: Balancing privacy against Precision in mining for logic rules. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 389–398. Springer, Heidelberg (1999)Google Scholar
  13. 13.
    Mehta, M., Agrawal, R., Rissanen, J.: SLIQ: A fast scalable Classifier for data mining. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 18–32. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  14. 14.
    Shafer, J., Agrawal, R., Mehta, M.: SPRINT: A scalable parallel Classifier for data mining. In: Proc. of the 22nd Int’l Conference on Very Large Databases, Bombay, India (September 1996)Google Scholar
  15. 15.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of Random data perturbation techniques. In: IEEE International Conference on Data Mining (2003)Google Scholar
  16. 16.
    Agrawal, D., Aggrawal, C.C.: On the design and quantification of Privacy preserving data mining algorithms. In: ACM Symposium on Principles of Database Systems (2001)Google Scholar
  17. 17.
    Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. Int’l J. Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Bu, S., Lakshmanan, L., Ng, R., Ramesh, G.: Preservation of Patterns and Input-Output Privacy. In: Proc. IEEE 23rd Int’l Conf. Data Eng., pp. 696–705 (April 2007)Google Scholar
  19. 19.
    Russell, S., Peter, N.: Artificial Intelligence. A Modern Approach 2/E. Prentice-Hall (2002)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • G. Nageswara Rao
    • 1
  • M. Sweta Harini
    • 1
  • Ch. Ravi Kishore
    • 1
  1. 1.Department of Information TechnologyAditya Institute of Technology & ManagementTekkaliIndia

Personalised recommendations