ONN the Use of Neural Networks for Data Privacy

  • Jordi Pont-Tuset
  • Pau Medrano-Gracia
  • Jordi Nin
  • Josep-L. Larriba-Pey
  • Victor Muntés-Mulero
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4910)

Abstract

The need for data privacy motivates the development of new methods that allow to protect data minimizing the disclosure risk without losing valuable statistical information. In this paper, we propose a new protection method for numerical data called Ordered Neural Networks (ONN). ONN presents a new way to protect data based on the use of Artificial Neural Networks (ANNs). The main contribution of ONN is a new strategy for preprocessing data so that the ANNs are not capable of accurately learning the original data set. Using the results obtained by the ANNs, ONN generates a new data set similar to the original one without disclosing the real sensible values.

We compare our method to the best methods presented in the literature, using data provided by the US Census Bureau. Our experiments show that ONN outperforms the previous methods proposed in the literature, proving that the use of ANNs is convenient to protect the data efficiently without losing the statistical properties of the set.

Keywords

Perturbative protection methods Data preprocessing Artificial Neural Networks Privacy in statistical databases 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Trans. on KDE 19(1), 1–16 (2007)Google Scholar
  2. 2.
    Torra, V., Domingo-Ferrer, J.: Record linkage methods for multidatabase data mining. In: Information Fusion in Data Mining, pp. 101–132. Springer, Heidelberg (2003)Google Scholar
  3. 3.
    Moore, R.: Controlled data swapping techniques for masking public use microdata sets. U.S. Bureau of the Census (Unpublished manuscript) (1996)Google Scholar
  4. 4.
    Burridge, J.: Information preserving statistical obfuscation. Statistics and Computing 13, 321–327 (2003)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata. Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, 111–133 (2001)Google Scholar
  6. 6.
    Rojas, R.: Neural Networks: A Systematic Introduction. Springer, Heidelberg (1996)Google Scholar
  7. 7.
    Freeman, J.A., Skapura, D.M.: Neural Networks: Algorithms, Applications and Programming Techniques, pp. 1–106. Addison-Wesley Publishing Company, Reading (1991)MATHGoogle Scholar
  8. 8.
    Adam, N.R., Wortmann, J.C.: Security-control for statistical databases: a comparative study. ACM Computing Surveys 21, 515–556 (1989)CrossRefGoogle Scholar
  9. 9.
    Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 111–133. Elsevier Science, Amsterdam (2001)Google Scholar
  10. 10.
    Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. on KDE 14, 189–201 (2002)Google Scholar
  11. 11.
    Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Post-masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 187–196. Springer, Heidelberg (2002)Google Scholar
  12. 12.
  13. 13.
    Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Comparing sdc methods for microdata on the basis of information loss and disclosure risk. In: Pre-proceedings of ETK-NTTS 2001, vol. 2, pp. 807–826. Eurostat (2001)Google Scholar
  14. 14.
    Yancey, W., Winkler, W., Creecy, R.: Disclosure risk assessment in perturbative microdata protection. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 135–152. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Nin, J., Herranz, J., Torra, V.: Rethinking rank swapping to decrease disclosure risk. In: DKE (in press, 2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jordi Pont-Tuset
    • 1
  • Pau Medrano-Gracia
    • 1
  • Jordi Nin
    • 2
  • Josep-L. Larriba-Pey
    • 1
  • Victor Muntés-Mulero
    • 1
  1. 1.DAMA-UPC, Computer Architecture Dept.Universitat Politcnica de CatalunyaBarcelonaSpain
  2. 2.IIIA, Artificial Intelligence Research Institute, CSIC, Spanish National Research CouncilBellaterraSpain

Personalised recommendations