Advertisement

Statistics and Computing

, Volume 13, Issue 4, pp 329–335 | Cite as

A theoretical basis for perturbation methods

  • Krishnamurty Muralidhar
  • Rathindra Sarathy
Article

Abstract

In this paper we discuss a new theoretical basis for perturbation methods. In developing this new theoretical basis, we define the ideal measures of data utility and disclosure risk. Maximum data utility is achieved when the statistical characteristics of the perturbed data are the same as that of the original data. Disclosure risk is minimized if providing users with microdata access does not result in any additional information. We show that when the perturbed values of the confidential variables are generated as independent realizations from the distribution of the confidential variables conditioned on the non-confidential variables, they satisfy the data utility and disclosure risk requirements. We also discuss the relationship between the theoretical basis and some commonly used methods for generating perturbed values of confidential numerical variables.

confidentiality data masking disclosure risk perturbation statistical disclosure limitation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burridge J. 2003. Information preserving statistical obfuscation. Statistics and Computing 13: 321–327.Google Scholar
  2. Clemen R.T. and Reilly T. 1999. Correlations and copulas for decision and risk analysis. Management Science 45: 208–224.Google Scholar
  3. Dalenius T. 1977. Towards a methodology for statistical disclosure control. Statistisktidskrift 5: 429–444.Google Scholar
  4. Duncan G.T. and Lambert D. 1986. Disclosure limited data dissemination. Journal of the American Statistical Association 81: 10–18.Google Scholar
  5. Duncan G.T. and Pearson R.W. 1991. Enhancing access to microdata while protecting confidentiality: Prospects for the future. Statistical Science 6: 219–239.Google Scholar
  6. Fienberg S.E., Makov U.E., and Sanil A.P. 1997. A Bayesian approach to data disclosure: Optimal intruder behavior for continuous data. Journal of Official Statistics 13: 75–89.Google Scholar
  7. Fienberg S.E., Makov U.E., and Steele R.J. 1998. Disclosure limitation using perturbation and related methods for categorical data. Journal of Official Statistics 14: 485–502.Google Scholar
  8. Fuller W.A. 1993. Masking procedures for microdata disclosure limitation. Journal of Official Statistics 9: 383–406.Google Scholar
  9. Franconi L. and Stander J. 2002. A model based method for disclosure limitation of business microdata. J. Roy. Stat. Soc. D 51: 51–61.Google Scholar
  10. Franconi L. and Stander J. 2003. Spatial and non-spatial model based protection procedures for release of business microdata. Statistics and Computing 13: 295–305.Google Scholar
  11. Gouweleeuw J., Kooiman P., Willenborg L., and de Wolf P.P. 1998. Post randomisation for statistical disclosure control: Theory and implementation. Journal of Official Statistics 14: 463–478.Google Scholar
  12. Kim J. 1986. A method for limiting disclosure in microdata based on random noise and transformation. In: Proc. ASA Surv. Res. Meth. Sec. ASA, Washington, DC, pp. 370–374.Google Scholar
  13. Kooiman P., Willenborg L., and Gouweleeuw J. 1997. PRAM:A method for statistical disclosure limitation of microdata, report. Department of Statistical Methods, Statistics Netherlands, Voorburg.Google Scholar
  14. Kruskal W.H. 1958. Ordinal measures of association. Journal of the American Statistical Association 53: 814–861.Google Scholar
  15. Liew C.K., Choi U.J., and Liew C.J. 1985. A data distortion by probability distribution. ACM Transactions on Database Systems 10: 395–411.Google Scholar
  16. Little R.J.A. 1993. Statistical analysis of masked data. Journal of Official Statistics 9: 407–426.Google Scholar
  17. Muralidhar K., Parsa R., and Sarathy R. 1999. A general additive data perturbation method for database security. Management Science 45: 1399–1415.Google Scholar
  18. Muralidhar K., Sarathy R., and Parsa R. 2001. An improved security requirement for data perturbation with implications for E-Commerce. Decision Sciences 32: 683–698.Google Scholar
  19. Palley M.A. and Simonoff J.S. 1987. The use of regression methodology for the compromise of confidential information in statistical databases. ACM Transactions on Database Systems 12: 593–608.Google Scholar
  20. Polettini S., Franconi L., and Stander J. 2002. Model based disclosure protection. In: Domingo-Ferrer J. (Ed.), Inference Control in Statistical Databases: From Theory to Practice, Berlin, Springer.Google Scholar
  21. Raghunathan T.E., Reiter J.P., and Rubin D.B. 2003. Multiple imputation for statistical disclosure limitation. Journal of Official Statistics 19: 1–16.Google Scholar
  22. Rubin D.B. 1987. Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, New York.Google Scholar
  23. Rubin D.B. 1993. Discussion: Statistical disclosure limitation. Journal of Official Statistics 9: 461–468.Google Scholar
  24. Sarathy R. and Muralidhar K. 2002. The security of confidential numerical data in databases. Information Systems Research 13: 389–403.Google Scholar
  25. Sarathy R., Muralidhar K., and Parsa R. 2002. Perturbing non-normal confidential variables: The copula approach. Management Science 48: 1613–1627.Google Scholar
  26. Sklar A. 1959. Fonctions de R´epartition à n dimensions et Leurs Mages. Publications de l'Institut Statisitque de l'Universite de Paris 8: 229–231.Google Scholar
  27. Sullivan G. 1989. The use of added error to avoid disclosure in microdata releases. Unpublished Ph.D. Dissertation, Iowa State University, Ames, Iowa.Google Scholar
  28. Tendick P. 1991. Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statistical Planning and Inference 27: 341–353.Google Scholar
  29. Tendick P. and Matloff N. 1994. A modified random perturbation method for database security. ACM Transactions on Database Systems 19: 47–63.Google Scholar
  30. Traub J.F., Yemini Y., and Wozniakowski H. 1984. The statistical security of a statistical database. ACM Transactions on Database Systems 9: 672–679.Google Scholar
  31. Willenborg L. and de Waal T. 1996. Statistical Disclosure Control in Practice. Springer Verlag, New York.Google Scholar
  32. Willenborg L. and de Waal T. 2001. Elements of Statistical Disclosure Control. Springer Verlag, New York.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Krishnamurty Muralidhar
    • 1
  • Rathindra Sarathy
    • 2
  1. 1.School of Management, Gatton College of Business & EconomicsUniversity of KentuckyLexingtonUSA
  2. 2.Department of Management Science & Information SystemsOklahoma State UniversityStillwaterUSA

Personalised recommendations