Towards a More Realistic Disclosure Risk Assessment

  • Jordi Nin
  • Javier Herranz
  • Vicenç Torra
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5262)

Abstract

The score was introduced in 2001 in order to compare different perturbative methods for statistical database protection. It measures the trade-off between utility (information loss) and privacy (disclosure risk of the released data). Since its introduction, the score has been widely accepted and used in the statistical database community. In particular, some methods are sometimes prefered to others depending on the obtained results in the original computation of the score.

In this paper we argue that some original aspects of the score computation, specially those related to the disclosure risk, should be revisited. Informally, the reason is that they do not consider the best possible situation for the intruder, and so they do not measure the real level of privacy. We add some experimental results which support our claims. More importantly, we propose some modifications which can/should lead in the future to a more fair, realistic and useful computation of the score.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adam, N.R., Wortmann, J.C.: Security-control for statistical databases: a comparative study. ACM Computing Surveys 21, 515–556 (1989)CrossRefGoogle Scholar
  2. 2.
    Dalenius, T., Reiss, S.P.: Data-swapping: a technique for disclosure control. Journal of Statistical Planning and Inference 6, 73–85 (1982)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Data Extraction System, U.S. Census Bureau, http://www.census.gov/
  4. 4.
    Defays, D., Anwar, M.N.: Micro-aggregation: A Generic Method. In: Proceedings of the 94 International Seminar on Statistical Confidentiality, Luxembourg, Office for Official Publications of the European Communities (1995)Google Scholar
  5. 5.
    Domingo-Ferrer, J., Mateo-Sanz, J., Torra, V.: Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In: Pre-proceedings of ETK-NTTS 2001, vol. 2, pp. 807–812. Eurostat, Luxembourg (2001)Google Scholar
  6. 6.
    Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata [10], pp. 91–110 (2001)Google Scholar
  7. 7.
    Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata [10], pp. 111–133 (2001)Google Scholar
  8. 8.
    Domingo-Ferrer, J., Torra, V., Mateo-Sanz, J.M., Sebé, F.: Systematic measures of re-identification risk based on the probabilistic links of the partially synthetic data back to the original microdata. Technical report (2005)Google Scholar
  9. 9.
    Domingo-Ferrer, J., Martínez-Ballesté, A., Mateo-Sanz, J.M., Sebé, F.: Efficient multivariate data-oriented microaggregation. The VLDB Journal 15, 355–369 (2006)CrossRefGoogle Scholar
  10. 10.
    Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.): Confidentiality, disclosure, and data access: theory and practical applications for statistical agencies. Elsevier Science, Amsterdam (2001)Google Scholar
  11. 11.
    Hansen, S., Mukherjee, S.: A Polynomial Algorithm for Optimal Univariate Microaggregation. Trans. on Kwnoledge and Data Engineering 15(4), 1043–1044 (2003)CrossRefGoogle Scholar
  12. 12.
    Hundepool, A., Van de Wetering, A., Ramaswamy, R., Franconi, L., Capobianchi, A., DeWolf, P.-P., Domingo-Ferrer, J., Torra, V., Brand, R., Giessing, S.: μ-ARGUS version 3.2 Software and User’s Manual. Statistics Netherlands, Voorburg NL (February 2003)Google Scholar
  13. 13.
    Jaro, M.A.: Advances in Record Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida. Journal of the American Statistical Society 84(406), 414–420 (1989)Google Scholar
  14. 14.
    Moore, R.A.: Controlled data-swapping techniques for masking public use microdata sets. Statistical Research Division Report Series, RR96-04, U.S. Bureau of the Census (1996)Google Scholar
  15. 15.
    Nin, J., Herranz, J., Torra, V.: Rethinking rank swapping to decrease disclosure risk. Data & Knowledge Engineering 64(1), 346–364 (2008)CrossRefGoogle Scholar
  16. 16.
    Nin, J., Herranz, J., Torra, V.: How to Group Attributes in Multivariate Microaggregation. International Journal on Uncertainty, Fuzziness and Knowledge-Based Systems 16(1), 121–138 (2008)CrossRefGoogle Scholar
  17. 17.
    Pagliuca, D., Seri, G.: Some results of individual ranking method on the system of enterprise accounts annual survey, Esprit SDC Project, Deliverable MI-3/D2 (1999)Google Scholar
  18. 18.
    Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Post-Masking Optimization of the Tradeoff between Information Loss and Disclosure Risk in Masked Microdata Sets. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 187–196. Springer, Heidelberg (2002)Google Scholar
  19. 19.
    Torra, V., Abowd, J.M., Domingo-Ferrer, J.: Using Mahalanobis distance-based record linkage for disclosure risk assessment. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302, pp. 233–242. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  20. 20.
    Willenborg, L., Waal, T.: Elements of Statistical Diclosure Control. Lecture Notes in Statistics. Springer, Heidelberg (2001)Google Scholar
  21. 21.
    Winkler, W.E.: Re-identification methods for masked microdata. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 216–230. Springer, Heidelberg (2004)Google Scholar
  22. 22.
    Yancey, W.E., Winkler, W.E., Creecy, R.H.: Disclosure risk assessment in perturbative microdata protection. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 135–152. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jordi Nin
    • 1
  • Javier Herranz
    • 1
  • Vicenç Torra
    • 1
  1. 1.IIIA, Artificial Intelligence Research Institute CSICSpanish National Research CouncilBellaterraSpain

Personalised recommendations