Skip to main content

Statistical Disclosure Limitation for Health Data: A Statistical Agency Perspective

  • Chapter
Medical Data Privacy Handbook

Abstract

Statistical agencies release heath data collected in surveys, censuses and registers. In this chapter, statistical disclosure limitation (SDL) from the perspective of statistical agencies is presented. Traditional outputs in the form of survey microdata and tabular outputs are first presented with respect to quantifying disclosure risk, common SDL techniques for protecting the data, and measuring information loss. In recent years, however, there is greater demand for data including government ‘open data’ initiatives, which have led statistical agencies to examine additional forms of disclosure risks, related to the concept of differential privacy in the computer science literature. A discussion on whether SDL practices carried out at statistical agencies for traditional outputs are differentially private, is provided in the chapter. The chapter concludes with the presentation of some innovative data dissemination strategies that are currently being assessed by statistical agencies, where stricter privacy guarantees are necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abowd, J.M., Vilhuber, L.: How protective are synthetic data? In: J. Domingo-Ferrer, Y. Saygn (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 5262, pp. 239–246. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  2. Antal, L., Shlomo, N., Elliot, M.: In: J. Domingo-Ferrer (ed.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 8744, pp. 62–78. Springer International Publishing, New York (2014)

    Google Scholar 

  3. Anwar, N.: Micro-aggregation – the small aggregates method. Informe Intern. Eurostat, Luxembourg (1993)

    Google Scholar 

  4. Benedetti, R., Capobianchi, A., Franconi, L.: Individual risk of disclosure using sampling Design information. Istat: Contributi (2003) http://www3.istat.it/dati/pubbsci/contributi/Contributi/contr_2003/2003_14.pdf

  5. Bethlehem, J., Keller, W., Pannekoek, J.: Disclosure limitation of microdata. J. Am. Stat. Assoc. 85, 38–45 (1990)

    Article  Google Scholar 

  6. Brand, R.: Microdata protection through noise addition. In: J. Domingo-Ferrer (ed.) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol. 2316, pp. 97–116. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Chambers, R.L., Dunstan, R.: Estimating distribution functions from survey data. Biometrika 73(3), 597–604 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chaudhuri, K., Mishra, N.: When random sampling preserves privacy. In: C. Dwork (ed.) Advances in Cryptology - CRYPTO 2006. Lecture Notes in Computer Science, vol. 4117, pp. 198–213. Springer, Berlin (2006)

    Chapter  Google Scholar 

  9. Dalenius, T., Reiss, S.P.: Data swapping: a technique for disclosure limitation. J. Stat. Plann. Inference 7(1), 73–85 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  10. Dandekar, R.A., Cox, L.H.: Synthetic tabular data: an alternative to complementary cell suppression. Energy Information Administration, U.S. Department of Energy (2002)

    Google Scholar 

  11. Defays, D., Nanopoulos, P.: Panels of enterprises and confidentiality: the small aggregates method. In: Proceedings of Statistics Canada Symposium 92, Design and Analysis of Longitudinal Surveys, p. 195204 (1992)

    Google Scholar 

  12. Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS ’03), pp. 202–210. Association for Computing Machinery (2003)

    Google Scholar 

  13. Domingo-Ferrer, J., Mateo-Sanz, J., Torra, V.: Comparing sdc methods for micro-data on the basis of information loss and disclosure risk. In: Proceedings of the ETK-NTTS Conference (2001)

    Google Scholar 

  14. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)

    Article  Google Scholar 

  15. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: S. Halevi, T. Rabin (eds.) Theory of Cryptography. Lecture Notes in Computer Science, vol. 3876, pp. 265–284. Springer, Berlin (2006)

    Google Scholar 

  16. Elamir, E.A., Skinner, C.J.: Record level measures of disclosure risk for survey microdata. J. Off. Stat. 22(3), 525–539 (2006)

    Google Scholar 

  17. Fienberg, S., McIntyre, J.: Data swapping: variations on a theme by dalenius and reiss. J. Off. Stat. 9(1), 383–406 (2005)

    Google Scholar 

  18. Fraser, B., Wooton, J.: A proposed method for confidentialising tabular output to protect against differencing. In: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality (2005)

    Google Scholar 

  19. Fuller, W.A.: Masking procedures for micro-data disclosure limitation. J. Off. Stat. 9(1), 383–406 (1993)

    MathSciNet  Google Scholar 

  20. Gomatam, S., Karr, A.: Distortion measures for categorical data swapping. Technical Report Number 131, National Institute of Statistical Sciences (2003)

    Google Scholar 

  21. Gouweleeuw, J., Kooiman, P., Willenborg, L., De Wolf, P.: Post randomisation for statistical disclosure limitation: theory and implementation. J. Off. Stat. 14(1), 463–478 (1998)

    MATH  Google Scholar 

  22. Hundepool, A.: The casc project. In: J. Domingo-Ferrer (ed.) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol. 2316, pp. 172–180. Springer, Berlin (2002)

    Chapter  Google Scholar 

  23. Kim, J.: A method for limiting disclosure in micro-data based on random noise and transformation. In: Proceedings of the American Statistical Association, Section on Survey Research Methods, pp. 370–374 (1986)

    Google Scholar 

  24. Little, R., Liu, F.: Selective multiple imputation of keys for statistical disclosure control in microdata. The University of Michigan Department of Biostatistics Working Paper Series. Working Paper 6. (2003)

    Google Scholar 

  25. Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: theory meets practice on the map. In: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering (ICDE ’08), pp. 277–286. IEEE (2008)

    Google Scholar 

  26. O’Keefe, C.M., Shlomo, N.: Comparison of remote analysis with statistical disclosure control for protecting the confidentiality of business data. Trans. Data Privacy 5(2), 403–432 (2012)

    MathSciNet  Google Scholar 

  27. O’Keefe, C.M., Good, N.M.: A remote analysis server - what does regression output look like? In: J. Domingo-Ferrer, Y. Saygn (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 5262, pp. 270–283. Springer, Berlin (2008)

    Chapter  Google Scholar 

  28. Raghunathan, T.E., Reiter, J.P., Rubin, D.B: Multiple imputation for statistical disclosure limitation. J. Off. Stat. 19(1), 1–16 (2003)

    Google Scholar 

  29. Reiter, J.: Releasing multiply imputed, synthetic public-use microdata: an illustration and empirical study. J. R. Stat. Soc. A 168(1), 185–205 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  30. Rinott, Y., Shlomo, N.: A generalized negative binomial smoothing model for sample disclosure risk estimation. In: J. Domingo-Ferrer, L. Franconi (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 4302, pp. 82–93. Springer, Berlin (2006)

    Chapter  Google Scholar 

  31. Rinott, Y., Shlomo, N.: A smoothing model for sample disclosure risk estimation. In: Complex Datasets and Inverse Problems: Tomography, Networks and Beyond. Institute of Mathematical Statistics, Lecture Notes Monograph Series 54, 161–171 (2007)

    Article  MathSciNet  Google Scholar 

  32. Rinott, Y., Shlomo, N.: Variances and confidence intervals for sample disclosure risk measures. Proceedings of the 56th World Statistics Conference Lisboa, Portugal, Instituto Nacional de Estatística (INE), 1090-1096 (2007) http://isi.cbs.nl/iamamember/CD7-Lisboa2007/Bulletin-of-the-ISI-Volume-LXII-2007.pdf

  33. Salazar-Gonzalez, J.J., Bycroft, C., Staggemeier, A.T.: Controlled rounding implementation. In: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, 9-11 Nov 2005

    Google Scholar 

  34. Shlomo, N., Antal, L., Elliot, M.: Measuring disclosure risk and data utility for flexible table generators. J.Off.Stat. 31(2), 305–324 (2015)

    Google Scholar 

  35. Shlomo, N.: Statistical disclosure limitation methods for census frequency tables. J. Int. Stat. Rev. 75(2), 199–217 (2007)

    Article  Google Scholar 

  36. Shlomo, N., Skinner, C.: Privacy protection from sampling and perturbation in survey microdata. J. Privacy Confidentiality 4(1), 155–169 (2012)

    MathSciNet  Google Scholar 

  37. Shlomo, N., Skinner, C.: Assessing the protection provided by misclassification-based disclosure limitation methods for survey microdata. Ann. Appl. Stat. 4(3), 1291–1310 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  38. Shlomo, N., De Waal, T.: Protection of micro-data subject to edit constraints against statistical disclosure. J. Off. Stat. 24(2), 1–26 (2008)

    Google Scholar 

  39. Shlomo, N., Skinner, C.: Privacy protection from sampling and perturbation in survey microdata. J. Privacy Confidentiality 4(1), 155–169 (2012)

    MathSciNet  Google Scholar 

  40. Shlomo, N., Young, C.: Statistical disclosure control methods through a risk-utility framework. In: Proceedings of the 2006 CENEX-SDC Project International Conference on Privacy in Statistical Databases (PSD’06), pp. 68–81. Springer (2006)

    Google Scholar 

  41. Shlomo, N., Young, C.: Invariant post-tabular protection of census frequency counts. In: J. Domingo-Ferrer, Y. Saygn (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 5262, pp. 77–89. Springer, Berlin (2008)

    Chapter  Google Scholar 

  42. Skinner, C., Holmes, D.: Estimating the re-identification risk per record in microdata. J. Off. Stat. 14(1), 361–372 (1998)

    MathSciNet  Google Scholar 

  43. Skinner, C., Shlomo, N.: Assessing identification risk in survey micro-data using log-linear models. J. Am. Stat. Assoc. 103(483), 989–1001 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  44. Willenborg, L., De Waal, T.: Elements of statistical disclosure limitation in practice. In: Lecture Notes in Statistics, vol. 155. Springer, New York (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalie Shlomo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Shlomo, N. (2015). Statistical Disclosure Limitation for Health Data: A Statistical Agency Perspective. In: Gkoulalas-Divanis, A., Loukides, G. (eds) Medical Data Privacy Handbook. Springer, Cham. https://doi.org/10.1007/978-3-319-23633-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23633-9_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23632-2

  • Online ISBN: 978-3-319-23633-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics