Abstract
Statistical agencies release heath data collected in surveys, censuses and registers. In this chapter, statistical disclosure limitation (SDL) from the perspective of statistical agencies is presented. Traditional outputs in the form of survey microdata and tabular outputs are first presented with respect to quantifying disclosure risk, common SDL techniques for protecting the data, and measuring information loss. In recent years, however, there is greater demand for data including government ‘open data’ initiatives, which have led statistical agencies to examine additional forms of disclosure risks, related to the concept of differential privacy in the computer science literature. A discussion on whether SDL practices carried out at statistical agencies for traditional outputs are differentially private, is provided in the chapter. The chapter concludes with the presentation of some innovative data dissemination strategies that are currently being assessed by statistical agencies, where stricter privacy guarantees are necessary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abowd, J.M., Vilhuber, L.: How protective are synthetic data? In: J. Domingo-Ferrer, Y. Saygn (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 5262, pp. 239–246. Springer, Heidelberg (2008)
Antal, L., Shlomo, N., Elliot, M.: In: J. Domingo-Ferrer (ed.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 8744, pp. 62–78. Springer International Publishing, New York (2014)
Anwar, N.: Micro-aggregation – the small aggregates method. Informe Intern. Eurostat, Luxembourg (1993)
Benedetti, R., Capobianchi, A., Franconi, L.: Individual risk of disclosure using sampling Design information. Istat: Contributi (2003) http://www3.istat.it/dati/pubbsci/contributi/Contributi/contr_2003/2003_14.pdf
Bethlehem, J., Keller, W., Pannekoek, J.: Disclosure limitation of microdata. J. Am. Stat. Assoc. 85, 38–45 (1990)
Brand, R.: Microdata protection through noise addition. In: J. Domingo-Ferrer (ed.) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol. 2316, pp. 97–116. Springer, Heidelberg (2002)
Chambers, R.L., Dunstan, R.: Estimating distribution functions from survey data. Biometrika 73(3), 597–604 (1986)
Chaudhuri, K., Mishra, N.: When random sampling preserves privacy. In: C. Dwork (ed.) Advances in Cryptology - CRYPTO 2006. Lecture Notes in Computer Science, vol. 4117, pp. 198–213. Springer, Berlin (2006)
Dalenius, T., Reiss, S.P.: Data swapping: a technique for disclosure limitation. J. Stat. Plann. Inference 7(1), 73–85 (1982)
Dandekar, R.A., Cox, L.H.: Synthetic tabular data: an alternative to complementary cell suppression. Energy Information Administration, U.S. Department of Energy (2002)
Defays, D., Nanopoulos, P.: Panels of enterprises and confidentiality: the small aggregates method. In: Proceedings of Statistics Canada Symposium 92, Design and Analysis of Longitudinal Surveys, p. 195204 (1992)
Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS ’03), pp. 202–210. Association for Computing Machinery (2003)
Domingo-Ferrer, J., Mateo-Sanz, J., Torra, V.: Comparing sdc methods for micro-data on the basis of information loss and disclosure risk. In: Proceedings of the ETK-NTTS Conference (2001)
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: S. Halevi, T. Rabin (eds.) Theory of Cryptography. Lecture Notes in Computer Science, vol. 3876, pp. 265–284. Springer, Berlin (2006)
Elamir, E.A., Skinner, C.J.: Record level measures of disclosure risk for survey microdata. J. Off. Stat. 22(3), 525–539 (2006)
Fienberg, S., McIntyre, J.: Data swapping: variations on a theme by dalenius and reiss. J. Off. Stat. 9(1), 383–406 (2005)
Fraser, B., Wooton, J.: A proposed method for confidentialising tabular output to protect against differencing. In: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality (2005)
Fuller, W.A.: Masking procedures for micro-data disclosure limitation. J. Off. Stat. 9(1), 383–406 (1993)
Gomatam, S., Karr, A.: Distortion measures for categorical data swapping. Technical Report Number 131, National Institute of Statistical Sciences (2003)
Gouweleeuw, J., Kooiman, P., Willenborg, L., De Wolf, P.: Post randomisation for statistical disclosure limitation: theory and implementation. J. Off. Stat. 14(1), 463–478 (1998)
Hundepool, A.: The casc project. In: J. Domingo-Ferrer (ed.) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol. 2316, pp. 172–180. Springer, Berlin (2002)
Kim, J.: A method for limiting disclosure in micro-data based on random noise and transformation. In: Proceedings of the American Statistical Association, Section on Survey Research Methods, pp. 370–374 (1986)
Little, R., Liu, F.: Selective multiple imputation of keys for statistical disclosure control in microdata. The University of Michigan Department of Biostatistics Working Paper Series. Working Paper 6. (2003)
Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: theory meets practice on the map. In: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering (ICDE ’08), pp. 277–286. IEEE (2008)
O’Keefe, C.M., Shlomo, N.: Comparison of remote analysis with statistical disclosure control for protecting the confidentiality of business data. Trans. Data Privacy 5(2), 403–432 (2012)
O’Keefe, C.M., Good, N.M.: A remote analysis server - what does regression output look like? In: J. Domingo-Ferrer, Y. Saygn (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 5262, pp. 270–283. Springer, Berlin (2008)
Raghunathan, T.E., Reiter, J.P., Rubin, D.B: Multiple imputation for statistical disclosure limitation. J. Off. Stat. 19(1), 1–16 (2003)
Reiter, J.: Releasing multiply imputed, synthetic public-use microdata: an illustration and empirical study. J. R. Stat. Soc. A 168(1), 185–205 (2005)
Rinott, Y., Shlomo, N.: A generalized negative binomial smoothing model for sample disclosure risk estimation. In: J. Domingo-Ferrer, L. Franconi (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 4302, pp. 82–93. Springer, Berlin (2006)
Rinott, Y., Shlomo, N.: A smoothing model for sample disclosure risk estimation. In: Complex Datasets and Inverse Problems: Tomography, Networks and Beyond. Institute of Mathematical Statistics, Lecture Notes Monograph Series 54, 161–171 (2007)
Rinott, Y., Shlomo, N.: Variances and confidence intervals for sample disclosure risk measures. Proceedings of the 56th World Statistics Conference Lisboa, Portugal, Instituto Nacional de Estatística (INE), 1090-1096 (2007) http://isi.cbs.nl/iamamember/CD7-Lisboa2007/Bulletin-of-the-ISI-Volume-LXII-2007.pdf
Salazar-Gonzalez, J.J., Bycroft, C., Staggemeier, A.T.: Controlled rounding implementation. In: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, 9-11 Nov 2005
Shlomo, N., Antal, L., Elliot, M.: Measuring disclosure risk and data utility for flexible table generators. J.Off.Stat. 31(2), 305–324 (2015)
Shlomo, N.: Statistical disclosure limitation methods for census frequency tables. J. Int. Stat. Rev. 75(2), 199–217 (2007)
Shlomo, N., Skinner, C.: Privacy protection from sampling and perturbation in survey microdata. J. Privacy Confidentiality 4(1), 155–169 (2012)
Shlomo, N., Skinner, C.: Assessing the protection provided by misclassification-based disclosure limitation methods for survey microdata. Ann. Appl. Stat. 4(3), 1291–1310 (2010)
Shlomo, N., De Waal, T.: Protection of micro-data subject to edit constraints against statistical disclosure. J. Off. Stat. 24(2), 1–26 (2008)
Shlomo, N., Skinner, C.: Privacy protection from sampling and perturbation in survey microdata. J. Privacy Confidentiality 4(1), 155–169 (2012)
Shlomo, N., Young, C.: Statistical disclosure control methods through a risk-utility framework. In: Proceedings of the 2006 CENEX-SDC Project International Conference on Privacy in Statistical Databases (PSD’06), pp. 68–81. Springer (2006)
Shlomo, N., Young, C.: Invariant post-tabular protection of census frequency counts. In: J. Domingo-Ferrer, Y. Saygn (eds.) Privacy in Statistical Databases. Lecture Notes in Computer Science, vol. 5262, pp. 77–89. Springer, Berlin (2008)
Skinner, C., Holmes, D.: Estimating the re-identification risk per record in microdata. J. Off. Stat. 14(1), 361–372 (1998)
Skinner, C., Shlomo, N.: Assessing identification risk in survey micro-data using log-linear models. J. Am. Stat. Assoc. 103(483), 989–1001 (2008)
Willenborg, L., De Waal, T.: Elements of statistical disclosure limitation in practice. In: Lecture Notes in Statistics, vol. 155. Springer, New York (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Shlomo, N. (2015). Statistical Disclosure Limitation for Health Data: A Statistical Agency Perspective. In: Gkoulalas-Divanis, A., Loukides, G. (eds) Medical Data Privacy Handbook. Springer, Cham. https://doi.org/10.1007/978-3-319-23633-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-23633-9_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23632-2
Online ISBN: 978-3-319-23633-9
eBook Packages: Computer ScienceComputer Science (R0)