Skip to main content
Log in

A new approach for disclosure control in the IAB establishment panel—multiple imputation for a better data access

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

For micro-datasets considered for release as scientific or public use files, statistical agencies have to face the dilemma of guaranteeing the confidentiality of survey respondents on the one hand and offering sufficiently detailed data on the other hand. For that reason, a variety of methods to guarantee disclosure control is discussed in the literature. In this paper, we present an application of Rubin’s (J. Off. Stat. 9, 462–468, 1993) idea to generate synthetic datasets from existing confidential survey data for public release.

We use a set of variables from the 1997 wave of the German IAB Establishment Panel and evaluate the quality of the approach by comparing results from an analysis by Zwick (Ger. Econ. Rev. 6(2), 155–184, 2005) with the original data with the results we achieve for the same analysis run on the dataset after the imputation procedure. The comparison shows that valid inferences can be obtained using the synthetic datasets in this context, while confidentiality is guaranteed for the survey participants.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abowd, J.M., Lane, J.: New approaches to confidentiality protection: synthetic data, remote access and research data centers. In: Privacy in Statistical Databases, pp. 282–289. Springer, New York (2004)

    Google Scholar 

  • Abowd, J.M., Woodcock, S.D.: Disclosure limitation in longitudinal linked data. In: Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 215–277. North-Holland, Amsterdam (2001)

    Google Scholar 

  • Abowd, J.M., Woodcock, S.D.: Multiply-imputing confidential characteristics and file links in longitudinal linked data. In: Privacy in Statistical Databases, pp. 290–297. Springer, New York (2004)

    Google Scholar 

  • Barnard, J., Rubin, D.B.: Small-sample degrees of freedom with multiple imputation. Biometrika 86, 948–955 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Brand, R.: Anonymität von Betriebsdaten—Verfahren zur Erfassung und Maßnahmen zur Verringerung des Reidentifikationsrisikos. Beiträge zur Arbeitsmarkt- und Berufsforschung, Bd. 237 (2000)

  • Brand, R.: Masking through noise addition. In: Inference Control in Statistical Databases, pp. 97–116. Springer, Berlin (2002)

    Google Scholar 

  • Brand, R., Bender, S., Kohaut, S.: Possibilities for the creation of a scientific-use file for the IAB-establishment-panel. In: Statistical Data Confidentiality Proceedings of the Joint Eurostat/UN-ECE Work Session on Statistical Data Confidentiality Held in Thessaloniki in March 1999, pp. 57–74. Eurostat, Brüssel (1999)

    Google Scholar 

  • Fischer, G., Janik, F., Müller, D., Schmucker, A.: The IAB establishment panel—from sample to survey to projection. FDZ-Methodenreport, No. 1 (2008)

  • Gottschalk, S.: Unternehmensdaten zwischen Datenschutz und Analysepotenzial. ZEW Wirtschaftsanalysen, Bd. 76. Nomos Verlag, Baden Baden (2005)

    Google Scholar 

  • Karr, A.F., Kohen, C.N., Oganian, A., Reiter, J.P., Sanil, A.P.: A framework for evaluating the utility of data altered to protect confidentiality. Am. Stat. 60, 224–232 (2006)

    Article  Google Scholar 

  • Kennickell, A.B.: Multiple imputation and disclosure protection: the case of the 1995 survey of consumer finances. In: Record Linkage Techniques, pp. 248–267. National Academy Press, Washington (1997)

    Google Scholar 

  • Kölling, A.: The IAB-establishment panel. J. Appl. Soc. Sci. Stud. 120, 291–300 (2000)

    Google Scholar 

  • Lane, J.: Optimizing the use of micro-data: an overview of the issues. Paper presented at the Joint Statistical Meetings. http://client.norc.org/jole/SOLEweb/Accesstomicrodata%5B1%5D.pdf. (2005)

  • Little, R.J.A.: Statistical analysis of masked data. J. Off. Stat. 9, 407–426 (1993)

    Google Scholar 

  • Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, Hoboken (2002)

    MATH  Google Scholar 

  • Meng, X.-L.: Multiple-imputation inferences with uncongenial sources of input. Stat. Sci. 9, 538–558 (1994)

    Google Scholar 

  • Raghunathan, T.E., Lepkowski, J.M., van Hoewyk, J., Solenberger, P.: A multivariate technique for multiply imputing missing values using a series of regression models. Surv. Methodol. 27, 85–96 (2001)

    Google Scholar 

  • Raghunathan, T.E., Reiter, J.P., Rubin, D.B.: Multiple imputation for statistical disclosure limitation. J. Off. Stat. 19, 1–16 (2003)

    Google Scholar 

  • Reiter, J.P.: Satisfying disclosure restrictions with synthetic data sets. J. Off. Stat. 18, 531–544 (2002)

    Google Scholar 

  • Reiter, J.P.: Inference for partially synthetic, public use microdata sets. Surv. Methodol. 29, 181–188 (2003)

    Google Scholar 

  • Reiter, J.P.: Simultaneous use of multiple imputation for missing data and disclosure limitation. Surv. Methodol. 30, 235–242 (2004)

    Google Scholar 

  • Reiter, J.P.: Releasing multiply-imputed, synthetic public use microdata: an illustration and empirical study. J. R. Stat. Soc. Ser. A 168, 185–205 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Reiter, J.P., Drechsler, J.: Releasing multiply-imputed, synthetic data generated in two stages to protect confidentiality. Tech. rep., IAB Discussion Paper, No. 20 (2007)

  • Ronning, G., Rosemann, M.: Estimation of the probit model from anonymized micro data. In: Work Session on Statistical Data Confidentiality, Geneva, 9–11 November 2005. Monograph of Official Statistics, pp. 207–216. Eurostat, Luxemburg (2006)

    Google Scholar 

  • Ronning, G., Rosemann, M., Strotmann, H.: Post-randomization under test: estimation of a probit model. J. Econ. Stat. 225, 544–566 (2005)

    Google Scholar 

  • Rosemann, M.: Auswirkungen datenverändernder Anonymisierungsverfahren auf die Analyse von Mikrodaten. IAW (2006)

  • Rubin, D.B.: Multiple imputation in sample surveys—a phenomenological Bayesian approach to nonresponse. In: American Statistical Association Proceedings of the Section on Survey Research Methods, pp. 20–40 (1978)

  • Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)

    Google Scholar 

  • Rubin, D.B.: Discussion: statistical disclosure limitation. J. Off. Stat. 9, 462–468 (1993)

    Google Scholar 

  • Rubin, D.B.: The design of a general and flexible system for handling nonresponse in sample surveys. Am. Stat. 58, 298–302 (2004)

    Article  Google Scholar 

  • Rubin, D.B., Schenker, N.: Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. J. Am. Stat. Assoc. 81, 366–374 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  • Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Springer, New York (2001)

    MATH  Google Scholar 

  • Zwick, T.: Continuing vocational training forms and establishment productivity in Germany. Ger. Econ. Rev. 6(2), 155–184 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jörg Drechsler.

Additional information

The research provided in this paper is part of the project “Wirtschaftsstatistische Paneldaten und faktische Anonymisierung” financed by the Federal Ministry for Education and Research (BMBF) and conducted by the following institutes: Federal Statistical Office Germany, Statistical Offices of the Länder, Institute for Applied Economic Research (IAW), Centre for European Economic Research (ZEW), Institute for Employment Research (IAB). For more information about this project, see for instance Ronning and Rosemann (2006) or Ronning et al. (2005). We thank our project partners and the participants of the “UNECE Conference on Data Editing and Imputation,” 25.09.2006–27.09.2006 in Bonn and “The Conference on Privacy in Statistical Databases ’06,” 13.12.2006–15.12.2006 in Rome, and especially J.M. Abowd, T.E. Raghunathan, D.B. Rubin, J.P. Reiter, and two anonymous referees for their helpful comments on the paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Drechsler, J., Dundler, A., Bender, S. et al. A new approach for disclosure control in the IAB establishment panel—multiple imputation for a better data access. AStA Adv Stat Anal 92, 439–458 (2008). https://doi.org/10.1007/s10182-008-0090-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-008-0090-1

Keywords

Navigation