Model Based Disclosure Protection

Polettini, Silvia; Franconi, Luisa; Stander, Julian

doi:10.1007/3-540-47804-3_7

Silvia Polettini⁵,
Luisa Franconi⁵ &
Julian Stander⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2316))

725 Accesses
1 Citations

Abstract

We argue that any microdata protection strategy is based on a formal reference model. The extent of model specification yields “parametric”, “semiparametric”, or “nonparametric” strategies. Following this classification, a parametric probability model, such as a normal regression model, or a multivariate distribution for simulation can be specified. Matrix masking (Cox [2]), covering local suppression, coarsening, microaggregation (Domingo-Ferrer [8]), noise injection, perturbation (e.g. Kim [15]; Fuller [12]), provides examples of the second and third class of models. Finally, a nonparametric approach, e.g. use of bootstrap procedures for generating synthetic microdata (e.g. Dandekar et. al. [4]) can be adopted.

In this paper we discuss the application of a regression based imputation procedure for business microdata to the Italian sample from the Community Innovation Survey. A set of regressions (Franconi and Stander [11]) is used for generating flexible perturbation, for the protection varies according to identifiability of the enterprise; a spatial aggregation strategy is also proposed, based on principal components analysis. The inferential usefulness of the released data and the protection achieved by the strategy are evaluated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brand, R.: Microdata protection through noise addition. In: “Inference Control in Statistical Databases”, LNCS 2316, Springer-Verlag (2002), 97–116.
Chapter Google Scholar
Cox, L.H.: Matrix masking methods for disclosure limitation in microdata. Surv. Method. 20 (1994) 165–169.
Google Scholar
Cox, L.H.: Towards a Bayesian Perspective on Statistical Disclosure Limitation. Paper presented at ISBA 2000—The Sixth World Meeting of the International Society for Bayesian Analysis (2000).
Google Scholar
Dandekar, R., Cohen, M., Kirkendall, N.: Applicability of Latin Hypercube Sampling to create multi variate synthetic micro data. In: ETK-NTTS 2001 Preproceedings of the Conference. European Communities Luxembourg (2001) 839–847.
Google Scholar
Dandekar, R., Cohen, M., Kirkendall, N.: Sensitive micro data protection using Latin Hypercube Sampling technique. In: “Inference Control in Statistical Databases”, LNCS 2316, Springer-Verlag (2002), 117–125.
Chapter Google Scholar
Duncan, G.T. and Mukherjee S.: Optimal disclosure limitation strategy in statistical databases: deterring tracker attacks through additive noise. J. Am. Stat. Ass. 95 (2000) 720–729.
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 40 (1977) 1–38.
MathSciNet Google Scholar
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering In Press (2001).
Google Scholar
Fienberg, S.E., Makov, U., Steele, R.J.: Disclosure limitation using perturbation and related methods for categorical data (with discussion). J. O.. Stat. 14 (1998) 485–502.
Google Scholar
Franconi, L., Stander, J.: Model based disclosure limitation for business microdata. In: Proceedings of the International Conference on Establishment Surveys-II, June 17–21, 2000 Buffalo, New York (2000) 887–896.
Google Scholar
Franconi, L., Stander, J.: A model based method for disclosure limitation of business microdata. J. Roy. Stat. Soc. D Statistician 51 (2002) 1–11.
Article MathSciNet Google Scholar
Fuller, W.A.: Masking procedures for microdata disclosure limitation. J. O.. Stat. 9 (1993) 383–406.
Google Scholar
Grim, J., Bocek, P., Pudil, P.: Safe dissemination of census results by means of Interactive Probabilistic Models. In: ETK-NTTS 2001 Pre-proceedings of the Conference. European Communities Luxembourg (2001) 849–856.
Google Scholar
Kennickell, A.B.: Multiple imputation and disclosure protection. In: Proceedings of the Conference on Statistical Data Protection, March, 25–27, 1998 Lisbon (1999) 381–400.
Google Scholar
Kim, J.: A method for limiting disclosure of microdata based on random noise and transformation. In: Proceedings of the Survey Research Methods Section, American Statistical Association (1986) 370–374.
Google Scholar
Little, R.J.A.: Statistical analysis of masked data. J. O.. Stat. 9 (1993) 407–426.
Google Scholar
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. John Wiley New York (1987).
MATH Google Scholar
Raghunathan T., Rubin, D.B.: Bayesian multiple imputation to Preserve Confidentiality in Public-Use Data Sets. In: Proceedings of ISBA 2000—The Sixth World Meeting of the International Society for Bayesian Analysis. European Communities Luxembourg (2000).
Google Scholar
Rubin, D.B.: Discussion of “Statistical disclosure limitation”. J. O.. Stat. 9 (1993) 461–468.
Google Scholar
Winkler, W.E., Yancey, W.E., Creecy, R.H.: Disclosure risk assessment in perturbative microdata protection via record linkage. In: “Inference Control in Statistical Databases”, LNCS 2316, Springer-Verlag (2002), 135–152.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics and Statistics, University of Plymouth, Drake Circus, PL4 8AA, Plymouth, UK
Silvia Polettini & Luisa Franconi
ISTAT, Servizio della Metodologia di Base per la Produzione Statistica, Via Cesare Balbo, 16, 00185R, Roma, Italy
Julian Stander

Authors

Silvia Polettini
View author publications
You can also search for this author in PubMed Google Scholar
Luisa Franconi
View author publications
You can also search for this author in PubMed Google Scholar
Julian Stander
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007, Tarragona, Spain
Josep Domingo-Ferrer

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Polettini, S., Franconi, L., Stander, J. (2002). Model Based Disclosure Protection. In: Domingo-Ferrer, J. (eds) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol 2316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47804-3_7

Download citation

DOI: https://doi.org/10.1007/3-540-47804-3_7
Published: 23 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43614-0
Online ISBN: 978-3-540-47804-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics