Skip to main content

Disclosure Risk and Data Utility

  • Chapter
  • First Online:
Book cover Statistical Confidentiality

Abstract

As we have repeatedly argued, DSOs fulfill their stewardship responsibilities by resolving the tension between ensuring confidentiality and providing access (Duncan et al., 1993; Kooiman et al., 1999; Marsh et al., 1991). Data stewardship, therefore, requires disseminating data products that both (1) protect confidentiality—so get disclosure risk R low by providing safe data and (2) keep data utility U high by providing data products that are analytically valid. In other words, the problem of protecting data is bi-criteria. This opens the question of how to balance the two criteria. Answering this requires that we know how R and U affect each other.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Another possibility is the Sample knowledge state where the data snooper is taken to know that τ is one of the values in X (i.e., is in the sample). The Sample knowledge state is appropriately assumed when the data snooper knows, for whatever reason, that the individual was surveyed in a sample survey. This would certainly be true if the data are a census or a near census. This state is called “response knowledge” by Keller and Bethlehem (1992). Or the data snooper might know that the record of the target is in the sampling frame, but not necessarily in the actual sample.

References

  • Abowd, J.M., Woodcock, S.D.: Multiply-imputing confidential characteristics and file links in longitudinal linked data. In: Domingo-Ferrer, J., Torra, V. (eds.) Privacy in Statistical Databases 2004, pp. 290–297. Springer, New York, NY (2004)

    Google Scholar 

  • Agrawal, R., Srikant, R.: Privacy-preserving data mining. Proceedings of the 2000 ACM SIGMOD on Management of Data, Dallas, TX, 15–18 May 2000

    Google Scholar 

  • Brand, R.: Microdata protection through noise addition. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. Lecture Notes in Computer Science, vol. 2316, pp. 97–116. Springer, Berlin, Heidelberg (2002a)

    Google Scholar 

  • Dalenius, T., Reiss, S.P.: Data-swapping: a technique for disclosure control. J. Stat. Plann. Inference 6, 73–85 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  • Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Zayatz, L., Doyle, P., Theeuwes, J., Lane, J. (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 111–133. North-Holland, Amsterdam (2001)

    Google Scholar 

  • Duncan, G.T., Fienberg, S.E.: Obtaining information while preserving privacy: a Markov perturbation method for tabular data. Eurostat. Proceedings of Statistical Data Protection '98, Lisbon, pp. 351–362 (1999)

    Google Scholar 

  • Duncan, G.T., Jabine, T.B., de Wolf, V.A. (eds.): Panel on Confidentiality and Data Access, Committee on National Statistics, Commission on Behavioral and Social Sciences and Education, National Research Council and the Social Science Research Council, Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics. National Academy of Sciences, Washington, DC (1993)

    Google Scholar 

  • Duncan, G.T., Keller-McNulty, S.A., Stokes, S.L.: Disclosure risk vs. data utility: the R-U confidentiality map. Technical report LA-UR-01-6428, Los Alamos National Laboratory, Los Alamos, NM 2001

    Google Scholar 

  • Duncan, G.T., Lambert, D.: Disclosure-limited data dissemination (with discussion). J. Am. Stat. Assoc. 81(393), 10–28 (1986)

    Article  Google Scholar 

  • Duncan, G.T., Lambert, D.: The risk of disclosure for microdata. J. Bus. Econ. Stat. 7, 207–217 (1989)

    Article  Google Scholar 

  • Duncan, G.T., Mukherjee, S.: Optimal disclosure limitation strategy in statistical databases: deterring tracker attacks through additive noise. J. Am. Stat. Assoc. 95, 720–729 (2000)

    Article  Google Scholar 

  • Duncan, G.T., Stokes, S.L.: Disclosure risk vs. data utility: the R-U confidentiality map as applied to topcoding. Chance 17(3), 16–20 (2004)

    MathSciNet  Google Scholar 

  • Elliot, M.J.: Data Citizenship: a 21st century solution to a 20th Century problem. Keynote speech to Exploiting Existing Data for Health Research, St Andrews September (2007)

    Google Scholar 

  • Elliot, M.J., Dale, A.: Scenarios of attack: a data intruder’s perspective on statistical disclosure risk. Netherlands Official Stat. 14, 6–10 (1999)

    Google Scholar 

  • Kamlet, M.S., Klepper, S., Frank, R.G.: Mixing micro and macro data: statistical issues and implication for data collection and reporting. Proceedings of the 1983 Public Health Conference on Records and Statistics, U.S. Department of Health and Human Services, Hyattsville, MD 1985

    Google Scholar 

  • Keller, W.J., Bethlehem, J.G.: Disclosure protection of microdata: problems and solutions. Stat. Neerl. 46, 5–19 (1992)

    Article  Google Scholar 

  • Kennickell, A.B., Lane, J.: Measuring the impact of data protection techniques on data utility: evidence from the survey of consumer finances. In: Domingo-Ferrer, J. (ed.) Privacy in Statistical Databases. Lecture Notes in Computer Science, pp. 291–303. Springer, New York, NY (2006)

    Chapter  Google Scholar 

  • Kim, J.J.: A method for limiting disclosure in microdata based on random noise and transformation. Proceedings of the Section on Survey Research Methods, American Statistical Association, Alexandria, VA, pp. 370–374 1986

    Google Scholar 

  • Kim, J.J., Winkler, W.E.: Masking microdata files. Proceedings of the Section on Survey Research Methods, American Statistical Association, Alexandria, VA, pp. 114–119 (1995)

    Google Scholar 

  • Kooiman, P., Nobel, J., Willenborg, L.: Statistical data protection at statistics Netherlands. Netherlands Official Stat. 14, 21–25 (1999)

    Google Scholar 

  • Kirkwood, C.W.: Strategic Decision Making: Multiobjective Decision Analysis with Spreadsheets. Duxbury Press, Belmont, CA (1996)

    Google Scholar 

  • Lambert, D.: Measures of disclosure risk and harm. J. Official Stat. 9, 313–331 (1993)

    Google Scholar 

  • Little, R.J.A.: Statistical analysis of masked data. J. Official Stat. 9(2), 407–426 (1993)

    Google Scholar 

  • Mackey, E., Elliot, M.J.: An application of game theory to understanding disclosure events. Proceedings of Work Session on Statistical Data Confidentiality, Bilbao, December 2009

    Google Scholar 

  • Marsh, C., Skinner, C., Arber, S., Penhale, B., Openshaw, S., Hobcraft, J., Lievesley, D., Walford, N.: The case for samples of anonymized records from the 1991 census. J. R. Stat. Soc. Ser. A 154, 305–340 (1991)

    Article  Google Scholar 

  • Mokken, R.J., Kooiman, P., Pannekoek, J., Willenborg, L.C.R.J.: Disclosure risks for microdata. Stat. Neerl. 46, 49–67 (1992)

    Article  Google Scholar 

  • Paass, G.: Disclosure risk and disclosure avoidance for microdata. J. Bus. Econ. Stat. 6(4), 487–500 (1988)

    Article  Google Scholar 

  • Rubin, D.B.: Discussion of statistical disclosure limitation. J. Official Stat. 9(2), 461–468 (1993)

    Google Scholar 

  • Shlomo, N.: Accessing microdata via the internet. Joint UN/ECE and Eurostat Work Session on Statistical Data Confidentiality, Working Paper No. 6. Luxembourg, April 7–9 (2003)

    Google Scholar 

  • Skinner, C.J.: Statistical disclosure issues for census microdata. Paper presented at International Symposium on Statistical Disclosure Avoidance, Voorburg, The Netherlands, 13 December 1990

    Google Scholar 

  • Spruill, N.L.: The confidentiality and analytic usefulness of masked business microdata. Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 602–607. Alexandria, VA (1983)

    Google Scholar 

  • Sullivan, G., Fuller, W.A.: The use of measurement error to avoid disclosure. Proceedings of the Section on Survey Research Methods, American Statistical Association, Alexandria, VA, pp. 802–807 (1989)

    Google Scholar 

  • Trottini, M.: A decision-theoretic approach to data disclosure problems. Paper prepared for 2nd Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Skopje, Macedonia, 14–16 March 2001 (2001)

    Google Scholar 

  • Trottini, M.: Decision models for data disclosure limitation. Ph.D. thesis, Department of Statistics, Carnegie Mellon University (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George T. Duncan .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer New York

About this chapter

Cite this chapter

Duncan, G.T., Elliot, M., Salazar-González, JJ. (2011). Disclosure Risk and Data Utility. In: Statistical Confidentiality. Statistics for Social and Behavioral Sciences. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7802-8_6

Download citation

Publish with us

Policies and ethics