Skip to main content

Using Partially Synthetic Data to Replace Suppression in the Business Dynamics Statistics: Early Results

  • Conference paper
Privacy in Statistical Databases (PSD 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8744))

Included in the following conference series:

Abstract

The Business Dynamics Statistics is a product of the U.S. Census Bureau that provides measures of business openings and closings, and job creation and destruction, by a variety of cross-classifications (firm and establishment age and size, industrial sector, and geography). Sensitive data are currently protected through suppression. However, as additional tabulations are being developed, at ever more detailed geographic levels, the number of suppressions increases dramatically. This paper explores the option of providing public-use data that are analytically valid and without suppressions, by leveraging synthetic data to replace observations in sensitive cells.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abowd, J.M., Gittings, K., McKinney, K.L., Stephens, B.E., Vilhuber, L., Woodcock, S.: Dynamically consistent noise infusion and partially synthetic data as confidentiality protection measures for related time-series. Tech. rep. Federal Committee on Statistical Methodology (January 2012), http://www.fcsm.gov/events/papers2012.html

  2. Abowd, J.M., Vilhuber, L.: Synthetic data server (2010), http://www.vrdc.cornell.edu/sds/

  3. Drechsler, J.: Synthetische Scientific-use-files der Welle 2007 des IAB-Betriebspanels. FDZ Methodenreport 201101_de, Institute for Employment Research, Nuremberg, Germany (January 2011), http://ideas.repec.org/p/iab/iabfme/201101_de.html

  4. Drechsler, J.: New data dissemination approaches in old Europe – synthetic datasets for a German establishment survey. Journal of Applied Statistics 39(2), 243–265 (2012), http://ideas.repec.org/a/taf/japsta/v39y2012i2p243-265.html

    Article  MathSciNet  Google Scholar 

  5. Drechsler, J., Reiter, J.P.: Disclosure risk and data utility for partially synthetic data: An empirical study using the German IAB Establishment Survey. Journal of Official Statistics 25(12), 589–603 (2009), http://ideas.repec.org/a/eee/csdana/v55y2011i12p3232-3243.html

    Google Scholar 

  6. Drechsler, J., Reiter, J.P.: Sampling with synthesis: A new approach for releasing public use census microdata. Journal of the American Statistical Association 105(492), 1347–1357 (2010), http://ideas.repec.org/a/bes/jnlasa/v105i492y2010p1347-1357.html

    Article  MathSciNet  Google Scholar 

  7. Gittings, R.K.: Essays in labor economics and synthetic data methods. Ph.d., Cornell University (2009)

    Google Scholar 

  8. Haltiwanger, J., Jarmin, R., Miranda, J.: Jobs created from business startups in the United States (2008), https://www.census.gov/ces/pdf/BDS_StatBrief1_Jobs_Created.pdf

  9. Haltiwanger, J.C., Jarmin, R.S., Miranda, J.: Who creates jobs? small vs. large vs. young. Working Paper 16300, National Bureau of Economic Research (August 2010), http://www.nber.org/papers/w16300

  10. Hethey, T., Schmieder, J.F.: Using worker flows in the analysis of establishment turnover: Evidence from German administrative data. FDZ Methodenreport 201006_en, Institute for Employment Research, Nuremberg, Germany (August 2010), http://ideas.repec.org/p/iab/iabfme/201006_en.html

  11. Holan, S.H., Toth, D., Ferreira, M.A.R., Karr, A.F.: Bayesian multiscale multiple imputation with implications for data confidentiality. Journal of the American Statistical Association 105(490), 564–577 (2010), http://dx.doi.org/10.1198/jasa.2009.ap08629

    Article  MathSciNet  Google Scholar 

  12. Karr, A.F., Kohnen, C.N., Oganian, A., Reiter, J.P., Sanil, A.P.: A framework for evaluating the utility of data altered to protect confidentiality 60(3), 1–9 (2006)

    Google Scholar 

  13. Kinney, S.K., Reiter, J.: SynLBD: providing firm characteristics on synthetic establishment data. Presentation, World Statistics Conference (2013)

    Google Scholar 

  14. Kinney, S.K., Reiter, J., Miranda, J.: Improving the Synthetic Longitudinal Business Database. Working Paper 14-12, U.S. Census Bureau, Center for Economic Studies (2014)

    Google Scholar 

  15. Kinney, S.K., Reiter, J.P., Reznek, A.P., Miranda, J., Jarmin, R.S., Abowd, J.M.: Towards unrestricted public use business microdata: The Synthetic Longitudinal Business Database. International Statistical Review 79(3), 362–384 (2011), http://ideas.repec.org/a/bla/istatr/v79y2011i3p362-384.html

    Article  Google Scholar 

  16. Machanavajjhala, A., Kifer, D., Abowd, J.M., Gehrke, J., Vilhuber, L.: Privacy: Theory meets practice on the map. In: International Conference on Data Engineering, ICDE (2008)

    Google Scholar 

  17. Rodríguez, R.: Synthetic data disclosure control for american community survey group quarters (2007)

    Google Scholar 

  18. Sakshaug, J.W., Raghunathan, T.E.: Synthetic Data for Small Area Estimation in the American Community Survey. Working Papers 13-19, Center for Economic Studies, U.S. Census Bureau (April 2013), http://ideas.repec.org/p/cen/wpaper/13-19.html

  19. U.S. Census Bureau: Synthetic LBD Beta version 2.0. [computer file], U.S. Census Bureau and Cornell University, Synthetic Data Server [distributor], Washington, DC and Ithaca, NY, USA (2011), http://www2.vrdc.cornell.edu/news/data/lbd-synthetic-data/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Miranda, J., Vilhuber, L. (2014). Using Partially Synthetic Data to Replace Suppression in the Business Dynamics Statistics: Early Results. In: Domingo-Ferrer, J. (eds) Privacy in Statistical Databases. PSD 2014. Lecture Notes in Computer Science, vol 8744. Springer, Cham. https://doi.org/10.1007/978-3-319-11257-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11257-2_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11256-5

  • Online ISBN: 978-3-319-11257-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics