Abstract
The Business Dynamics Statistics is a product of the U.S. Census Bureau that provides measures of business openings and closings, and job creation and destruction, by a variety of cross-classifications (firm and establishment age and size, industrial sector, and geography). Sensitive data are currently protected through suppression. However, as additional tabulations are being developed, at ever more detailed geographic levels, the number of suppressions increases dramatically. This paper explores the option of providing public-use data that are analytically valid and without suppressions, by leveraging synthetic data to replace observations in sensitive cells.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abowd, J.M., Gittings, K., McKinney, K.L., Stephens, B.E., Vilhuber, L., Woodcock, S.: Dynamically consistent noise infusion and partially synthetic data as confidentiality protection measures for related time-series. Tech. rep. Federal Committee on Statistical Methodology (January 2012), http://www.fcsm.gov/events/papers2012.html
Abowd, J.M., Vilhuber, L.: Synthetic data server (2010), http://www.vrdc.cornell.edu/sds/
Drechsler, J.: Synthetische Scientific-use-files der Welle 2007 des IAB-Betriebspanels. FDZ Methodenreport 201101_de, Institute for Employment Research, Nuremberg, Germany (January 2011), http://ideas.repec.org/p/iab/iabfme/201101_de.html
Drechsler, J.: New data dissemination approaches in old Europe – synthetic datasets for a German establishment survey. Journal of Applied Statistics 39(2), 243–265 (2012), http://ideas.repec.org/a/taf/japsta/v39y2012i2p243-265.html
Drechsler, J., Reiter, J.P.: Disclosure risk and data utility for partially synthetic data: An empirical study using the German IAB Establishment Survey. Journal of Official Statistics 25(12), 589–603 (2009), http://ideas.repec.org/a/eee/csdana/v55y2011i12p3232-3243.html
Drechsler, J., Reiter, J.P.: Sampling with synthesis: A new approach for releasing public use census microdata. Journal of the American Statistical Association 105(492), 1347–1357 (2010), http://ideas.repec.org/a/bes/jnlasa/v105i492y2010p1347-1357.html
Gittings, R.K.: Essays in labor economics and synthetic data methods. Ph.d., Cornell University (2009)
Haltiwanger, J., Jarmin, R., Miranda, J.: Jobs created from business startups in the United States (2008), https://www.census.gov/ces/pdf/BDS_StatBrief1_Jobs_Created.pdf
Haltiwanger, J.C., Jarmin, R.S., Miranda, J.: Who creates jobs? small vs. large vs. young. Working Paper 16300, National Bureau of Economic Research (August 2010), http://www.nber.org/papers/w16300
Hethey, T., Schmieder, J.F.: Using worker flows in the analysis of establishment turnover: Evidence from German administrative data. FDZ Methodenreport 201006_en, Institute for Employment Research, Nuremberg, Germany (August 2010), http://ideas.repec.org/p/iab/iabfme/201006_en.html
Holan, S.H., Toth, D., Ferreira, M.A.R., Karr, A.F.: Bayesian multiscale multiple imputation with implications for data confidentiality. Journal of the American Statistical Association 105(490), 564–577 (2010), http://dx.doi.org/10.1198/jasa.2009.ap08629
Karr, A.F., Kohnen, C.N., Oganian, A., Reiter, J.P., Sanil, A.P.: A framework for evaluating the utility of data altered to protect confidentiality 60(3), 1–9 (2006)
Kinney, S.K., Reiter, J.: SynLBD: providing firm characteristics on synthetic establishment data. Presentation, World Statistics Conference (2013)
Kinney, S.K., Reiter, J., Miranda, J.: Improving the Synthetic Longitudinal Business Database. Working Paper 14-12, U.S. Census Bureau, Center for Economic Studies (2014)
Kinney, S.K., Reiter, J.P., Reznek, A.P., Miranda, J., Jarmin, R.S., Abowd, J.M.: Towards unrestricted public use business microdata: The Synthetic Longitudinal Business Database. International Statistical Review 79(3), 362–384 (2011), http://ideas.repec.org/a/bla/istatr/v79y2011i3p362-384.html
Machanavajjhala, A., Kifer, D., Abowd, J.M., Gehrke, J., Vilhuber, L.: Privacy: Theory meets practice on the map. In: International Conference on Data Engineering, ICDE (2008)
RodrÃguez, R.: Synthetic data disclosure control for american community survey group quarters (2007)
Sakshaug, J.W., Raghunathan, T.E.: Synthetic Data for Small Area Estimation in the American Community Survey. Working Papers 13-19, Center for Economic Studies, U.S. Census Bureau (April 2013), http://ideas.repec.org/p/cen/wpaper/13-19.html
U.S. Census Bureau: Synthetic LBD Beta version 2.0. [computer file], U.S. Census Bureau and Cornell University, Synthetic Data Server [distributor], Washington, DC and Ithaca, NY, USA (2011), http://www2.vrdc.cornell.edu/news/data/lbd-synthetic-data/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Miranda, J., Vilhuber, L. (2014). Using Partially Synthetic Data to Replace Suppression in the Business Dynamics Statistics: Early Results. In: Domingo-Ferrer, J. (eds) Privacy in Statistical Databases. PSD 2014. Lecture Notes in Computer Science, vol 8744. Springer, Cham. https://doi.org/10.1007/978-3-319-11257-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-11257-2_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11256-5
Online ISBN: 978-3-319-11257-2
eBook Packages: Computer ScienceComputer Science (R0)