Balancing Quality and Confidentiality for Multivariate Tabular Data
- Lawrence H. CoxAffiliated withNational Center for Health Statistics, Centers for Disease Control and Prevention
- , James P. KellyAffiliated withOptTek Systems, Inc.
- , Rahul PatilAffiliated withOptTek Systems, Inc.
Absolute cell deviation has been used as a proxy for preserving data quality in statistical disclosure limitation for tabular data. However, users’ primary interest is that analytical properties of the data are for the most part preserved, meaning that the values of key statistics are nearly unchanged. Moreover, important relationships within (additivity) and between (correlation) the published tables should also be unaffected. Previous work demonstrated how to preserve additivity, mean and variance in for univariate tabular data. In this paper, we bridge the gap between statistics and mathematical programming to propose nonlinear and linear models based on constraint satisfaction to preserve additivity and covariance, correlation, and regression coefficient between data tables. Linear models are superior than nonlinear models owing to simplicity, flexibility and computational speed. Simulations demonstrate the models perform well in terms of preserving key statistics with reasonable accuracy.
KeywordsControlled tabular adjustment linear programming covariance
- Balancing Quality and Confidentiality for Multivariate Tabular Data
- Book Title
- Privacy in Statistical Databases
- Book Subtitle
- CASC Project Final Conference, PSD 2004, Barcelona, Spain, June 9-11, 2004. Proceedings
- pp 87-98
- Print ISBN
- Online ISBN
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- Series ISSN
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag Berlin Heidelberg
- Additional Links
- Controlled tabular adjustment
- linear programming
- Industry Sectors
- eBook Packages
- Editor Affiliations
- 15. Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, UNESCO Chair in Data Privacy
- 16. IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council
- Author Affiliations
- 17. National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, USA
- 18. OptTek Systems, Inc., Boulder, CO, 80302, USA
To view the rest of this content please follow the download PDF link above.