Privacy in Statistical Databases

Volume 3050 of the series Lecture Notes in Computer Science pp 87-98

Balancing Quality and Confidentiality for Multivariate Tabular Data

  • Lawrence H. CoxAffiliated withNational Center for Health Statistics, Centers for Disease Control and Prevention
  • , James P. KellyAffiliated withOptTek Systems, Inc.
  • , Rahul PatilAffiliated withOptTek Systems, Inc.

* Final gross prices may vary according to local VAT.

Get Access


Absolute cell deviation has been used as a proxy for preserving data quality in statistical disclosure limitation for tabular data. However, users’ primary interest is that analytical properties of the data are for the most part preserved, meaning that the values of key statistics are nearly unchanged. Moreover, important relationships within (additivity) and between (correlation) the published tables should also be unaffected. Previous work demonstrated how to preserve additivity, mean and variance in for univariate tabular data. In this paper, we bridge the gap between statistics and mathematical programming to propose nonlinear and linear models based on constraint satisfaction to preserve additivity and covariance, correlation, and regression coefficient between data tables. Linear models are superior than nonlinear models owing to simplicity, flexibility and computational speed. Simulations demonstrate the models perform well in terms of preserving key statistics with reasonable accuracy.


Controlled tabular adjustment linear programming covariance