Abstract
Statistical database security focuses on the protection of confidential individual values stored in so-called statistical databases and used for statistical purposes. Examples include patient records used by medical researchers, and detailed phone call records, statistically analyzed by phone companies in order to improve their services. This problem became apparent in the 1970s and has escalated in recent years due to massive data collection and growing social awareness of individual privacy.
The techniques used for preventing statistical database compromise fall into two categories: noise addition, where all data and/or statistics are available but are only approximate rather than exact, and restriction, where the system only provides those statistics and/or data that are considered safe. In either case, a technique is evaluated by measuring both the information loss and the achieved level of privacy. The goal of statistical data protection is to maximize the privacy while minimizing the information loss. In order to evaluate a particular technique it is important to establish a theoretical lower bound on the information loss necessary to achieve a given level of privacy. In this chapter, we present an overview of the problem and the most important results in the area.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Office for National Statistics. 200 Years of Census. 2001.
D. Trewin. Managing statistical confidentiality and microdata access-draft principles and guidelines of good practice. In UNECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, Switzerland, 2005.
J. Domingo-Ferrer and J.M. Mateo-Sanz. Current directions in statistical data protection. Research in Official Statistics, 1(2):105–112, 1998.
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In Proceedings of the IEEE Symposium on Research in Security and Privacy, Oakland, California, USA, 1998.
M. Miller. A model of statistical database compromise incorporating supplementary knowledge. In Databases in the 1990s, pages 258–267, 1991.
M. Miller and J. Seberry. Relative compromise of statistical databases. The Australian Computer Journal, 21(2):56–61, 1989.
L. Willenborg and T. de Waal. Elements of Statistical Disclosure Control. Lecture Notes in Statistics. 2001. Springer.
L. Brankovic and H. Fernau. Approximability of a 0,1-matrix problem. In Proc. of the AWOCA2005, pages 39–45, September 2005.
L. Willenborg and T. de Waal. Statistical Disclosure Control in Practice. Lecture Notes in Statistics. 1996. Springer.
D.E.R. Denning. Cryptography and Data Security. Addison-Wesley, 1982.
F.Y. Chin and G. Ozsoyoglu. Security in partitioned dynamic statistical databases. Proc. of the IEEE COMPSAC Conference, pages 594–601, 1979.
F.Y. Chin and G. Ozsoyoglu. Auditing and inference control in statistical databases. IEEE Transactions on Software Engineering, SE-8(6):574–582, 1982.
L. Brankovic, M. Miller, and J. Širáň. Towards a practical auditing method for the prevention of statistical database compromise. In Proceeding of Australasian Database Conference, Australian Computer Science Communications, volume 18, pages 177–184, 1996.
L. Brankovic. Usability of Secure Statistical Databases. PhD Thesis, The University of Newcastle, 1998.
F.M. Malvestuto and M. Mezzini. Auditing sum queries. In Proceedings of 9th International Conference on Database Theory, ICDT 2003, pages 126–142, Siena, Italy, 2003.
K. Kenthapadi, N. Mishra, and K. Nissim. Simulatable auditing. In PODS’ 05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 118–127, New York, NY, USA, 2005. ACM Press.
Y. Li, L. Wang, X. Sean Wang, and S. Jajodia. Auditing interval-based inference. In Proceedings of 14th International Conference on Advanced Information Systems Engineering, CAiSE 2002, pages 553–567, Toronto, Canada, 2002.
J. Kleinberg, C. Papadimitriou, and P. Raghavan. Auditing boolean attributes. J. Comput. Syst. Sci., 66(1):244–253, 2003.
N.R. Adam and J.C. Wortmann. Security-control methods for statistical databases: a comparative study. ACM Comput. Surv., 21(4):515–556, 1989.
J.J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In American Statistical Association, Proceedings of the Section on Survey Research Methods, pages 303–308, 1986.
P. Tendick. Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statisical Planning and Inference, 27:341–353, 1991.
W.A. Fuller. Masking procedures for microdata disclosure limitation. Journal of Official Statistics, 9(2):383–406, 1993.
J.J. Kim and W.E. Winkler. Masking microdata files. In American Statistical Association, Proceedings of the Section on Survey Research Methods, pages 114–119, 1995.
W.E. Yancey, W.E. Winkler, and R.H. Creecy, editors. Disclosure Risk Assessment in Perurbative Microdata Protection. Lecture Notes in Computer Science: Inference Control in Statistical Databases. Springer, 2002.
K. Muralidhar, R. Parsa, and R. Sarathy. A general additive data perturbation method for database security. Management Science, 45(10):1399–1415, 1999.
K. Muralidhar, R. Sarathy, and R. Parsa. An improved security requirement for data perturbation with implications for e-commerce. Decision Science, 32(4):683–698, 2001.
K. Muralidhar and R. Sarathy. An enhanced data perturbation approach for small data sets. Decision Sciences, 36(3):513–529, 2005.
W.E. Winkler. Masking and re-identification methods for public-use microdata: Overview and research problems. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 231–246, Barcelona, Spain, 2004.
C.K. Liew, U.J. Choi, and C.J. Liew. A data distortion by probability distribution. ACM Transactions on Database Systems, 10(3):395–411, 1985.
J. Burridge. Information preserving statistical obfuscation. Statistics and Computing, 13:321–327, 2003.
G.T. Duncan and R.W. Pearson. Enhancing access to microdata while protecting confidentiality: Prospects for the future. Statistical Science, 6(3):219–232, 1991.
D. Ting, S. Fienberg, and M. Trottini. Romm methodology for microdata release. In UNECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, Switzerland, 2005.
J. Domingo-Ferrer and V. Torra. Disclosure control methods and information loss for microdata. In P. Doyle, J. Lane, J. Theeuwes, and L. Zayatz, editors, Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, pages 93–112. Elsevier, 2002.
J.M. Mateo-Sanz, F. Sebé, and J. Domingo-Ferrer. Outlier protection in continuous microdata masking. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 201–215, Barcelona, Spain.
S.S. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of American Statistical Association, 60(309):63–69, March 1965.
P.P de Wolf, J.M. Gouweleeuw, P. Kooiman, and L. Willenborg. Reflections on pram. In SDP, Amsterdam, 1998.
P.P. de Wolf and I. van Gelder. An empirical evaluation of pram. Discussion Paper 04012, Statistics Netherlands, Voorburg, September 2004.
M. Trottini. Assessing disclosure risk and data utility: A multiple objectives decision problem. In Joint ECE/Eurostat Work Session on Statistical Confidentiality, Luxembourg, 2003.
J. Domingo-Ferrer, F. Sebé, and J. Castellà-Roca. On the security of noise addition for privacy in statistical databases. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 149–161, Barcelona, Spain, 2004.
C.J. Skinner and M.J. Elliot. A measure of disclosure risk for microdata. Journal of the Royal Statistical Society, 64:855–867, 2002.
J. Domingo-Ferrer, J.M. Mateo-Sanz, and V. Torra. Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In In Proceedings of NTTS and ETK, 2001.
I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS’ 03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 202–210, New York, NY, USA, 2003. ACM Press.
M. Miller, I. Roberts, and J. Simpson. Application of symmetric chains to an optimization problem in the security of statistical databases. Bulletin of the ICA, 2:47–58, 1991.
J.R. Griggs. Concentrating subset sums at k points. Bulletin Institute Combinatorics and Applications, 20:65–74, 1997.
M. Miller, I. Roberts, and J. Simpson. Prevention of relative compromise in statistical databases using audit expert. Bulletin of the ICA, 10:51–62, 1994.
P. Horak, L. Brankovic, and M. Miller. A combinatorial problem in database security. Discrete Applied Mathematics, 91(1–3):119–126, 1999.
L. Brankovic, P. Horak, and M. Miller. An optimization problem in statistical databases. SIAM Journal on Discrete Mathematics, 13(2):346–353, 2000.
L. Wang, Y. Li, D. Wijesekera, and S. Jajodia. Precisely answering multidimensional range queries without privacy breaches. In Proceedings of 8th European Symposium on Research in Computer Security, ESORICS 2003, pages 100–115, Gjøvik, Norway, 2003.
L. Wang, D. Wijesekera, and S. Jajodia. Cardinality-based inference control in data cubes. Journal of Computer Security, 12(5):655–692, 2004.
L. Wang, S. Jajodia, and D. Wijesekera. Securing OLAP data cubes against privacy breaches. In Proceedings of IEEE Symposium on Security and Privacy, pages 161-, 2004.
L. Brankovic, M. Miller, and J. Širáň. Range query usability of statistical databases. Int. J. Comp. Math., 79(12):1265–1271, 2002.
L. Brankovic and J. Širáň. 2-compromise usability in 1-dimensional statistical databases. In Proc. 8th Int. Computing and Combinatorics Conference, COCOON2002, pages 448–455, 2002.
L. Franconi and S. Polettini. Individual risk estimation in-argus: A review. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 262–272, Barcelona, Spain, 2004.
L. Sweeney. Guaranteeing anonymity when sharing medical data, the datafly system. In AMIA, Proceedings of Fall Symposium, pages 51–55, Washington, DC, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Brankovic, L., Giggins, H. (2007). Statistical Database Security. In: Petković, M., Jonker, W. (eds) Security, Privacy, and Trust in Modern Data Management. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69861-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-69861-6_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69860-9
Online ISBN: 978-3-540-69861-6
eBook Packages: Computer ScienceComputer Science (R0)