Skip to main content

Part of the book series: Data-Centric Systems and Applications ((DCSA))

Abstract

Statistical database security focuses on the protection of confidential individual values stored in so-called statistical databases and used for statistical purposes. Examples include patient records used by medical researchers, and detailed phone call records, statistically analyzed by phone companies in order to improve their services. This problem became apparent in the 1970s and has escalated in recent years due to massive data collection and growing social awareness of individual privacy.

The techniques used for preventing statistical database compromise fall into two categories: noise addition, where all data and/or statistics are available but are only approximate rather than exact, and restriction, where the system only provides those statistics and/or data that are considered safe. In either case, a technique is evaluated by measuring both the information loss and the achieved level of privacy. The goal of statistical data protection is to maximize the privacy while minimizing the information loss. In order to evaluate a particular technique it is important to establish a theoretical lower bound on the information loss necessary to achieve a given level of privacy. In this chapter, we present an overview of the problem and the most important results in the area.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Office for National Statistics. 200 Years of Census. 2001.

    Google Scholar 

  2. D. Trewin. Managing statistical confidentiality and microdata access-draft principles and guidelines of good practice. In UNECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, Switzerland, 2005.

    Google Scholar 

  3. J. Domingo-Ferrer and J.M. Mateo-Sanz. Current directions in statistical data protection. Research in Official Statistics, 1(2):105–112, 1998.

    Google Scholar 

  4. P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In Proceedings of the IEEE Symposium on Research in Security and Privacy, Oakland, California, USA, 1998.

    Google Scholar 

  5. M. Miller. A model of statistical database compromise incorporating supplementary knowledge. In Databases in the 1990s, pages 258–267, 1991.

    Google Scholar 

  6. M. Miller and J. Seberry. Relative compromise of statistical databases. The Australian Computer Journal, 21(2):56–61, 1989.

    Google Scholar 

  7. L. Willenborg and T. de Waal. Elements of Statistical Disclosure Control. Lecture Notes in Statistics. 2001. Springer.

    Google Scholar 

  8. L. Brankovic and H. Fernau. Approximability of a 0,1-matrix problem. In Proc. of the AWOCA2005, pages 39–45, September 2005.

    Google Scholar 

  9. L. Willenborg and T. de Waal. Statistical Disclosure Control in Practice. Lecture Notes in Statistics. 1996. Springer.

    Google Scholar 

  10. D.E.R. Denning. Cryptography and Data Security. Addison-Wesley, 1982.

    Google Scholar 

  11. F.Y. Chin and G. Ozsoyoglu. Security in partitioned dynamic statistical databases. Proc. of the IEEE COMPSAC Conference, pages 594–601, 1979.

    Google Scholar 

  12. F.Y. Chin and G. Ozsoyoglu. Auditing and inference control in statistical databases. IEEE Transactions on Software Engineering, SE-8(6):574–582, 1982.

    MathSciNet  Google Scholar 

  13. L. Brankovic, M. Miller, and J. Širáň. Towards a practical auditing method for the prevention of statistical database compromise. In Proceeding of Australasian Database Conference, Australian Computer Science Communications, volume 18, pages 177–184, 1996.

    Google Scholar 

  14. L. Brankovic. Usability of Secure Statistical Databases. PhD Thesis, The University of Newcastle, 1998.

    Google Scholar 

  15. F.M. Malvestuto and M. Mezzini. Auditing sum queries. In Proceedings of 9th International Conference on Database Theory, ICDT 2003, pages 126–142, Siena, Italy, 2003.

    Google Scholar 

  16. K. Kenthapadi, N. Mishra, and K. Nissim. Simulatable auditing. In PODS’ 05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 118–127, New York, NY, USA, 2005. ACM Press.

    Google Scholar 

  17. Y. Li, L. Wang, X. Sean Wang, and S. Jajodia. Auditing interval-based inference. In Proceedings of 14th International Conference on Advanced Information Systems Engineering, CAiSE 2002, pages 553–567, Toronto, Canada, 2002.

    Google Scholar 

  18. J. Kleinberg, C. Papadimitriou, and P. Raghavan. Auditing boolean attributes. J. Comput. Syst. Sci., 66(1):244–253, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  19. N.R. Adam and J.C. Wortmann. Security-control methods for statistical databases: a comparative study. ACM Comput. Surv., 21(4):515–556, 1989.

    Article  Google Scholar 

  20. J.J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In American Statistical Association, Proceedings of the Section on Survey Research Methods, pages 303–308, 1986.

    Google Scholar 

  21. P. Tendick. Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statisical Planning and Inference, 27:341–353, 1991.

    Article  MATH  MathSciNet  Google Scholar 

  22. W.A. Fuller. Masking procedures for microdata disclosure limitation. Journal of Official Statistics, 9(2):383–406, 1993.

    Google Scholar 

  23. J.J. Kim and W.E. Winkler. Masking microdata files. In American Statistical Association, Proceedings of the Section on Survey Research Methods, pages 114–119, 1995.

    Google Scholar 

  24. W.E. Yancey, W.E. Winkler, and R.H. Creecy, editors. Disclosure Risk Assessment in Perurbative Microdata Protection. Lecture Notes in Computer Science: Inference Control in Statistical Databases. Springer, 2002.

    Google Scholar 

  25. K. Muralidhar, R. Parsa, and R. Sarathy. A general additive data perturbation method for database security. Management Science, 45(10):1399–1415, 1999.

    Article  Google Scholar 

  26. K. Muralidhar, R. Sarathy, and R. Parsa. An improved security requirement for data perturbation with implications for e-commerce. Decision Science, 32(4):683–698, 2001.

    Article  Google Scholar 

  27. K. Muralidhar and R. Sarathy. An enhanced data perturbation approach for small data sets. Decision Sciences, 36(3):513–529, 2005.

    Article  Google Scholar 

  28. W.E. Winkler. Masking and re-identification methods for public-use microdata: Overview and research problems. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 231–246, Barcelona, Spain, 2004.

    Google Scholar 

  29. C.K. Liew, U.J. Choi, and C.J. Liew. A data distortion by probability distribution. ACM Transactions on Database Systems, 10(3):395–411, 1985.

    Article  MATH  Google Scholar 

  30. J. Burridge. Information preserving statistical obfuscation. Statistics and Computing, 13:321–327, 2003.

    Article  MathSciNet  Google Scholar 

  31. G.T. Duncan and R.W. Pearson. Enhancing access to microdata while protecting confidentiality: Prospects for the future. Statistical Science, 6(3):219–232, 1991.

    Google Scholar 

  32. D. Ting, S. Fienberg, and M. Trottini. Romm methodology for microdata release. In UNECE/Eurostat Work Session on Statistical Data Confidentiality, Geneva, Switzerland, 2005.

    Google Scholar 

  33. J. Domingo-Ferrer and V. Torra. Disclosure control methods and information loss for microdata. In P. Doyle, J. Lane, J. Theeuwes, and L. Zayatz, editors, Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, pages 93–112. Elsevier, 2002.

    Google Scholar 

  34. J.M. Mateo-Sanz, F. Sebé, and J. Domingo-Ferrer. Outlier protection in continuous microdata masking. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 201–215, Barcelona, Spain.

    Google Scholar 

  35. S.S. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of American Statistical Association, 60(309):63–69, March 1965.

    Article  Google Scholar 

  36. P.P de Wolf, J.M. Gouweleeuw, P. Kooiman, and L. Willenborg. Reflections on pram. In SDP, Amsterdam, 1998.

    Google Scholar 

  37. P.P. de Wolf and I. van Gelder. An empirical evaluation of pram. Discussion Paper 04012, Statistics Netherlands, Voorburg, September 2004.

    Google Scholar 

  38. M. Trottini. Assessing disclosure risk and data utility: A multiple objectives decision problem. In Joint ECE/Eurostat Work Session on Statistical Confidentiality, Luxembourg, 2003.

    Google Scholar 

  39. J. Domingo-Ferrer, F. Sebé, and J. Castellà-Roca. On the security of noise addition for privacy in statistical databases. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 149–161, Barcelona, Spain, 2004.

    Google Scholar 

  40. C.J. Skinner and M.J. Elliot. A measure of disclosure risk for microdata. Journal of the Royal Statistical Society, 64:855–867, 2002.

    Article  MATH  MathSciNet  Google Scholar 

  41. J. Domingo-Ferrer, J.M. Mateo-Sanz, and V. Torra. Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In In Proceedings of NTTS and ETK, 2001.

    Google Scholar 

  42. I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS’ 03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 202–210, New York, NY, USA, 2003. ACM Press.

    Google Scholar 

  43. M. Miller, I. Roberts, and J. Simpson. Application of symmetric chains to an optimization problem in the security of statistical databases. Bulletin of the ICA, 2:47–58, 1991.

    MATH  MathSciNet  Google Scholar 

  44. J.R. Griggs. Concentrating subset sums at k points. Bulletin Institute Combinatorics and Applications, 20:65–74, 1997.

    MATH  MathSciNet  Google Scholar 

  45. M. Miller, I. Roberts, and J. Simpson. Prevention of relative compromise in statistical databases using audit expert. Bulletin of the ICA, 10:51–62, 1994.

    MATH  Google Scholar 

  46. P. Horak, L. Brankovic, and M. Miller. A combinatorial problem in database security. Discrete Applied Mathematics, 91(1–3):119–126, 1999.

    Article  MATH  MathSciNet  Google Scholar 

  47. L. Brankovic, P. Horak, and M. Miller. An optimization problem in statistical databases. SIAM Journal on Discrete Mathematics, 13(2):346–353, 2000.

    Article  MathSciNet  Google Scholar 

  48. L. Wang, Y. Li, D. Wijesekera, and S. Jajodia. Precisely answering multidimensional range queries without privacy breaches. In Proceedings of 8th European Symposium on Research in Computer Security, ESORICS 2003, pages 100–115, Gjøvik, Norway, 2003.

    Google Scholar 

  49. L. Wang, D. Wijesekera, and S. Jajodia. Cardinality-based inference control in data cubes. Journal of Computer Security, 12(5):655–692, 2004.

    Google Scholar 

  50. L. Wang, S. Jajodia, and D. Wijesekera. Securing OLAP data cubes against privacy breaches. In Proceedings of IEEE Symposium on Security and Privacy, pages 161-, 2004.

    Google Scholar 

  51. L. Brankovic, M. Miller, and J. Širáň. Range query usability of statistical databases. Int. J. Comp. Math., 79(12):1265–1271, 2002.

    Article  MATH  Google Scholar 

  52. L. Brankovic and J. Širáň. 2-compromise usability in 1-dimensional statistical databases. In Proc. 8th Int. Computing and Combinatorics Conference, COCOON2002, pages 448–455, 2002.

    Google Scholar 

  53. L. Franconi and S. Polettini. Individual risk estimation in-argus: A review. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, pages 262–272, Barcelona, Spain, 2004.

    Google Scholar 

  54. L. Sweeney. Guaranteeing anonymity when sharing medical data, the datafly system. In AMIA, Proceedings of Fall Symposium, pages 51–55, Washington, DC, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Brankovic, L., Giggins, H. (2007). Statistical Database Security. In: Petković, M., Jonker, W. (eds) Security, Privacy, and Trust in Modern Data Management. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69861-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69861-6_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69860-9

  • Online ISBN: 978-3-540-69861-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics