We focus primarily on the use of additive and matrix multiplicative data perturbation techniques in privacy preserving data mining (PPDM). We survey a recent body of research aimed at better understanding the vulnerabilities of these techniques. These researchers assumed the role of an attacker and developed methods for estimating the original data from the perturbed data and any available prior knowledge. Finally, we briefly discuss research aimed at attacking k-anonymization, another data perturbation technique in PPDM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
N. R. Adam and J. C. Worthmann. Security-control methods for statistical databases: a comparative study. ACM Computing Surveys (CSUR), 21(4):515–556, 1989.
Charu C. Aggarwal. On k-anonymity and the curse of dimensionality. In Proceedings of the 31st VLDB Conference, pages 901–909, Trondheim, Norway, 2005.
Charu C. Aggarwal and Philip S. Yu. A condensation based approach to privacy preserving data mining. In Proceedings of the 9th International Conference on Extending Database Technology (EDBT’04), pages 183–199, Heraklion, Crete, Greece, March 2004.
D. Agrawal and C. C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 247–255, Santa Barbara, CA, 2001.
R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 439–450, Dallas, TX, May 2000.
R. Brand. Microdata protection through noise addition. Lecture Notes in Computer Science - Inference Control in Statistical Databases, 2316:97–116, 2002.
K. Chen and L. Liu. Privacy preserving data classification with rotation perturbation. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM’05), pages 589–592, Houston, TX, November 2005.
K. Chen, G. Sun, and L. Liu. Towards attack-resilient geometric data perturbation. In Proceedings of the 2007 SIAM International Conference on Data Mining (SDM’07), Minneapolis, MN, April 2007.
J. Domingo-Ferrer, F. Sebé, and J. Castellà -Roca. On the security of noise addition for privacy in statistical databases. Privacy in Statistical Databases, LNCS3050:149–161, 2004.
A. Evfimevski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In Proceedings of the ACM SIGMOD/PODS Conference, San Diego, CA, June 2003.
S. E. Fienberg and J. McIntyre. Data swapping: Variations on a theme by dalenius and reiss. Technical report, National Institute of Statistical Sciences, Research Triangle Park, NC, 2003.
A. Friedman, R. Wolff, and A. Schuster. Providing k-anonymity in data mining. Journal of VLDB, 2006 (to be published).
G. Strang. Linear Algebra and Its Applications (3rd Ed.). Harcourt Brace Jovanovich College Publishers, New York, 1986.
S. Guo and X. Wu. On the use of spectral filtering for privacy preserving data mining. In Proceedings of the 21st ACM Symposium on Applied Computing, pages 622–626, Dijon, France, April 2006.
S. Guo and X. Wu. Deriving private information from arbitrarily projected data. In Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’07), Nanjing, China, May 2007.
S. Guo, X. Wu, and Y. Li. Deriving private information from perturbed data using iqr based approach. In Proceedings of the Second International Workshop on Privacy Data Management (PDM’06), Atlanta, GA, April 2006.
S. Guo, X. Wu, and Y. Li. On the lower bound of reconstruction error for spectral filtering based privacy preserving data mining. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’06), pages 520–527, Berlin, Germany, September 2006.
Z. Huang, W. Du, and B. Chen. Deriving private information from randomized data. In Proceedings of the 2005 ACM SIGMOD Conference, pages 37–48, Baltimroe, MD, June 2005.
A. Hyvärinen and E. Oja. Independent component analysis: Algorithms and applications. Neural Networks, 13(4):411–430, June 2000.
I. T. Jolliffe. Principal Component Analysis. Springer Series in Statistics. Springer, second edition, 2002.
D. Jonsson. Some limit theorems for the eigenvalues of a sample covariance matrix. Journal of Multivariate Analysis, 12:1–38, 1982.
H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In Proceedings of the IEEE International Conference on Data Mining (ICDM’03), pages 99–106, Melbourne, FL, November 2003.
J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In Proceedings of the American Statistical Association on Survey Research Methods, pages 370–374, Washington, DC, 1986.
J. J. Kim and W. E. Winkler. Multiplicative noise for masking continuous data. Technical Report Statistics #2003-01, Statistical Research Division, U.S. Bureau of the Census, Washington D.C., April 2003.
N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the 23rd International Conference on Data Engineering (ICDE’07), pages 106–115, Istanbul, Turkey, April 2007.
X.-B. Li and S. Sarkar. A tree-based data perturbation approach for privacy-preserving data mining. IEEE Transactions on Knowledge and Data Engineering (TKDE), 18(9):1278–1283, 2006.
C. K. Liew, U. J. Choi, and C. J. Liew. A data distortion by probability distribution. ACM Transactions on Database Systems (TODS), 10(3):395–411, 1985.
K. Liu. Multiplicative Data Perturbation for Privacy Preserving Data Mining. PhD thesis, University of Maryland, Baltimore County, Baltimore, MD, January 2007.
K. Liu, C. Giannella, and H. Kargupta. An attacker’s view of distance preserving maps for privacy preserving data mining. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’06), pages 297–308, Berlin, Germany, September 2006.
K. Liu, H. Kargupta, and J. Ryan. Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering (TKDE), 18(1):92–106, January 2006.
M. Kantarcioǧlu, J. Jin, and C. Clifton. When do data mining results violate privacy? In Proceedings of the 10th ACM SIGKDD Conference (KDD’04), pages 599–604, Seattle, WA, August 2004.
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 1(1), 2006.
S. Mukherjee, Z. Chen, and A. Gangopadhyay. A privacy preserving technique for euclidean distance-based mining algorithms using fourier-related transforms. The VLDB Journal, 15(4):293–315, 2006.
K. Muralidhar and R. Sarathy. Data shuffling - a new masking approach for numerical data. Management Science, 52(5):658–670, May 2006.
J. A. Nelder and R. Mead. A simplex method for function minimization. Computer Journal, 7:308–313, 1965.
S. R. M. Oliveira and O. R. Zaïane. Privacy preserving clustering by data transformation. In Proceedings of the 18th Brazilian Symposium on Databases, pages 304–318, Manaus, Amazonas, Brazil, October 2003.
S. R. M. Oliveira and O. R. Zaïane. Privacy preservation when sharing data for clustering. In Proceedings of the International Workshop on Secure Data Management in a Connected World, pages 67–82, Toronto, Canada, August 2004.
P. Samarati. Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027, November/December 2001.
J. W. Silverstein and P. L. Combettes. Signal detection via spectral theory of large dimensional random matrices. IEEE Transactions on Signal Processing, 40(8):2100–2105, 1992.
G. W. Stewart and Ji-Guang Sun. Matrix Perturbation Theory. Academic Press, 1990.
L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557–570, 2002.
G. J. Szekély and M. L. Rizzo. Testing for equal distributions in high dimensions. InterStat, November(5), 2004.
P. Tendick. Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statistical Planning and Inference, 27(2):341–353, 1991.
M. Trottini, S. E. Fienberg, U. E. Makov, and M. M. Meyer. Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: A simulation study. Journal of Computational Methods in Sciences and Engineering, 4:5–16, 2004.
V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni. Association rule hiding. In IEEE Transactions on Knowledge and Data Engineering, volume 16, pages 434–447, 2004.
K. Wang, Benjamin C. M. Fung, and Philip S. Yu. Handicapping attacker’s confidence: an alternative to k-anonymization. Knowledge and Information Systems, 11(3):345–368, 2007.
E. P. Wigner. On the statistical distribution of the widths and spacings of nuclear resonance levels. Proceedings of the Cambridge Philosophical Society, 47:790–798, 1952.
R. Chi-Wing Wong, J. Li, A. Wai-Chee Fu, and K. Wang. (α,k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In Proceedings of the 12th ACM SIGKDD Conference (KDD’06), pages 754–759, Philadelphia, PA, August 2006.
Y. Zhu and L. Liu. Optimal randomization for privacy preserving data mining. In Proceedings of the 10th ACM SIGKDD Conference (KDD’04), pages 761–766, Seattle, WA, August 2004.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Liu, K., Giannella, C., Kargupta, H. (2008). A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods. In: Aggarwal, C.C., Yu, P.S. (eds) Privacy-Preserving Data Mining. Advances in Database Systems, vol 34. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-70992-5_15
Download citation
DOI: https://doi.org/10.1007/978-0-387-70992-5_15
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-70991-8
Online ISBN: 978-0-387-70992-5
eBook Packages: Computer ScienceComputer Science (R0)