Skip to main content
Log in

The Vulnerability of Multiplicative Noise Protection to Correlation-Attacks on Continuous Microdata

  • Published:
Sankhya B Aims and scope Submit manuscript

Abstract

When multiplicative noises are used to protect values of a sensitive attribute in a microdata, it is frequently assumed that data intruders use the noise-multiplied value to estimate the corresponding unobservable original value of a target record. In this paper, we show that, data intruders could easily construct another estimate instead of using the noise-multiplied value to attack an original value. The new estimate, namely “correlation-attack” estimate, is obtained by exploiting the potentially high correlation between the noise-multiplied data and the original data. We provide a detailed comparison between the two estimates (noise-multiplied value and the correlation-attack estimate) by comparing the mean squared errors of the two underlying estimators, and we propose that data providers should always assess the disclosure risks from both estimators when generating noise-multiplied data. Correspondingly, we propose a disclosure risk measure which could be used by data providers for noise generating variable selection during data masking stage. A simulation study is provided to illustrate how the disclosure risk measure could help with noise generating variable selection for masking a set of original data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R. and Srikant, R. (2000). Privacy preserving data mining. In Proceedings of the ACM SIGMOD, p. 439–450.

  • Agrawal, R. and Aggarwal, C. (2001). On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the 20th Symposium on Principles of Database Systems, Santa Barbara, California USA.

  • Brand, R. (2002). Microdata protection through noise addition. In Inference Control in Statistical Databases, vol. 2316 of LNCS. Springer Berlin Heidelberg, p. 61–74.

  • Domingo-Ferrer, J., Sebé, F. and Castellà-Roca, J. (2004). On the security of noise addition for privacy in statistical databases. Lecture Notes in Computer Science3050, 149–161.

    Article  Google Scholar 

  • Domingo-Ferrer, J. and Torra, V. (2001). Disclosure Protection Methods and Information Loss for Microdata. In Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Doyle, P., Lane, J. I., Theeuwes, J. J. M. and Zayatz, L. (eds.), p. 91–110.

  • Duncan, G., Keller-McNulty, S. and Stokes, S. (2001). Disclosure Risk Vs. Data Utility: the R-U Confidentiality Map. Technical Report LA-UR-01-6428, Los Alamos National Laboratory, Statistical Sciences Group, Los Alamos, New Mexico.

    Google Scholar 

  • Duncan, G., Keller-McNulty, S. and Stokes, S. (2004). Database security and confidentiality: examining disclosure risk vs. data utility through the R-U confidentiality map. Techncal Report Number 142, National Institute of Statistical Science.

  • Evans, T. (1996). Effects on trend statistics of the use of multiplicative noise for disclosure limitation. U.S. Bureau of the Census. http://www.census.gov/srd/sdc/papers.html.

  • Hwang, J.T. (1986). Multiplicative errors-in-variables models with applications to recent data released by the U.S. Department of Energy. Journal of American Statistical Association 81, 680–688.

    Article  MathSciNet  Google Scholar 

  • Klein, M., Mathew, T. and Sinha, B. (2014). Noise multiplication for statistical disclosure control of extreme values in log-normal regression samples. Journal of Privacy and Confidentiality 6, 77–125.

    Article  Google Scholar 

  • Kim, J.J. (2007). Application of Truncated Triangular and Trapezoidal Distributions for Developing Multiplicative Noise. Proceedings of the Survey Methods Research Section, American Statistical Assoication, CD Rom.

  • Kim, J.J. and Winkler, W.E. (1995). Masking microdata files. American Statistical Association. Proceedings of the Section on Survey Research Methods, p. 114–119.

  • Kim, J.J. and Winkler, W.E. (2003). Multiplicative noise for masking continuous data. Statistical Research Division, Research Report Series(Statistics #2003-01). U.S Census Bureau.

  • Kim, J. and Jeong, D.M. (2008). Truncated triangular distribution for multiplicative noise and domain estimation. Section on Government Statistics-JSM 2008, 1023–1030.

    Google Scholar 

  • Li, X.B. and Sarkar, S. (2011). Protecting Privacy against Regression Attacks in Predictive Data Mining. International Conference on Information Systems, Icis 2011, Shanghai, China.

  • Li, X.B. and Sarkar, S. (2013). Class-restricted clustering and microperturbation for data privacy. Management Science 59, 4, 796–812.

    Article  Google Scholar 

  • Lin, Y.X. and Wise, P. (2012). Estimation of regression paremeters from noise multiplied data. Journal of Privacy and Confidentiality 4, 61–94.

    Google Scholar 

  • Lin, Y.X. (2014). Density approximant based on noise multiplied data. Privacy in statistical databases. LNCS 8744, 89–104.

    Google Scholar 

  • Lin, Y.X. and Fielding, M.J. (2015). Maskdensity14: a R package for the density approximant of a univariate based on noise multiplied data. SoftwareX 3-4, 37–43.

    Article  Google Scholar 

  • Liu, K., Giannella, C. and Kargupta, H. (2008). A survey of attack techniques on Privacy-Preserving data perturbation methods. Privacy-Preserving Data Mining, vol. 34 of the series Advances in Database Systems, p. 359–381.

  • Ma, Y., Lin, Y.X., Chipperfield, J.O., Newman, J. and Leaver, V. (2016). A new algorithm for protecting aggregated business microdata via a remote system. Privacy in Statistical Databases. LNCS 9867, 210–221.

    Google Scholar 

  • Muralidhar, K. and Domingo-Ferrer, J. (2016). Rank-based record linakge for re-identification risk assessment. Privacy in Statistical Databases. LNCS 9867, 225–236.

    Google Scholar 

  • Nayak, T.K., Sinha, B. and Zayatz, L. (2011). Statistical properties of multiplicative noise masking for confidentiality protection. Journal of Official Statistics27, 3, 527–544.

    Google Scholar 

  • Oganyan, A. and Karr, A. (2011). Masking methods that preserve positivity constraints in microdata. Journal of Statistical Planning and Inference 141, 31–41.

    Article  MathSciNet  Google Scholar 

  • Shlomo, N. (2010). Releasing microdata: Disclosure risk estimation, data masking and assessing utility. Journal of Privacy and Confidentiality 2, 1, 73–91.

    Article  Google Scholar 

  • Sinha, B., Nayak, T.K. and Zayatz, L. (2011). Privacy protection and quantile estimation from noise multiplied data. Sankhya B 73, 2, 297–315.

    Article  MathSciNet  Google Scholar 

  • Yancey, W.E., Winkler, W.E. and Creecy, R.H. (2002). Disclosure risk assessment in perturbative micro-data protection. Inference Control in Statistical Databases (ed. J. Domingo-Ferrer), New York: Springer, p. 135–151.

Download references

Acknowledgements

We thank all anonymous reviewers for their careful readings and constructive comments on the paper. This research has been conducted with the support of the Australian Government Research Training Program Scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Ma.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Y., Lin, YX. & Sarathy, R. The Vulnerability of Multiplicative Noise Protection to Correlation-Attacks on Continuous Microdata. Sankhya B 82, 305–327 (2020). https://doi.org/10.1007/s13571-019-00191-0

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13571-019-00191-0

Keywords

AMS (2000) subject classification

Navigation