On the Lower Bound of Reconstruction Error for Spectral Filtering Based Privacy Preserving Data Mining
Additive Randomization has been a primary tool to hide sensitive private information during privacy preserving data mining. The previous work based on Spectral Filtering empirically showed that individual data can be separated from the perturbed one and as a result privacy can be seriously compromised. Our previous work initiated the theoretical study on how the estimation error varies with the noise and gave an upper bound for the Frobenius norm of reconstruction error using matrix perturbation theory. In this paper, we propose one Singular Value Decomposition (SVD) based reconstruction method and derive a lower bound for the reconstruction error. We then prove the equivalence between the Spectral Filtering based approach and the proposed SVD approach and as a result the achieved lower bound can also be considered as the lower bound of the Spectral Filtering based approach.
KeywordsSingular Value Decomposition Reconstruction Error Frobenius Norm Data Owner Spectral Filter
- 1.Agrawal, D., Agrawal, C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th Symposium on Principles of Database Systems (2001)Google Scholar
- 2.Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas, May 2000, pp. 439–450 (2000)Google Scholar
- 3.Guo, S., Wu, X.: On the use of spectral filtering for privacy preserving data mining. In: Proceedings of the 21st ACM Symposium on Applied Computing, April 2006, pp. 622–626 (2006)Google Scholar
- 4.Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proceedings of the ACM SIGMOD Conference on Management of Data, Baltimore, MA (2005)Google Scholar
- 5.Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd IEEE International Conference on Data Mining, pp. 99–106 (2003)Google Scholar