Advertisement

Discord Region Based Analysis to Improve Data Utility of Privately Published Time Series

  • Shuai Jin
  • Yubao Liu
  • Zhijie Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6440)

Abstract

Privacy preserving data publishing is one of the most important issues of privacy preserving data mining, but the problem of privately publishing time series data has not received enough attention. Random perturbation is an efficient method of privately publishing data. Random noise addition introduces uncertainty into published data, increasing the difficult of conjecturing the original values. The existing Gaussian white noise addition distributes the same amount of noise to every single attribute of each series, incurring the great decrease of data utility for classification purpose. Through analyzing the different impact of local regions on overall classification pattern, we formally define the concept of discord region which strongly influences the classification performance. We perturb original series differentially according to their position, whether in a discord region, to improve classification utility of published data. The experimental results on real and synthetic data verify the effectiveness of our proposed methods.

Keywords

privacy preserving publishing time series discord region random perturbation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yabo, X., Ke, W., Ada, W.C.F., Rong, S., Jian, P.: Privacy-Preserving Data Stream Classification. In: Charu, A., Philip, S.Y. (eds.) Privacy-Preserving Data Mining Models and Algorithms, pp. 487–510. Springer, Heidelberg (2008)Google Scholar
  2. 2.
    Ye, Z., Yongjian, F., Huirong, F.: On Privacy in Time Series Data Mining. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 479–493. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Josenildo, C.S., Matthias, K.: Privacy-preserving discovery of frequent patterns in time series. In: Perner, P. (ed.) ICDM 2007. LNCS (LNAI), vol. 4597, pp. 318–328. Springer, Heidelberg (2007)Google Scholar
  4. 4.
    Nin, J., Torra, V.: Towards the evaluation of time series protection methods. Information Science 179, 1663–1677 (2009)CrossRefzbMATHGoogle Scholar
  5. 5.
    Feifei, L., Sun, J., Papadimitriou, S., Mihaila, G., Stanoi, I.: Hiding in the crowd: Privacy preservation on evolving streams through correlation tracking. In: 23rd International Conference on Data Engineering, pp. 686–695. IEEE, Los Alamitos (2007)Google Scholar
  6. 6.
    Papadimitriou, S., Feifei, L., Kollios, G., Philip, S.Y.: Time series compressibility and privacy. In: 33rd International Conference on Very Large Data Bases, pp. 459–470. ACM, New York (2007)Google Scholar
  7. 7.
    Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, 571–588 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Yubao, L., Xiuwei, C., Fei, W., Jian, Y.: Efficient Detection of Discords for Time Series Stream. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, Q.-M. (eds.) APWeb/WAIM 2009. LNCS, vol. 5446, pp. 629–634. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    Lindell, Y., Pinkas, B.: Privacy Preserving Data Mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  10. 10.
    Agrawal, R., Aggarwal, C.C.: Privacy preserving data mining. In: ACM SIGMOD International Conference on Management of Data, pp. 439–450. ACM, New York (2000)Google Scholar
  11. 11.
    Xi, X., Keogh, E., Shelton, C., Wei, L., Ratanamahatana, C.: Fast time series classification using numerosity reduction. In: International Conference on Machine Learning, pp. 1033–1040. ACM, New York (2006)Google Scholar
  12. 12.
    Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: ACM SIGMOD International Conference on Management of Data, pp. 217–228. ACM, New York (2006)Google Scholar
  13. 13.
    Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 211–222. ACM, New York (2003)Google Scholar
  14. 14.
    Abe, S., Lan, M.S.: A method for fuzzy rules extraction directly from numerical data and its application to pattern classification. IEEE Trans. Fuzzy Systems 3, 18–28 (1995)CrossRefGoogle Scholar
  15. 15.
    Benjamin, C.M.F., Ke, W., Philip, S.Y.: Top-Down Specialization for Information and Privacy Preservation. In: 21st International Conference on Data Engineering, pp. 205–216. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  16. 16.
    Ivan, D., Yuval, I.: Scalable Secure Multiparty Computation. In: Dwork, C. (ed.) CRYPTO 2006. LNCS, vol. 4117, pp. 501–520. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Shuai Jin
    • 1
  • Yubao Liu
    • 1
  • Zhijie Li
    • 1
  1. 1.Department of Computer ScienceSun Yat-sen UniversityGuangzhouChina

Personalised recommendations