Achieving k-Anonymity for Associative Classification in Incremental-Data Scenarios
When a data mining model is to be developed, one of the most important issues is preserving the privacy of the input data. In this paper, we address the problem of data transformation to preserve the privacy with regard to a data mining technique, associative classification, in an incremental-data scenario. We propose an incremental polynomial-time algorithm to transform the data to meet a privacy standard, i.e. k-Anonymity. While the transformation can still preserve the quality to build the associative classification model. The computational complexity of the proposed incremental algorithm ranges from O(n log n) to O( Δn) depending on the characteristic of increment data. The experiments have been conducted to evaluate the proposed work comparing with a non-incremental algorithm. From the experiment result, the proposed incremental algorithm is more efficient in every problem setting.
KeywordsExecution Time Class Label Generalization Level Incremental Algorithm Order Change
Unable to display preview. Download preview PDF.
- 1.Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st International Conference on Data Engineering, pp. 205–216. IEEE Computer Society, Los Alamitos (2005)Google Scholar
- 3.Li, W., Han, J., Pei, J.: Cmar: Accurate and efficient classification based on multiple class-association rules. In: Proceedings of the 2001 IEEE ICDM International Conference on Data Mining, pp. 369–376. IEEE Computer Society, Washington, DC (2001)Google Scholar
- 4.Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the Fourth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 80–86. AAAI Press, Menlo Park (1998)Google Scholar
- 7.Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: Proceedings of the 4th IEEE International Conference on Data Mining, pp. 249–256. IEEE Computer Society, Los Alamitos (2004)Google Scholar