k-Anonymous Decision Tree Induction

Friedman, Arik; Schuster, Assaf; Wolff, Ran

doi:10.1007/11871637_18

k-Anonymous Decision Tree Induction

Arik Friedman²¹,
Assaf Schuster²¹ &
Ran Wolff²¹

Conference paper

3499 Accesses
16 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4213))

Abstract

In this paper we explore an approach to privacy preserving data mining that relies on the k-anonymity model. The k-anonymity model guarantees that no private information in a table can be linked to a group of less than k individuals. We suggest extended definitions of k-anonymity that allow the k-anonymity of a data mining model to be determined. Using these definitions, we present decision tree induction algorithms that are guaranteed to maintain k-anonymity of the learning examples. Experiments show that embedding anonymization within the decision tree induction process provides better accuracy than anonymizing the data first and inducing the tree later.

Download to read the full chapter text

Chapter PDF

References

Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Conference on Management of Data, pp. 439–450. ACM Press, New York (2000)
Chapter Google Scholar
Du, W., Zhan, Z.: Building decision tree classifier on private data. In: Proc. of CRPITS’14, pp. 1–8. Australian Computer Society, Inc., Darlinghurst (2002)
Google Scholar
Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000)
Chapter Google Scholar
Vaidya, J., Clifton, C.: Privacy-preserving decision trees over vertically partitioned data. In: DBSec, pp. 139–152 (2005)
Google Scholar
Kantarcioǧlu, M., Jin, J., Clifton, C.: When do data mining results violate privacy? In: Proc. of ACM SIGKDD, NY, USA, pp. 599–604. ACM Press, New York (2004)
Google Scholar
US Dept. of HHS: Standards for privacy of individually identifiable health information; final rule (2002)
Google Scholar
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proc. of PODS 2004, pp. 223–228. ACM Press, New York (2004)
Chapter Google Scholar
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Approximation algorithms for k-anonymity. Journal of Privacy Technology (JOPT) (2005)
Google Scholar
Bayardo Jr., R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proc. of ICDE, pp. 217–228 (2005)
Google Scholar
Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: Proc. of ICDE (2005)
Google Scholar
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proc. of ACM SIGKDD, pp. 279–288 (2002)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: Proc. of ICDE (2006)
Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 571–588 (2002)
Article MATH MathSciNet Google Scholar
Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D.: k-anonymous patterns. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 10–21. Springer, Heidelberg (2005)
Chapter Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: Proc. of SIGMOD, NY, USA, pp. 49–60. ACM Press, New York (2005)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Newman, D.J., Hettich, S., Merz, C.B.: UCI repository of machine learning databases (1998)
Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: ℓ-diversity: Privacy beyond k-anonymity. In: Proc. of ICDE (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Dept., Technion – Israel Institute of Technology,
Arik Friedman, Assaf Schuster & Ran Wolff

Authors

Arik Friedman
View author publications
You can also search for this author in PubMed Google Scholar
Assaf Schuster
View author publications
You can also search for this author in PubMed Google Scholar
Ran Wolff
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Knowledge Engineering Group, Technische Universität Darmstadt,
Johannes Fürnkranz
Max Planck Institute for Computer Science, Saarbrücken, Germany
Tobias Scheffer
Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, Germany
Myra Spiliopoulou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Friedman, A., Schuster, A., Wolff, R. (2006). k-Anonymous Decision Tree Induction. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Knowledge Discovery in Databases: PKDD 2006. PKDD 2006. Lecture Notes in Computer Science(), vol 4213. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871637_18

Download citation

DOI: https://doi.org/10.1007/11871637_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45374-1
Online ISBN: 978-3-540-46048-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics