Abstract
In this paper we study the problem of anonymity in multi-instance (MI) micro-data publication. The classical k-anonymity approach is shown to be insufficient and/or inappropriate for MI databases. Thus, it is extended to MI databases, resulting in a more general setting of MI k-anonymity. We show that MI k-anonymity problem is NP-Hard and the attack model for MI databases is different from that of single-instance databases. We make an observation that the introduced MI k-anonymity is not a strong privacy guarantee when anonymity sets are highly unbalanced with respect to instance counts. To this end a new anonymity principle, called p-certainty, which is unique to MI case is introduced. A clustering algorithms solving the p-certainty anonymity principle is developed and experimentally evaluated.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abul O, Bonchi F, Nanni M (2008) Never walk alone: uncertainty for anonymity in moving objects databases. In: Proceedings of 24th IEEE international conference on data, engineering (ICDE’08)
Adam NR, Wortmann JC (1989) Security-control methods for statistical databases: a comparative study. ACM Comput Surv 21(4):515–556
Aggarwal G, Feder T, Kenthapadi K, Khuller S, Panigrahy R, Thomas D, Zhu A (2006) Achieving anonymity via clustering. In: Proceedings of 25rd ACM symposium on principles of database systems (PODS’06)
Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: Proceedings of 10th international conference on database theory (ICDT’05)
Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of 20th ACM symposium on principles of database systems (PODS’01), pp 247–255
Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of 2000 ACM SIGMOD international conference on management of data (SIGMOD’00), pp 439–450
Domingo-Ferrer J, Mateo-Sanz JM (2002) Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans Knowl Data Eng 14(1):189–201
Garey MR, Johson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York
Kohavi R (1996) Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD’96)
Kriegel H-P, Pryakhin A, Schubert M (2006) An EM approach for clustering multi-instance objects. In: Proceedings of 10th Pacific-Asia conference on knowledge discovery and data mining (PAKDD’06)
Kwok JT, Cheung P-M (2007) Marginalized multi-instance kernels. In: Proceedings of 20th international joint conference on artificial intelligence (IJCAI’07)
LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of 2005 ACM SIGMOD international conference on management of data (SIGMOD’05), pp 49–60
LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: Proceedings of 22nd IEEE international conference on data, engineering (ICDE’06)
Li J, Wong RC-W, Fu AW-C, Pei J (2006) Achieving k-anonymity by clustering in attribute hierarchical structures. In: Proceedings of 8th international conference on data warehousing and knowledge, discovery (DaWaK’06)
Li N, Li T (2007) \(t\)-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of 23rd IEEE international conference on data, engineering (ICDE’07)
Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) \(l\)-diversity: privacy beyond \(k\)-anonymity. In: Proceedings of 22nd IEEE international conference on data, engineering (ICDE’06)
Martin DJ, Kifer D, Machanavajjhala A, Gehrke J (2007) Worst-case background knowledge for privacy-preserving data publishing. In: Proceedings of 23rd IEEE international conference on data engineering (ICDE’07)
Meyerson A, Willliams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of the 23rd ACM symposium on principles of database systems (PODS’04)
Nergiz M, Clifton C, Nergiz A (2007) Multirelational k-anonymity. In: Proceedings of data engineering, 2007. ICDE 2007, IEEE 23rd international conference on, pp 1417–1421
O’Leary DE (1991) Knowledge discovery as a threat to database security. In Piatetsky-Shapiro G, Frawley WJ (eds) Knowledge discovery in databases. AAAI/MIT Press, Cambridge, pp 507–516
Samarati P, Sweeney L (1998) Generalizing data to provide anonymity when disclosing information (abstract). In: Proceedings of 17th ACM symposium on principles of database systems (PODS’98)
Sweeney L (2002) k-anonymity: a model of protecting privacy. Int J Uncertainty Fuzziness Knowl Based Syst 10(5):557–570
Wong R, Li J, Fu A, Wang K (2006) \((\alpha , k)\)-anonymity: an enhanced k-anonymity model for privacy-preserving data publishing. In: Proceedings of 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06)
Xiao X, Tao Y (2007) m-invariance: towards privacy preserving re-publication of dynamic datasets. In: Proceedings of 2007 ACM SIGMOD international conference on management of data (SIGMOD’07)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Abul, O. (2013). Anonymity in Multi-Instance Micro-Data Publication. In: Gelenbe, E., Lent, R. (eds) Information Sciences and Systems 2013. Lecture Notes in Electrical Engineering, vol 264. Springer, Cham. https://doi.org/10.1007/978-3-319-01604-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-01604-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01603-0
Online ISBN: 978-3-319-01604-7
eBook Packages: Computer ScienceComputer Science (R0)