Abstract
There is a tradition of data administrators using record linkage to assess the re-identification risk before releasing anonymized microdata sets. In this paper we describe a record linkage procedure based on ranks, and we compare the performance of this rank-based record linkage against the more usual distance-based record linkage to re-identify records masked using several different masking methods. We try to elicit the reasons why RBRL performs better than DBRL for certain methods and worse than DBRL for other methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare SDC methods for the protection of numerical microdata. Deliverable of the EU IST-2000-25069 “CASC” project (2003). http://neon.vb.cbs.nl/casc/
Domingo-Ferrer, J., Oganian, A., Torres, A., Mateo-Sanz, J.M.: On the security of microaggregation with individual ranking: analytical attacks. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5), 477–491 (2002)
Domingo-Ferrer, J., Sánchez, D., Rufian-Torrell, G.: Anonymization of nominal data using semantic marginality. Inf. Sci. 242, 35–48 (2013)
Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. Confidentiality, Disclosure and Data Access, Theory and Practical Applications for Statistical Agencies, pp. 111–134. North-Holland, Amsterdam (2001)
Domingo-Ferrer, J., Torra, V.: Disclosure risk assessment in statistical disclosure control of microdata via advanced record linkage. Stat. Comput. 13(4), 343–354 (2003)
Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Min. Knowl. Discov. 11(2), 195–212 (2005)
Fellegi, I., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64(328), 1183–1210 (1969)
Gouweleeuw, J.M., Kooiman, P., De Wolf, P.-P.: Post randomisation for statistical disclosure control: theory and implementation. J. Official Stat. 14(4), 463–478 (1998)
Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Schulte Nordholt, E., Spicer, K., De Wolf, P.-P.: Statistical Disclosure Control. Wiley, Hoboken (2012)
Jaro, M.A.: Advances in record linkage methodology as applied to matching the 1985 census of Tampa, Florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989)
Mateo-Sanz, J.M., Sebé, F., Domingo-Ferrer, J.: Outlier protection in continuous microdata masking. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 201–215. Springer, Heidelberg (2004)
Moore, R.A.: Controlled Data Swapping for Masking Public Use Microdata Sets. Research report series (RR96/04), Statistical Research Division, US Census Bureau, Washington, DC (1996)
Nin, J., Herranz, J., Torra, V.: Rethinking rank swapping to decrease disclosure risk. Data Knowl. Eng. 64(1), 346–364 (2008)
Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: Enhancing data utility in differential privacy via microaggregation-based k-anonymity. VLDB J. 23(5), 771–794 (2014)
Torra, V., Domingo-Ferrer, J.: Record linkage methods for multidatabase data mining. In: Torra, V. (ed.) Information Fusion in Data Mining. Studies in Fuzziness and Soft Computing, vol. 123, pp. 99–130. Springer, Heidelberg (2003)
Winkler, W.E.: Matching and record linkage. In: Business Survey Methods, pp. 355–384. Wiley, Hoboken (1995)
Winkler, W.E.: Masking and re-identification methods for public-use microdata: overview and research problems. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 231–246. Springer, Heidelberg (2004)
Acknowledgments and Disclaimer
The second author is partly supported by the European Commission (projects H2020-644024 “CLARUS” and H2020-700540 “CANVAS”), by the Government of Catalonia (ICREA-Acadèmia prize and grant 2014 SGR 537) and by the Spanish Government (projects TIN2014-57364-C2-1-R “SmartGlacis” and TIN2015-70054-REDC). The second author leads the UNESCO Chair in Data Privacy, but the views expressed in this paper are the authors’ own and are not necessarily shared by UNESCO.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Muralidhar, K., Domingo-Ferrer, J. (2016). Rank-Based Record Linkage for Re-Identification Risk Assessment. In: Domingo-Ferrer, J., Pejić-Bach, M. (eds) Privacy in Statistical Databases. PSD 2016. Lecture Notes in Computer Science(), vol 9867. Springer, Cham. https://doi.org/10.1007/978-3-319-45381-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-45381-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45380-4
Online ISBN: 978-3-319-45381-1
eBook Packages: Computer ScienceComputer Science (R0)