Skip to main content

Privacy-Preserving Record Linkage

  • Conference paper
Privacy in Statistical Databases (PSD 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6344))

Included in the following conference series:

Abstract

Record linkage has a long tradition in both the statistical and the computer science literature. We survey current approaches to the record linkage problem in a privacy-aware setting and contrast these with the more traditional literature. We also identify several important open questions that pertain to private record linkage from different perspectives.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 86–97. ACM, New York (2003)

    Chapter  Google Scholar 

  2. Bilenko, M., Mooney, R.J., Cohen, W.W., Ravikumar, P., Fienberg, S.E.: Adaptive name matching in information integration. IEEE Intelligent Systems 18(5), 16–23 (2003)

    Article  Google Scholar 

  3. Blake, I., Kolesnikov, V.: Strong conditional oblivious transfer and computing on intervals. In: Lee, P.J. (ed.) ASIACRYPT 2004. LNCS, vol. 3329, pp. 515–529. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Bourgain, J.: On lipschitz embedding of finite metric spaces in hilbert space. Israel Journal of Mathematics 52(1), 46–52 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  5. Churches, T., Christen, P.: Blind data linkage using n-gram similarity comparisons. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 121–126. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Churches, T., Christen, P.: Some methods for blindfolded record linkage. BMC Medical Informatics and Decision Making 4(1), 9 (2004)

    Article  Google Scholar 

  7. Cohen, L.W., Cohen, W.W.: Data integration using similarity joins and a word- based information representation. ACM Transactions on Information Systems 18, 2000 (1998)

    Google Scholar 

  8. Domingo-Ferrer, J., Torra, V.: Validating distance-based record linkage with probabilistic record linkage. In: Escrig, M.T., Toledo, F.J., Golobardes, E. (eds.) CCIA 2002. LNCS (LNAI), vol. 2504, pp. 207–215. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Du, W., Chen, S., Han, Y.S.: Privacy-preserving multivariate statistical analysis: Linear regression and classification. In: Proceedings of the 4th SIAM International Conference on Data Mining, pp. 222–233 (2004)

    Google Scholar 

  10. Dwork, C.: Differential privacy: A survey of results. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Fellegi, I.P., Sunter, A.B.: A theory for record linkage. Journal of the American Statistical Association 64(328), 1183–1210 (1969)

    Article  Google Scholar 

  12. Fienberg, S.E., Fulp, W.J., Slavkovic, A.B., Wrobel, T.A.: “Secure” log-linear and logistic regression analysis of distributed databases. In: Domingo-Ferrer, J., Franconi, L. (eds.) PSD 2006. LNCS, vol. 4302, pp. 277–290. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  13. Fienberg, S., Slavkovic, A., Nardi, Y.: Valid statistical analysis for logistic regression with multiple sources. In: Gal, C.S., Kantor, P.B., Lesk, M.E. (eds.) ISIPS 2008. LNCS, vol. 5661, pp. 82–94. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  14. Fienberg, S.E., Manrique-Vallier, D.: Integrated methodology for multiple systems estimation and record linkage using a missing data formulation. AStA Adv. Stat. Anal. 93, 49–60 (2009)

    Article  MATH  Google Scholar 

  15. Goethals, B., Laur, S., Lipmaa, H., Mielikainen, T.: On secure scalar product computation for privacy-preserving data mining. In: ISISC 2004 (2004)

    Google Scholar 

  16. Goldreich, O.: Modern Cryptography, Probabilistic Proofs, and Pseudorandomness. Springer, New York (1998)

    Google Scholar 

  17. Goldreich, O.: Foundations of Cryptography. Basic Applications, vol. 2. Cambridge University Press, Cambridge (2004)

    MATH  Google Scholar 

  18. Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game or a completeness theorem for protocols with honest majority. In: STOC, pp. 218–229. ACM, New York (1987)

    Google Scholar 

  19. Herzog, T.N., Scheuren, F.J., Winkler, W.E.: Data Quality and Record Linkage Techniques, 1st edn. Springer, Heidelberg (May 2007)

    MATH  Google Scholar 

  20. Inan, A., Kantarcioglu, M., Bertino, E., Scannapieco, M.: A hybrid approach to private record linkage. In: ICDE, pp. 496–505. IEEE, Los Alamitos (2008)

    Google Scholar 

  21. Jaro, M.: Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. Journal of the American Statistical Association 84(406), 414–420 (1989)

    Article  Google Scholar 

  22. Karr, A., Lin, X., Reiter, J., Sanil, A.: Secure regression on distributed databases. Journal of Computational and Graphical Statistics 14(2), 263–279 (2005)

    Article  MathSciNet  Google Scholar 

  23. Karr, A., Lin, X., Reiter, J., Sanil, A.: Secure analysis of distributed databases. In: Olwell, D., Wilson, A.G., Wilson, G. (eds.) Statistical Methods in Counterterrorism: Game Theory, Modeling, Syndromic Surveillance, and Biometric Authentication, pp. 237–261. Springer, New York (2006)

    Google Scholar 

  24. Karr, A., Lin, X., Sanil, A., Reiter, J.: Privacy-preserving analysis of vertically partitioned data using secure matrix products. Journal of Official Statistics 25(1), 125–138 (2009)

    Google Scholar 

  25. Lahiri, P., Larsen, M.: Regression analysis with linked data. Journal of the American Statistical Association 100(469), 222–230 (2002)

    Article  MathSciNet  Google Scholar 

  26. Lindell, Y., Pinkas, B.: Privacy preserving data mining. Journal of Cryptology 15(3), 177–206 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  27. Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining. Journal of Privacy and Confidentiality 1(1), 59–98 (2009)

    Google Scholar 

  28. Malkhi, D., Nisan, N., Pinkas, B., Sella, Y.: Fairplay–a secure two-party computation system. In: SSYM 2004: Proceedings of the 13th conference on USENIX Security Symposium, Berkeley, CA, USA, p. 20. USENIX Association (2004)

    Google Scholar 

  29. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)

    Google Scholar 

  30. Ravikumar, P., Cohen, W.W.: A hierarchical graphical model for record linkage. In: Chickering, D.M., Halpern, J.Y. (eds.) UAI, pp. 454–461. AUAI Press (2004)

    Google Scholar 

  31. Ravikumar, P., Cohen, W.W., Fienberg, S.E.: A secure protocol for computing string distance metrics. In: PSDM held at ICDM, pp. 40–46 (2004)

    Google Scholar 

  32. Sadinle, M.: Multiple record linkage: Generalizing the fellegi-sunter theory. Conict Analysis Resource Center (CERAC), Bogota, Columbia, January 22 (2010)

    Google Scholar 

  33. Scannapieco, M., Figotin, I., Bertino, E., Elmagarmid, A.K.: Privacy preserving schema and data matching. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) SIGMOD Conference, pp. 653–664. ACM, New York (2007)

    Google Scholar 

  34. Schnell, R., Bachteler, T., Reiher, J.: Privacy-preserving record linkage using bloom filters. BMC Medical Informatics and Decision Making 9(1), 41 (2009)

    Article  Google Scholar 

  35. Sweeney, L.: k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 557–570 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  36. Torra, V., Domingo-Ferrer, J.: Record linkage methods for multidatabase data mining. In: Torra, V. (ed.) Information Fusion in Data Mining, pp. 99–130. Springer, Heidelberg (2003)

    Google Scholar 

  37. Vaidya, J., Zhu, Y., Clifton, C.: Privacy Preserving Data Mining (Advances in Information Security). Springer, New York (2005)

    Google Scholar 

  38. Winkler, W.E.: Matching and record linkage. In: Business Survey Methods, pp. 355–384. Wiley, Chichester (1995)

    Google Scholar 

  39. Winkler, W.E.: The state of record linkage and current research problems. Technical report, Statistical Research Division, U.S. Bureau of the Census (1999)

    Google Scholar 

  40. Winkler, W.E.: Methods for record linkage and bayesian networks. Technical report, Series RRS2002/05, U.S. Bureau of the Census (2002)

    Google Scholar 

  41. Yakout, M., Atallah, M.J., Elmagarmid, A.K.: Efficient private record linkage. In: ICDE, pp. 1283–1286. IEEE, Los Alamitos (2009)

    Google Scholar 

  42. Yao, A.: Protocols for secure computations. In: Proceedings of the 23rd Annual IEEE Symposium on Foundations of Computer Science, pp. 160–164 (1982)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hall, R., Fienberg, S.E. (2010). Privacy-Preserving Record Linkage. In: Domingo-Ferrer, J., Magkos, E. (eds) Privacy in Statistical Databases. PSD 2010. Lecture Notes in Computer Science, vol 6344. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15838-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15838-4_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15837-7

  • Online ISBN: 978-3-642-15838-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics