Datenbank-Spektrum

, Volume 16, Issue 3, pp 227–236

Speeding up Privacy Preserving Record Linkage for Metric Space Similarity Measures

Fachbeitrag

DOI: 10.1007/s13222-016-0222-9

Cite this article as:
Sehili, Z. & Rahm, E. Datenbank Spektrum (2016) 16: 227. doi:10.1007/s13222-016-0222-9
  • 67 Downloads

Abstract

The analysis of person-related data in Big Data applications faces the tradeoff of finding useful results while preserving a high degree of privacy. This is especially challenging when person-related data from multiple sources need to be integrated and analyzed. Privacy-preserving record linkage (PPRL) addresses this problem by encoding sensitive attribute values such that the identification of persons is prevented but records can still be matched. In this paper we study how to improve the efficiency and scalability of PPRL by restricting the search space for matching encoded records. We focus on similarity measures for metric spaces and investigate the use of M‑trees as well as pivot-based solutions. Our evaluation shows that the new schemes outperform previous filter approaches by an order of magnitude.

Keywords

Metric Space M-Tree Triangle Inequality Bloom Filter Record Linkage 

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Institut für InformatikUniversität Leipzig LeipzigGermany

Personalised recommendations