Record matching; Re-identification
Record linkage is a computational procedure for linking each record a in file A (e.g., a file masked for disclosure protection) to a record b in file B (original file). The pair (a, b) is a match if b turns out to be the original record corresponding to a.
Record linkage techniques were created for data fusion and to increase data quality. However, they have also found an application in measuring the risk of identity disclosure in statistical disclosure control. In the SDC context, it is assumed that an intruder has an external dataset sharing some (key or outcome) attributes with the released protected dataset and containing additionally some identifier attributes (e.g., passport number, full name, etc.). The intruder is assumed to attempt to link the protected dataset with the external dataset using the shared attributes. The number of matches gives an estimation of the number of protected records whose respondent can...