Abstract
We provide the first efficient cryptanalysis of Bloom filter encryptions of a database containing more than one personal identifier. The cryptanalysis is fully automated and shows several drawbacks of existing encryption methods based on Bloom filters. In particular, the special representation of the hash functions as linear combinations of two hash functions f and g is exploited in order to detect Bloom filter encryptions of single bigrams (so-called atoms). The assignment of atoms to bigrams is obtained via a modification of an algorithm which was originally proposed for the automated cryptanalysis of simple substitution ciphers. Using our approach, we were able to reconstruct 77.7 % of the identifier values correctly. We point to further improvements of the basic Bloom filter approach that are worth being investigated with respect to their privacy guarantees in future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jones, M., McEwan, P., Morgan, C.L., Peters, J.L., Goodfellow, J., Currie, C.J.: Evaluation of the pattern of treatment, level of anticoagulation control, and outcome of treatment with warfarin in patients with non-valvar atrial fibrillation: a record linkage study in a large British population. Heart 91(4), 472–477 (2005)
Newman, T.B., Brown, A.N.: Use of commercial record linkage software and vital statistics to identify patient deaths. J. Am. Med. Assoc. 4(3), 233–237 (1997)
Van den Brandt, P.A., Schouten, L.J., Goldbohm, R.A., Dorant, E., Hunen, P.M.H.: Development of a record linkage protocol for use in the Dutch cancer registry for epidemiological research. Int. J. Epidemiol. 19(3), 553–558 (1990)
Schnell, R., Bachteler, T., Reiher, J.: Privacy-preserving record linkage using Bloom filters. BMC Med. Inform. Decis. 9(41), 1–11 (2009)
Kuehni, C.E., Rueegg, C.S., Michel, G., Rebholz, C.E., Strippoli, M.-P.F., Niggli, F.K., Egger, M., von der Weid, N.X.: Cohort profile: the swiss childhood cancer survivor study. Int. J. Epidemiol. 41(6), 1553–1564 (2012)
Rocha, M. C. N.: Vigilância dos óbitos registrados com causa básica hanseníase: caracterização no Brasil (2004–2009) e investigação em Fortaleza, Ceará (2006–2011). Master thesis, Universidade de Brasília (2013)
Randall, S.M., Ferrante, A.M., Boyd, J.H., Bauer, J.K., Semmens, J.B.: Privacy-preserving record linkage on large real world datasets. J. Biomed. Inform. 50, 205–212 (2014)
Schnell, R., Richter, A., Borgs, C.: Performance of different methods for privacy preserving record linkage with large scale medical data sets. In: Presentation at the International Health Data Linkage Conference, Vancouver (2014)
Herzog, T.N., Scheuren, F.J., Winkler, W.E.: Data Quality and Record Linkage Techniques. Springer, New York (2007)
Schnell, R., Bachteler, T., Reiher, J.: A novel error-tolerant anonymous linking code. Working Paper NO. WP-GRLC-2011-02, German Record Linkage Center, Nürnberg (2011)
Office for National Statistics: Beyond: Matching anonymous data (M9). Methods and Policies, Office for National Statistics, London (2011)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Kuzu, M., Kantarcioglu, M., Durham, E., Malin, B.: A constraint satisfaction cryptanalysis of bloom filters in private record linkage. In: Fischer-Hübner, S., Hopper, N. (eds.) PETS 2011. LNCS, vol. 6794, pp. 226–245. Springer, Heidelberg (2011)
Niedermeyer, F., Steinmetzer, S., Kroll, M., Schnell, R.: Cryptanalysis of basic bloom filters used for privacy preserving record linkage. J. Priv. Confidentiality 6(2), 59–79 (2014)
Kuzu, M., Kantarcioglu, M., Durham, E., Toth, C., Malin, B.: A practical approach to achieve private medical record linkage in light of public resources. J. Am. Med. Assoc. 20(2), 285–292 (2012)
Randall, S.M., Ferrante, A.M., Boyd, J.H., Semmens, J.B.: The effect of data cleaning on record linkage quality. BMC Med. Inform. Decis. 13(64), 1–10 (2013)
Kirsch, A., Mitzenmacher, M.: Less hashing, same performance: building a better bloom filter. Random Struct. Algor. 33(2), 187–218 (2008)
Jakobsen, T.: A fast method for the cryptanalysis of substitution ciphers. Cryptol. 19(3), 265–274 (1995)
Borgelt, C.: Frequent item set mining. WIREs Data Min. Knowl. Discov. 2, 437–456 (2012)
Acknowledgements
Research of both authors was supported by the research grant SCHN 586/19-1 of the German Research Foundation (DFG) awarded to the head of the Research Methodology Group, Rainer Schnell. We thank him and the three anonymous reviewers for their helpful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kroll, M., Steinmetzer, S. (2015). Who Is 1011011111\(\ldots \)1110110010? Automated Cryptanalysis of Bloom Filter Encryptions of Databases with Several Personal Identifiers. In: Fred, A., Gamboa, H., Elias, D. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2015. Communications in Computer and Information Science, vol 574. Springer, Cham. https://doi.org/10.1007/978-3-319-27707-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-27707-3_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27706-6
Online ISBN: 978-3-319-27707-3
eBook Packages: Computer ScienceComputer Science (R0)