Skip to main content

Who Is 1011011111\(\ldots \)1110110010? Automated Cryptanalysis of Bloom Filter Encryptions of Databases with Several Personal Identifiers

  • Conference paper
  • First Online:
Biomedical Engineering Systems and Technologies (BIOSTEC 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 574))

Abstract

We provide the first efficient cryptanalysis of Bloom filter encryptions of a database containing more than one personal identifier. The cryptanalysis is fully automated and shows several drawbacks of existing encryption methods based on Bloom filters. In particular, the special representation of the hash functions as linear combinations of two hash functions f and g is exploited in order to detect Bloom filter encryptions of single bigrams (so-called atoms). The assignment of atoms to bigrams is obtained via a modification of an algorithm which was originally proposed for the automated cryptanalysis of simple substitution ciphers. Using our approach, we were able to reconstruct 77.7 % of the identifier values correctly. We point to further improvements of the basic Bloom filter approach that are worth being investigated with respect to their privacy guarantees in future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jones, M., McEwan, P., Morgan, C.L., Peters, J.L., Goodfellow, J., Currie, C.J.: Evaluation of the pattern of treatment, level of anticoagulation control, and outcome of treatment with warfarin in patients with non-valvar atrial fibrillation: a record linkage study in a large British population. Heart 91(4), 472–477 (2005)

    Article  Google Scholar 

  2. Newman, T.B., Brown, A.N.: Use of commercial record linkage software and vital statistics to identify patient deaths. J. Am. Med. Assoc. 4(3), 233–237 (1997)

    Article  Google Scholar 

  3. Van den Brandt, P.A., Schouten, L.J., Goldbohm, R.A., Dorant, E., Hunen, P.M.H.: Development of a record linkage protocol for use in the Dutch cancer registry for epidemiological research. Int. J. Epidemiol. 19(3), 553–558 (1990)

    Article  Google Scholar 

  4. Schnell, R., Bachteler, T., Reiher, J.: Privacy-preserving record linkage using Bloom filters. BMC Med. Inform. Decis. 9(41), 1–11 (2009)

    Google Scholar 

  5. Kuehni, C.E., Rueegg, C.S., Michel, G., Rebholz, C.E., Strippoli, M.-P.F., Niggli, F.K., Egger, M., von der Weid, N.X.: Cohort profile: the swiss childhood cancer survivor study. Int. J. Epidemiol. 41(6), 1553–1564 (2012)

    Article  Google Scholar 

  6. Rocha, M. C. N.: Vigilância dos óbitos registrados com causa básica hanseníase: caracterização no Brasil (2004–2009) e investigação em Fortaleza, Ceará (2006–2011). Master thesis, Universidade de Brasília (2013)

    Google Scholar 

  7. Randall, S.M., Ferrante, A.M., Boyd, J.H., Bauer, J.K., Semmens, J.B.: Privacy-preserving record linkage on large real world datasets. J. Biomed. Inform. 50, 205–212 (2014)

    Article  Google Scholar 

  8. Schnell, R., Richter, A., Borgs, C.: Performance of different methods for privacy preserving record linkage with large scale medical data sets. In: Presentation at the International Health Data Linkage Conference, Vancouver (2014)

    Google Scholar 

  9. Herzog, T.N., Scheuren, F.J., Winkler, W.E.: Data Quality and Record Linkage Techniques. Springer, New York (2007)

    MATH  Google Scholar 

  10. Schnell, R., Bachteler, T., Reiher, J.: A novel error-tolerant anonymous linking code. Working Paper NO. WP-GRLC-2011-02, German Record Linkage Center, Nürnberg (2011)

    Google Scholar 

  11. Office for National Statistics: Beyond: Matching anonymous data (M9). Methods and Policies, Office for National Statistics, London (2011)

    Google Scholar 

  12. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  13. Kuzu, M., Kantarcioglu, M., Durham, E., Malin, B.: A constraint satisfaction cryptanalysis of bloom filters in private record linkage. In: Fischer-Hübner, S., Hopper, N. (eds.) PETS 2011. LNCS, vol. 6794, pp. 226–245. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  14. Niedermeyer, F., Steinmetzer, S., Kroll, M., Schnell, R.: Cryptanalysis of basic bloom filters used for privacy preserving record linkage. J. Priv. Confidentiality 6(2), 59–79 (2014)

    Google Scholar 

  15. Kuzu, M., Kantarcioglu, M., Durham, E., Toth, C., Malin, B.: A practical approach to achieve private medical record linkage in light of public resources. J. Am. Med. Assoc. 20(2), 285–292 (2012)

    Article  Google Scholar 

  16. Randall, S.M., Ferrante, A.M., Boyd, J.H., Semmens, J.B.: The effect of data cleaning on record linkage quality. BMC Med. Inform. Decis. 13(64), 1–10 (2013)

    Google Scholar 

  17. Kirsch, A., Mitzenmacher, M.: Less hashing, same performance: building a better bloom filter. Random Struct. Algor. 33(2), 187–218 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  18. Jakobsen, T.: A fast method for the cryptanalysis of substitution ciphers. Cryptol. 19(3), 265–274 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  19. Borgelt, C.: Frequent item set mining. WIREs Data Min. Knowl. Discov. 2, 437–456 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

Research of both authors was supported by the research grant SCHN 586/19-1 of the German Research Foundation (DFG) awarded to the head of the Research Methodology Group, Rainer Schnell. We thank him and the three anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Kroll .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Kroll, M., Steinmetzer, S. (2015). Who Is 1011011111\(\ldots \)1110110010? Automated Cryptanalysis of Bloom Filter Encryptions of Databases with Several Personal Identifiers. In: Fred, A., Gamboa, H., Elias, D. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2015. Communications in Computer and Information Science, vol 574. Springer, Cham. https://doi.org/10.1007/978-3-319-27707-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27707-3_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27706-6

  • Online ISBN: 978-3-319-27707-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics