Can you Really Anonymize the Donors of Genomic Data in Today’s Digital World?

  • Mohammed Alser
  • Nour Almadhoun
  • Azita Nouri
  • Can Alkan
  • Erman Ayday
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9481)


The rapid progress in genome sequencing technologies leads to availability of high amounts of genomic data. Accelerating the pace of biomedical breakthroughs and discoveries necessitates not only collecting millions of genetic samples but also granting open access to genetic databases. However, one growing concern is the ability to protect the privacy of sensitive information and its owner. In this work, we survey a wide spectrum of cross-layer privacy breaching strategies to human genomic data (using both public genomic databases and other public non-genomic data). We outline the principles and outcomes of each technique, and assess its technological complexity and maturation. We then review potential privacy-preserving countermeasure mechanisms for each threat.


Genomics Privacy Bioinformatics 


  1. 1.
    Ayday, E., Raisaro, J.L., Hengartner, U., et al.: Privacy-preserving processing of raw genomic data. In: Proceedings of 8th Data Privacy Management (DPM 2013) International Workshop (in conjunction with ESORICS) (2013)Google Scholar
  2. 2.
    Ayday, E., Raisaro, J.L., et al.: Protecting and evaluating genomic privacy in medical tests and personalized medicine. In: Proceedings of the 12th ACM Workshop on Privacy in the Electronic Society, pp. 95–106. ACM (2013)Google Scholar
  3. 3.
    Baldi, P., Baronio, R., et al.: Countering GATTACA: efficient and secure testing of fully-sequenced human genomes. In: Proceedings of the 18th ACM Conference on Computer and Communications Security, pp. 691–702. ACM (2011)Google Scholar
  4. 4.
    Erlich, Y., Narayanan, A.: Routes for breaching and protecting genetic privacy. Nat. Rev. Genet. 15(6), 409–421 (2014)CrossRefGoogle Scholar
  5. 5.
    Galperin, M.Y., et al.: The 2015 nucleic acids research database issue and molecular biology database collection. Nucleic Acids Res. 43(D1), D1–D5 (2015)CrossRefGoogle Scholar
  6. 6.
    Gymrek, M., McGuire, A.L., Golan, D., Halperin, E., Erlich, Y.: Identifying personal genomes by surname inference. Science 339(6117), 321–324 (2013)CrossRefGoogle Scholar
  7. 7.
    Homer, N., Szelinger, S., Redman, M., Duggan, D., Tembe, W., et al.: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4(8), e1000167 (2008)CrossRefGoogle Scholar
  8. 8.
    Humbert, M., Ayday, E., et al.: Addressing the concerns of the lacks family: quantification of kin genomic privacy. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, pp. 1141–1152. ACM (2013)Google Scholar
  9. 9.
    Iossifov, I., ORoak, B.J., Sanders, S.J., et al.: The contribution of de novo coding mutations to autism spectrum disorder. Nature 515(7526), 216–221 (2014)CrossRefGoogle Scholar
  10. 10.
    Kayser, M., de Knijff, P.: Improving human forensics through advances in genetics, genomics and molecular biology. Nat. Rev. Genet. 12(3), 179–192 (2011)CrossRefGoogle Scholar
  11. 11.
    Kobayashi, E., Sakurada, T., et al.: Public involvement in pharmacogenomics research: a national survey on patients attitudes towards pharmacogenomics research and the willingness to donate DNA samples to a DNA bank in japan. Cell Tissue Banking 12(2), 71–80 (2011)CrossRefGoogle Scholar
  12. 12.
    Naveed, M., Ayday, E., Clayton, E.W., Fellay, J., Gunter, C.A., Hubaux, J.P., Malin, B.A., Wang, X.: Privacy in the genomic era. ACM Comput. Surv. (CSUR) 48(1), 6 (2015)CrossRefGoogle Scholar
  13. 13.
    Nyholt, D.R., Yu, C.E., Visscher, P.M.: On jim watson’s APOE status: genetic information is hard to hide. Eur. J. Hum. Genet. 17(2), 147 (2009)CrossRefGoogle Scholar
  14. 14.
    Pakstis, A.J., Speed, W.C., Fang, R., Hyland, F.C., et al.: SNPs for a universal individual identification panel. Hum. Genet. 127(3), 315–324 (2010)CrossRefGoogle Scholar
  15. 15.
    Schadt, E.E., Woo, S., Hao, K.: Bayesian method to predict individual SNP genotypes from gene expression data. Nat. Genet. 44(5), 603–608 (2012)CrossRefGoogle Scholar
  16. 16.
    Storr, C.L., Or, F., et al.: Genetic research participation in a young adult community sample. J. Commun. Genet. 5(4), 363–375 (2014)CrossRefGoogle Scholar
  17. 17.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Sweeney, L., Abu, A., Winn, J.: Identifying participants in the personal genome project by name. Available at SSRN 2257732 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Mohammed Alser
    • 1
  • Nour Almadhoun
    • 1
  • Azita Nouri
    • 1
  • Can Alkan
    • 1
  • Erman Ayday
    • 1
  1. 1.Computer Engineering DepartmentBilkent UniversityAnkaraTurkey

Personalised recommendations