Skip to main content

An Efficient Hybrid Encryption Scheme for Large Genomic Data Files

  • Conference paper
  • First Online:
Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health (CyberDI 2019, CyberLife 2019)

Abstract

With the rapid development of genomic sequencing technology, the cost of obtaining personal genomic data and analyzing it effectively has been gradually reduced. The analysis and utilization of genomic data have gradually come into the public view, the privacy leakage of genomic data has aroused the attention of researchers. Genomic data has unique format and a large amount of data, but the existing genetic privacy protection schemes often fail to consider security, availability and efficiency together. In this paper, we analyzed widely used genomic data file formats and designed a hybrid encryption scheme for large genomic data files. Firstly, we designed a key agreement protocol based on RSA asymmetric cryptography. Secondly, we used AES symmetric encryption to encrypt the genomic data by optimizing the packet processing of files and multithreading encryption, and improved the usability by assisting the computing platform with key management. Software implementation indicates that the scheme can be applied to the secure transmission of genomic data in the network environment and provide an efficient encryption method for the privacy protection of genomic data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Here K is the AES encryption key, \( K_{p} \) is the public key of RSA, \( K_{s} \) is the private key of RSA, m is the plaintext, \( H(\bullet ) \) represents the hash value.

References

  1. Homer, N., et al.: Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4, e1000167 (2008)

    Article  Google Scholar 

  2. Wang, R., Li, Y.F., Wang, X.F., Tang, H.X., Zhou, X.Y.: Learning your identity and disease from research papers: information leaks in genome wide association study. In: Proceedings of the 16th ACM Conference on Computer and Communications Security, CCS 2009, Chicago, Illinois, vol. 10, no. 1145, pp. 534–544 (2009). https://doi.org/10.1145/1653662.1653726

  3. Gymrek, M., McGuire, A.L., Golan, D., Halperin, E., Erlich, Y.: Identifying personal genomes by surname inference. Science 339(6117), 321–324 (2013). https://doi.org/10.1126/science.1229566

    Article  Google Scholar 

  4. Lippert, C., et al.: Identification of individuals by trait prediction using whole-genome sequencing data. PNAS 114(38), 10166–10171 (2017). https://doi.org/10.1073/pnas.1711125114

    Article  Google Scholar 

  5. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002). https://doi.org/10.1142/S0218488502001648

    Article  MathSciNet  MATH  Google Scholar 

  6. Nyholt, D.R., Yu, C., Visscher, P.M.: On Jim Watson’s APOE status: genetic information is hard to hide. Eur. J. Hum. Genet. 17(2), 147–149 (2009). https://doi.org/10.1038/ejhg.2008.198

    Article  Google Scholar 

  7. Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: PODS, p. 188 (1998). https://doi.org/10.1145/275487.275508

  8. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: privacy beyond k-anonymity. TKDD 1(1), 3 (2007). https://doi.org/10.1109/ICDE.2006.1

    Article  Google Scholar 

  9. Li, N., Li, T., Venkatasubramanian, S.: Closeness: a new privacy measure for data publishing. IEEE Trans. Knowl. Data Eng. 22(7), 943–956 (2010). https://doi.org/10.1109/tkde.2009.139

    Article  Google Scholar 

  10. Johnson, A., Shmatikov, V.: Privacy-preserving data exploration in genome-wide association studies. In: Proceeding of the 19th ACM SIGKDD International Conference on Knowledge/Discovery and Data Minging, pp. 1079–1087. ACM (2013). https://doi.org/10.1145/2487575.2487687

  11. Ayday, E., Raisaro, J.L., Hubaux, J.P.: Personal use of the genmic data: privacy vs. storage cost. In: Proceeding of IEEE Global Communications Conference, Exhibition and Industry Forum, pp. 2723–2729 (2013). https://doi.org/10.1109/GLOCOM.2013.6831486

  12. Cristofaro, E.D., Faber, S., Tsudik, G.: Secure genomic testing with size- and position-hiding private substring matching. In: Proceedings of the 12th ACM Workshop on Privacy in the Electronic Society, pp. 107–118. ACM (2013). https://doi.org/10.1145/2517840.2517849

  13. Chen, Y., Peng, B., Wang, X., Tang, H.: Large-scale privacy-preserving mapping of human genomic sequences on hybrid clouds. In: Proceeding of the 19th Network and Distributed System Security Symposium, San Diego, California, USA (2012)

    Google Scholar 

  14. Burrows, M., Abadi, M., Needham, R.: A logic of authentication. SIGOPS Oper. Syst. Rev. 23(5), 1–13 (1989). https://doi.org/10.1145/77648.77649

    Article  MATH  Google Scholar 

  15. Schneider, T., Tkachenko, O.: EPISODE: efficient privacy-PreservIng similar sequence queries on outsourced genomic DatabasEs? In: Asia CCS 2019 Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, pp. 315–327 (2019). https://doi.org/10.1145/3321705.3329800

Download references

Acknowledgment

This project is supported by the National Key Research and Development Program of China (No. 2016YFC1000307), the National Natural Science Foundation of China (No. 61571024, No. 61971021) and Aeronautical Science Foundation of China (No. 2018ZC51016) for valuable helps.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Shang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, Y., Shang, T., Liu, J., Cao, Z., Geng, Y. (2019). An Efficient Hybrid Encryption Scheme for Large Genomic Data Files. In: Ning, H. (eds) Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health. CyberDI CyberLife 2019 2019. Communications in Computer and Information Science, vol 1137. Springer, Singapore. https://doi.org/10.1007/978-981-15-1922-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1922-2_15

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1921-5

  • Online ISBN: 978-981-15-1922-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics