Advertisement

Medical & Biological Engineering & Computing

, Volume 52, Issue 11, pp 945–961 | Cite as

DNA-LCEB: a high-capacity and mutation-resistant DNA data-hiding approach by employing encryption, error correcting codes, and hybrid twofold and fourfold codon-based strategy for synonymous substitution in amino acids

  • Ibbad Hafeez
  • Asifullah Khan
  • Abdul Qadir
Original Article

Abstract

Data-hiding in deoxyribonucleic acid (DNA) sequences can be used to develop an organic memory and to track parent genes in an offspring as well as in genetically modified organism. However, the main concerns regarding data-hiding in DNA sequences are the survival of organism and successful extraction of watermark from DNA. This implies that the organism should live and reproduce without any functional disorder even in the presence of the embedded data. Consequently, performing synonymous substitution in amino acids for watermarking becomes a primary option. In this regard, a hybrid watermark embedding strategy that employs synonymous substitution in both twofold and fourfold codons of amino acids is proposed. This work thus presents a high-capacity and mutation-resistant watermarking technique, DNA-LCEB, for hiding secret information in DNA of living organisms. By employing the different types of synonymous codons of amino acids, the data storage capacity has been significantly increased. It is further observed that the proposed DNA-LCEB employing a combination of synonymous substitution, lossless compression, encryption, and Bose–Chaudary–Hocquenghem coding is secure and performs better in terms of both capacity and robustness compared to existing DNA data-hiding schemes. The proposed DNA-LCEB is tested against different mutations, including silent, miss-sense, and non-sense mutations, and provides substantial improvement in terms of mutation detection/correction rate and bits per nucleotide. A web application for DNA-LCEB is available at http://111.68.99.218/DNA-LCEB.

Keywords

DNA Data hiding Watermarking Organic Memory Synonymous substitution Amino acids 

Notes

Acknowledgments

This work is supported by ICT R&D, Pakistan research grant project; ICTRDF/TR&D/2012/62-DEWS and COMSTECH-TWAS Joint Research Grants Program for Young Scientist; 12-216 RG/ITC/AS-C; UNESCO FR: 3240270865. We also thank Mr. Khurram Jawad for his help in improving the write-up of the manuscript.

Supplementary material

11517_2014_1194_MOESM1_ESM.pdf (153 kb)
Supplementary material 1 (PDF 153 kb)

References

  1. 1.
    Agarwal H (2010) Matlab implementation, analysis and comparison of RSA family cryptosystems. In: Presented at the IEEE conference on computational intelligence and computing research (ICCIC). doi: 10.1109/ICCIC.2010.5705873
  2. 2.
    Ailenberg M, Rotstein OD (2009) An improved Huffman coding method for archiving text, images, and music characters in DNA. Biotechniques 47:747–754PubMedCrossRefGoogle Scholar
  3. 3.
    Arita M, Ohashi Y (2004) Secret signatures inside genomic DNA. Biotechnol Prog 20:1605–1607PubMedCrossRefGoogle Scholar
  4. 4.
    Balado FE, Haughton D (2010) Performance of DNA data embedding algorithms under substitution mutations. In: Presented at the 2010 IEEE international conference on bioinformatics and biomedicine workshops, Hong Kong, pp 201–206Google Scholar
  5. 5.
    Bose RC, Chaudhuri R (1960) On a class of error correction binary group codes. Inf Control 3(1):68–79CrossRefGoogle Scholar
  6. 6.
    Chang CC, Lu T-C, Chang Y-F, Lee C-T (2007) Reversible data hiding schemes for deoxyribonucleic acid (DNA) medium. Int J Innov Comput Inf Control 3:1145–1160Google Scholar
  7. 7.
    Church GM, Gao Y, Kosuri S (2012) Next generation digital information storage in DNA. Science 07:2012Google Scholar
  8. 8.
    Cipra BA (1993) The ubiquitous Reed–Solomon codes. SIAM News 26-1Google Scholar
  9. 9.
    Clelland CT, Risca V, Bancroft C (1999) Hiding data in DNA microdots. Nature 399:533–534PubMedCrossRefGoogle Scholar
  10. 10.
    Crick F, Watson JD (1953) Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature 171:737–738PubMedCrossRefGoogle Scholar
  11. 11.
    Daemen J, Rijmen V (1999) The block cipher rijndael . In: Third international conference, CARDIS’98, Louvain-la-Neuve, Belgium, September 14–16, 1998. Proceedings, pp 277–284. doi: 10.1007/10721064_26
  12. 12.
    Gehani A, LaBean TH, Reif JH (2004) DNA based cryptography. Comput J IMACS DNA Based Comput Am Math Soc USA 2950:34–50Google Scholar
  13. 13.
    Gonzalez RC, Woods RE (2002) Digital image processing. Pearson Education, New DelhiGoogle Scholar
  14. 14.
    Hayat M, Khan A, Yeasin M (2012) Prediction of membrane proteins using split amino acid ensemble classification. Amino Acids 42:2447–2460PubMedCrossRefGoogle Scholar
  15. 15.
    Heider D, Barnekow A (2007) DNA-based watermarks using the DNA-Crypt algorithm. Comput J BMC Bioinform 8:176–187CrossRefGoogle Scholar
  16. 16.
    Heider D, Barnekow A (2008) DNA watermarks: a proof of concept. Comput J BMC Mol Biol 9:45–49CrossRefGoogle Scholar
  17. 17.
    Heider D, Kessler D, Barnekow A (2008) Watermarking sexually reproducing diploid organisms. Bioinformatics 24:1961–1962PubMedCrossRefGoogle Scholar
  18. 18.
    Heider D, Pyka M, Barnekow A (2009) DNA watermarks in non-coding regulatory sequences. BMC Res Notes 2:125PubMedCrossRefPubMedCentralGoogle Scholar
  19. 19.
    Khan A, Mirza AM (2007) Genetic perceptual shaping: utilizing cover image and conceivable attack information using genetic programming. Inf Fusion 8:354–365CrossRefGoogle Scholar
  20. 20.
    Khan A, Tahir SF, Majid A, Chor T-S (2008) Machine learning based adaptive watermark decoding in view of an anticipated attack. Pattern Recognit 41:2594–2610CrossRefGoogle Scholar
  21. 21.
    Kim H (2008) DNA repair Ku proteins in gastric cancer cells and pancreatic acinar cells. Amino Acids 34(2):195–202Google Scholar
  22. 22.
    Liss M, Daubert D, Brunner K, Kliche K, Hammes U, Leiherer A et al (2012) Embedding permanent watermarks in synthetic genes. PLoS One 7:10CrossRefGoogle Scholar
  23. 23.
    Liu G, Liu H, Kadir A (2014) Hiding message into DNA sequence through DNA coding and chaotic maps. Med Biol Eng Comput 52(9):741–747. doi: 10.1007/s11517-014-1177-3
  24. 24.
    Miller F (1882) Telegraphic code to insure privacy and secrecy in the transmission of telegrams. C.M. CornwellGoogle Scholar
  25. 25.
    Modegi T (2005) Watermark embedding techniques for DNA sequences using codon usage bias features. In: Presented at the 16th international conference on genome informaticsGoogle Scholar
  26. 26.
    Mousa H, Moustafa K, Abdel-Wahed W, Hadhoud M (2011) Data hiding based on contrast mapping using DNA medium. Int Arab J Inf Technol 8:147–154Google Scholar
  27. 27.
    Naveed M, Khan A (2011) GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic. Amino Acids 42:1825CrossRefGoogle Scholar
  28. 28.
    NCBI (2012) GenBank. www.ncbi.nlm.nih.gov/genbank/
  29. 29.
    NIoSaT (NIST) (2001) Federal information processing standards publication (FIPS 197). Advanced encryption standard (AES)Google Scholar
  30. 30.
    Shimanovsky B, Feng J, Potkon M (2003) Hiding data in DNA. In: Presented at the revised papers from the 5th international workshop on information hiding, IH 2002 Noordwijkerhout, The Netherlands. Lecture Notes in Computer Science, vol 2578, pp 373–386Google Scholar
  31. 31.
    Shiu HJ, Ng KL, Feng JF, Lee RCT, Huang CH (2010) Data hiding method based upon DNA sequences. Inf Sci 180:12CrossRefGoogle Scholar
  32. 32.
    Smith GC, Fiddes CC, Hawkings JP, Cox JPL (2003) Some possible codes for encrypting data in DNA. Biotechnol Lett 25:1125–1130PubMedCrossRefGoogle Scholar
  33. 33.
    Tu C, Liang J, Tran TD (2003) Adaptive runlength coding. IEEE Signal Process Lett 10:61–64CrossRefGoogle Scholar
  34. 34.
    Usman I, Khan A (2010) BCH coding and intelligent watermark embedding: employing both frequency and strength selection. Appl Soft Comput J 10:332–343CrossRefGoogle Scholar
  35. 35.
    Wong PC, Wong K-K, Foote H (2003) Organic data memory using the DNA approach. Commun ACM 46:95–98CrossRefGoogle Scholar
  36. 36.
    Yachie N, Ohashi Y, Tomita M (2008) Stabilizing synthetic data in the DNA of living organisms. Syst Synth Biol 2:19–25PubMedCrossRefPubMedCentralGoogle Scholar

Copyright information

© International Federation for Medical and Biological Engineering 2014

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of HelsinkiHelsinkiFinland
  2. 2.Department of Computer and Information SciencesPakistan Institute of Engineering and Applied Sciences (PIEAS)IslamabadPakistan

Personalised recommendations