Skip to main content
Log in

Construction of DNA codes with multiple constrained properties

  • Research
  • Published:
Cryptography and Communications Aims and scope Submit manuscript

Abstract

DNA sequences are prone to creating secondary structures by folding back on themselves by non-specific hybridization of its nucleotides. The formation of large stem-length secondary structures makes the sequences chemically inactive towards synthesis and sequencing processes. Furthermore, in DNA computing, other constraints like homopolymer run length also introduce complications. In this paper, our goal is to tackle the problems due to the creation of secondary structures in DNA sequences along with constraints such as not having a large homopolymer run length. This paper presents families of DNA codes with secondary structures of stem length at most two and homopolymer run length at most four. We identified \(\mathbb {Z}_{11}\) as an ideal structure to construct DNA codes to avoid the above problems. By mapping the error-correcting codes over \(\mathbb {Z}_{11}\) to DNA nucleotides, we obtained DNA codes with rates 0.5765 times the corresponding code rate over \(\mathbb {Z}_{11}\), including some new secondary structure-free and better-performing codes for DNA-based data storage and DNA computing purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1

Similar content being viewed by others

References

  1. Church, G.M., Gao, Y., Kosuri, S.: Next-generation digital information storage in DNA. Science 337(6102), 1628–1628 (2012)

    Article  Google Scholar 

  2. Goldman, N., Bertone, P., Chen, S., Dessimoz, C., LeProust, E.M., Sipos, B., Birney, E.: Towards practical, high-capacity, low maintenance information storage in synthesized DNA. Nature (2013)

  3. Yazdi, S.M.H.T., Kiah, H.M., Garcia-Ruiz, E., Ma, J., Zhao, H., Milenkovic, O.: DNA-based storage: trends and methods. IEEE Trans. on Molecular, Biological and Multi-Scale Communications 1(3), 230–248 (2015). https://doi.org/10.1109/TMBMC.2016.2537305

    Article  Google Scholar 

  4. Tuan, T.N., Cai, K., Kiah, H.M., Dao, D.T., Schouhamer Immink, K.A.: On the Design of Codes for DNA Computing: Secondary Structure Avoidance Codes. arXiv e-prints: arXiv-2302 (2023)

  5. Marathe, A., Codon, A.E., Corn, R.M.: On combinatorial DNA word design. J. Comput. Biol. 8(3), 201–219 (2004)

    Article  Google Scholar 

  6. Benerjee, K.G., Banerjee, A.: On DNA codes with multiple constraints. IEEE Commun. Lett. https://doi.org/10.1109/LCOMM.2020.3029071

  7. Limbachiya, D., Benerjee, K.G., Rao, B., Gupta, M.K.: On DNA codes using the ring \(Z_4 + wZ_4\). In: Proceedings of the IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, pp. 2401–2405 (2018)

  8. Rykov, V.V., Macula, A.J., Torney, D.C., White, P.S.: DNA sequences and quaternary cyclic codes. In: Proceedings of the IEEE International Symposium on Information Theory (ISIT), Washington, DC, USA, USA, pp. 248–248 (2001)

  9. International Human Genome Sequencing Consortium: initial sequencing and analysis of the human genome. Nature 409(6822), 860–921 (2001)

  10. Kim, Y.S., Kim, S.H.: New construction of DNA codes with constant-GC contents from binary sequences with ideal correlation. In: Proceedings of the IEEE International Symposium on Information Theory (ISIT), St. Petersburg, Russia, pp. 1569–1573 (2011)

  11. Milenkovic, O., Kashyap, N.: On the design of codes for DNA computing. In: Ytrehus (ed.) Coding and Cryptography. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 100–119 (2006)

  12. Zuker, M., Sankoff, D.: RNA secondary structures and their prediction. Bulletin of Mathematical Biology 46(4), 591–621 (1984). ISSN 0092-8240, https://doi.org/10.1016/S0092-8240(84)80062-2

  13. Nussinov, R., Jacobson, A.B.: Fast algorithm for predicting the secondary structure of single-stranded RNA. Natl. Acad. Sci. 77(11), 6309–6313 (1980)

    Article  Google Scholar 

  14. Clote, P., Backofen, R.: Computational molecular biology: an introduction. Wiley Series in Mathematical and Computational Biology, Hoboken, New Jersey, US (2000)

    Google Scholar 

  15. Mishra, P., Bhaya, C., Pal, A.K., Singh, A.K.: Compressed DNA coding using minimum variance Huffman tree. IEEE Commun. Lett. 24(8), 1602–1606 (2020). https://doi.org/10.1109/LCOMM.2020.2991461

    Article  Google Scholar 

  16. Limbachiya, D., Gupta, M.K., Aggarwal, V.: Family of constrained codes for archival DNA data storage. IEEE Commun. Lett. 22(10), 1972–1975 (2018). https://doi.org/10.1109/LCOMM.2018.2861867

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to sincerely thank the referees for a meticulous reading of this manuscript, and for valuable suggestions which helped to create an improved final version. Some part of this paper was done during the visit of Prof Abhay Kumar Singh to Prof Udaya Parampalli, School of Computing and Information Systems, The University of Melbourne, Parkville, Australia in September 2023. Prof Abhay Kumar Singh expresses gratitude to the School of Computing and Information Systems at The University of Melbourne, Parkville, Australia for their hospitality and support during discussions on DNA-constrained codes.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed equally

Corresponding author

Correspondence to Abhay Kumar Singh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhoi, S.S., Parampalli, U. & Singh, A.K. Construction of DNA codes with multiple constrained properties. Cryptogr. Commun. (2024). https://doi.org/10.1007/s12095-024-00718-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12095-024-00718-x

Keywords

Mathematics Subject Classification (2010)

Navigation