High-density information storage and random access scheme using synthetic DNA

Zhang, Shufang; Wu, Jianjun; Huang, Beibei; Liu, Yuhong

doi:10.1007/s13205-021-02882-w

High-density information storage and random access scheme using synthetic DNA

Original Article
Published: 12 June 2021

Volume 11, article number 328, (2021)
Cite this article

3 Biotech Aims and scope Submit manuscript

843 Accesses
4 Citations
Explore all metrics

Abstract

The high-storage density, long-life cycle, and low-energy consumption of DNA molecules make it the future of next-generation storage technology. However, DNA storage has the disadvantages of high-synthesis cost and low-random access efficiency. A high-density DNA-coding scheme can effectively reduce the cost of DNA synthesis. This paper first proposes a DNA-mapping method based on codebook and a random access method for DNA information based on encoded content. The mapping method satisfies the two biological constraints of homopolymer length and GC content. The random access method can efficiently and selectively read specific files in the DNA pool. To increase storage density, convolutional neural networks are combined with mapping methods to generate base sequences. In the experiments, our method was compared with the results of existing DNA information storage methods, which showed that the proposed scheme has better information storage density.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Adaptive coding for DNA storage with high storage density and low coverage

Article Open access 04 July 2022

NOREC4DNA: using near-optimal rateless erasure codes for DNA storage

Article Open access 17 August 2021

Efficient DNA-based data storage using shortmer combinatorial encoding

Article Open access 02 April 2024

Availability of data and material

The data and material do not be opened.

References

Akhmetov A, Ellington AD, Marcotte EM (2018) A highly parallel strategy for storage of digital information in living cells. BMC Biotechnol 18(1):64
Article CAS Google Scholar
Anavy L, Vaknin I, Atar O et al (2019) Data storage in DNA with fewer synthesis cycles using composite DNA letters. Nat Biotechnol 37(10):1229–1236
Article CAS Google Scholar
Ballé J, Laparra V, Simoncelli EP (2015) Density modeling of images using a generalized normalization transformation[J]. arXiv e-prints, arXiv:1511.06281
Biswas S, Nath S, Sing JK et al (2019) Storing digital data in nucleic acid memory with extended genetic alphabet. Proceedings of 2019 devices for integrated circuit. IEEE, Kalyani, pp 236–239
Chapter Google Scholar
Blawat M, Gaedke K, Hütter I et al (2016) Forward error correction for DNA data storage. Procedia Comput Sci 80:1011–1022
Article Google Scholar
Ceze L, Nivala J, Strauss K (2019) Molecular digital data storage using DNA. Nat Rev Genet 20(8):456–466
Article CAS Google Scholar
Choi Y, Ryu T, Lee AC et al (2019) High information capacity DNA-based data storage with augmented encoding characters using degenerate bases. Sci Rep 9(1):1–7
Google Scholar
Dimopoulou M, Antonini M, Barbry P et al (2019) A biologically constrained encoding solution for long-term storage of images onto synthetic DNA. Proceedings of 27th European signal processing conference. IEEE, A Coruna, pp 1–5
Google Scholar
Dong Y et al (2020) DNA storage: research landscape and future prospects. Natl Sci Rev 7(6):1092–1107
Article Google Scholar
Erlich Y, Zielinski D (2017) DNA Fountain enables a robust and efficient storage architecture. Science 355(6328):950–954
Article CAS Google Scholar
Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169
Jia D, Wei D, Socher R et al (2009) ImageNet: a large-scale hierarchical image database. Proc of IEEE Computer Vision & Pattern Recognition, pp 248–255
Google Scholar
Organick L, Ang SD, Chen YJ et al (2018) Random access in large-scale DNA data storage. Nat Biotechnol 36(3):242
Article CAS Google Scholar
Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252
Article Google Scholar
Shipman SL, Nivala J, Macklis JD et al (2017) CRISPR-Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 547(7663):345–349
Article CAS Google Scholar
Sun L, He J, Luo J et al (2019) DNA and the digital data storage. Health Sci J 13(3):659
Google Scholar
Wang Y, Noor-A-Rahim M, Gunawan E et al (2019) Construction of bio-constrained code for DNA data storage. IEEE Commun Lett 23(6):963–966
Article Google Scholar
Yazdi SMHT, Yuan Y, Ma J et al (2015) A Rewritable Random-Access DNA-Based Storage System. Sci Rep 5:14138
Article Google Scholar
Zhang S, Huang B, Song X et al (2019) A high storage density strategy for digital information based on synthetic DNA. 3 Biotech 9(9):342
Article Google Scholar

Download references

Acknowledgements

Thanks all co-author for their contribution. The authors would like to thank Shufang Zhang for the insightful discussions and feedback.

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
Shufang Zhang, Jianjun Wu & Beibei Huang
Computer Science and Engineering Department, Santa Clara University, Santa Clara, CA, 95053, USA
Yuhong Liu

Authors

Shufang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianjun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Beibei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yuhong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Shufang zhang, Jianjun Wu, BeiBei Huang and Yuhong Liu. The first draft of the manuscript was written by Jianjun Wu, BeiBei Huang and Yuhong Liu. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Shufang Zhang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose. The authors have no conflicts of interest to declare that are relevant to the content of this article. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Ethics approval

The research is not involving human participants and animals. The authors inform all information of this research.

Consent to participate

The authors agree to participate.

Consent for publication

The submission is published by approval of authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, S., Wu, J., Huang, B. et al. High-density information storage and random access scheme using synthetic DNA. 3 Biotech 11, 328 (2021). https://doi.org/10.1007/s13205-021-02882-w

Download citation

Received: 19 March 2021
Accepted: 03 June 2021
Published: 12 June 2021
DOI: https://doi.org/10.1007/s13205-021-02882-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-density information storage and random access scheme using synthetic DNA

Abstract

Access this article

Similar content being viewed by others

Adaptive coding for DNA storage with high storage density and low coverage

NOREC4DNA: using near-optimal rateless erasure codes for DNA storage

Efficient DNA-based data storage using shortmer combinatorial encoding

Availability of data and material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Rights and permissions

About this article

Cite this article

Keywords

Navigation

High-density information storage and random access scheme using synthetic DNA

Abstract

Access this article

Similar content being viewed by others

Adaptive coding for DNA storage with high storage density and low coverage

NOREC4DNA: using near-optimal rateless erasure codes for DNA storage

Efficient DNA-based data storage using shortmer combinatorial encoding

Availability of data and material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation