Skip to main content
Log in

On duplication-free codes for disjoint or equal-length errors

  • Published:
Designs, Codes and Cryptography Aims and scope Submit manuscript

Abstract

Motivated by applications in DNA storage, we study a setting in which strings are affected by tandem-duplication errors. In particular, we look at two settings: disjoint tandem-duplication errors, and equal-length tandem-duplication errors. We construct codes, with positive asymptotic rate, for the two settings, as well as for their combination. Our constructions are duplication-free codes, comprising codewords that do not contain tandem duplications of specific lengths. Additionally, our codes generalize previous constructions, containing them as special cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

Notes

  1. The codes \(C_F\) in [11] contain all the duplication-free strings of length up to n, with a proper padding to make them all of length n.

References

  1. Ben-Tolila E., Schwartz M.: On the reverse-complement string-duplication system. IEEE Trans. Inform. Theory 68(11), 7184–7197 (2022).

    Article  MathSciNet  Google Scholar 

  2. Berstel J.: Growth of repetition-free words—a review. Theor. Comp. Sci. 340(2), 280–290 (2005).

    Article  MathSciNet  Google Scholar 

  3. Church G.M., Gao Y., Kosuri S.: Next-generation digital information storage in DNA. Science 337, 1628 (2012).

    Article  Google Scholar 

  4. Elishco O.: On the long-term behavior of \(k\)-tuples frequencies in mutation systems. arXiv preprint arXiv:2401.04020 (2024).

  5. Elishco O., Farnoud F., Schwartz M., Bruck J.: The entropy rate of some Pólya string models. IEEE Trans. Inform. Theory 65(12), 8180–8193 (2019).

    Article  MathSciNet  Google Scholar 

  6. Farnoud F., Schwartz M., Bruck J.: The capacity of string-duplication systems. IEEE Trans. Inform. Theory 62(2), 811–824 (2016).

    Article  MathSciNet  Google Scholar 

  7. Farnoud F., Schwartz M., Bruck J.: Estimation of duplication history under a stochastic model for tandem repeats. BMC Bioinform. 20(64), 1–11 (2019).

    Google Scholar 

  8. Goldman N., Bertone P., Chen S., Dessimoz C., LeProust E.M., Sipos B., Birney E.: Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 494(7435), 77–80 (2013).

    Article  Google Scholar 

  9. Goshkoder D., Polyanskii N., Vorobyev I.: Codes correcting a single long duplication error. In: Proceedings of the 2023 IEEE International Symposium on Information Theory (ISIT2023), Taipei, Taiwan, pp. 2708–2713 (2023).

  10. Jain S., Farnoud F., Bruck J.: Capacity and expressiveness of genomic tandem duplication. IEEE Trans. Inform. Theory 63(10), 6129–6138 (2017).

    Article  MathSciNet  Google Scholar 

  11. Jain S., Farnoud F., Schwartz M., Bruck J.: Duplication-correcting codes for data storage in the DNA of living organisms. IEEE Trans. Inform. Theory 63(8), 4996–5010 (2017).

    Article  MathSciNet  Google Scholar 

  12. Kovačević M.: Zero-error capacity of duplication channels. IEEE Trans. Commun. 67(10), 6735–6742 (2019).

    Article  Google Scholar 

  13. Lenz A., Wachter-Zeh A., Yaakobi E.: Duplication-correcting codes. Des. Codes Cryptogr. 87, 277–298 (2019).

    Article  MathSciNet  Google Scholar 

  14. Lind D., Marcus B.H.: An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, Cambridge (1985).

    Google Scholar 

  15. Marcus B.H., Roth R.M., Siegel P.H.: An Introduction to Coding for Constrained Systems (2001). Unpublished lecture notes. https://personal.math.ubc.ca/~marcus/Handbook/index.html.

  16. Nguyen T.T., Cai K., Song W., Immink K.A.S.: Optimal single chromosome-inversion correcting codes for data storage in live DNA. In: Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT2022), Espoo, Finland, pp. 1791–1796 (2022).

  17. Shipman S.L., Nivala J., Macklis J.D., Church G.M.: CRISPR-Cas encoding of digital movie into the genomes of a population of living bacteria. Nature 547, 345–349 (2017).

    Article  Google Scholar 

  18. Tang Y., Farnoud F.: Error-correcting codes for short tandem duplication and edit errors. IEEE Trans. Inform. Theory 68(2), 871–880 (2021).

    Article  MathSciNet  Google Scholar 

  19. Tang Y., Yehezkeally Y., Schwartz M., Farnoud F.: Single-error detection and correction for duplication and substitution channels. IEEE Trans. Inform. Theory 66(11), 6908–6919 (2020).

    Article  MathSciNet  Google Scholar 

  20. Tang Y., Wang S., Lou H., Gabrys R., Farnoud F.: Low-redundancy codes for correcting multiple short-duplication and edit errors. IEEE Trans. Inform. Theory 69(5), 2940–2954 (2023).

    Article  MathSciNet  Google Scholar 

  21. Yohananov L., Schwartz M.: Optimal reverse-complement-duplication error-correcting codes. arXiv preprint arXiv:2312.00394 (2023).

  22. Zeraatpisheh M., Esmaeili M., Gulliver T.A.: Construction of tandem duplication correcting codes. IET Commun. 13(15), 2217–2225 (2019).

    Article  Google Scholar 

  23. Zeraatpisheh M., Esmaeili M., Gulliver T.A.: Construction of duplication correcting codes. IEEE Access 8, 96150–96161 (2020).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Zhejiang Lab BioBit Program (grant no. 2022YFB507). The author M. Schwartz is currently on a leave of absence from Ben-Gurion University of the Negev.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the design and implementation and wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Wenjun Yu.

Ethics declarations

Competing interest

The authors declare no competing interests.

Additional information

Communicated by T. Feng.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, W., Schwartz, M. On duplication-free codes for disjoint or equal-length errors. Des. Codes Cryptogr. (2024). https://doi.org/10.1007/s10623-024-01417-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10623-024-01417-7

Keywords

Mathematics Subject Classification

Navigation