Abstract
Identification of coding regions in DNA sequences remains challenging. Various methods have been proposed, but these are limited by species-dependence and the need for adequate training sets. The elements in DNA coding regions are known to be distributed in a quasi-random way, while those in non-coding regions have typical similar structures. For short sequences, these statistical characteristics cannot be extracted correctly and cannot even be detected. This paper introduces a new way to solve the problem: balanced estimation of diffusion entropy (BEDE).
Similar content being viewed by others
References
Kotlar, D., Lavner, T.: Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. Genome Res. 13(18), 1930–1937 (2003)
Lobzin, V.V., Chechetkin, V.R.: Order and correlations in genomic DNA sequences. The spectral approach. Physics-Uspekhi 43, 55–78 (2000)
Anastassiou, D.: Frequency-domain analysis of biomolecular sequences. Bioinformatics 16(12), 1073–1081 (2000)
Grosse, I., Herzel, H., Buldyrev, S.V., Stanley, H.E.: Species independence of mutual information in coding and noncoding DNA. Phys. Rev. E 61(5), 5624–5629 (2000)
Bernaola-Galván, P., Grosse, I., Carpena, P., Oliver, J.L., Román-Roldán, R., Stanley, H.E.: Finding borders between coding and noncoding DNA regions by an entropic segmentation method. Phys. Rev. Lett. 85(6), 1342–1345 (2000)
Barral, J.P., Hasmy, A., Jiménez, J., Marcano, A.: Nonlinear modeling technique for the analysis of DNA chains. Phys. Rev. E 61(2), 1812–1815 (2000)
Scafetta, N., Hamilton, P., Grigolini, P.: The thermodynamics of social processes: the teen birth phenomenon. Fractals 9(2), 193–208 (2001)
Grigolini, P., Leddon, D., Scafetta, N.: Diffusion entropy and waiting time statistics of hard-x-ray solar flares. Phys. Rev. E 65(4), 046203 (2002)
Yang, H.J., Zhao, F.C., Qi, L.Y., Hu, B.L.: Temporal series analysis approach to spectra of complex networks. Phys. Rev. E 69(6), 066104 (2004)
Yang, H.J., Zhao, F.C., Zhang, W., Li, Z.N.: Diffusion entropy approach to complexity for a Hodgkin–Huxley neuron. Physica A 347, 704–710 (2005)
Cai, S.M., Zhou, P.L., Yang, H.J., Yang, C.X., Wang, B.H., Zhou, T.: Diffusion entropy analysis on the scaling behavior of financial markets. Physica A 367, 337–344 (2006)
Scafetta, N., Latora, V., Grigolini, P.: Lévy scaling: the diffusion entropy analysis applied to DNA sequences. Phys. Rev. E 66(3), 031906 (2002)
Allegrini, P., Bellazzini, J., Bramanti, G., et al.: Scaling breakdown: a signature of aging. Phys. Rev. E 66(1), 015101 (2002)
Qi, J.C., Yang, H.J.: Hurst exponents for short time series. Phys. Rev. E 84(6), 066114 (2011)
Zhang, W., Qiu, L., Xiao, Q., Yang, H.J., Zhang, Q., Wang, J.: Evaluation of scale invariance in physiological signals by means of balanced estimation of diffusion entropy. Phys. Rev. E 86(5), 056107 (2012)
Stanley, H.E., Buldyrev, S.V., Goldberger, A.L., Havlin, S., Peng, C.K., Simons, M.: Scaling features of noncoding DNA. Physica A 273(1–2), 1–18 (1999)
Yang, H.J., Zhao, F.C., Zhuo, Y.Z., Wu X.Z.: Analysis of DNA chains by means of factorial moments. Phys. Lett. A 292(6), 349–356 (2002)
García, P., Jiménez, J., Marcano, A., Molelro, F.: Local optimal metrics and nonlinear modeling of chaotic time series. Phys. Rev. Lett. 76(9), 1449–1452 (1996)
Conflict of interest
The authors declare no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, J., Zhang, W. & Yang, H. In search of coding and non-coding regions of DNA sequences based on balanced estimation of diffusion entropy. J Biol Phys 42, 99–106 (2016). https://doi.org/10.1007/s10867-015-9399-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10867-015-9399-7