Skip to main content
Log in

Classification analysis of a latent dinucleotide periodicity of plant genomes

  • Mathematical Models and Methods
  • Published:
Russian Journal of Genetics Aims and scope Submit manuscript

Abstract

The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the Gen-Bank database, 14 766 sequences with a periodicity of two nucleotides have been found at a high level of statistical significance. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wells, R., Molecular Basis of Genetic Instability of Triplet Repeats, J. Biol. Chem., 1996, vol. 271, pp. 2875–2878.

    PubMed  CAS  Google Scholar 

  2. Weitzmann, M., Woodford, K., and Usdin, K., DNA Secondary Structures and the Evolution of Hypervariable Tandem Arrays, J. Biol. Chem., 1997, vol. 272, pp. 9517–9523.

    Article  PubMed  CAS  Google Scholar 

  3. Richards, R., Holman, K., Yu, S., and Sutherland, G., Fragile X Syndrome Unstable Element, P(CCG)n, and Other Simple Tandem Repeat Sequences Are Binding Sites for Specific Nuclear Proteins, Hum. Mol. Genet., 1993, vol. 2, pp. 1429–1435.

    Article  PubMed  CAS  Google Scholar 

  4. Lu, Q., Wallrath, L., Granok, H., and Elgin, S., (CT)n (GA)n Repeats and Heat Shock Elements Have Distinct Roles in Chromatin Structure and Transcriptional Activation of the Drosophila hsp26 Gene, Mol. Cell. Biol., 1993, vol. 13, pp. 2802–2814.

    PubMed  CAS  Google Scholar 

  5. Keim, P., Price, L.B., Klevytska, A.M., et al., Multiple-Locus Variable-Number Tandem Repeat Analysis Reveals Genetic Relationships within Bacillus anthracis, J. Bacteriol., 2000, vol. 182, pp. 2928–2936.

    Article  PubMed  CAS  Google Scholar 

  6. Frothingham, R. and Meeker-O’Connell, W.A., Genetic Diversity in the Mycobacterium tuberculosis Complex Based on Variable Numbers of Tandem DNA Repeats, Microbiology, 1998, vol. 144, pp. 1189–1196.

    Article  PubMed  CAS  Google Scholar 

  7. Supply, P., Mazars, E., Lesjean, S., et al., Variable Human Minisatellite-Like Regions in the Mycobacterium tuberculosis Genome, Mol. Microbiol., 2000, vol. 36, pp. 762–771.

    Article  PubMed  CAS  Google Scholar 

  8. Le Fleche, P., Hauck, Y., Onteniente, L., et al., A Tandem Repeats Database for Bacterial Genomes: Application to the Genotyping of Yersinia pestis and Bacillus anthracis, BMC Microbiol., 2001, vol. 1, no. 2.

  9. Toth, G., Gaspari, Z., and Jurka, J., Microsatellites in Different Eukaryotic Genomes: Survey and Analysis, Genome Res., 2000, vol. 10, pp. 967–981.

    Article  PubMed  CAS  Google Scholar 

  10. Gur-Arie, R., Cohen, C.J., Eitan, Y., et al., Simple Sequence Repeats in Escherichia coli: Abundance, Distribution, Composition, and Polymorphism, Genome Res., 2000, vol. 10, pp. 62–71.

    PubMed  CAS  Google Scholar 

  11. Dib, C., Faure, S., Fizames, C., et al., A Comprehensive Genetic Map of the Human Genome Based on 5,264 Microsatellites, Nature, 1996, vol. 380, pp. 149–152.

    Article  Google Scholar 

  12. Van Belkum, A., Scherer, S., van Leeuwen, W., et al., Variable Number of Tandem Repeats in Clinical Strains of Haemophilus influenzae, Infect. Immun., 1997, vol. 65, pp. 5017–5027.

    PubMed  Google Scholar 

  13. Adair, D.M., Worsham, P.L., Hill, K.K., et al., Diversity in a Variable-Number Tandem Repeat from Yersinia pestis, J. Clin. Microbiol., 2000, vol. 38, pp. 1516–1519.

    PubMed  CAS  Google Scholar 

  14. Benson, G., Tandem Repeats Finder: A Program to Analyze DNA Sequences, Nucleic Acids Res., 1999, vol. 27, no. 2, pp. 573–580.

    Article  PubMed  CAS  Google Scholar 

  15. Kolpakov, R., Bana, G., and Kucherov, G., mreps: Efficient and Flexible Detection of Tandem Repeats in DNA, Nucleic Acids Res., 2003, vol. 31, no. 13, pp. 3672–3678.

    Article  PubMed  CAS  Google Scholar 

  16. Korotkov, E.V., Korotkova, M.A., and Kudryashov, N.A., Information Decomposition Method to Analyze Symbolical Sequences, Phys. Lett. A, 2003, vol. 312, pp. 198–210.

    Article  CAS  Google Scholar 

  17. Ruitberg, C.M., Reeder, D.J., and Butler, J.M., STR-Base: A Short Tandem Repeat DNA Database for the Human Identity Testing Community, Nucleic Acids Res., 2001, vol. 29, pp. 320–322.

    Article  PubMed  CAS  Google Scholar 

  18. Boby, T., Patch, A.-M., and Aves, S.J., TRbase: A Database Relating Tandem Repeats to Disease Genes for the Human Genome, Bioinformatics, 2005, vol. 21, no. 6, pp. 811–816.

    Article  PubMed  CAS  Google Scholar 

  19. Smulders, M.J.M., Van Der Shoot, J., Arens, P., and Vosman, B., Trinucleotide Repeat Microsatellite Markers for Black Poplar (Populus nigra L.), Mol. Ecol. Notes, 2001, vol. 1, pp. 188–190.

    Article  CAS  Google Scholar 

  20. Thompson, H., Schmidt, R., and Dean, C., Identification and Distribution of Seven Classes of Middle-Repetitive DNA in the Arabidopsis thaliana Genome, Nucleic Acids Res., 1996, vol. 24, no. 15, pp. 3017–3022.

    Article  PubMed  CAS  Google Scholar 

  21. Li, Y.-C., Fahima, T., Roder, M.S., et al., Genetic Effects on Microsatellite Diversity in Wild Emmer Wheat (Triticum dicoccoides) at the Yehudiyya Microsite, Israel, Heredity, 2003, vol. 90, pp. 150–156.

    Article  PubMed  CAS  Google Scholar 

  22. Yaish, M.W.F., Perez De La Vega, M., Isolation of (GA)n Microsatellite Sequences and Description of a Predicted MADS-Box Sequence Isolated from Common Bean (Phaseolus vulgaris L.), Genet. Mol. Biol., 2003, vol. 26, pp. 337–342.

    Article  CAS  Google Scholar 

  23. Benson, G., Tandem Cyclic Alignment: Proceedings of 12th Annual Symposium on Combinatorial Pattern Matching, LNCS, 2001, vol. 2089, pp. 118–130.

    Google Scholar 

  24. Ingham, L.D., Hanna, W.W., Baier, J.W., and Hannah, L.C., Origin of the Main Class of Repetitive DNA within Selected Pennisetum Species, Mol. Gen. Genet., 1993, vol. 238, pp. 350–356.

    Article  PubMed  CAS  Google Scholar 

  25. Shelenkov, A.A., Skryabin, K.G., and Korotkov, E.V., Search and Classification of Potential Minisatellite Sequences from Bacterial Genomes, DNA Res., 2006, vol. 13, no. 3, pp. 89–102.

    Article  PubMed  CAS  Google Scholar 

  26. Wang, Z., Weber, J.L., Zhong, G., and Tanksley, S.D., Survey of Plant Short Tandem DNA Repeats, Theor. Appl. Genet., 1994, vol. 88, pp. 1–6.

    CAS  Google Scholar 

  27. Gupta, P.K. and Varshney, R.K., The Development and Use of Microsatellite Markers for Genetic Analysis of Plant Breeding with Emphasis on Bread Wheat, Euphytica, 2000, vol. 113, pp. 163–185.

    Article  CAS  Google Scholar 

  28. Korotkova, M.A., Korotkov, E.V., and Rudenko, V.M., Latent Periodicity of Protein Sequences, J. Mol. Model., 1999, vol. 5, pp. 103–115.

    Article  CAS  Google Scholar 

  29. Korotkov, E.V., Korotkova, M.A., and Tulko, J.S., Latent Sequence Periodicity of Some Oncogenes and DNA-Binding Protein Genes, Comput. Appl. Biosci., 1997, vol. 13, pp. 37–44.

    PubMed  CAS  Google Scholar 

  30. Chaley, M.B., Korotkov, E.V., and Skryabin, K.G., Method Revealing Latent Periodicity of the Nucleotide Sequences Modified for a Case of Small Samples, DNA Res., 1999, vol. 6, pp. 153–163.

    Article  PubMed  CAS  Google Scholar 

  31. Kullback, S., Information Theory and Statistics, New York: Wiley, 1959.

    Google Scholar 

  32. Lespinasse, D., Rodier-Goud, M., Grivet, L., et al., A Saturated Genetic Linkage Map of Rubber Tree (Hevea spp.) Based on RFLP, AFLP, Microsatellite, and Isozyme Markers, Theor. Appl. Genet., 2000, vol. 100, no. 1, pp. 127–138.

    Article  CAS  Google Scholar 

  33. Rice, P., Longden, I., and Bleasby, A., EMBOSS: The European Molecular Biology Open Software Suite, Trends Genet., 2000, vol. 16, pp. 276–277.

    Article  PubMed  CAS  Google Scholar 

  34. Kannan, S.K. and Myers, E.W., An Algorithm for Locating Nonoverlapping Regions of Maximum Alignment Score, SIAM J. Comput., 1996, vol. 25, pp. 648–662.

    Article  Google Scholar 

  35. Benson, G., A Space Efficient Algorithm for Finding the Best Nonoverlapping Alignment Score, Theor. Comput. Sci., 1995, vol. 145, pp. 357–369.

    Article  Google Scholar 

  36. Schmidt, J.P., All Highest Scoring Paths in Weighted Grid Graphs and Their Application to Finding All Approximate Repeats in Strings, SIAM J. Comput., 1998, vol. 27, pp. 972–992.

    Article  Google Scholar 

  37. Laskin, A.A., Kudryashov, N.A., Skryabin, K.G., and Korotkov, E.V., Latent Periodicity of Serine-Threonine and Tyrosine Protein Kinases and Other Protein Families, Comput. Biol. Chem., 2005, vol. 29, pp. 229–243.

    Article  PubMed  CAS  Google Scholar 

  38. Issac, B., Singh, H., Kaur, H., and Raghava, G.P.S., Locating Probable Genes Using Fourier Transform Approach, Bioinformatics, 2002, vol. 18, no. 1, pp. 196–197.

    Article  PubMed  CAS  Google Scholar 

  39. Chechetkin, V.R. and Lobzin, V.V., Nucleosome Units and Hidden Periodicities in DNA Sequences, J. Biomol. Struct. Dyn., 1998, vol. 15, pp. 937–947.

    PubMed  CAS  Google Scholar 

  40. Jackson, J.H., George, R., and Herring, P.A., Vectors of Shannon Information from Fourier Signals Characterizing Base Periodicity in Genes and Genomes, Biochem. Biophys. Res. Commun., 2000, vol. 268, pp. 289–292.

    Article  PubMed  CAS  Google Scholar 

  41. Milosavljevic, A. and Jurka, J., Discovering Simple DNA Sequences by the Algorithmic Significance Method, Comput. Appl. Biosci., 1993, vol. 9, pp. 407–411.

    PubMed  CAS  Google Scholar 

  42. Landau, G., Schmidt, J., and Sokol, D., An Algorithm for Approximate Tandem Repeats, J. Comp. Biol., 2001, vol. 8, pp. 1–18.

    Article  CAS  Google Scholar 

  43. Subramanian, S., Mishra, R.K., and Singh, L., Genome-Wide Analysis of Microsatellite Repeats in Humans: Their Abundance and Density in Specific Genomic Regions, Genome Biol., 2003, vol. 4, no. 2, p. R13.

    Article  PubMed  Google Scholar 

  44. Priolli, R.H.G., Mendes-Junior, C.T., Arantes, N.E., and Contel, E.P.B., Characterization of Brazilian Soybean Cultivars Using Microsatellite Markers, Genet. Mol. Biol., 2002, vol. 25, pp. 185–193.

    Google Scholar 

  45. Morgante, M., Hanafey, M., and Powell, W., Microsatellites Are Preferentially Associated with Nonrepetitive DNA in Plant Genomes, Nat. Genet., 2002, vol. 30, pp. 194–200.

    Article  PubMed  CAS  Google Scholar 

  46. Coggins, L.W. and O’Prey, M., DNA Tertiary Structures Formed in Vitro by Misaligned Hybridization of Multiple Tandem Repeat Sequences, Nucleic Acids Res., 1989, vol. 17, pp. 7417–7426.

    Article  PubMed  CAS  Google Scholar 

  47. Weber, J.L. and Wong, C., Mutation of Human Short Tandem Repeats, Hum. Mol. Genet., 1993, vol. 2, pp. 1123–1128.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. A. Shelenkov.

Additional information

Original Russian Text © A.A. Shelenkov, K.G. Skryabin, E.V. Korotkov, 2008, published in Genetika, 2008, Vol. 44, No. 1, pp. 120–136.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shelenkov, A.A., Skryabin, K.G. & Korotkov, E.V. Classification analysis of a latent dinucleotide periodicity of plant genomes. Russ J Genet 44, 101–114 (2008). https://doi.org/10.1134/S1022795408010134

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1022795408010134

Keywords

Navigation