Skip to main content

Predicting Gene Structure with the Use of Mixtures of Probability Distributions

Abstract

The authors consider the problem of reconstruction of hidden state sequences for mixture distributions with constituents described by the generalization of high-order Markov chains and hidden Markov models. A new algorithm to solve the problem using dynamic programming is proposed, as well as its modifications to eliminate recursion and reduce search. The results are applied to the problem of gene fragment recognition in plants.

This is a preview of subscription content, access via your institution.

References

  1. K. Knapp and Y.-P. P. Chen, “An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy,” Nucleic Acids Research, 35, 317–324 (2007).

    Article  Google Scholar 

  2. I. V. Sergienko, A. M. Gupal, and A. V. Ostrovsky, “Recognition of DNA gene fragments using hidden Markov models,” Cybern. Syst. Analysis, 48, No. 3, 369–377 (2012).

    MathSciNet  Article  Google Scholar 

  3. A. M. Gupal and A. V. Ostrovsky, “Using compositions of Markov models to determine functional gene fragments,” Cybern. Syst. Analysis, 49, No. 5, 692–698 (2013).

    MATH  MathSciNet  Article  Google Scholar 

  4. I. V. Sergienko, A. M. Gupal, and A. V. Ostrovskiy, “Using EM-algorithm for gene classification,” Cybern. Syst. Analysis, 51, No. 1, 41–50 (2015).

    Article  Google Scholar 

  5. A. V. Ostrovskiy, “Detecting the proteins secondary structure using Markov models,” J. Autom. Inform. Sci., 45, No. 3, 75–83 (2013).

    Article  Google Scholar 

  6. The National Center for Biotechnology Information of the USA, http://ncbi.nlm.nih.gov/.

  7. A. Y. Ng, “Preventing overfitting of cross-validation data,” in: Proc. 14th Intern. Conf. on Machine Learning, Morgan Kaufmann, Waltham (1997), pp. 245–253.

    Google Scholar 

  8. I. V. Sergienko, B. A. Beletskii, S. V. Vasil’ev, and A. M. Gupal, “Predicting protein secondary structure based on Bayesian classification procedures on Markovian chains,” Cybern. Syst. Analysis, 43, No. 2, 208–212 (2007).

    MATH  Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to I. V. Sergienko.

Additional information

Translated from Kibernetika i Sistemnyi Analiz, No. 3, May–June, 2015, pp. 44–53.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sergienko, I.V., Gupal, A.M. & Ostrovskiy, A.V. Predicting Gene Structure with the Use of Mixtures of Probability Distributions. Cybern Syst Anal 51, 361–369 (2015). https://doi.org/10.1007/s10559-015-9728-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10559-015-9728-7

Keywords

  • Markov chain
  • hidden variables
  • gene
  • bioinformatics
  • nucleotide
  • exon
  • intron
  • likelihood