Skip to main content

Using Em-Algorithm for Gene Classification

Abstract

The EM algorithm is considered for the problem of separation of distribution mixtures described by Markov chains, together with the related weighted likelihood maximization problem. Auxiliary algorithms are proposed to select the initial approximation and optimal number of mixture components, as well as a method to approximate the distribution mixture with given data using support vector machines. The results are applied to gene fragment classification.

This is a preview of subscription content, access via your institution.

References

  1. K. Knapp and Y.-P. P. Chen, “An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy,” Nucleic Acids Research., 35, 317–324 (2007).

    Article  Google Scholar 

  2. I. V. Sergienko, A. M. Gupal, and A. V. Ostrovsky, “Recognition of DNA gene fragments using hidden Markov models,” Cybern. Syst. Analysis, 48, No. 3, 369–377 (2012).

    Article  MathSciNet  Google Scholar 

  3. A. M. Gupal and A. V. Ostrovsky, “Using compositions of Markov models to determine functional gene fragments,” Cybern. Syst. Analysis, 49, No. 5, 692–698 (2013).

    Article  MATH  MathSciNet  Google Scholar 

  4. M. I. Schlesinger, “Spontaneous pattern recognition,” in: Reading Automata [in Russian], Naukova Dumka, Kyiv (1965), pp. 38–45.

  5. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. the Royal Statistical Society, Ser. B, 34, pp. 1–38 (1977).

    MathSciNet  Google Scholar 

  6. C. Bishop, Pattern Recognition and Machine Learning, Springer, Cambridge (2006).

    MATH  Google Scholar 

  7. S. A. Aivazyan, V. M. Bukhshtaber, I. S. Enyukov, and L. D. Meshalkin, Applied Statistics: Classification and Dimension Reduction [in Russian], Financy i Statistika, Moscow (1989).

    Google Scholar 

  8. M. Ridley, Genome: The Autobiography of a Species in 23 Chapters, Harper Perennial (2006).

  9. C. Cortes and V. Vapnik, “Support vector machines,” Machine Learning, 20, 273–293 (1995).

    MATH  Google Scholar 

  10. S. Knerr, L. Pesonnaz, and G. Dreyfus, “Single-layer learning revisited: A stepwise procedure for building and training a neural network,” in: F. F. Soulie and J. Herald (eds), Neurocomputing: Algorithms, Architectures and Applications, Springer, Berlin (1990), pp. 41–50.

    Chapter  Google Scholar 

  11. National Center of Biotechnological Information of the USA, http://ncbi.nlm.nih.gov/.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to I. V. Sergienko.

Additional information

Translated from Kibernetika i Sistemnyi Analiz, No. 1, January–February, 2015, pp. 48–58.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sergienko, I.V., Gupal, A.M. & Ostrovskiy, A.V. Using Em-Algorithm for Gene Classification. Cybern Syst Anal 51, 41–50 (2015). https://doi.org/10.1007/s10559-015-9695-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10559-015-9695-z

Keywords

  • Markov chain
  • classification
  • gene
  • bioinformatics
  • nucleotide
  • exon
  • intron
  • likelihood