Original Paper

Computational Statistics

, Volume 22, Issue 1, pp 49-69

First online:

Using a VOM model for reconstructing potential coding regions in EST sequences

  • Armin ShmiloviciAffiliated withDepartment of Information Systems Engineering, Ben-Gurion University Email author 
  • , Irad Ben-GalAffiliated withDepartment of Industrial Engineering, Tel-Aviv University

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


This paper presents a method for annotating coding and noncoding DNA regions by using variable order Markov (VOM) models. A main advantage in using VOM models is that their order may vary for different sequences, depending on the sequences’ statistics. As a result, VOM models are more flexible with respect to model parameterization and can be trained on relatively short sequences and on low-quality datasets, such as expressed sequence tags (ESTs). The paper presents a modified VOM model for detecting and correcting insertion and deletion sequencing errors that are commonly found in ESTs. In a series of experiments the proposed method is found to be robust to random errors in these sequences.


Variable order Markov model Coding and noncoding DNA Context tree Gene annotation Sequencing error detection and correction