Computational Statistics

, Volume 22, Issue 1, pp 49–69

Using a VOM model for reconstructing potential coding regions in EST sequences

Original Paper

Abstract

This paper presents a method for annotating coding and noncoding DNA regions by using variable order Markov (VOM) models. A main advantage in using VOM models is that their order may vary for different sequences, depending on the sequences’ statistics. As a result, VOM models are more flexible with respect to model parameterization and can be trained on relatively short sequences and on low-quality datasets, such as expressed sequence tags (ESTs). The paper presents a modified VOM model for detecting and correcting insertion and deletion sequencing errors that are commonly found in ESTs. In a series of experiments the proposed method is found to be robust to random errors in these sequences.

Keywords

Variable order Markov model Coding and noncoding DNA Context tree Gene annotation Sequencing error detection and correction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Department of Information Systems EngineeringBen-Gurion UniversityBeer-ShevaIsrael
  2. 2.Department of Industrial EngineeringTel-Aviv UniversityTel-AvivIsrael

Personalised recommendations