Fast and Adaptive Variable Order Markov Chain Construction
Variable order Markov chains (VOMCs) are a flexible class of models that extend the well-known Markov chains. They have been applied to a variety of problems in computational biology, e.g. protein family classification. A linear time and space construction algorithm has been published in 2000 by Apostolico and Bejerano. However, neither a report of the actual running time nor an implementation of it have been published since. In this paper we use the lazy suffix tree and the enhanced suffix array to improve upon the algorithm of Apostolico and Bejerano. We introduce a new software which is orders of magnitude faster than current tools for building VOMCs, and is suitable for large scale sequence analysis.
Unable to display preview. Download preview PDF.
- 23.Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Wheeler, D.L.: GenBank. Nucleic Acids Res. 36(Database issue), D25–D30 (2008)Google Scholar
- 25.The UniProt Consortium: The Universal Protein Resource (UniProt). Nucl. Acids Res. 36(suppl.1), D190–195 (2008)Google Scholar