A Coarse-to-Fine Approach to Computing the k-Best Viterbi Paths

  • Jesper Nielsen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6661)

Abstract

The Hidden Markov Model (HMM) is a probabilistic model used widely in the fields of Bioinformatics and Speech Recognition. Efficient algorithms for solving the most common problems are well known, yet they all have a running time that is quadratic in the number of hidden states, which can be problematic for models with very large state spaces. The Viterbi algorithm is used to find the maximum likelihood hidden state sequence, and it has earlier been shown that a coarse-to-fine modification can significantly speed up this algorithm on some models. We propose combining work on a k-best version of Viterbi algorithm with the coarse-to-fine framework. This algorithm may be used to approximate the total likelihood of the model, or to evaluate the goodness of the Viterbi path on very large models.

Keywords

coarse-to-fine k-best Viterbi Hidden Markov Models 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Albrechtsen, A., Sand Korneliussen, T., Moltke, I., van Overseem Hansen, T., Nielsen, F.C., Nielsen, R.: Relatedness mapping and tracts of relatedness for genome-wide data in the presence of linkage disequilibrium. Genetic Epidemiology 33(3), 266–274 (2009)CrossRefGoogle Scholar
  2. 2.
    Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics conference, pp. 132–139. Morgan Kaufmann Inc., San Francisco (2000)Google Scholar
  3. 3.
    Charniak, E., Johnson, M.: Coarse-to-fine n -best parsing and MaxEnt discriminative reranking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL 2005, June 1, pp. 173–180 (2005)Google Scholar
  4. 4.
    Chong, J., Yi, Y., Faria, A., Satish, N., Keutzer, K.: Data-parallel large vocabulary continuous speech recognition on graphics processors. In: Proceedings of the 1st Annual Workshop on Emerging Applications and Many Core Architecture (EAMA), pp. 23–35. sn (2008)Google Scholar
  5. 5.
    Drinnenberg, I., Weinberg, D., Xie, K., Mower, J., Wolfe, K., Fink, G., Bartel, D.: RNAi in Budding Yeast. Science 326(5952), 544 (2009)CrossRefGoogle Scholar
  6. 6.
    Du, J., Rozowsky, J., Korbel, J., Zhang, Z., Royce, T., Schultz, M., Snyder, M.: A Supervised Hidden Markov Model Framework for Efficiently Segmenting Tiling Array Data in Transcriptional and ChIP-chip Experiments: Systematically Incorporating Validated Biological Knowledge. Bioinformatics (2008)Google Scholar
  7. 7.
    Dutheil, J.Y., Ganapathy, G., Hobolth, A., Mailund, T., Uyenoyama, M.K., Schierup, M.H.: Ancestral Population Genomics: The Coalescent Hidden Markov Model Approach. Genetics 183, 259–274 (2009)CrossRefGoogle Scholar
  8. 8.
    Finkel, R., Bentley, J.: Quad trees a data structure for retrieval on composite keys. Acta informatica 4(1), 1–9 (1974)CrossRefMATHGoogle Scholar
  9. 9.
    Fridlyand, J., Snijders, A., Pinkel, D., Albertson, D., Jain, A.: Hidden Markov models approach to the analysis of array CGH data. Journal of Multivariate Analysis 90(1), 132–153 (2004)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Goodman, J.: Global thresholding and multiple-pass parsing. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 11–25 (1997)Google Scholar
  11. 11.
    Huang, L., Chiang, D.: Better k-best parsing. In: Proc. of IWPT, pp. 53–64 (2005)Google Scholar
  12. 12.
    Hudson, R.: Generating samples under a Wright Fisher neutral model of genetic variation. Bioinformatics 18(2), 337 (2002)CrossRefGoogle Scholar
  13. 13.
    Karplus, K., Barrett, C., Cline, M., Diekhans, M., Grate, L., Hughey, R.: Predicting protein structure using only sequence information. Proteins Suppl. 3, 121–125 (1999)CrossRefGoogle Scholar
  14. 14.
    Knapp, K., Chen, Y.P.P.: An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy. Nucleic acids research 35(1), 317–324 (2007)CrossRefGoogle Scholar
  15. 15.
    Kupiec, J.: Robust part-of-speech tagging using a hidden Markov model. Computer Speech & Language 6(3), 225–242 (1992)CrossRefGoogle Scholar
  16. 16.
    Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition (1990)Google Scholar
  17. 17.
    Raphael, C.: Coarse-to-fine dynamic programming. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(12), 1379–1390 (2001)CrossRefGoogle Scholar
  18. 18.
    Senf, A., Chen, X.W.: Identification of genes involved in the same pathways using a Hidden Markov Model-based approach. Bioinformatics (Oxford, England) 25(22), 2945–2954 (2009)CrossRefGoogle Scholar
  19. 19.
    Wang, K., Li, M., Hadley, D., Liu, R., Glessner, J., Grant, S.F.A., Hakonarson, H., Bucan, M.: PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome research 17(11), 1665–1674 (2007)CrossRefGoogle Scholar
  20. 20.
    Willett, D., Neukirchen, C., Rigoll, G.: Efficient search with posterior probability estimates in HMM-based speech recognition. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998 (Cat. No.98CH36181), vol. 2, pp. 821–824. IEEE, Los Alamitos (1998)CrossRefGoogle Scholar
  21. 21.
    Yin, J., Jordan, M.I., Song, Y.S.: Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data. Bioinformatics (Oxford, England) 25(12), i231–i239 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jesper Nielsen
    • 1
  1. 1.Bioinformatics Research CentreAarhus UniversityAarhus CDenmark

Personalised recommendations