Abstract
Profile HMMs based on classical hidden Markov models have been widely studied for identification of members belonging to protein sequence families. Classical Viterbi search algorithm which has been used traditionally to calculate log-odd scores of the alignment of a new sequence to a profile model is based on the probability theory. To overcome the limitations of the classical HMM and for achieving an improved alignment and better log-odd scores for the sequences belonging to a given family, we propose a fuzzy Viterbi search algorithm which is based on Choquet integrals and Sugeno fuzzy measures. The proposed search algorithm incorporates ascending values of the scores of the neighboring states while calculating the scores for a given state, hence providing better alignment and improved log-odd scores. The proposed fuzzy Viterbi algorithm for profiles along with classical Viterbi search algorithm has been tested on globin and kinase families. The results obtained in terms of log-odd scores, Z-scores and other statistical analysis establish the superiority of fuzzy Viterbi search algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Eddy, S.R.: Profile hidden Markov models. Bioinformatics 14, 755–763 (1998)
Taylor, W.R.: Identification of protein sequence homology by consensus template alignment. J. Mol. Biol. 188, 233–258 (1986)
Krogh, A.: An introduction to hidden Markov models for biological sequences. In: Computational Methods in Molecular Biology, pp. 45–63. Elsevier, Amsterdam (1998)
Valsan, Z., Gavat, I., Sabac, B.: Statistical and hybrid methods for speech recognition in Romanian. International Journal of Speech Technology 5 5, 259–268 (2002)
Shi, H., Gader, P.D.: Lexicon-driven handwritten word recognition using Choquet Fuzzy Integral. In: IEEE Conf., pp. 412–417 (1996)
Cheok, A.D., et al.: Use of a novel generalized Fuzzy Hidden Markov Model for speech recognition. In: IEEE Conf. Fuzzy System, pp. 1207–1210 (2001)
Tran, D., Wagner, M.: Fuzzy Hidden Markov Models for Speech and Speaker Recognition. In: IEEE Conf. Speech Processing, pp. 426–430 (1999)
Churchill, G.A.: Stochastic models for heterogeneous DNA sequence. Bull. Math. Biol. 51, 79–94 (1989)
Sonnhammer, E.L., Eddy, S.R., Durbin, R.: Pfam: A comprehensive database of protein families based on seed alignments. Proteins 28, 405–420 (1997)
Levitt, M.: Competitive assessment of protein recognition and alignment accuracy. Proteins 1(suppl.), 92–104 (1997)
Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis-Probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge (2003)
Magdi, M.A., Gader, P.: Generalized hidden Markov models- part I: theoretical frameworks. IEEE Trans. Fuzzy Systems 8(1), 67–80 (2000)
Bidargaddi, N.P., Chetty, M., Kamruzzaman, J.: Fuzzy decoding in profile hidden Markov models for protein family identification. To appear in proc. Intl. Conf. Bioinfor-matics and its Applications (2004)
Gribskov, M., et al.: Profile analysis: Detection of distantly related proteins. Proc. Natl. Acad. Sci. USA 84, 4355–4358 (1987)
Sugeno, M.: Fuzzy measures and fuzzy integrals- A survey. In: Gupta, M.M., Saridis, G.N., Gaines, B.R. (eds.) Fuzzy Automata and Decision Processes, pp. 89–102. North-Holland, New York (1977)
Gough, J., et al.: Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J. Mol. Biol. 313(4), 903–919 (2001)
Bateman, A., et al.: The Pfam protein families database. Nucleic Acids Research 30, 276–280 (2002)
Mulder, N.J., et al.: The InterPro database. Nucleic Acids Res. 31, 315–318 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bidargaddi, N.P., Chetty, M., Kamruzzaman, J. (2005). A Fuzzy Viterbi Algorithm for Improved Sequence Alignment and Searching of Proteins. In: Rothlauf, F., et al. Applications of Evolutionary Computing. EvoWorkshops 2005. Lecture Notes in Computer Science, vol 3449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-32003-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-32003-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25396-9
Online ISBN: 978-3-540-32003-6
eBook Packages: Computer ScienceComputer Science (R0)