Advances in Computational Biology pp 103-108 | Cite as
False Positive Reduction in Automatic Segmentation System
Abstract
An application has been developed for automatic segmentation of Potyvirus polyproteins through stochastic models of Pattern Recognition. These models usually find the correct location of the cleavage site but also suggest other possible locations called false positives. For reducing the number of false positives, we evaluated three methods. The first is to shrink the search range skipping portions of polyprotein with low probability of containing the cleavage site. In the second and third approach, we use a measure to rank candidate locations in order to maximize the ranking of the correct cleavage site. Here we evaluate probability emitted by Hidden Markov Models (HMM) and Minimum Editing Distance (MED) as measure alternatives. Our results indicate that HMM probability is a better quality measure of a candidate location than MED. This probability is useful to eliminate most of false positive. Besides, it allows to quantify the quality of an automatic segmentation.
Keywords
Hide Markov Model Cleavage Site Automatic Segmentation Search Range Candidate LocationPreview
Unable to display preview. Download preview PDF.
References
- 1.Nicolas, O., Laliberté, J.F.: The complete nucleotide sequence of turnip mosaic potyvirus rna. Journal of General Virology 73(Pt 11), 2785–2793 (1992)CrossRefGoogle Scholar
- 2.von Heijne, G.: Patterns of amino acids near signal-sequence cleavage sites. Eur. J. Biochem. 133(1), 17–21 (1983); PubMed: 6852022Google Scholar
- 3.Li, B.Q., Cai, Y.D., Feng, K.Y., Zhao, G.J.: Prediction of protein cleavage site with feature selection by random forest. PLoS ONE 7(9), e45854 (2012) doi:10.1371/journal.pone.0045854, PubMed Central:PMC3445488, PubMed:23029276Google Scholar
- 4.Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
- 5.Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains. The Annals of Mathematical Statistics 41(1), 164–171 (1970)MathSciNetzbMATHCrossRefGoogle Scholar
- 6.Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM 21(1), 168–173 (1974)MathSciNetzbMATHCrossRefGoogle Scholar