Abstract
We consider the matching of weighted patterns against an unweighted text. We adapt the shift-add algorithm for this problem. We also present an algorithm that enumerates all strings that produce a score higher than a given score threshold when aligned against a weighted pattern and then searches for all these strings using a standard exact multipattern algorithm. We show that both of these approaches are faster than previous algorithms on patterns of moderate length and high significance levels while the good performance of the shift-add algorithm continues with lower significance levels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Gonnet, G.: A new approach to text searching. Communications of the ACM 35(10), 74–82 (1992)
Claverie, J.M., Audic, S.: The statistical significance of nucleotide position-weight matrix matches. Computer Applications in Biosciences 12(5), 431–439 (1996)
Karp, R., Rabin, M.: Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development 31, 160–249 (1987)
Liefooghe, A., Touzet, H., Varré, J.S.: Large scale matching for position weight matrices. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 401–412. Springer, Heidelberg (2006)
Matys, V., Fricke, E., Geffers, R., Gößling, E., Haubrock, M., Hehl, R., Hornischer, K., Karas, D., Kel, A., Kel-Margoulis, O., Kloos, D., Land, S., Lewicki-Potapov, B., Michael, H., Münch, R., Reuter, I., Rotert, S., Saxel, H., Scheer, M., Thiele, S., Wingender, E.: TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 31, 374–378 (2003)
Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics 5(4), 1–36 (2000)
Pizzi, C., Rastas, P., Ukkonen, E.: Fast search algorithms for position specific scoring matrices. In: Hochreiter, S., Wagner, R. (eds.) BIRD 2007. LNCS (LNBI), vol. 4414, pp. 239–250. Springer, Berlin (2007)
Pizzi, C., Ukkonen, E.: Fast profile matching algorithms – a survey. Theoretical Computer Science (to appear)
Salmela, L., Tarhio, J., Kytöjoki, J.: Multi-pattern string matching with q-grams. ACM Journal of Experimental Algorithmics 11, 1–19 (2006)
Staden, R.: Methods for calculating the probabilities of finding patterns in sequences. Computer Applications in Biosciences 5, 89–96 (1989)
Wu, T., Neville-Manning, C., Brutlag, D.: Fast probabilistic analysis of sequence function using scoring matrices. Bioinformatics 16(3), 233–244 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Salmela, L., Tarhio, J. (2007). Algorithms for Weighted Matching. In: Ziviani, N., Baeza-Yates, R. (eds) String Processing and Information Retrieval. SPIRE 2007. Lecture Notes in Computer Science, vol 4726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75530-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-75530-2_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75529-6
Online ISBN: 978-3-540-75530-2
eBook Packages: Computer ScienceComputer Science (R0)