Skip to main content

Spaced Seeds Design Using Perfect Rulers

  • Conference paper
String Processing and Information Retrieval (SPIRE 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7024))

Included in the following conference series:

Abstract

We consider the problem of lossless spaced seed design for approximate pattern matching. We show that, using mathematical objects known as perfect rulers, we can derive a family of spaced seeds for matching with up to two errors. We analyze these seeds with respect to the trade-off they offer between seed weight and the minimum length of the pattern to be matched. We prove that for patterns of length up to a few hundreds our seeds have a larger weight, hence a better filtration efficiency, than the ones known in the literature. In this context, we study in depth the specific case of Wichmann rulers and prove some preliminary results on the generalization of our approach to the larger class of unrestricted rulers.

This research is founded by the BioBITS Project Converging Technologies 2007, area: Biotechnology-ICT, Regione Piemonte.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burkhardt, S., Kärkkäinen, J.: Better filtering with gapped q-grams. Fundam. Inform. 56(1-2), 51–70 (2003)

    MathSciNet  MATH  Google Scholar 

  2. Egidi, L., Manzini, G.: Spaced seeds design using perfect rulers. Technical Report TR-INF-2011-06-01-UNIPMN, Computer Science Department, UPO (2011), http://www.di.unipmn.it

  3. Erdós, P., Gál, I.S.: On the representation of 1, 2, …, n by differences. Indagationes Math. 10, 379–382 (1948)

    Google Scholar 

  4. Farach-Colton, M., Landau, G.M., Sahinalp, S.C., Tsur, D.: Optimal spaced seeds for faster approximate string matching. J. Comput. Syst. Sci. 73(7), 1035–1044 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  5. Keich, U., Li, M., Ma, B., Tromp, J.: On spaced seeds for similarity search. Discrete Applied Mathematics 138(3), 253–263 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  6. Kucherov, G., Noé, L., Roytberg, M.A.: Multiseed lossless filtration. IEEE/ACM Trans. Comput. Biology Bioinform. 2(1), 51–61 (2005)

    Article  Google Scholar 

  7. Leech, J.: On the representation of 1, 2, …, n by differences. J. London Math. Soc. 31, 160–169 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  8. Li, M., Ma, B., Kisman, D., Tromp, J.: Patternhunter II: Highly sensitive and fast homology search. J. Bioinformatics and Computational Biology 2(3), 417–440 (2004)

    Article  Google Scholar 

  9. Lin, H., Zhang, Z., Zhang, M.Q., Ma, B., Li, M.: Zoom! zillions of oligos mapped. Bioinformatics 24(21), 2431–2437 (2008)

    Article  Google Scholar 

  10. Luschny, P.: Perfect and optimal rulers (2003), http://www.luschny.de/math/rulers/prulers.html

  11. Ma, B., Li, M.: On the complexity of the spaced seeds. J. Comput. Syst. Sci. 73(7), 1024–1034 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  12. Ma, B., Tromp, J., Li, M.: Patternhunter: faster and more sensitive homology search. Bioinformatics 18(3), 440–445 (2002)

    Article  Google Scholar 

  13. Ma, B., Yao, H.: Seed optimization is no easier than optimal Golomb ruler design. In: Brazma, A., Miyano, S., Akutsu, T. (eds.) APBC. Advances in Bioinformatics and Computational Biology, vol. 6, pp. 133–144. Imperial College Press, London (2008)

    Google Scholar 

  14. Nicolas, F., Rivals, E.: Hardness of optimal spaced seed design. J. Comput. Syst. Sci. 74(5), 831–849 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  15. Wichmann, B.: A note on restricted difference bases. J. London Math. Soc. 38, 465–466 (1962)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Egidi, L., Manzini, G. (2011). Spaced Seeds Design Using Perfect Rulers. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds) String Processing and Information Retrieval. SPIRE 2011. Lecture Notes in Computer Science, vol 7024. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24583-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24583-1_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24582-4

  • Online ISBN: 978-3-642-24583-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics