Skip to main content

Superiority and Complexity of the Spaced Seeds

  • Reference work entry
  • First Online:
  • 33 Accesses

Years and Authors of Summarized Original Work

  • 2006; Ma, Li, Zhang

Problem Definition

In the 1970s, sequence alignment was introduced to demonstrate the similarity of the sequences of genes and proteins [12]. A DNA sequence is a finite sequence over four nucleotides – adenine, guanine, cytosine, and thymine, whereas a protein sequence is over 20 amino acids. Homologous proteins have similar biological functions. Since they evolve from a common ancestral sequence, the sequences of homologous proteins and their encoding genes are often highly similar. Therefore, the DNA or amino acid sequence of a protein is often aligned with the sequences of well-studied proteins to infer the biological functions of the protein.

Formally, an alignment of two sequences, S and T, on an alphabet \(\mathcal{B}\) is a two-row matrix with the following properties:

  1. 1.

    The letters in Sare listed in order, interspersed with space symbols “–,” in a row, where “–” represents the fact that a letter is missing at...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   1,599.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   1,999.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410

    Article  Google Scholar 

  2. Brejovà B, Brown D, Vinar̆ T (2004) Optimal spaced seeds for homologous coding regions. J Bioinformatics Comput Biol 1:595–610

    Google Scholar 

  3. Buhler J, Keich U, Sun Y (2004) Designing seeds for similarity search in genomic DNA. J Comput Syst Sci 70:342–363

    Article  MathSciNet  Google Scholar 

  4. Choi KP, Zhang LX (2004) Sensitivity analysis and efficient method for identifying optimal spaced seeds. J Comput Syst Sci 68:22–40

    Article  MathSciNet  MATH  Google Scholar 

  5. Choi KP, Zeng F, Zhang LX (2004) Good spaced seeds for homology search. Bioinformatics 20:1053–1059

    Article  Google Scholar 

  6. Intl Mouse Genome Sequencing Consortium (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 409:520–562

    Google Scholar 

  7. Keich U, Li M, Ma B, Tromp J (2004) On spaced seeds for similarity search. Discret Appl Math 3:253–263

    Article  MathSciNet  MATH  Google Scholar 

  8. Li M, Ma B, Kisman D, Tromp J (2004) PatternHunter II: highly sensitive and fast homology search. J Bioinformatics Comput Biol 2:417–440

    Article  Google Scholar 

  9. Ma B, Yao H (2009) Seed optimization for iid similarities is no easier than optimal Golomb ruler design. Inf Process Lett 109(19):1120–1124

    Article  MathSciNet  MATH  Google Scholar 

  10. Ma B, Tromp J, Li M (2002) PatternHunter: faster and more sensitive homology search. Bioinformatics 18:440–445

    Article  Google Scholar 

  11. Ma B, Li M (2007) On the complexity of the spaced seeds. J Comput Syst Sci 73:1024–1034

    Article  MathSciNet  MATH  Google Scholar 

  12. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453

    Article  Google Scholar 

  13. Smith TF, Waterman MS (1980) Identification of common molecular subsequences. J Mol Biol 147:195–197

    Article  Google Scholar 

  14. Sun Y, Buhler J (2004) Designing multiple simultaneous seeds for DNA similarity search. In: Proceedings RECOMB’04, 2004, San Diego, pp 76–85

    Google Scholar 

  15. Zhang LX (2007) Superiority of spaced seeds for homology search. IEEE/ACM Trans Comput Biol Bioinformatics (TCBB) 4:496–505

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Louxin Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this entry

Cite this entry

Zhang, L. (2016). Superiority and Complexity of the Spaced Seeds. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_803

Download citation

Publish with us

Policies and ethics