Skip to main content

Genome Sequencing and Annotation

  • Protocol
  • 649 Accesses

Part of the book series: Methods in Molecular Medicine™ ((MIMM,volume 67))

Abstract

The availability of complete microbial genome sequences enormously facilitates experimental molecular investigations of the respective organisms by providing complete lists of genes, their genetic contexts, and their predicted functions. This can be used in a number of ways to focus studies on bacterial pathogenesis and also vaccine development (1,2). The complete genome sequences from two unrelated strains of Neisseria meningitidis, a derivative of isolate MC58 which originally expressed serogroup B capsule and strain Z2491, which is serogroup A, are now available (3,4). The genome sequences of both these strains were determined using the whole genome shotgun approach (5). In this approach, randomly sheared chromosomal DNA is cloned to make a small insert library (1.5–2.0 kb for MC58, 0.5–0.8 kb and 1.0–1.5 kb for Z2491), then each insert is sequenced from both ends using plasmidspecific primers. For the MC58 genome sequence, a large insert lambda library (8–24 kb) was also used. In the initial sequencing phase, 6-8 times coverage of the estimated size of the genome is generally achieved. The DNA sequences are linked together (assembled) into large contigs (a derivative of the word contiguous). Polymerase chain reaction (PCR) and sequencing of large insert libraries are then used to join the contigs, close gaps, and resolve ambiguities (see ref. 6 for a review).

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Saunders N. J. and Moxon E. R. (1998) Implications of sequencing bacterial genomes for pathogenesis and vaccine development. Curr. Opin. Biotechnol. 9, 618–623.

    Article  CAS  PubMed  Google Scholar 

  2. Field D., Hood D., and Moxon E. R. (1999) Contribution of genomics to bacterial pathogenesis. Curr. Opin. Genet. Dev. 9, 700–703.

    Article  CAS  PubMed  Google Scholar 

  3. Tettelin H., Saunders N. J., Heidelberg J., Jeffries A. C., Nelson K. E., Eisen J. A., et al. (dy2000) Complete genome sequence of Neisseria meningitidis serotype B strain MC58. Science 287, 1809–1815.

    Article  CAS  PubMed  Google Scholar 

  4. Parkhill J., Achtman M., James K. D., Bentley S. D., Churcher C., Klee S. R., et al. (dy2000) Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491. Nature 404, 502–506.

    Article  CAS  PubMed  Google Scholar 

  5. Fleischmann R. D., Adams M. D., White O., Clayton R. A., Kirkness E. F., Kerlavage A. R., et al. (dy1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512.

    Article  CAS  PubMed  Google Scholar 

  6. Frangeul L., Nelson K. E., Buchrieser C., Danchin A., Glaser P., and Kunst F. (1999) Cloning and assembly strategies in microbial genome projects. Microbiology 145, 2625–2634.

    CAS  PubMed  Google Scholar 

  7. Salzberg S., Delcher A., Kasif S., and White O. (1998) Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 2, 544–548.

    Article  Google Scholar 

  8. Saunders N. J., Peden J. F., Hood D. W., and Moxon E. R. (1998) Simple sequence repeats in the Helicobacter pylori genome. Mol. Microbiol. 27, 1091–1098.

    Article  CAS  PubMed  Google Scholar 

  9. Moxon E. R., Rainey P. B., Nowak M. A., and Lenski R. E. (1994) Adaptive evolution of highly mutable loci in pathogenic bacteria. Curr. Biol. 4, 24–33.

    Article  CAS  PubMed  Google Scholar 

  10. Saunders N. J., Jeffries A. C., Peden J. F., Hood D. W., Tettelin H., Rappuoli R. and Moxon E. R. (2000) Repeat-associated phase variable genes in the complete genome sequence of Neisseria meningitidis strain MC58. Mol. Micobiol. 37, 207–215.

    Article  CAS  Google Scholar 

  11. Tatusov R. L., Koonin E. V., and Lipman D. J. (1997) A genomic perspective on protein families. Science 278, 631–63

    Article  CAS  PubMed  Google Scholar 

  12. Tatusov R. L., Galperin M. Y., Natale D. A., and Koonin E. V. (2000) The COG database: a tool for genome-scale analyses of protein functions and evolution. Nucleic Acids Res. 28, 33–36.

    Article  CAS  PubMed  Google Scholar 

  13. Altschul S. F., Gish W., Miller W., Myers E. W., and Lipman D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

    CAS  PubMed  Google Scholar 

  14. Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z., Miller W., and Lipman D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

    Article  CAS  PubMed  Google Scholar 

  15. Pearson W. R. (2000) Flexible sequence similarity searching with the FASTA3 program package. Methods Mol. Biol. 132, 185–219.

    CAS  PubMed  Google Scholar 

  16. Henikoff S. and Henikoff J. G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10,915–10,919.

    Article  CAS  PubMed  Google Scholar 

  17. Galperin M. Y. and Koonin E. V. (1998) Sources of systematic error in functional annotation of genomes: domain rearrangement non-orthologous gene displacement and operon disruption. In Silico Biol. 1, 55–67.

    CAS  PubMed  Google Scholar 

  18. Enright A. J., Iliopoulos I., Kyprides N. C., and Ouzounis C. A. (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90.

    Article  CAS  PubMed  Google Scholar 

  19. Kyrpides N. C. and Ouzounis C. A. (1999) Whole-genome sequence annotation: ‘Going wrong with confidence.’ Mol. Microbiol. 32, 881–891.

    Article  Google Scholar 

  20. Bateman A., Birney E., Durbin R., Eddy S. R., Howe K. L., and Sonnhammer E. L. L. (2000) The Pfam protein families database. Nucleic Acids Res. 28, 263–266.

    Article  CAS  PubMed  Google Scholar 

  21. Bucher P. and Bairoch A. (1994) A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation, in ISMB-94 Proceedings 2nd International Conference on Intelligent Systems for Molecular Biology (Altman R., Brutlag D., Karp P., Lathrop R., and Searls D., eds.), AAAI Press, Menlo Park, OH, pp. 53–61.

    Google Scholar 

  22. Corpet F., Gouzyl J., and Kahn D. (1999) Recent improvements of the ProDom database of protein domain families. Nucleic Acids Res. 27, 263–267.

    Article  CAS  PubMed  Google Scholar 

  23. Doolittle R. F. (1995) The multiplicity of domains in proteins. Ann. Rev. Biochem. 64, 287–314.

    Article  CAS  PubMed  Google Scholar 

  24. Kanehisa M. and Goto S. (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30.

    Article  CAS  PubMed  Google Scholar 

  25. Galperin M. Y. and Koonin E. V. (1999) Functional genomics and genome evolution. Genetica 106, 159–170.

    Article  CAS  PubMed  Google Scholar 

  26. Galperin M. Y., Tatusov R. L., and Koonin E. V. (1999) Comparing microbial genomes: how the gene set determines the lifestyle, in Organization of the Prokary otic Genome (Charlebois R. L., ed.), ASM, Washington, DC, pp. 91–108.

    Google Scholar 

  27. Riley M. (1998) Systems for categorizing functions of gene products. Curr. Opin. Struct. Biol. 8, 388–392.

    Article  CAS  PubMed  Google Scholar 

  28. Koonin E. V., Mushegian A. R., and Boork P. (1996) Non-orthologous gene displacement. Trends Genet. 12, 334–336.

    Article  CAS  PubMed  Google Scholar 

  29. Henikoff S., Greene E. A., Pietrokovski S., Bork P., Attwood T. K., and Hood L. (1997) Gene families: the taxonomy of protein paralogs and chimeras. Science 278, 609–614.

    Article  CAS  PubMed  Google Scholar 

  30. Tomb J. F., White O., Kerlavage A. R., Clayton R. A., Sutton G. G., Fleischmann R. D., et al. (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388, 539–547.

    Article  CAS  PubMed  Google Scholar 

  31. Overbeek R., Fonstein M., D’Souza M., Pusch G. D., and Malsev N. (1999) The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901.

    Article  CAS  PubMed  Google Scholar 

  32. Ouellette B. F. F. (1998) The GenBank sequence database, in Bioinformatics:A Practical Guide to the Analysis of Genes and Proteins Baxevanis A. D. and Ouellette B. F. F., eds.), Wiley, New York, pp. 16–45.

    Google Scholar 

  33. Delcher A. L., Kasif S., Fleischmann R. D., Peterson J., White O., and Salzberg S. L. (1999) Alignment of whole genomes. Nucleic Acids Res. 27, 2369–2376.

    Article  CAS  PubMed  Google Scholar 

  34. Karlin S., Campbell A. M., and Mrazek J. (1998) Comparative DNA analysis across diverse genomes. Ann. Rev. Genet. 32, 185–225.

    Article  CAS  PubMed  Google Scholar 

  35. Pizza M., Scarlato V., Masignani V., Giuliani M. M., Arico B., Baldi L., et al. (dy2000) Novel proteins for vaccine development from the meningococcus B genome. Science 287, 1816–1820.

    Article  CAS  PubMed  Google Scholar 

  36. Saunders N. J., Peden J. F., and Moxon E. R. (1999) The absence of an uptake sequence in Helicobacter pylori. Microbiology 145, 3523–3528.

    CAS  Google Scholar 

  37. Hosking S. L., Deadman M. E., Moxon E. R., Peden J. F., Saunders N. J., and High N. J. (1998) An in silico evaluation of Tn916 as a tool for generalisedmutagenesis in Haemophilus influenzae Rd. Microbiology 144, 2525–2530.

    CAS  Google Scholar 

  38. Wilson M., DeRisi J., Kristensen H-H., Imboden P., Rane S., Brown P. O., and Schoolnik G. K. (1999) Exploring drug-induced alterations in gene expression by microarray hybridization. Proc. Natl. Acad. Sci. USA 96, 12,833–12,838.

    Article  CAS  PubMed  Google Scholar 

  39. Behr M. A., Wilson M. A., Gill W. P., Salamon H., Schoolnik G. K., Rane S., and Small P. M. (1999) Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science 284, 1520–1523.

    Article  CAS  PubMed  Google Scholar 

  40. Wasinger V. C., Pollack J. D., and Humphery-Smith I. (2000) The proteome of Mycoplasma genitalium chaps-soluble component. Eur. J. Biochem. 267, 1571–1582.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Humana Press Inc., Totowa, NJ

About this protocol

Cite this protocol

Jeffries, A.C., Saunders, N.J., Hood, D.W. (2001). Genome Sequencing and Annotation. In: Walker, J.M., Pollard, A.J., Maiden, M.C.J. (eds) Meningococcal Disease. Methods in Molecular Medicine™, vol 67. Humana Press. https://doi.org/10.1385/1-59259-149-3:215

Download citation

  • DOI: https://doi.org/10.1385/1-59259-149-3:215

  • Publisher Name: Humana Press

  • Print ISBN: 978-0-89603-849-3

  • Online ISBN: 978-1-59259-149-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics