Skip to main content

Shallow Shotgun Sequencing as a Strategy for Finding Coding Exons

  • Chapter
Identification of Transcribed Sequences
  • 45 Accesses

Abstract

Narrowing candidate genomic segments down to a size compatible with contig assembly by shotgun sequencing is a limiting step in the identification of mammalian genes by positional cloning. A 6 to 8 fold redundancy of sequencing (coverage) is usually required and, most often, does not alleviate the need for directed approaches for closing the remaining gaps. Once a complete sequence is obtained, computer analyzes are used to locate candidate exons, usually spanning less than 10% of the genomic sequence. Here, I propose an alternative strategy in which the need for contig assembly - and the high coverage it imposes - is removed. This strategy takes advantage of the fact that: i) mammalian coding exons are on average much smaller than individual sequencing runs, and ii) computer methods to identify coding regions only depend on local sequence information. In this context of “exon hunting” (opposed to genomic sequencing per se), I show that a 2 to 3 fold sequencing coverage is indeed sufficient to locate most candidate exons within genomic fragments, the size of which is now limited only by the available sequencing power. This strategy can be fully automated as it involves a single experimental technique and real-time computer analyzes of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Legouis, J-P. Hardelin, J. Levilliers, J.-M. Claverie, S. Compain, V. Wunderle, P. Millasseau, D. Le Paslier, D. Cohen, D. Caterina, L. Bougueleret, G. Lutfalla, J. Weissenbach, and C. Petit, The candidate gene for the X-linked Kallmann syndrome encodes a protein related to adhesion molecules, Cell 67:423 (1991).

    Article  PubMed  CAS  Google Scholar 

  2. E.S. Lander and M.S. Waterman, Genomic mapping by fingerprinting random clones: a mathematical analysis, Genomics 2:231 (1988).

    Article  PubMed  CAS  Google Scholar 

  3. J.W. Fickett, Recognition of protein coding regions in DNA sequences, Nucl. Acids Res.10:5303 (1982).

    Article  PubMed  CAS  Google Scholar 

  4. J.-M. Claverie, J.-M. Claverie and L. Bougueleret, L. Bougueleret Heuristic informational analysis of sequences. Nucl. Acids Res.14:179 (1986).

    Article  PubMed  CAS  Google Scholar 

  5. J.-M. Claverie, I. Sauvaget, and L. Bougueleret, k-tuple frequency analysis: from intron/exon discrimination to T-Cell epitope mapping, Meth. Enzymol.183:237 (1990).

    Article  PubMed  CAS  Google Scholar 

  6. E.C. Uberbacher and R.J. Mural, Locating protein-coding regions in human DNA by a multiple sensor neural network approach, Proc. Natl. Acad. Sci. USA 88:11261 (1991).

    Article  PubMed  CAS  Google Scholar 

  7. W. Gish and D.J. States, Identification of protein coding regions by database similarity search, Nature Genetics 3:266 (1993).

    Article  PubMed  CAS  Google Scholar 

  8. J.M. Claverie and D. States, Information enhancement methods for large scale sequence analysis. Computers and Chemistry 17:191 (1993).

    Article  CAS  Google Scholar 

  9. J.M. Claverie, Large scale sequence analysis, in “Automated DNA Sequencing and Analysis Techniques,” J.C. Venter, ed., Academic Press, New York, (in press).

    Google Scholar 

  10. J.-M. Claverie, Identifying coding exons by similarity search: Alu-derived and other potentially misleading protein sequences, Genomics 12:838 (1992).

    Article  PubMed  CAS  Google Scholar 

  11. M.L. Engle and C. Burks, Artificially generated data sets for testing DNA sequence assembly algorithms. Genomics 16:286 (1993).

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer Science+Business Media New York

About this chapter

Cite this chapter

Claverie, JM. (1994). Shallow Shotgun Sequencing as a Strategy for Finding Coding Exons. In: Hochgeschwender, U., Gardiner, K. (eds) Identification of Transcribed Sequences. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-2562-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-2562-2_20

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-6094-0

  • Online ISBN: 978-1-4615-2562-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics