Abstract
The arrival of next or second generation sequencing has ushered in a new era of marsupial genomics, where large-scale sequencing of marsupial transcriptomes, and soon perhaps genomes, is within the scope of many independent laboratories. This promises to reveal much about the biology of marsupial genomes and provides opportunities for comparison with eutherian genomes. These comparisons will highlight both the conserved features that are critical, as well as important differences where marsupials and eutherians have chosen different evolutionary paths. Here we describe the current state of marsupial genomic sequencing projects and resources, including available genome and transcriptome sequences. We also survey a number of useful bioinformatics tools, particularly those that we have utilized on marsupial, or sometimes monotreme, genomic data and found useful. Finally, some of the challenges met in dealing with, largely next generation, marsupial sequence are described – experience that we think is also relevant to other non-model organisms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Sanger-sequencing is capillary-based DNA sequencing that producing long (600–1,000 nt), high quality reads.
- 2.
The Roche 454 is a next generation sequencing platform that produces hundreds of thousands of 200–600 nt long reads.
- 3.
A transcriptome is the set of all transcripts or expressed genes in a tissue.
- 4.
The Illumina GA2 is a next generation sequencing platform that produces millions of short reads (32–100 nt).
- 5.
RNA-seq is expression analysis based on sequencing transcripts and counting reads mapping to each gene.
- 6.
A flowgram is the signal intensity data from a Roche 454 sequencer. It is analogous to the chromatogram in Sanger sequencing. The signal intensity is proportional to the number of the base of the same type added in each sequencing step.
- 7.
NCBI Reference Sequence collection (http://www.ncbi.nlm.nih.gov/refseq).
- 8.
The E-value is the number of hits expected by chance when searching a database of a particular size. Note that you should use proper scientific notation to write E-values in publications, not the common, but dreadful computer shorthand (e.g. 1e-5).
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410.
Baker ML, Indiviglio S, Nyberg AM, et al. (2007) Analysis of a set of Australian northern brown bandicoot expressed sequence tags with comparison to the genome sequence of the South American grey short tailed opossum. BMC Genomics 8:50.
Belov K, Deakin JE, Papenfuss AT, et al. (2006) Reconstructing an ancestral mammalian immune supercomplex from a marsupial major histocompatibility complex. PLoS Biol 4:e46.
Belov K, Sanderson CE, Deakin JE, et al. (2007) Characterization of the opossum immune genome provides insights into the evolution of the mammalian immune system. Genome Res 17:982–991.
Chou H-H, Holmes MH (2001) DNA sequence quality trimming and vector removal. Bioinformatics 17:1093–1104.
Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14:755–763.
Finn RD, Tate J, Mistry J, et al. (2008) The Pfam protein families database. Nucleic Acids Re 36:D281–D288.
Gough J, Karplus K, Hughey R, Chothia C (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol 313:903–919.
Huang X, Madan A (1999) CAP3: A DNA sequence assembly program. Genome Res 9:868–877.
Hubbard TJ, Aken BL, Beal K, et al. (2007) Ensembl 2007. Nucleic Acids Res 35:D610–D617.
Kent WJ (2002) BLAT–the BLAST-like alignment tool. Genome Res 12:656–664.
Korf I, Yandell M, Bedell J (2003) BLAST. O’Reilly and Associates, Sebastapol.
Kullberg M, Hallström B, Arnason U, Janke A (2007) Expressed sequence tags as a tool for phylogenetic analysis of placental mammal evolution. PLoS One 2:e775.
Lachish S, Jones M, McCallum H (2007) The impact of disease on the survival and population growth rate of the Tasmanian devil. J Anim Ecol 76:926–936.
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.
Lefevre CM, Digby MR, Whitley JC, Strahm Y, Nicholas KR (2007) Lactation transcriptomics in the Australian marsupial, Macropus eugenii: transcript sequencing and quantification BMC Genomics 8:417.
Li R, Li Y, Kristiansen K, Wang J (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24:713–714.
McCallum H, Tompkins D, Jones M, et al. (2007) Distribution and impacts of Tasmanian Devil Facial Tumour Disease. EcoHealth 4:318–325.
Mikkelsen TS, Wakefield MJ, Aken B, et al. (2007) Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature 447:167–177.
Murchison EP, Tovar C, Hsu A, et al. (2010). The Tasmanian devil transcriptome reveals Schwann cell origins of a clonally transmissible cancer. Science 327(5961):84–87.
Pearse AM, Swift K (2006) Allograft theory: transmission of devil facial-tumour disease. Nature 439:549.
Pertea G (2009) http://compbio.dfci.harvard.edu/tgi/software, Retrieved 20 August 2009, from Dana Farber-Cancer Institute Software Tools.
Pyecroft S, Pearse A, Loh R, et al. (2007) Towards a case definition for devil facial tumour disease: what is it? EcoHealth 4:346–351.
Rumble SM, Lacroute P, Dalca AV, Fiume M, Sidow A, Brudno M (2009) SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol 5:e1000386.
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I (2009) ABySS: a parallel assembler for short read sequence. Genome Res 19(6):1117–1123.
Slater GS, Birney E (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6:31.
Wong ES, Young LJ, Papenfuss AT, Belov K (2006) In silico identification of opossum cytokine genes suggests the complexity of the marsupial immune system rivals that of eutherian mammals. Immunome Res 2:4.
Yeh RF, Lim LP, Burge CB (2001) In silico identification of opossum cytokine genes suggests the complexity of the marsupial immune system rivals that of eutherian mammals. Genome Res 11:803–816.
Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Papenfuss, A.T., Hsu, A., Wakefield, M. (2010). Marsupial Sequencing Projects and Bioinformatics Challenges. In: Deakin, J., Waters, P., Marshall Graves, J. (eds) Marsupial Genetics and Genomics. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9023-2_6
Download citation
DOI: https://doi.org/10.1007/978-90-481-9023-2_6
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9022-5
Online ISBN: 978-90-481-9023-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)